SQL coding challenge #2: Relational division: Find all movies two actors cast in together

This challenge was a bit tricky, but quite interesting, so I thought it would be a good topic to talk about and walk through its solution, along with the thought process I applied to build the solution query from scratch.

If you find this article useful or simply enjoy reading it, don’t forget to clap! Or even leave a comment. Authors are always happy about claps and feedback, and it is a great way to let us know we are writing useful articles for you :-)

It makes use of subquery, aggregation with the “group by” clause, sorting using the “order by” clause, row filtering with the “where” clause, and “inner join” between a subquery and another table.

Instructions:

Given film_actor and film tables from the DVD Rental sample database find all movies both Sidney Crowe (actor_id = 105) and Salma Nolte (actor_id = 122) cast in together and order the result set alphabetically.

Column     | Type                        | Modifiers
------------+-----------------------------+----------
title | character varying(255) | not null
film_id | smallint | not null
Column     | Type                        | Modifiers
------------+-----------------------------+----------
actor_id | smallint | not null
film_id | smallint | not null
last_update | timestamp without time zone | not null
Column     | Type                        | Modifiers
------------+-----------------------------+----------
actor_id | integer | not null
first_name | character varying(45) | not null
last_name | character varying(45) | not null
last_update | timestamp without time zone | not null

The desired output:

title
-------------
Film Title 1
Film Title 2
...

Solution:

select film.title
— we get the film_id columns for any movies where either actor was cast
— we count how many rows we have for each individual film_id by grouping
— subquery returns a table with two columns, film_id and cnt
— we name this table abc
from (select film_id, count(film_id) as cnt
from film_actor
where actor_id in (105, 122)
group by film_id
order by cnt desc
) abc
— take abc and join it to film to match the film_ids and get the corresponding titles
inner join film
on abc.film_id = film.film_id
— then we only keep the movies where the row count was 2
— count 2 means both actors were cast in the same movie
where cnt=2
— sort the result alphabetically, ascending order, by title
order by title asc

Solution

Instructions, step by step:

find all movies, meaning titles of the movies.
both Sidney Crowe (actor_id = 105) and Salma Nolte (actor_id = 122) cast in together, this looks like a where clause where we want the actor_id column to be 105 and/or 122.
order the result set alphabetically, meaning an order by clause sorting the movie titles alphabetically, i.e. ascending from a to z.

Let’s take a look at the tables we have, and how they relate to each other.
film — title, film_id
film_actor — actor_id, film_id, and last_update
actor — actor_id, first_name, last_name, last_update

We can retrieve a list of film_ids where either one of the two actors was cast by selecting rows from the film_actor table and filtering on actor_id being equal to 105 or 122.

select film_id, actor_id
from film_actor
where actor_id in (105, 122)

What I want to see though is a list of only film_ids that belong to movies where BOTH actors were cast. Let’s think about how we can do that, if we were to work it out manually. We would review the list, and then take note of how many rows we have for each given movie. If the row count for a given movie/film_id is 1, it means that only one of the two actors played in it. If the row count for a given movie/film_id is 2, it means that both of the actors played in it. So we need to develop our basic query above to include a row count. When we think about getting a count, we should immediately think about the “count” function in SQL, and also aggregation, i.e. the “group by” clause.

select film_id, count(film_id) as cnt
from film_actor
where actor_id in (105, 122)
group by film_id

Now, we want to change the film_id into a title. To do so, we need to match rows between the above table we just generated, and one other table. But which one? Well, we need another table that contains the film_id column, and also the title column. Looking at our tables’ schemas above, this would be the film table. When we think about matching rows between two tables, we need to think of a “join” operation. In our case, we will apply an “inner join” on film_id between our newly generated table above, let’s call it abc, and the film table.

This query is not syntactically correct yet… but I’m using it to show how we make the inner join happen between our generated table, abc, and the film table.

(select film_id, count(film_id) as cnt
from film_actor
where actor_id in (105, 122)
group by film_id) abc
inner join film
on abc.film_id = film.film_id

To make the above query syntactically correct, we need to use the syntax for subqueries, keeping in mind that the column we want in our final result is the title column:

select title from
(select film_id, count(film_id) as cnt
from film_actor
where actor_id in (105, 122)
group by film_id) abc
inner join film
on abc.film_id = film.film_id

What we see here now is a list of all the movie titles instead of their film_id, along with the count we did earlier, which tells us whether only one of the actors was cast in the movie, or if both were cast in the movie. For example, in the screen shot above, we wee that only one of the actors was cast in “Alaska Phantom”, while both actors were cast in “Antitrust Tomatoes”. We want to filter these rows and only keep those where the cnt column is 2, which means both actors played in that movie.

select title, cnt from
(select film_id, count(film_id) as cnt
from film_actor
where actor_id in (105, 122)
group by film_id) abc
inner join film
on abc.film_id = film.film_id
where cnt=2

Let’s recall what our instructions were:
Given film_actor and film tables from the DVD Rental sample database find all movies both Sidney Crowe (actor_id = 105) and Salma Nolte (actor_id = 122) cast in together and order the result set alphabetically.

As you can see, we have the movie titles where both actors were cast, however the problem only asks that we list the title, and also that the order by alphabetical. For this, let’s only select the title and then order ascending by title.

select title from
(select film_id, count(film_id) as cnt
from film_actor
where actor_id in (105, 122)
group by film_id) abc
inner join film
on abc.film_id = film.film_id
where cnt=2
order by title asc

And there it is, our final and correct output accepted for the challenge:

We can clean up the query, add some comments, and then submit for a successful completion of this challenge, on Codewars:

Final solution

I hope you enjoyed this article, and if you did, please clap, and feel free to leave a comment. Questions are welcome, and I try my best to answer them in a clear and concise way that is easy to understand, whatever your SQL experience level may be.

Until next time, happy querying!

I am a Data Scientist with 10+ years experience in the Tech Industry, and a background in Education and Customer Service.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store