Sample dataset: Pagila¶
Aiven provides a sample database you can import in your Aiven for PostgreSQL service. This page covers information about the database and the procedure to get it up and running.
Sakila (and thus, Pagila) is a database representing a DVD rental store (remember those?!), containing information about films (like title, category, actresses), rental stores (like address, staff members, customers) and rentals, where a customer rents a film from a store through its staff.
With all these relational information, Pagila is a perfect fit to play around with PostgreSQL and the SQL language.
Load Pagila via the web console¶
Before exploring the Pagila database, follow the create new service article to spin up a PostgreSQL instance.
With your new PostgreSQL service, you can quickly load the sample dataset by clicking on the “Load sample dataset” button.
After loading the data, connect to the PostgreSQL instance using the command below. The
SERVICE_URI value can be found in the Aiven Console dashboard.
Load Pagila manually¶
pagila-data.sqlfrom our GitHub repository.
You may use the following command on your terminal:
Connect to the PostgreSQL instance using the following command. The
SERVICE_URIvalue can be found in the Aiven Console dashboard.
psqlshell, create a database named
pagilaand connect to it with the command below:
CREATE DATABASE pagila; \c pagila;
Populate the database with the command below. This might take some time.
Once the command finishes, make sure to reconnect to the database to access the imported data:
You are ready to go! You can use the Sample queries section below to explore the database. Have fun!
Entity-relationship model diagram¶
The image below shows an overview of the Pagila database tables and views, generated by DBeaver.
Here you can see how the tables relate to each other, through each fields. For example, the
film table has string columns like
description. It also relates to the table
language with the columns
original_language_id. With that information, you know that you can join both tables to get the language of each film, or to list all films for a specific language.
Let’s explore the dataset with a few queries. All the queries results were limited by the first 10 items.
List all the films by ordered by their length
select film_id, title, length from film order by length desc;
|film_id|title |length| |-------|------------------|------| |426 |HOME PITY |185 | |690 |POND SEATTLE |185 | |609 |MUSCLE BRIGHT |185 | |991 |WORST BANGER |185 | |182 |CONTROL ANTHEM |185 | |141 |CHICAGO NORTH |185 | |349 |GANGS PRIDE |185 | |212 |DARN FORRESTER |185 | |817 |SOLDIERS EVOLUTION|185 | |872 |SWEET BROTHERHOOD |185 |
List how many films there are in each film category
select category.name, count(category.name) category_count from category left join film_category on category.category_id = film_category.category_id left join film on film_category.film_id = film.film_id group by category.name order by category_count desc;
|name |category_count| |-----------|--------------| |Sports |74 | |Foreign |73 | |Family |69 | |Documentary|68 | |Animation |66 | |Action |64 | |New |63 | |Drama |62 | |Sci-Fi |61 | |Games |61 |
Show the actors and actresses ordered by how many movies they are featured in
select actor.first_name, actor.last_name, count(actor.first_name) featured_count from actor left join film_actor on actor.actor_id = film_actor.actor_id group by actor.first_name, actor.last_name order by featured_count desc;
|first_name|last_name|featured_count| |----------|---------|--------------| |SUSAN |DAVIS |54 | |GINA |DEGENERES|42 | |WALTER |TORN |41 | |MARY |KEITEL |40 | |MATTHEW |CARREY |39 | |SANDRA |KILMER |37 | |SCARLETT |DAMON |36 | |VIVIEN |BASINGER |35 | |VAL |BOLGER |35 | |GROUCHO |DUNST |35 |
Get a list of all active customers, ordered by their first name
select first_name, last_name from customer where active = 1 order by first_name asc;
|first_name|last_name| |----------|---------| |MARY |SMITH | |PATRICIA |JOHNSON | |LINDA |WILLIAMS | |BARBARA |JONES | |ELIZABETH |BROWN | |JENNIFER |DAVIS | |MARIA |MILLER | |SUSAN |WILSON | |MARGARET |MOORE | |DOROTHY |TAYLOR |
See who rented most DVDs – and how many times
select customer.first_name, customer.last_name, count(customer.first_name) rentals_count from customer left join rental on customer.customer_id = rental.customer_id group by customer.first_name, customer.last_name order by rentals_count desc;
|first_name|last_name|rentals_count| |----------|---------|-------------| |ELEANOR |HUNT |46 | |KARL |SEAL |45 | |CLARA |SHAW |42 | |MARCIA |DEAN |42 | |TAMMY |SANDERS |41 | |WESLEY |BULL |40 | |SUE |PETERS |40 | |MARION |SNYDER |39 | |RHONDA |KENNEDY |39 | |TIM |CARY |39 |
Ready for a challenge?¶
After playing around with the sample queries, can you use SQL statements to answer some of these questions?
What is the total revenue of each rental store?
select store.store_id, sum(payment.amount) as "total revenue" from store left join inventory on inventory.store_id = store.store_id left join rental on rental.inventory_id = inventory.inventory_id left join payment on payment.rental_id = rental.rental_id where payment.amount is not null group by store.store_id order by sum(payment.amount) desc;
|store_id|total revenue| |--------|-------------| | 2| 33726.77| | 1| 33689.74|
Can you list the top 5 film genres by their gross revenue?
select category.name, film.title, sum(payment.amount) as "gross revenue" from film left join film_category on film_category.film_id = film.film_id left join category on film_category.category_id = category.category_id left join inventory on inventory.film_id = film.film_id left join rental on rental.inventory_id = inventory.inventory_id left join payment on payment.rental_id = rental.rental_id where payment.amount is not null group by category.name, film.title order by sum(payment.amount) desc limit 5;
| name | title | gross revenue| |------------|-------------------|--------------| |Music | TELEGRAPH VOYAGE | 231.73| |Documentary | WIFE TURN | 223.69| |Comedy | ZORRO ARK | 214.69| |Sci-Fi | GOODFELLAS SALUTE | 209.69| |Sports | SATURDAY LAMBS | 204.72|
texttype, allowing for full text search queries, what will you search for?
-- Select all descriptions with the words "documentary" and "robot" select film.title, film.description from film where to_tsvector(film.description) @@ to_tsquery('documentary & robot');
| title | description | |-----------------|--------------------------------------------------------------------------------------------------------------------| |CASPER DRAGONFLY | A Intrepid Documentary of a Boat And a Crocodile who must Chase a Robot in The Sahara Desert | |CHAINSAW UPTOWN | A Beautiful Documentary of a Boy And a Robot who must Discover a Squirrel in Australia | |CONTROL ANTHEM | A Fateful Documentary of a Robot And a Student who must Battle a Cat in A Monastery | |CROSSING DIVORCE | A Beautiful Documentary of a Dog And a Robot who must Redeem a Womanizer in Berlin | |KANE EXORCIST | A Epic Documentary of a Composer And a Robot who must Overcome a Car in Berlin | |RUNNER MADIGAN | A Thoughtful Documentary of a Crocodile And a Robot who must Outrace a Womanizer in The Outback | |SOUTH WAIT | A Amazing Documentary of a Car And a Robot who must Escape a Lumberjack in An Abandoned Amusement Park | |SWEDEN SHINING | A Taut Documentary of a Car And a Robot who must Conquer a Boy in The Canadian Rockies | |VIRGIN DAISY | A Awe-Inspiring Documentary of a Robot And a Mad Scientist who must Reach a Database Administrator in A Shark Tank |
To clean up the environment and destroy the database, run the following commands:
\c defaultdb; DROP DATABASE pagila;
The source code for the Pagila database is available from our repository.