In [8]:
import sqlite3
con = sqlite3.connect('movie_db.db')# connect to a database (or create it, if it doesn't already exist)
cur = con.cursor()#generate a 'live' cursor 
with open("movie_db.sql") as f: 
    sql_data = f.read()
#print(sql_data[0:400],10) 
cur.executescript(sql_data);
import pandas
pandas.set_option('display.max_rows',20)
def sql_request(q):
    cur.execute(q)
    return pandas.DataFrame(cur.fetchall(),columns=[d[0] for d in cur.description])  # just using Pandas for nice tabular display of results, nothing more

Class 15 Outline

Homework 3: SQL Queries for Movie Database

Due by Sunday April 1 at midnight.

Study this jupyter notebook and ...

* [5pts] Answer the 5 extended questions
* [5pts] Answer 5 additional questions from the list of 15 more questions. (You can pick any 5.)

You must write 1-2 sentences about each result, interpretting what you've found. For example, does your query return all movies or only those which have received reviews?

Submit to UBLearns a jupyter notebook

LastName_FirstName_HW3.ipynb

Your answer for each question should be in the form of indicated in the notebook. For example, if the question is:

1. Write a query that returns movies released after 2012.

Then any query invoking 'year>2012' would be correct, such as the following:

In [6]:
query1 = "select * from movie,rating,reviewer on (movie.movieID=rating.movieID and rating.raterID=reviewer.raterID and year>2012)"
sql_request(query1)
Out[6]:
movieID title year director raterID movieID stars ratingDate raterID name
0 1 Birdman 2014 Alejandro González Iñárritu 100 1 5 2017-03-07 100 Jenna
1 41 Jurassic World 2015 Colin Trevorrow 999 41 5 2017-03-07 999 Raj
2 41 Jurassic World 2015 Colin Trevorrow 97 41 5 2017-03-07 97 Danny
... ... ... ... ... ... ... ... ... ... ...
7 6 Frozen 2013 Chris Buck Jennifer Lee 67 6 5 2017-03-07 67 Aishani
8 7 Lion 2016 Garth Davis 67 7 5 2017-03-07 67 Aishani
9 96 Hidden Figures 2016 Theodore Melfi 67 96 5 2017-03-07 67 Aishani

10 rows × 10 columns

This returns movies and their reviews, and does not include movies that were not reviewed.

In [3]:
query1 = "select * from movie where movie.year>2012"
sql_request(query1)
Out[3]:
movieID title year director
0 1 Birdman 2014 Alejandro González Iñárritu
1 41 Jurassic World 2015 Colin Trevorrow
2 6 Frozen 2013 Chris Buck Jennifer Lee
3 7 Lion 2016 Garth Davis
4 96 Hidden Figures 2016 Theodore Melfi
5 12 Get Out 2017 Jordan Peele
6 98 Spotlight 2016 Tom McCarthy
7 1199 Lala Land 2017 Damien Chazelle
8 233 Jungle Book 2016 Jon Favreau

This is all movies more recent than 2012, including those that do and don't have reviews.