Exercise 3: SQL Queries for Presidential Election Results

Download this datafile, which contains results for the last presidential election (at the county level).

Launch sqlite in a terminal and open/create a database via

In [ ]:
sqlite3 pres.db

Load the .csv via the following

In [ ]:
sqlite> .mode csv
sqlite> .separator ,
sqlite> .import 2016_US_County_Level_Presidential_Results.csv pres

Enter the following commands in order, noting their effects.

In [ ]:
sqlite> SELECT * FROM pres LIMIT 10;
sqlite> .mode column
sqlite> .header on
sqlite> SELECT * FROM pres LIMIT 10;

Try out the following commands, and comment on what you learn.

In [ ]:
sqlite> SELECT * from pres WHERE state_abbr = NY;
sqlite> SELECT * from pres WHERE state_abbr = "NY";
sqlite> SELECT votes_dem from pres WHERE state_abbr = "NY";
sqlite> SELECT votes_dem,combined_fips from pres WHERE state_abbr = "NY";

Aggregation by GROUPS

You can use 'GROUP BY' to implement partial aggregations:

SELECT a,b,c,AGG(d) FROM table GROUP BY a,b;

For example, compare the following 2 queries:

In [ ]:
sqlite> SELECT state_abbr,MAX(per_dem) from pres;
sqlite> SELECT state_abbr,MAX(per_dem) from pres GROUP BY state_abbr;

Which query above gives the maximum value of per_dem across all counties in the United States, and which gives the maximum of per_dem for each state?

Using VIEWS

You can create views, which essentially save your search queries as new tables. That is, given a huge database, you can use it to construct smaller tables (i.e., views) to help you search through the database and to help construct more complicated queries.

Consider the following:

sqlite> CREATE VIEW state_view AS SELECT state_abbr,SUM(votes_dem),SUM(votes_gop),SUM(total_votes) from pres GROUP BY state_abbr; sqlite> SELECT * FROM state_view;

Note that you can now access the table state_view as if it was a table in the database. To delete such a view, you can enter

In [ ]:
DROP VIEW IF EXISTS state_view;

Now, quit sqlite by entering

 .quit

Exporting SQL Results to an .csv File

Repoen the database using

sqlite3 pres.db

and enter the following commands:

In [ ]:
sqlite> .separator ,
sqlite> .header on
sqlite> .output votes_and_fips.csv
sqlite> SELECT votes_dem,combined_fips from pres;
sqlite> .quit

What happened?

Reopen sqlite and try the following:

In [ ]:
sqlite> SELECT COUNT(votes_dem) from pres where state_abbr = "NY";

Questions

1. Use COUNT to identify how many counties are in California (CA)

2. Use SUM to calulate the total number of votes in California.

3. Use MAX to calulate the maximum of 'per_gop' across counties in California.

Harder Question

4. Use the sum and group to create a file state_votes.csv that contains 2 columns: the sum of total_votes across a state, and the state abbreviations.