SQL – Importing Data, Creating a Database

For this exercise, we will be using an open source database manager called SQLite that runs as an extension to Firefox. It basically turns your browser into a database program. Firefox Add-On: https://addons.mozilla.org/en-us/firefox/addon/sqlite-manager/

This will suffice for our purposes this semester, but if you are going to deal with Big Data, you will need Big Boy and Big Girl data software. A discussion for another day.
Firefox: Install add-on, relaunch browser. Find Add-on in Menu Bar: Tools

Create a database

1) Create a database. Call it anything, Test, Donkey, Hog. Put it in the same folder as your data: FL_nativity
2) Import Data FL_nativity into SQL.
This can be tedious, and here’s why:
We have to tell SQL about each column, whether it is text, numbers or some variation in between.
There are multiple data types:
For this, we are just going to define the columns as either “text” or “numeric” — more on the other data types later. So you need to separately define 15 different columns as text or numeric. (When you buy Big Boy or Big Girl software, the import sequence is less tedious)
As you import FL_nativity — do yourself a favor and open it separately in a spreadsheet so you can see the various columns and what they contain. Check the number of columns in the spreadsheet versus the number of imported columns to ensure they match. You should have 15 imported columns and it should import 69 rows.
Headings
IMPORTANT NOTE: Eliminate Drudgery of renaming file names – CHECK the box import top row with file names
Id     Id2     Geography     Estimate_Total     MOE_Total     Native     Native_MOE     Puerto_Rico     Puerto_Rico_MOE     BornAbroad     BornAbroad_MOE     Naturalized     Naturalized_MOE     Noncitizen     Noncitizen_MOE
Here’s what your finished import sequence should look like:
This is what you did in SQL language
CREATE TABLE “FL_nativity” (“Id” NUMERIC,”Id2″ NUMERIC,”Geography” TEXT,”Total” NUMERIC DEFAULT (null) ,”MOE_Total” NUMERIC,”Native” NUMERIC,”Native_MOE” NUMERIC,”Puerto_Rico” NUMERIC,”Puerto_Rico_MOE” NUMERIC,”BornAbroad” NUMERIC,”BornAbroad_MOE” NUMERIC,”Naturalized” NUMERIC,”Naturalized_MOE” NUMERIC,”Noncitizen” NUMERIC,”Noncitizen_MOE” NUMERIC)
If you messed up, here is how you edit the headings
1) Structure
2) See columns, click on row and rename

WORKING WITH FL_NATIVITY

Instructions for this exercises follow this tipsheet: MySQL_Tipsheet-rsw using this data FL_nativity

Try to create a new database (see instructions on this page) and perform some basic queries.

We return to our SQL commands to see what is in the database.  Remember to have a list of your column headings handy so you spell the queries correctly.
SELECT * FROM FL_nativity
–Brings up all columns from FL_nativity
SELECT geography, total, native, naturalized, noncitizen FROM FL_nativity
–brings up these columns
Wildcard query
WHERE georgraphy LIKE ‘a%’ Finds any values that starts with “a”
Add a column
ALTER TABLE “main”.”FL_nativity” ADD COLUMN “Immigrant” NUMERIC
Fill it with stuff
UPDATE FL_nativity
SET immigrant = (naturalized + BornAbroad)
Follow this cheat sheet for additional exercises:
FL Poverty

Homework

Updated 4/5

  1. Import the FL_poverty data and clean up the labels.

–You may find it easier to edit the labels on Excel first before importing them. Remember, no spaces

2) Join the Fl_nativity and FL_poverty tables in SQL. Join them on the Id2 field. Query SELECT * to show both joined, post a screenshot.

For background, first review this entry on the basic concept of joining tables:
https://www.w3schools.com/sql/sql_join_inner.asp

Documentation on the Joining Tables command.
https://dev.mysql.com/doc/refman/5.7/en/join.html

If you can’t get it to work after an hour, then write down your progress, any questions and post that for the assignment.

3) Assuming you get it to work, run a few basic queries:

From Fl_Poverty:

The top and bottom counties with the most and least number of people in poverty.

Top three counties with the highest Non-Citizen** poverty

Bonus question:

Create a new column for the native poverty rate (poverty / population) and non-citizen poverty rate.  Determine the top and bottom counties in these categories.

**Updated the question Wednesday

Blog Post by 11:59 pm with the answers.

Quick Reference Sheet for these commands:

 
 
Major Commands:
 
SELECT
FROM
WHERE
GROUP BY
ORDER BY
 
 
SELECT: You tell the database what columns you want.
If you want everything, SELECT *
Example: SELECT * FROM Customers;
–Retrieves EVERYTHING (*) from the database called Customers
 
WHERE: A filter. Allows you to grab only certain records do numerical and text string, or a horizontal slice of your data; you set criteria that has to be true in order for rows to be returned
SELECT * FROM Customers WHERE ContactName “Maria Anders”;
–retrieves all records with Maria Anders as contact name
 
WILDCARD SEARCH
SELECT * FROM Customers WHERE ContactName LIKE “Maria%”;
–retrieves all records with Maria in ContactName. Note the addition of LIke and the wildcard %
 
GROUP BY – this groups rows together by a certain field or fields so that you can do math on those groups, such as counting the number of records in each group or summing up the numbers in a field.
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
–Here we are counting the number of countries in the database
Remember this command is paired with putting an aggregate function in your SELECT statement, such as COUNT(*) or SUM(noncitizen)
ORDER BY – this sorts your results by a field or fields, ascending (default) or descending (DESC)
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
–Produces a list ordered by country, alphabetized
DESC ASC
List of countries in descending order:
SELECT * FROM Customers
ORDER BY Country DESC;
 
Add a column
ALTER TABLE “main”.”FL_nativity” ADD COLUMN “Immigrant” NUMERIC
Fill it with stuff
UPDATE FL_nativity
SET immigrant = (naturalized + BornAbroad)

Links to SQL Tutorials

SQL Reference:

https://www.w3schools.com/sql/sql_quickref.asp

Tip Sheet for FL_Nativity:

MySQL_Tipsheet-rsw

Data types

https://www.w3schools.com/sql/sql_datatypes.asp

SQL Tutorial from Matt Waite:

https://github.com/tthibo/SQL-Tutorial