1. Import the FL_poverty data and clean up the labels.

–You may find it easier to edit the labels on Excel first before importing them. Remember, no spaces

2) Join the Fl_nativity and FL_poverty tables in SQL. Join them on the Id2 field. Query SELECT * to show both joined, post a screenshot.

For background, first review this entry on the basic concept of joining tables:
https://www.w3schools.com/sql/sql_join_inner.asp

Documentation on the Joining Tables command.
https://dev.mysql.com/doc/refman/5.7/en/join.html

SELECT * FROM FL_nativity, FL_poverty;
SELECT * FROM FL_nativity INNER JOIN FL_poverty ON FL_nativity.Id2 = FL_poverty.Id2;
SELECT FL_nativity.geography, FL_poverty.Total FROM FL_nativity INNER JOIN FL_poverty ON FL_nativity.Id2 = FL_poverty.Id2;

3) Assuming you get it to work, run a few basic queries:

From Fl_Poverty:

The top and bottom counties with the most and least number of people in poverty.

Top three counties with the highest black poverty

 

The cleaned data is here:

FL_povertyCLEAN

 

Part 2

Bonus Question:

Create a new column for the native poverty rate (poverty / population) and non-citizen poverty rate.  Determine the top and bottom counties in these categories.

You will need to convert the datatype to REAL in order to do division

Export A New Database:

–Join the FL_Nativity and FL_Poverty tables

–Include the Total from the Nativity, Born in the US, Not US citizen, Total poverty, Native Below Poverty and Non Citizen Below Poverty

Export Your Data into a Subset File
–See “Actions” Dialog on “Execute SQL” tab

–Click to “Save Result as CSV File”

This downloads your SQL search into a new table
The only problem: It doesn’t download the header files
—Open the .csv file and add the headers

Import the .csv into a new database called FL_POV_IMM

–Create new columns with the ratios:

Born in US / Total Population

Not Citizen / Total Population

Total Poverty / Native US Citizen below poverty level

Total poverty / Non citizen below poverty

Answer what are the highest and lowest of each of these four categories.

 

Homework

To prepare for the Denise Malan workshop, review this tutorial and dataset on Google Fusion tables, a free mapping tool.

fusionComplete

Magnet_schools

Remember, we meet Tuesday in JB Hunt 535

We will continue to work on SQL. This assignment is due Tuesday night (NOTE NEW TIME) at 11:59 pm
  1. Join FL_nativity and FL_poverty, as discussed in class
  2. Create a database table with the following fields. REMEMBER TO USE REAL instead of NUMERIC for numerical fields:

FL_nativity.geography
FL_nativity.Total_Nativity
FL_nativity.Born_In_US
FL_nativity.NOT_Citizen
FL_poverty.Total_Poverty
FL_poverty.Native_BelowP
FL_poverty.Non_Citizen_BELOWP
(your column titles may vary but the data should be the same)

3. Create new columns with the following calculations:

–The ratio of US residents to the total population (Born_In_US / Total_Nativity)
–The ratio of non-citizens to the total population (NOT_Citizen / Total_Nativity)
–The ratio of US residents in poverty (Native_BelowP /Total_Poverty)
–The ratio of non-citizens to the total population (Native_BelowP /Total_Poverty)

Export this database as a .csv file and put it in a blog post

4. Answer the following questions in a blog post

–Identify the counties and the rates of all counties with poverty rates greater than 25%

–Identify the TOP 10 counties with the highest non-citizen population by ratio

–Identify the TOP 10 counties with the highest US poverty by ratio.

Bonus Question:

–Construct a formula that determines which counties would match from the two top 10 lists of highest non-citizen by ratio and highest US poverty by ratio.

Answers for April 11 Homework

 

  1. Join FL_nativity and FL_poverty, as discussed in class
  2. Create a database table with the following fields. REMEMBER TO USE REAL instead of NUMERIC for numerical fields:

FL_nativity.geography
FL_nativity.Total_Nativity
FL_nativity.Born_In_US
FL_nativity.NOT_Citizen
FL_poverty.Total_Poverty
FL_poverty.Native_BelowP
FL_poverty.Non_Citizen_BELOWP
(your column titles may vary but the data should be the same)

3. Create new columns with the following calculations:

–The ratio of US residents to the total population (Born_In_US / Total_Nativity)

//Check your math before you create the column //

SELECT Geography, Total_Nativity, Born_In_US, (Born_In_US / Total_Nativity) from FL_POV_IMM ;

//Create Four New Columns with the appropriate Ratios //

ALTER TABLE FL_POV_IMM

ADD COLUMN BORN_US_Ratio REAL ;

ALTER TABLE FL_POV_IMM

ADD COLUMN NON_Citizen_Ration REAL ;

ALTER TABLE FL_POV_IMM

ADD COLUMN USPOVERTY_Ratio REAL;

ALTER TABLE FL_POV_IMM

ADD COLUMN FOREIGNPOVERTY_Ratio REAL;

//Create New Columns with the appropriate Ratios //

–The ratio of US Citizens to the total population (Born_In_US / Total_Nativity)

UPDATE FL_POV_IMM

SET “BORN_US_RATIO” = (BORN_IN_US / Total_Nativity) *100 ;

//Check to see you did it correctly//
SELECT Geography, Total_Nativity, Born_In_US, BORN_US_Ratio from FL_POV_IMM ;
–The ratio of non-citizens to the total population (NOT_Citizen / Total_Nativity)

UPDATE FL_POV_IMM

SET “NON_CITIZEN_RATIO” = (NOT_Citizen / Total_Nativity) *100 ;


–The ratio of US residents in poverty (Native_BelowP /Total_Poverty)

UPDATE FL_POV_IM

SET “USPOVERTY_RATIO” = (Native_BelowP / Total_Poverty) *100 ;


–The ratio of non-citizens to the total population (Native_BelowP /Total_Poverty)

UPDATE FL_POV_IMM

SET “FOREIGNPOVERTY_RATIO” = (Non_Citizen_BelowP / Total_Poverty) *100 ;

4. Answer the following questions

//NOTE: I wrote the first question with US Poverty rates in mind. Some of you answered with a TOTAL POVERTY RATIO, which also is accurate //

–Identify the counties and the rates of all counties with poverty rates greater than 25%

SELECT Geography, TOTALPOVERTY_Ratio
FROM FL_POV_IMM2
WHERE TOTALPOVERTY_RATIO > 25
Order By TOTALPoverty_Ratio DESC ;

Answer:
DeSoto County, 30.6
Hamilton County, 29.3
Putnam County, 28.7
Holmes County, 27.8
Okeechobee County, 27.7
Hardee County, 27.4
Gadsden County, 27.1
Hendry County, 26.4
Madison County, 25.7

–Identify the counties and the rates of all counties with US poverty rates greater than 25%

SELECT Geography, USPOVERTY_Ratio
FROM FL_POV_IMM
WHERE USPOVERTY_RATIO > 25
Order By USPoverty_Ratio DESC;

Hamilton County with 28.1%,
Putnam County with 27.4%,
Holmes County with 26.4%,
Madison County with 25.3%,
Gadsden County with 25.2%

Identify the TOP 10 counties with the highest non-citizen population by ratio

SELECT Geography, NON_CITIZEN_Ratio

FROM FL_POV_IMM

WHERE NON_CITIZEN_Ratio > 9.6

Order By Non_Citizen_Ratio DESC ;

Miami-Dade County with 23.6%, Hendry County with 18.03%, Broward County with 14.06%, Collier County with 13.9%, DeSoto County with 13.7%, Hardee County with 13.4%, Palm Beach County with 11.97%, Orange County with 10.1%, Glades County with 9.7%, and Osceola County with 9.6%

–Identify the TOP 10 counties with the highest US poverty by ratio.

SELECT Geography, USPOVERTY_RATIO

FROM FL_POV_IMM

WHERE USPOVERTY_RATIO > 21.8

ORDER BY USPOVERTY_RATIO DESC ;

Hamilton County with 28.06%, Putnam County with 27.4%, Holmes County with 26.4%, Madison County with 25.3%, Gadsden County with 25.2%, Okeechobee County with 23.4%, DeSoto County with 23.2%, Hardee County with 22.6%, Suwannee County with 22.1%, and Gilchrist County with 21.8%

–Construct a formula that determines which counties would match from the two top 10 lists of highest non-citizen by ratio and highest US poverty by ratio.

SELECT Geography, NON_CITIZEN_Ratio,  USPOVERTY_RATIO

FROM FL_POV_IMM

WHERE NON_CITIZEN_Ratio > 9.6 AND USPOVERTY_RATIO > 21.8 ;
NON-Citizen Ratio

DeSoto County, 13.7

Hardee County, 13.4

 

US Poverty Ratio

DeSoto County, 23.2

Hardee County, 22.6

 

 

Quick Reference Sheet for these commands:

 
 
Major Commands:
 
SELECT
FROM
WHERE
GROUP BY
ORDER BY
 
 
SELECT: You tell the database what columns you want.
If you want everything, SELECT *
Example: SELECT * FROM Customers;
–Retrieves EVERYTHING (*) from the database called Customers
 
WHERE: A filter. Allows you to grab only certain records do numerical and text string, or a horizontal slice of your data; you set criteria that has to be true in order for rows to be returned
SELECT * FROM Customers WHERE ContactName “Maria Anders”;
–retrieves all records with Maria Anders as contact name
 
WILDCARD SEARCH
SELECT * FROM Customers WHERE ContactName LIKE “Maria%”;
–retrieves all records with Maria in ContactName. Note the addition of LIke and the wildcard %
 
GROUP BY – this groups rows together by a certain field or fields so that you can do math on those groups, such as counting the number of records in each group or summing up the numbers in a field.
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
–Here we are counting the number of countries in the database
Remember this command is paired with putting an aggregate function in your SELECT statement, such as COUNT(*) or SUM(noncitizen)
ORDER BY – this sorts your results by a field or fields, ascending (default) or descending (DESC)
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;
–Produces a list ordered by country, alphabetized
DESC ASC
List of countries in descending order:
SELECT * FROM Customers
ORDER BY Country DESC;
 
Add a column
ALTER TABLE “main”.”FL_nativity” ADD COLUMN “Immigrant” NUMERIC
Fill it with stuff
UPDATE FL_nativity
SET immigrant = (naturalized + BornAbroad)

Links to SQL Tutorials

SQL Reference:

https://www.w3schools.com/sql/sql_quickref.asp

Tip Sheet for FL_Nativity:

MySQL_Tipsheet-rsw

Data types

https://www.w3schools.com/sql/sql_datatypes.asp

SQL Tutorial from Matt Waite:

https://github.com/tthibo/SQL-Tutorial