Independent Analysis

“If we cannot do our own independent analysis, we are at the mercy of the PR machine, which has this firepower behind them.

There is this absolutely essential need to be able to do independent quantitative analysis based on raw data…So I think that is really the key in journalism.

It is not only the flashing new things that you can do fancy graphics. It is not only numbers. It is about responding to the way information is now stored, controlled, processed and allowing journalism to fulfill its mission, independently accessing those things, and without data journalism that is going to be impossible.”

–Martin Stabe, head of interactive news at the Financial Times

source:
https://journals.uio.no/index.php/TJMI/article/viewFile/882/1160

Transparency

Reliability: How sure are we that we got the right answer? That we’ve done everything correctly?
Replicability: If we had to do it all again, would we get the same answer? If someone else did it, would they?
Transparency: If our results are challenged, can we show exactly what we’ve done to defend it?

–Matt Waite

Data Diary
What actions you took, commands you ran, thinking behind what you are doing.

Data Biography Template

Interviewing Data

Sample Data Diary Entry

FBI Crime Data for Arkansas

Uniform Crime Reporting, Read General Resources

https://ucr.fbi.gov/

Gather Data For All Arkansas Cities, All Crimes

https://www.ucrdatatool.gov/Search/Crime/Local/OneYearofData.cfm

Using the Data:
  1. Four Corners Test
2) Sorting and Filtering:
–Ascending – Descending
Question #1: Using the four corners test, describe
—The number of cities
—How is the crime rate organized?
Question #2: Sorting.
Fayetteville is ranked what statewide for total property crime?

 

BUILDING A CALCULATION: PERCENTAGE CHANGE

1. New number – Old number. This is the amount, or raw, change. Subtracted the old number from the new: = New-old.

2. Percentage Change.(N- O)/O or (New – Old, divided by Old). Divide the amount change by the old number: = Change/Old.

Calculations

Calculate a violent crime rate per 1,000 people

–Insert Column

–Crime Rate Formula: =(Violent Crime/Population)*1000

Answer these questions

–The place with the highest violent crime rate in Arkansas in 2014 is xxx with a violent crime rate of xxx per 1,000

 

 

Homework Assignment: 50 Points

Part 1: Reading and Questions

Identify two things in these articles below you found interesting or important to your work as a journalist or a researcher. Post your comments with your homework by 11:59 p.m. Sunday March 12.

Reading a Data Dictionary

How to Avoid 10 Common Mistakes In Data Reporting

AP Stylebook on Data Journalism

Definition from the AP Stylebook, 2016

Editor’s Note: The 2016 AP Stylebook launches today. Stylebook editors announced in April the plan to lowercase internet and web when the 2016 Stylebook came out and those changes take effect today. Subscribers get access to about two dozen new food entries plus two new fashion entries today, as well as the revised internet and web entries.
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

data journalism
Data sources used in stories should be vetted for integrity and validity. When evaluating a data set, consider the following questions:
–What is the original source for the data? How reliable is it? Can we get answers to questions about it?
–“ Is this the most current version of the data set? How often is the data updated? How many years of data have been collected?
–Why was the data collected? Was it for purposes of advocacy? Might that affect the data’s reliability or completeness? Does the data make intuitive sense? Are there anomalies (outliers, blank values, different types of data in the same field) that would invalidate the analysis?
–What rules and regulations affect the gathering (and interpretation) of the data?
–Is there an alternative source for comparison? Does the data for a parallel industry, organization or region look similar? If not, what could explain the discrepancy?
–Is there a data dictionary or record layout document for the data set? This document would describe the fields, the types of data they contain and details such as the meaning of codes in the data and how missing data is indicated. If the data collectors used a data entry form, is the form available to review? For example, if the data entry was performed by inspectors, is it possible to see the form they used to collect the data and any directions they received about how to enter the data?
Data and the results of analysis must be represented accurately in stories and visualizations. Any limitations of the data must also be conveyed. If one point in the analysis is drawn from a subset of the data or a different data set altogether, explain why this was done.
Use statistics that include a meaningful base for comparison (per capita, per dollar). Data should reflect the appropriate population for the topic: for example, use voting-age population as a base for stories on demographic voting patterns. Avoid percentage and percent change comparisons from a small base. Rankings should include raw numbers to provide a sense of relative importance.
When comparing dollar amounts across time, be sure to adjust for inflation. When using averages (that is, adding together a group of numbers and dividing the sum by the quantity of numbers in the group), be wary of extreme, outlier values that may unfairly skew the result. It may be better to use the median (the middle number among all the numbers being considered) if there is a large difference between the average (mean) and the median.
Correlations should not be treated as a causal relationship. Where possible, control for outside factors that may be affecting both variables in the correlation. Use round numbers where possible, particularly to avoid a false appearance of precision. Be clear about limitations of sample size in reporting on data sets. See the polls and surveys section for more specific guidance on margin of error.
Try not to include too many numbers in a single sentence or paragraph.

Data Checklist

(Daniel Lathrop. Dallas Morning News)

— Review methodology with one or more other data people
— Check results to other available comparable data
— Ensure all record counts are consistent across stages
— Check averages

— Examine outputs to ensure logical consistency (do things that should add up to 100% add up to 100%?)
— Recheck all coding line by line if possible or in aggregate if not

— Re-read all programs/scripts
— Re-run entire analysis from scratch
— Check each number against analysis or source material prior to publication
— Recheck each number against analysis or source material on each draft

Data Story on Arkansas Crime

Watch Video on Excel Best Practices 

Use This Dataset:
2010-2014-March7 -class

Answer the following questions with the 2014 data.

–Determine which locality has the highest violent crime rate in Arkansas for 2014. Determine which one has the highest property crime rate for Arkansas in 2014.

–Fayetteville’s property crime rate  is XXX, PLACING It #  statewide.

–Fayetteville’s violent crime rate is xxx, PLACING It #  statewide

Now, analyze the 2010 data, and answer the same questions.

–Determine which locality has the highest violent crime rate in Arkansas for 2010. Determine which one has the highest property crime rate  for Arkansas in 2010.

–Fayetteville’s property crime rate is XXX  PLACING It # statewide.

–Fayetteville’s violent crime rate is xxx , PLACING It # statewide

Write a story.

Assume this information was released by the Federal Bureau of Investigation on Tuesday, March 7 at a press conference in Fayetteville.

Write a 250-300 word story based on your findings. Some answers you might want to consider – how does Fayetteville compare to the rest of the state? What is the trend between 2010 and 2014? Which localities has the greatest increase? The biggest decrease?

Post the reading questions, data answers and crime story all in a single Word document on Blackboard here.