I thought that the first point in the American Press Institute article was very important – you should not give to much credit to the data you have, nor should you rely on it too much. I’m definitely guilty of assuming statistics I read in academia or journalism are perfect, and taking them at face value. This tip is a good reminder to take data with a grain of salt, and consider where it came from and how it was collected.

Tip #3 in the same article was also immensely helpful. I’ve heard of cleaning data before but never know what that actually meant or how it worked. I had no idea that you can simply run your file through something OpenRefine, which will clear discrepancies between cells written differently for the same meaning. Last semester I took a project-based class that involved analyzing massive spreadsheets of donations to Super PACs. Something like OpenRefine would have been great, because we were essentially told to fix discrepancies by hand or just disregard them.