This is a course in finding and telling visual stories from data. We will cover fundamental principles of data analysis and visual presentation, chart types and when to use them, and how to acquire, process and “interview” data. We will make static and interactive charts and maps using free software. There will be some coding, but no prior experience is required. The emphasis is on gaining practical skills that students can apply in a newsroom setting.
We will meet in 108/Lower NG on Tuesdays from 10.00am - 1.00pm. Your instructor, Peter Aldhous, will maintain office hours in B1 from 2.00pm - 5.00pm, following each class. You are welcome to arrange appointments to discuss your work.
Where marked, class time will be scheduled for one of you to critique and lead discussion of a recently published news graphic/interactive.
Categorical and continuous variables; basic operations for interviewing a dataset; sampling and margins of error; plotting and summarizing distributions; choosing bins for your data; basic newsroom math; correlation and its pitfalls; exploring differences between groups; scatter plots and box plots.
Encoding data using visual cues; choosing chart types to show comparisons, composition (parts of the whole) and connections; using color effectively; using chart furniture, minimizing chart junk and highlighting the story; avoiding pitfalls; good practice, including for interactive graphics.
We will use Tableau Public to explore and visualize data including traffic accidents in Berkeley, combining a map and charts into an interactive online dashboard.
Introduction to databases and Structured Query Language for manipulating data, prior to visualization. We will use SQLite and the SQLite Manager Firefox plugin to explore data including drug company payments to doctors. We will also make pivot tables in Libre Office Calc.
Data search and download tricks, including Table2Clipboard and DownThemAll! Firefox plugins; manipulating urls and using APIs to acquire data; scraping data from the web with Import.io; cleaning data with Open Refine; converting data between different formats using Mr. Data Converter.
Introduction to R, R Studio and R packages including ggplot2 for visualization; the grammar of graphics; manipulating data with R. We will visualize data including measures of the wealth and wellbeing of nations, exporting charts in vector image formats.
Further work with R/R Studio, as necessary. We will then use Inkscape to edit, refine and annotate charts exported from R.
Basic mapping principles: projections, geocoding, geodata formats; approaches to putting data onto maps, including choropleth maps, scaled symbols, hexagonal binning and cartograms. We will also discuss the data made available for your final projects; students may add further data, or suggest entirely different datasets, with the instructor’s approval.
We will use QGIS to make a multi-layered map from data on seismic hazards and historical earthquakes. We will also learn how to use QGIS and its plugins to process geodata, including converting between formats, simplifying data, joining maps to external data and hexagonal binning of points.
Instead, one-on-one meetings will be arranged with instructor to discuss your final projects.
We will use TileMill, Leaflet and some simple JavaScript to create an interactive online version of the earthquake and seismic hazard map. This will include an API call so that the map automatically updates to include new quakes.
Basic principles of network visualization and analysis; we will then use Gephi to create network graphs illustrating voting patterns in the U.S. Senate.
We will use D3 to code from scratch an online interactive visualization of the gender pay gap across occupations. This will be a challenging exercise, intended as an introduction to the huge possibilities offered by a JavaScript code library that powers many of today’s most impressive online news visualizations.
Arvind will present his tool Lyra, a visualization design environment developed to unleash the expressivity of D3 without the need to code from scratch. He will lead you through visualization exercises including the emulation of published graphics from The New York Times.
We will continue work on the D3 visualization, as necessary, and conclude with a discussion of lessons learned, and next steps to continue to develop your data manipulation and visualization skills.
Alberto Cairo: The Functional Art: An Introduction to Information Graphics and Visualization
Nathan Yau: Data Points: Visualization That Means Something
Further reading/viewing will be recommended to support weekly class material.
Unexcused absence from two classes will drop you one letter grade; a third unexcused absence will result in an F. Excused absences will be permitted only in extraordinary circumstances. Regardless of the reason for an absence, students will be responsible for any assignments due and for learning material covered in class.
Class participation, weekly assignments: 45%
Final project: 45%
Attendance: 10%
Students must turn off the ringers on their cell phones before class begins. Students may not check e-mail, social media sites or other websites during lecture portions of class or while working on class exercises.
The high academic standard at the University of California, Berkeley, is reflected in each degree that is awarded. As a result, it is up to every student to maintain this high standard by ensuring that all academic work reflects his/her own ideas or properly attributes the ideas to the original sources.
These are some basic expectations of students with regards to academic integrity:
Any work submitted should be your own individual thoughts, and should not have been submitted for credit in another course unless you have prior written permission to re-use it in this course from this instructor.
All assignments must use “proper attribution,” meaning that you have identified the original source of words or ideas that you reproduce or use in your assignment. This includes drafts and homework assignments!
If you are unclear about expectations, ask your instructor.
If you need disability-related accommodations in this class, if you have emergency medical information you wish to share with the instructor, or if you need special arrangements in case the building must be evacuated, please inform the instructor as soon as possible by seeing him after class or making an appointment to visit during office hours. If you are not currently listed with DSP (Disabled Students’ Program) but believe that you could benefit from their support, you may apply online.