I spent a couple days in DC last week so I could attend Visualized: Political Data, a one-day conference dedicated to exploring data visualizations in the context of political science, political campaigns, and political journalism.
I felt conflicted about flying all the way from SFO to DCA for a one-day event, but since political data viz is both where I came from (working with Gallup’s opinion poll data) and where I’d potentially like to go again, I figured it would be worth it.
I'm glad I went. I was a little disappointed that some of the slated speakers had cancelled (including Lisa Strausfeld, who I had been looking forward to seeing), but overall the sessions were engaging and insightful. I definitely picked up some good info – the highlights of which I'll share with you now.
Topic 1: Public Data / Open Government
Derek Willis (NYT), Ryan Sibley (Sunlight Foundation), and Rebecca Williams (Data.gov) all tackled topics related to working with government data.
Willis began with the example of collecting leave of absence records for congressional representatives. This is a metric that should be tracked and available but isn’t – either due to human error or bad record-keeping practices. Plus, even if we had accurate data on congressional leaves of absence, we still wouldn’t know why a representative was absent.
So while it’s nice to play with government datasets, you should really be looking beyond that – combining datasets from various places and scraping data from government apps to fill in the pieces.
Willis said the biggest opportunities for improvement are in:
- Developing efficient information markets
- Linking offline and online actions
- Improving speed and accuracy in public data reporting.
Check out some of Willis’ projects: @unitedstates and Open Elections.
Sibley went over a few of the datasets that Sunlights has opened:
Political influence:
- FEC
- State campaign finance
- Lobbying data
- Revolving door data (also see OpenSecrets.org)
- Advisor Committee on Transparency
Criminal justice:
- Criminal justice data collection
- Bureau of Justice Statistics
- U.S. Sentencing Commission
- Local police departments
Sibley also discussed some of the issues that make it hard to achieve Sunlight’s goals regarding open data.
- Methodologies can keep us from understanding the whole story.
- Example: When requesting data on the cause of death by law enforcement, the CDC suppressed smaller numbers to protect individual identities.
- Old technology stands in the way of improving reporting.
- Example: Senators don’t disclose their campaign finance reports digitally at this time.
- In the criminal justice system, there’s no way to track individual cases over time because the police, courts, corrections, and re-entry organizations don’t share data with each other.
- In politics, there’s no way to track all the different players.
- Example: There is no unique ID for donations by organization, so if a company’s name changes, it’s difficult to match it with its previous name. When unique IDs are missing, it’s hard to identify which organizations are affiliated.
To help resolve some of these issues, Sibley implored everyone to use the data. Demonstrated interest will lead to the release of more and better data.
Williams had similar things to say regarding government data collection – sometimes government data simply isn’t properly collected, structured, public, or reusable. This is where coalitions and partnerships between the government and the private sector (like freelawfounders.org) become valuable.
The biggest advantage of government data is that you can vote on it. The more you can point to people losing money, the more actionable the response will be.
Links:
Topic 2: Maps
Alicia Parlapiano (New York Times) and Alyson Hurt (NPR) both spoke about maps – mostly U.S. state maps.
Alicia laid out four goals for representing state data:
- Accessibility
- Context
- Fairness
- Revelation
The “fairness problem” relates to how the physical size of different states can misrepresent the data by masking huge differences in population. Take national elections for example: Montana and Delaware have the same number of electoral votes (3), but when you look at a map, Montana’s size dwarfs Delaware’s, which – at a glance – leaves the impression that Montana’s votes count more.
Cartograms can help solve the fairness problem for electoral votes.
Grid maps – made of squares or hexagons – are also good options when you want to show state totals without giving any one state more visual weight than another.
Hurt explained that the challenge with grid maps is geographic fidelity. Consider offering context by placing a geographically accurate image next to the cartogram.
Parlapiano and Hurt also emphasized that maps aren’t the only way to present state data – and they frequently aren’t even the best way. Jonathan Schwabish also touched on this point. He suggested asking four questions before choosing to make a map:
- Should it be a bar chart?
- Should it be a scatterplot?
- Should it be a table?
- Should it simply be a sentence?
Topic 3: Elections
Andy Cotgreave (Tableau) and Sarah Newhall (Blue State Digital) spoke about using data visualization to engage people with political data.
Contgreave specifically analyzed how the U.K. media used twitter in the recent elections.
Contgreave's discoveries:
- Sentiment tracking is dangerous – This is especially true with Twitter because Twitter demographics are not representative of the real world. However, Twitter stats are very shareable, and people take it for granted that the data is accurate.
- Polls v. actual results – For the UK election, none of the polls were correct. Even though the pollsters got it wrong, the poll results still drove the party strategy, the media, the agenda. Pollsters are sometime too shocked by outliers and will massage the data too much. In the most recent elections, the outliers weren’t really outliers at all.
- Answer a new question every day – Play with the data and post it on a day-to-day basis because that’s what keeps people engaged. If you build out all these fancy apps for visualizing – but the data is boring – people won’t stay interested. For example, if you’re visualizing trends over time but the trend never changes, no one will care.
Newhall came at it from the marketing campaign side of things. She emphasized that data does a handful of things when used to influence people:
- It proves what’s at stake.
- It motivates people to act.
- It turns defense into offense.
- It shapes the narrative and provides clear explanations.
- It informs and inspires.
Topic 4: Data for Humans
Jamie Chandler (George Washington University), Jonathan Schwabish (Urban Institute), and Ben Casselman (FiveThirtyEight)
From Chandler:
- Today we have an opportunity to take data and communicate it to people in a simple, elegant way – and cut through all the talk that they get from the political world.
- Data has been a big part of elections over the last couple cycles. We have all this data that’s being used to drive the vote, but we can also use that data in our messaging. We can communicate very complex things to the public in very simple ways. Simple data points that will shift our perception of previously held beliefs.
- We’re missing the bridge between data science and communications. Politicians, CEOs, leaders rely on their gut feel 67% of the time. They’ll disregard what the data scientist recommends. Communications professionals have not be exposed to this level of sophistication [data analysis].
- In the campaign world, we have to start thinking about how we connect the people who talk to the public and the people who do the analysis. Vocative is an online news agency that partners a data scientist with a journalist to mine the web for news.
- If we’re going to bring data into the conversation, we have to remember who the average person is. In general, only 66% are able to remember visual. Statistical literacy is generally very low. People have biases when it relates to politics.
From Schwabish:
- Think carefully and critically about your graphics – because people will learn something from them.
- Mix Modalities
- Example: Equal population mapper from Slate
- Example: “The site now stores 90 million gallons of radioactive water, more than enough to fill Yankee stadium." - WashPo
- Big enough, but not too big
- Super-imposing Jupiter on Earth's horizon
- Steve Jobs presenting the Macbook Air in an envelope
- Have a Soul
- Identifiable Victim Effect. We’re more likely to provide aid to an actual person.
- The problem with statistics is that they don’t activate our moral emotions.
- Communicating requires connecting.
From Casselman:
- How do we:
- Reach readers who aren't data junkies? (Without disappointing the nerds)
- Communicate methodology? (Without convoluting the story)
- Reflect uncertainty? (Without sewing confusion)
- Readers should be able to understand:
- Where the data comes from
- Limitations of the data
- The assumptions and decisions we've made while working with the data
- How our analysis supports our conclusions
- How confident we are in this conclusions
- Sometimes the uncertainty is the story itself.
More coverage of Visualized: Political Data: Twitter, Azavea report