Categories ▸ Visualization
Data Visualization: A Practical Introduction will begin shipping next week. I’ve written an R package that contains datasets, functions, and a course packet to go along with the book. The socviz package contains about twenty five datasets and a number of utility and convenience functions. The datasets range in size from things with just a few rows (used for purely illustrative purproses) to datasets with over 120,000 observations, for practicing with and exploring.
I taught my Data Visualization seminar in Philadelphia this past Friday and Saturday. It covers most of the content of my book, including a unit on making maps. The examples in the book are from the United States. But what about other places? Two of the participants were from Canada, and so here’s an example that walks through the process of grabbing a shapefile and converting it to a simple-features object for use in R.
As part of the run-up to the release of Data Visualization (out in about ten days! Currently 30% off on Amazon!), I’ve been playing with graphing different kinds of data. One great source of rich time-series data is mortality.org, which hosts a collection of standardized demographic data for a large number of countries. Mortality rates are often interesting to look at as a heatmap, as we get data for a series of ages (e.
Since the U.S. midterm elections I’ve been playing around with some Congressional Quarterly data about the composition of the House and Senate since 1945. Unfortunately I’m not allowed to share the data, but here are two or three things I had to do with it that you might find useful.
The data comes as a set of CSV files, one for each congressional session. You download the data by repeatedly querying CQ’s main database by year.
The American Sociological Association released some data on its special-interest sections, including some demographic breakdowns. Dan Hirschman wrote a post on Scatterplot looking at some of the breakdowns. Here are some more. I was interested in two things: first, the relative prevalence of Student and Retired members across sections, and second the distribution of women across sections. About 53% of all ASA members are women, substantially higher than some other social sciences and many other academic disciplines.
Yesterday, Vox ran a story about changes in food consumption patterns in the United States over the past few decades. It featured this graph:
Vox Time Series
When I saw it, one of those little bells went off in my head:
As a rule, when you see a sharp change in a long-running time-series, you should always check to see if some aspect of the data-generating process changed—such as the measurement device or the criteria for inclusion in the dataset—before coming up with any substantive stories about what happened and why.
To close out what has become demography week, I combined the US monthly birth data with data for England and Wales (from the same ONS source as before), so that I could look at the trends together. The monthly England and Wales data I have to hand runs from 1938 to 1991. I thought combining the monthly tiled heatmap and the LOESS decomposition would work well as a poster, so I made one.
Amateur demography week continues around here. Today we are looking at the population of England and Wales since 1961, courtesy of some data from the UK Office of National Statistics. We have data on population counts by age (in nice, detailed, yearly increments) broken down by sex. We’re going to tidy the data, make a pyramid for a year, and then make an animated gif that shows the changing age distribution of the population over more than fifty years.
Yesterday I came across Aaron Penne’s collection of very nice data visualizations, one of which was of monthly births in the United States since 1933. He made a tiled heatmap of the data, taking care when calculating the average rate to correct for the varying number of days in different months. Aaron works in Python, so I took the opportunity to play around with the data and redo the plots in R.
On Twitter the other day, Philip Cohen put up some data on changes in Bachelor’s degrees awarded between 1995 and 2015. The data come from the National Center for Education Statistics. It seemed like a good candidate for drawing as a figure, so I had a go at it:
Changes in the number of Bachelor’s degrees awarded over the past twenty years.
Afterwards, I was messing around with the data and wanted to draw some time-series plots for the various subject areas the NCES tracks.
To be notified of updates, you can
subscribe to the RSS feed for the site.