Thu, Feb 7, 2019


I am stuck at home sick today, so I decided to provide a relational analysis of the Stats Package Wars that have been bubbling away for the past week. True in all its details. If you want something slightly more constructive, consider Data Visualization: A Practical Introduction, or The Plain Person’s Guide to Plain-Text Social Science.

Wed, Jan 2, 2019

Dataviz Course Packet Quickstart

Chapter 2 of Data Visualization walks you through setting up an R Project, and takes advantage of R Studio’s support for RMarkdown templates. That is, once you’ve created your project in R Studio, can choose File > New File > R Markdown, like this: Select R Markdown … And then choose “From Template” on the left side of the dialog box that pops up, and select the “Data Visualization Notes” option on the right:

Thu, Dec 27, 2018

French Mortality Poster

Based on the heatmaps I drew earlier this month, I made a poster of two centuries of data on mortality rates in France for males and females. It turned out reasonably well, I think. I will probably get it blown up to a nice large size and put it up on the wall. I’ve had very good results with PhD Posters for work like this over the years, by the way.

Wed, Dec 12, 2018

Teaching and Learning Materials for Data Visualization

Data Visualization: A Practical Introduction will begin shipping next week. I’ve written an R package that contains datasets, functions, and a course packet to go along with the book. The socviz package contains about twenty five datasets and a number of utility and convenience functions. The datasets range in size from things with just a few rows (used for purely illustrative purproses) to datasets with over 120,000 observations, for practicing with and exploring.

Sun, Dec 9, 2018

Canada Map

I taught my Data Visualization seminar in Philadelphia this past Friday and Saturday. It covers most of the content of my book, including a unit on making maps. The examples in the book are from the United States. But what about other places? Two of the participants were from Canada, and so here’s an example that walks through the process of grabbing a shapefile and converting it to a simple-features object for use in R.

Tue, Dec 4, 2018

Heatmaps of Mortality Rates

As part of the run-up to the release of Data Visualization (out in about ten days! Currently 30% off on Amazon!), I’ve been playing with graphing different kinds of data. One great source of rich time-series data is, which hosts a collection of standardized demographic data for a large number of countries. Mortality rates are often interesting to look at as a heatmap, as we get data for a series of ages (e.

Mon, Nov 19, 2018

Zero Counts in dplyr

Here’s a feature of dplyr that occasionally bites me (most recently while making these graphs). It’s about to change mostly for the better, but is also likely to bite me again in the future. If you want to follow along there’s a GitHub repo with the necessary code and data. Say we have a data frame or tibble and we want to get a frequency table or set of counts out of it.

Sat, Nov 17, 2018

Congress Over Time

Since the U.S. midterm elections I’ve been playing around with some Congressional Quarterly data about the composition of the House and Senate since 1945. Unfortunately I’m not allowed to share the data, but here are two or three things I had to do with it that you might find useful. The data comes as a set of CSV files, one for each congressional session. You download the data by repeatedly querying CQ’s main database by year.

Tue, Nov 6, 2018

Spreading Multiple Values

Earlier this year my colleague Steve Vaisey was converting code in some course notes from Stata to R. He asked me a question about tidily converting from long to wide format when you have multiple value columns. This is a little more awkward than it should be, and I’ve run into the issue several times since then. I’m writing down the answer (or, an answer) here so that I can find it again myself.

Wed, Sep 12, 2018

Asa Section Demographics

The American Sociological Association released some data on its special-interest sections, including some demographic breakdowns. Dan Hirschman wrote a post on Scatterplot looking at some of the breakdowns. Here are some more. I was interested in two things: first, the relative prevalence of Student and Retired members across sections, and second the distribution of women across sections. About 53% of all ASA members are women, substantially higher than some other social sciences and many other academic disciplines.

Sociology and other distractions, since 2002. View an index of posts by category. R-related posts also appear on R-Bloggers.



To be notified of updates, you can subscribe to the  RSS feed for the site.