Tue, Oct 15, 2019

Parsing Sda Pages

SDA is a suite of software developed at Berkeley for the web-based analysis of survey data. The Berkeley SDA archive (http://sda.berkeley.edu) lets you run various kinds of analyses on a number of public datasets, such as the General Social Survey. It also provides consistently-formatted HTML versions of the codebooks for the surveys it hosts. This is very convenient! For the gssr package, I wanted to include material from the codebooks as tibbles or data frames that would be accessible inside an R session.

Thu, Oct 10, 2019

Back in the GSSR

The General Social Survey, or GSS, is one of the cornerstones of American social science and one of the most-analyzed datasets in Sociology. It is routinely used in research, in teaching, and as a reference point in discussions about changes in American society since the early 1970s. It is also a model of open, public data. The National Opinion Research Center already provides many excellent tools for working with the data, and has long made it freely available to researchers.

Mon, Aug 26, 2019

This Is Just the Verse

They pluck your plums, your mum and dad They eat them for their supper, too They gobble all the fruit you had And leave some bullshit note for you But they were robbed blind in their day Of damsons, prunes, and blackthorn sloes Their breakfast treats all poached away Thefts justified with old-style prose “Forgive us” both your parents moan “They were delicious, sweet, and cold” They wonder why I never phone

Sat, Aug 3, 2019

Rituals of Childhood

Back in April, in Ireland, my nephew Luke made his first communion alongside his school classmates. I did much the same thing myself in much the same place about forty years ago. My brother tells me that the preparation nowadays is a little more humane than the version we enjoyed. But there is as much anticipation beforehand, and no less excitement on the day. Luke’s little suit lacked the stylish navy-blue velvet panels mine sported in 1980, but in essence the event was the same in its purpose, its form, and in most of its details.

Sun, Jun 23, 2019

Earned Doctorates

PhDs awarded in selected disciplines, 2006-2016. Thierry Rossier asked me for the code to produce plots like the one above. The data come from the Survey of Earned Doctorates, a very useful resource for tracking trends in PhDs awarded in the United States. The plot is made with geom_line() and geom_label_repel(). The trick, if it can be dignified with that term, is to use geom_label_repel() on a subset of the data that contains the last year of observations only.

Mon, May 13, 2019

Baby Name Animation

I was playing around with the gganimate package this morning and thought I’d make a little animation showing a favorite finding about the distribution of baby names in the United States. This is the fact—I think first noticed by Laura Wattenberg, of the Baby Name Voyager—that there has been a sharp, relatively recent rise in boys’ names ending in the letter ‘n’, at the expense of names with ‘e’, ‘l’, and ‘y’ endings.

Fri, Mar 22, 2019

A Quick and Tidy Look at the 2018 GSS

The data from the 2018 wave of the General Social Survey was released during the week, leading to a flurry of graphs showing various trends. The GSS is one of the most important sources of information on various aspects of U.S. society. One of the best things about it is that the data is freely available for more than forty years worth of surveys. Here I’ll walk through my own quick look at the data, in order to show how R can tidily manage data from a complex survey.

Mon, Mar 18, 2019

Frank Oz Muppets and the Big Five Personality Traits

In case you are searching for a unified account of Frank Oz Muppets in terms of the Big Five Personality Traits—and, to be clear, someone on the internet was earlier today—I’m providing it here for posterity. This version includes the “Henson Area”, which is optional but both clarifying for the strictly psychological aspects and a bridge to a fully social theory of Frank Oz Muppets. A unified account of Frank Oz Muppets in terms of the Big Five Personality Traits, and vice versa.

Tue, Mar 12, 2019

Installing Socviz

I’ve gotten a couple of reports from people having trouble installing the development version of the socviz library that’s meant to be used with Data Visualization: A Practical Introduction. As best as I can tell, the difficulties are being caused by GitHub’s rate limits. The symptom is that, after installing the tidyverse and devtools libraries, you try install_github("kjhealy/socviz") and get an error something like this: Error in utils::download.file(url, path, method = download_method(), quiet = quiet(): cannot open URL https://api.

Tue, Mar 12, 2019

The Persistence of the Old Regime, Again

A few years ago I wrote a post about the stickiness of college and university rankings in the United States. It’s been doing the rounds again, so I thought I’d revisit it and redraw a few of the graphs I made then. In 1911, Kendric Babcock made an effort to rank US Universities and Colleges. In his report, Babcock divided schools into four Classes, beginning with Class I: The better sort of school.

Sociology and other distractions, since 2002. View an index of posts by category. R-related posts also appear on R-Bloggers.


I am Professor of Sociology at Duke University. I’m affiliated with the Kenan Institute for Ethics, the Markets and Management Studies program, and the Duke Network Analysis Center. Learn more.



To be notified of updates, you can subscribe to the  RSS feed for the site.