Data Visualization for Social Science: A Practical Introduction with R and ggplot2 I’m writing a book on data visualization, provisionally titled Data Visualization for Social Science: A practical introduction with R and ggplot2. As part of that process, largely because I’ve benefited so much myself from the availability of open and widely shared tools for software development, I’m making the draft version of the book available as its own website.
Here are two small sites I made recently, and which I may continue to tweak and expand. The first, plain-text.co, presents “The Plain Person’s Guide to Plain-Text Social Science”. It is designed to address some questions about managing research and writing projects in the social sciences using plain-text and free or mostly-free tools like Emacs (or other text editors), R, pandoc, and make. The second, vissoc.co which I’ve mentioned before, compiles notes from a short course in data visualization I taught last semester.
The Gravitational Waves paper that was in the news yesterday has almost a thousand authors. (Actually there’s more than one paper—there’s the “discovery” paper and the “implications” paper.) Out of interest, I fed the list of authors in the “implications” paper into R and constructed an affiliation network with ties based on the university or research institute listed. Then I colored the nodes by the country of the primary institutional affiliation.
ASA Section Membership and Revenues. I taught a half-sized introductory seminar on data visualization last semester. It’s an introduction to some principles of data visualization for working social scientists, and is focused mostly on teaching people how to use ggplot effectively. I’ve made the (slightly rough-and-ready) course notes available as a website. The notes include numerous code samples, .Rmd files for every week, and there’s a GitHub repository containing all the material to build the site, including the datasets used to make the plots.
A few days ago, Matt Yglesias shared this tweet from Liz Ann Sonders, Chief Investment Strategist with Charles Schwab, Inc:
DailyShot: Here is a comparison of the monetary base with the S&P500 ... Coincidence? pic.twitter.com/QsdNhJdbRP
— Liz Ann Sonders (@LizAnnSonders) January 15, 2016 Matt remarked that “Friends don’t let friends use two y-axes”. It’s a good rule. The topic came up a couple of times during the data visualization short course I taught last semester.
I’m teaching a short graduate seminar on Data Visualization with R this semester. Following Matt Salganik, I wanted students to be able to submit homework or other assignments as R Markdown files, but to have a way to make sure their R code passed some basic stylistic checks provided by lintr before they submitted it to me. Students write .Rnw files containing discussion or notes interspersed with chunks of R code.
The United Kingdom’s election results are being digested by the chattering classes. So, yesterday afternoon I thought I’d see if I could grab the election data to make some pictures. Because the ever-civilized BBC has election web pages with a sane HTML structure, this proved a lot more straightforward than I feared. (Thanks also in no small part to statistician Hadley Wickham’s rvest scraping library, alongside many other tools he has contributed to the community of social scientists who use R to do data analysis.
Over the past few months, I’ve had several people ask me about the tools I use to put papers together. I maintain a page of resources somewhat grandiosely headed “Writing and Presenting Social Science”. Really it just makes public some configuration files and templates for my text editor and related tools. Things have changed a little recently—which led to people asking the questions—so I will try to lay out the current setup here.
I have been asked by my superiors to give a brief demonstration of the surprising effectiveness of even the simplest techniques of the new-fangled Social Networke Analysis in the pursuit of those who would seek to undermine the liberty enjoyed by His Majesty’s subjects. This is in connection with the discussion of the role of “metadata” in certain recent events and the assurances of various respectable parties that the government was merely “sifting through this so-called metadata” and that the “information acquired does not include the content of any communications”.
The Emacs Social Science Starter Kit is a drop-in collection of packages and settings for Emacs 24 aimed at people like me: that is, people doing social science data analysis and writing, using some combination of tools like R, git, LaTeX, Pandoc, perhaps some other programming languages (e.g., Python, or Perl), and plain-text formats like Markdown, and Org-Mode. More information on the kit is available here. Some of its highlights are listed here.