The United Kingdom’s election results are being digested by the chattering classes. So, yesterday afternoon I thought I’d see if I could grab the election data to make some pictures. Because the ever-civilized BBC has election web pages with a sane HTML structure, this proved a lot more straightforward than I feared. (Thanks also in no small part to statistician Hadley Wickham’s rvest scraping library, alongside many other tools he has contributed to the community of social scientists who use R to do data analysis.
A side-note to the enjoyable exchange with Dr Drang about sales trends in Apple products, which was picked up by John Gruber. The LOESS decompositions I posted looked like this:
Quarterly sales decomposition for iPhones. One or two people remarked that these figures were shorter and wider than they were used to seeing. I did this on purpose—following the approach taken by William Cleveland and others, the charts are banked, meaning the aspect ratio is set to make it easier to pick out trends.
Update (April 30th): I redrew the decomposition plots this morning, and added a couple more.
Another Twitter conversation, this time in the evening. Dr Drang put up a characteristically sharp post looking at sales trends in Apple Macs, iPhones, and iPads. He used moving averages to show long-term sales trends effectively, and he made a convincing argument that iPad sales are in decline. I ended up grabbing the sales data myself from barefigur.
Following a conversation on Twitter this morning, here’s a quick plot of some GSS data from 2000. Respondents were asked to estimate the percentage of people in the United States who fell into a range of (not necessarily exclusive) categories: White, Black, Hispanic, Asian, and Jewish. Here we show the median guesses of White respondents and Black respondents, together with the actual percentage of people in each category, based on the 2000 Census.
Thanks, Paddy. I’m very sad to hear that Paddy O’Carroll died this weekend in Cork. He was one of my first teachers in Sociology, and a man of deep intelligence, humanity, and insight into Irish society and especially its political community. Famously disorganized in lecture, he was nevertheless sharp as a pin in conversation. I can’t count the number of times he brought me up short with some observation or anecdote that I’d spend the rest of the day thinking about.
Last Thursday I gave a talk at the American Philosophical Association’s Central Division meetings about patterns in publication and citation in some of the field’s major journals. I have a more extensive analysis of the data that’s almost done, but that deserves a paper of its own rather than a post. Here I’ll confine myself mostly to descriptive material about some broad trends, together with a bit of discussion at the end.
Update: Updated to identify Catholic schools. (And again later, with more Catholic schools ID’d.)
I took another look at the vaccination exemption data I discussed the other day. This time I was interested in getting a closer look at the range of variation between different sorts of schools. My goal was to extract a bit more information about the different sorts of elementary schools in the state, just using the data from the Health Department spreadsheet.
California Kindergarten PBE Rates by Type of School, 2014-15. (PDF available.)
I came across a report this afternoon, via Eric Rauchway, about high rates of vaccination exemption in Sacramento schools. As you are surely aware, this is a serious political and public health problem at the moment. Like Eric, I was struck by just how high some of the rates were. So I went and got the data from the California Department of Public Health, just wanting to take a quick look at it.
Update, January 22nd: Now with plots standardized per thousand films released that year.
It’s time for another episode of Data Analysis on the Bus. This one follows from an exchange on Twitter, prompted by the coverage of American Sniper about the tendency to use the word “American” in film titles, especially when you want things to sound terribly serious. This led to a bit of freewheeling and it has to be said perhaps tendentious cultural theorizing on my part.
After listening to the hosts discuss probability on ATP this week, I was most of the way through writing something that, had I finished it, would have been this Dr Drang post only not nearly as good. (I will confess that my motivation was exactly the same as his: “People believe John”.) In fairness I don’t blame them for getting confused, because probability really is confusing and I’m terrible at it myself.