This past September I gave the closing keynote at posit::conf; it’s now on YouTube to watch. Keen-eyed observers will note from the title that it’s about trustworthy data visualization. But it’s also about trust a bit more generally, and how we should think about it in a world where researchers are faking results, AIs are enthusiastically confabulating, and government is destroying data infrastructure. When you find yourself giving a talk with a little tiny microphone stuck to the side of your head you have to ask yourself some hard questions, but the talk was partly about that. ❧ Continue reading…
Mamdani’s victory in the New York City mayoral election gave me the opportunity to draw a few maps, and also to learn a bit about incorporating additional spatial data into maps drawn in R. R is not a specialized piece of GIS software. ESRI’s ArcGIS is the 800lb gorilla in this world and QGIS the GIMP to its Photoshop, so to speak.
Still, you can do a lot of spatial stuff in R, grounded in the sf package and its many friends. Plus you get the benefit of all the data manipulation and analysis that R is really good at. So, having gotten the precinct-level results for the election, some maps from New York City (e.g., the clipped borough boundaries map), and GTFS data from the MTA describing the structure of the subway system, I was able to draw some things. I strongly approve of the existence of the GTFS, by the way. It’s a spec for encoding transit data and lots of cities use it. Really handy. ❧ Continue reading…
Release 2 of the 2024 GSS cross-section and 1972-2024 culumative data are now available. I’ve updated gssr and gssrdoc to incorporate them. There are quite a few changes in the data and variables, thanks in part to some changes in data collection methods and a privacy/disclosure review.
The gssr and gssrdoc packages are the nicest way to get General Social Survey data up and running in R. The figure above shows (survey-weighted) trends derived from the immameco question. ❧ Continue reading…
Here I continue my efforts to design visualizations that are as poorly-suited as possible to being displayed on phones. It looks pretty good on a big monitor, or six feet wide on a wall.
I made a version of this plot a few years ago. I ended up revisiting it this morning because I’m updating various datasets and code. A Manhattan
plot is a term sometimes used to describe a kind of scatter plot where the x-values are fairly continuous, and
the y values have distributions with long tails, so the plot looks like a skyline. This one here is a bar chart rather than a scatter plot but it’s still a kind of Manhattan plot of Manhattan. ❧ Continue reading…