I’ve spent the last couple of months revising my Data Visualization book for a second edition that, ideally, will appear some time in the next twelve months. As with the first edition, I’ve posted a complete draft of the book at its website. The production process hasn’t started yet, so it’s not ready to pre-order or anything, but the site has a one-question form you can fill out that asks for your email address if you’d like to be notified with one (and only one) email when it’s available. A lot has changed since the first edition, reflecting changes both in R and ggplot specifically, and in the world of coding generally. I may end up highlighting some of those new elements in other posts. But here, I want to focus on some nerdy details involved in getting the book to its final draft. I’ll discuss Quarto, the publishing system I used, its many advantages, and its current limits with respect to the demands I made of it. ❧ Continue reading…
I’ve written a second edition of Data Visualization: A Practical Introduction, which ideally should come out with Princeton University Press later this year. As with the first edition, a full draft of the book is available at https://socviz.co. The production process is just getting started so there’s no new cover yet, and there isn’t a link to pre-order. But (also like last time) I’ve put up a link to a form that lets you add your email if you’d like to be notified when it’s available to buy. You’ll only get one email (from me personally, not a marketing department) if you do; no spam or anything. ❧ Continue reading…
This past September I gave the closing keynote at posit::conf; it’s now on YouTube to watch. Keen-eyed observers will note from the title that it’s about trustworthy data visualization. But it’s also about trust a bit more generally, and how we should think about it in a world where researchers are faking results, AIs are enthusiastically confabulating, and government is destroying data infrastructure. When you find yourself giving a talk with a little tiny microphone stuck to the side of your head you have to ask yourself some hard questions, but the talk was partly about that. ❧ Continue reading…
Mamdani’s victory in the New York City mayoral election gave me the opportunity to draw a few maps, and also to learn a bit about incorporating additional spatial data into maps drawn in R. R is not a specialized piece of GIS software. ESRI’s ArcGIS is the 800lb gorilla in this world and QGIS the GIMP to its Photoshop, so to speak.
Still, you can do a lot of spatial stuff in R, grounded in the sf package and its many friends. Plus you get the benefit of all the data manipulation and analysis that R is really good at. So, having gotten the precinct-level results for the election, some maps from New York City (e.g., the clipped borough boundaries map), and GTFS data from the MTA describing the structure of the subway system, I was able to draw some things. I strongly approve of the existence of the GTFS, by the way. It’s a spec for encoding transit data and lots of cities use it. Really handy. ❧ Continue reading…