22 July 2022
“Happy families are all alike; every unhappy family is unhappy in its own way” runs the opening sentence of Anna Karenina. Hadley Wickham echoes the sentiment in a somewhat different context: “Tidy datasets are all alike, but every messy dataset is messy in its own way”. Data analysis is mostly data wrangling. That is, before you can do anything at all with your data, you need to get it into a format that your software can read.
29 June 2022
One more from the Manhattan data. Here’s a plot of all Manhattan’s presently existing buildings with their year of construction on the x-axis and their height in feet on the y-axis. With a nice wide aspect ratio and the use of geom_rect() to make the columns, we get a plot that looks a little like a skyline itself. Or, as was pointed out on Twitter, a Manhattan Plot of Manhattan itself.
24 June 2022
Following up on yesterday’s first cut at a map of Manhattan’s buildings by height, which I’ll be revisiting soon, here are two maps of the city’s building’s by age. First for Manhattan alone and then for the whole of New York City. The latter one is quite compressed, unfortunately, as it contains data on every building footprint in the city.
Manhattan Buildings by Nearest Decade of Construction.
New York City buildings by Decade of Construction.
23 June 2022
New York’s Open Data initiative continues to provide a lot of really excellent material to experiment with visually. Here is a map of building footprints in Manhattan, with their roof heights designated by color. Made with R and ggplot along with the sf package. The PNG is relatively high-resolution so you should be able to zoom in a bit to see some familiar outlines.
Building footprints and roof heights in Manhattan.