21 October 2021
Prompted by a guest visit to Mine Çetinkaya-Rundel’s Advanced Data Visualization class here at Duke, I’ve updated my US and state excess death graphs. Earlier posts (like this one from February) will update as well.
I am interested in all-cause mortality in the United States for 2020. I look at each jurisdiction, ordered by how far off its 2015-2019 average it was in 2020.
All-cause mortality by jurisdiction.
The zero-percent line in this graph is average deaths between 2015 and 2015.
9 October 2021
The PDP-11/70 was a 16-bit minicomputer built by Digital Equipment Corporation in the 1970s. Amongst other things it is well-known for its front panel designs, with color-coded (and color-coordinated) switches and associated blinkenlights. I have an interest in vintage computers, mostly focused on Macs from the late 1980s, that I ended up indulging a little during the pandemic. I’ve fixed up a couple of SE/30s and a Quadra 700) over the past year.
3 September 2021
I updated the covdata package for the first time in a while, as I’ll be using it to teach in the near future. As a side-effect, I ended up taking a look at what the ongoing polarization or divergence of the COVID experience is like in different parts of the United States. Here I use county-level data to draw out some of the trends. The idea is to take the time series of COVID-19 deaths and split it into deciles by some county-level quantity of interest.
4 May 2021
Recently I came across a question where someone was looking to take a bunch of CSV files, each of which contained numerical columns, and (a) get them into R, (b) calculate the mean and standard deviation of every column in every CSV file, and (c) calculate some overall summary like the mean of all the means and the mean of all the standard deviations.
I already know how to use map_dfr() to read a lot of CSVs with the same structure into a nice tidy tibble.