27 April 2022
On StackOverflow, a questioner with a bunch of data frames (already existing as objects in their environment) wanted to split each of them into two based on some threshold being met, or not, on a specific column. Every one of the data frames had this column in it. Their thought was that they’d write a loop, or use lapply after putting the data frames in a list, and write a function that split the data fames, named each one, and wrote them out as separate objects in the environment.
10 April 2022
As mentioned last time, we often want to build up a data frame iteratively. The map() family of functions in purrr can help with this. Here I’ll show a handy pattern for keeping track of what you’ve added to the data frame you’re making.
The map_dfr() function will take a vector, apply a function to each element of it, and then return the results as a data frame bound row-by-row. For example, the U.
8 April 2022
Let’s say we’re working with the General Social Survey. We’re interested in repeatedly fitting some model each year to see how some predictor changes over time. For example, the GSS has a longstanding question named fefam, where respondents are asked to give their opinion on the following statement:
It is much better for everyone involved if the man is the achiever outside the home and the woman takes care of the home and family.
15 February 2022
For the past view years, Jason Snell at Six Colors has conducted a survey of people who write about Apple. He asks a series of questions about the company and its products and presents a report of people’s answers. This year’s report has all the details for those interested.
I’m a subscriber to Six Colors (it’s well worth it if you like that sort of thing). In the course of chatting about the report and its graphs in the member Slack, Jason kindly shared an anonymized version of the survey data with me.