This article explains the why behind many of the decisions we make when creating a good visualization.
It’s important to think of your data visualizations in much of the same way. Our current society thrives on scroll culture, which means you need to make your visualization immune to it. It helps to assume that you care about your work the most, such that even individuals or stakeholders that requested a visualization are likely to move quickly through your work. Thus, a visualization is great when it can capture the point efficiently, quickly, and without losing the interest of your end-user(s).
You can run your regression analysis (any variation) and then apply the following code:
model.glm = glm(as.numeric(outcome == "Y") ~ Group + var1 + var2 +
#specific to your model,
data = model.df)results <- as.data.frame(get_model_data(model.glm, type = "eff", terms = c("Group"))) %>%
mutate(Group = if_else(x == 1, "Group 1", "Group 2"),
predicted = (predicted*100),
conf.low = (conf.low*100),
conf.high = (conf.high*100)) %>%
dplyr::select("Group", "pp" = "predicted", "std.error",
Review the different type = … but
type = "eff"means that you don’t have to use a true reference group, but instead, you can use a proportion of your…
I created an animated line graph with animated labels for a Data Viz Storytelling competition (Github here). The data included Denver eviction data from Jan 2019 — June 2019 and Jan 2020 — Aug 2020. I wanted to show the difference between the two years since the eviction moratorium was implemented in response to COVID in March 2020 (and extended through the end of 2020).
This code was inspired from the following two links
The data can be downloaded here. I used the 2 .csv files for Denver 2019 and Denver 2020.
If you want to isolate an effect and compare groups, we need to get rid of any other noise. For example, say you have two people. One person gets tutored and the other does not, then they both take the same test. We want to know if the tutoring sessions were effective. We cannot simply compare the test grades of these two individuals. We have to control for person-level variables that may be responsible for the different test scores in order to isolate the effect of the tutoring session. Some person-level differences that may influence a person’s ability to perform…
I have data that’s 1 row per person where people reported the number of drinks they had in a specific time-block. The columns are various time blocks, but the participant could also provide other answers that were coded differently:
NAmeans that they didn't use alcohol at all
0mean that they had no drinks at that time
-999means that they did drink in that time block, but they can't remember how much.
Here’s the fun part. We want to impute a typical person-mean for
-999. In practice, this means that if a person reported they drank in the…
Reproducible code should (at minimum) be your last step before closing out a project. You want someone (especially your future self) to be able to pick up your code and quickly and efficiently be able to read it. Typical suggestions for reproducible code surround good commenting and formatting, but I’m going to go into detail for a few tips that have made things much cleaner.
Any time you are running code that has a random process, you should set your seed. You can choose any random number for your seed. Most people choose birthdates, anniversaries, or favorite numbers. I usually…
This post is a branch of my previous post Becoming a better lap swimmer where I cover some distinctions between triathletes and swim team swimmers.
In short, if you are a strong kicker, everything else about swimming will become easier. You will have good body position in the water (see my previous post) and therefore, you will move efficiently in the water so you won’t spend as much useless energy clawing through the water. Part of being a good kicker is how you rotate your body in the water. You’ll need…
At 18 years old, I competed in an Ironman Triathlon. About 8 years later, I found myself competing for Colorado State’s swim club team and the differences between the two were stark — not only because I had transitioned from distance racing to sprint racing, but also in the general approaches. Now that I understand the distinctions between the two worlds, I cannot unsee the ‘triathlete’ stroke when I swim in my friendly neighborhood recreation center.
The main problem that I see with my fellow, local lap swimmers is that they are often swimming uphill. You probably have experienced this…
When I used to teach psychology, I would present the above image to my students and their minds would immediately wonder if this is how someone with dyslexia experiences the world. Luckily, I could tell them first-hand that this is not dyslexia.
This isn’t easier to read for someone with dyslexia, in fact, it’s much harder. The reason why any person is able to read this paragraph is due to Gestalt psychology, which generally states that the whole (a word) is greater than the sum of its parts (letters). …
trained neuroscientist | professional data scientist | lifelong feminist