This article explains the why behind many of the decisions we make when creating a good visualization.

Consider the clique: “A picture is worth a thousand words…”

It’s important to think of your data visualizations in much of the same way. Our current society thrives on scroll culture, which means you need to make your visualization immune to it. It helps to assume that you care about your work the most, such that even individuals or stakeholders that requested a visualization are likely to move quickly through your work. Thus, a visualization is great when it can capture the point efficiently, quickly, and without losing the interest of your end-user(s).


You can run your regression analysis (any variation) and then apply the following code:

model.glm = glm(as.numeric(outcome == "Y") ~ Group + var1 + var2 +  
Group*var2,
#specific to your model,
family=gaussian(link="identity"),
data = model.df)
results <- as.data.frame(get_model_data(model.glm, type = "eff", terms = c("Group"))) %>%
mutate(Group = if_else(x == 1, "Group 1", "Group 2"),
predicted = (predicted*100),
conf.low = (conf.low*100),
conf.high = (conf.high*100)) %>%
dplyr::select("Group", "pp" = "predicted", "std.error",
"conf.low", "conf.high")

Review the different type = … but type = "eff"means that you don’t have to use a true reference group, but instead, you can use a proportion of your…


I created an animated line graph with animated labels for a Data Viz Storytelling competition (Github here). The data included Denver eviction data from Jan 2019 — June 2019 and Jan 2020 — Aug 2020. I wanted to show the difference between the two years since the eviction moratorium was implemented in response to COVID in March 2020 (and extended through the end of 2020).

Final Image:

This code was inspired from the following two links

Set-up and the Data

The data can be downloaded here. I used the 2 .csv files for Denver 2019 and Denver 2020.

#import…


Image for post
Image for post
This code is to show examples of different ways you can procure a matched sample for analysis using tidyverse. This information includes contributions from multiple people.

Why Match?

If you want to isolate an effect and compare groups, we need to get rid of any other noise. For example, say you have two people. One person gets tutored and the other does not, then they both take the same test. We want to know if the tutoring sessions were effective. We cannot simply compare the test grades of these two individuals. We have to control for person-level variables that may be responsible for the different test scores in order to isolate the effect of the tutoring session. Some person-level differences that may influence a person’s ability to perform…


Just another data-wrangling post

The Task

I have data that’s 1 row per person where people reported the number of drinks they had in a specific time-block. The columns are various time blocks, but the participant could also provide other answers that were coded differently:

  • NA means that they didn't use alcohol at all
  • 0 mean that they had no drinks at that time
  • -999 means that they did drink in that time block, but they can't remember how much.

Here’s the fun part. We want to impute a typical person-mean for -999. In practice, this means that if a person reported they drank in the…


Reproducible code should (at minimum) be your last step before closing out a project. You want someone (especially your future self) to be able to pick up your code and quickly and efficiently be able to read it. Typical suggestions for reproducible code surround good commenting and formatting, but I’m going to go into detail for a few tips that have made things much cleaner.

Image for post
Image for post

set.seed()

Any time you are running code that has a random process, you should set your seed. You can choose any random number for your seed. Most people choose birthdates, anniversaries, or favorite numbers. I usually…


This post is a branch of my previous post Becoming a better lap swimmer where I cover some distinctions between triathletes and swim team swimmers.

Why become a better kicker?

  • Body position
  • Efficiency/reduced energy
  • Better rotation
  • Stronger core
  • Speed

In short, if you are a strong kicker, everything else about swimming will become easier. You will have good body position in the water (see my previous post) and therefore, you will move efficiently in the water so you won’t spend as much useless energy clawing through the water. Part of being a good kicker is how you rotate your body in the water. You’ll need…


Image for post
Image for post

At 18 years old, I competed in an Ironman Triathlon. About 8 years later, I found myself competing for Colorado State’s swim club team and the differences between the two were stark — not only because I had transitioned from distance racing to sprint racing, but also in the general approaches. Now that I understand the distinctions between the two worlds, I cannot unsee the ‘triathlete’ stroke when I swim in my friendly neighborhood recreation center.

The main problem that I see with my fellow, local lap swimmers is that they are often swimming uphill. You probably have experienced this…


You can read this.

When I used to teach psychology, I would present the above image to my students and their minds would immediately wonder if this is how someone with dyslexia experiences the world. Luckily, I could tell them first-hand that this is not dyslexia.

This isn’t easier to read for someone with dyslexia, in fact, it’s much harder. The reason why any person is able to read this paragraph is due to Gestalt psychology, which generally states that the whole (a word) is greater than the sum of its parts (letters). …

Lara Southard, PhD

trained neuroscientist | professional data scientist | lifelong feminist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store