Tim Plante, MD MHS

Adding overlaying text “boxes”/markup to Stata figures/graphs

What’s up with adding text to figures in Stata? It’s handy to add text to your plots to orient your readers to specific observations in your figures. You might opt to highlight a datapoint, add percentages to bars, or say what part of a figure range is good vs. bad. Adding text to Stata is …

Appending/merging/combining Stata figures/images with ImageMagick

Sometimes you want to combine two figures in Stata but the –graph combine– command isn’t doing it for you for whatever reason. I’ve run into this when I’ve already merged several figures (abcd), and want to combine that with another merged figure (1234): But what I want is one image that looks like this, stacking …

Making a Bland-Altman plot with printed mean and SD in Stata

The below code outputs this Bland-Altman plot figure, which prints the mean and SD and puts a solid line for mean difference and red dotted lines for mean difference +/- 1.96*SD. (Note: the original BA paper used +/- 2*SD, but it’s reasonable to use +/-1.96*SD given it’s commonality in estimating the bounds of 95% confidence …

Making a scatterplot with R squared and percent coefficient of variation in Stata

I recently had to make some scatterplots for Figure 3 of this paper. I decided to clean up the code in case it might be helpful to others. The below code outputs a scatterplot with R-squared and %CV. I grabbed the %CV-from-a-regression code from Mehmet Mehmetoglu’s CV program. (Type –ssc install cv– to grab that …

Part 6: Visualizing your continuous exposure at baseline

Visualization of your continuous exposure in an observational epidemiology research project As we saw in Part 5, it’s important to describe the characteristics of your baseline population by your exposure. This helps readers get a better understanding of internal validity. For folks completing analyses with binary exposures, part 6 isn’t for you. If your analysis …

Using Stata’s Frames feature to build an analytical dataset

Stata 16 introduced the new Frames functionality, which allows multiple datasets to be stored in memory, with each dataset stored in its own “Frame”. This allows for dynamic manipulation of multiple datasets across multiple Frames. Stata is still simplest to use when manipulating a single dataset (or, frame). So, Stata users will probably be interested …

Mediation analysis in Stata using IORW (inverse odds ratio-weighted mediation)

Mediation is a commonly-used tool in epidemiology. Inverse odds ratio-weighted (IORW) mediation was described in 2013 by Eric J. Tchetgen Tchetgen in this publication. It’s a robust mediation technique that can be used in many sorts of analyses, including logistic regression, modified Poisson regression, etc. It is also considered valid if there is an observed …

Stata-R integration with Rcall

Stata is great because of its intuitive syntax, reasonable learning curve, and dependable implementation. There’s some cutting edge functionality and graphical tools in R that are missing in Stata. I came across the Rcall package that allows Stata to interface with R and use some of these advanced features. (Note: this worked for me as …

Part 3: Introduction to Stata

Note in 2023: We are using R and not Stata this summer so this post doesn’t apply to this year’s projects. Stata is a popular commercial statistical software package that was first released 30+ years ago. It has some really nice features, loads of top-rate documentation, a very active community, and approachable syntax. For beginners, …

Table 1 with pweights in Stata

The very excellent table1_mc program will automate generation of your Table 1 for nearly all needs (read about it here), except for datasets using pweight. I’ve been toying around with automating Excel table generation using Stata v16+ Frames features. I recently started working on a database that requires pweighting for analyses, and opted to use …

Skip to toolbar