A student shared this paper by Blakeway and colleagues. It shows the leads of the heart color-coded by region. It’s awesome. I’m posting it here mostly so I can find it again. Link to the PDF, look at the figure on Page 2: https://www.resuscitationjournal.com/article/S0300-9572(12)00053-6/pdf
Citation: Blakeway E, Jabbour RJ, Baksi J, Peters NS, Touquet R. ECGs: colour-coding for initial training. Resuscitation. 2012 May;83(5):e115-6. doi: 10.1016/j.resuscitation.2012.01.034. Epub 2012 Feb 2. PMID: 22306667. https://pubmed.ncbi.nlm.nih.gov/22306667/
Tables can render weirdly in MS Word and Powerpoint, and you can have a hard time figuring out why. Here’s few steps to fix them so they are snappy.
First: MS Word example (see Powerpoint below)
You get a table that looks like this. You start pulling your hair.
(Ignore that there are only 3 quartiles, I should have written tertile and made all of the figures in this post before I realized that typo.)
Step 1: fix the “paragraph” settings.
Select the entire table by clicking this symbol in the top left, then click the tiny little arrow at the bottom right of the paragraph options on the home tab:
This paragraph dialogue will open up. Change indentations to zero on left and right, for the dropdown of “special”, change it to “none”, and change spacing to zero for before and after, and change line spacing to single.
Now we go back to our table and see that it looks a little better.
Step 1B (optional): Save the fixed paragraph formatting as a new Style
For bonus points: Save this paragraph format as a new style called “Tables” that you can apply and edit as needed! While on the “Home” tab, click this little box at the bottom right of the Styles section, then click the A+ button that appears.
On the pop-up screen, name the style “Tables”, leave everything else unchanged, then hit “okay”.
Now highlight some text in your table, right click the “Tables” style in the home/Styles block, and click “update tables to match selection”.
Now to fix the paragraph settings for future tables all you need to do is select the entire table (top left symbol on table) and click the “Tables” style. You can also edit the font and paragraph settings of all tables that have your Table style applied simultaneously by editing the Tables style directly (right click on “Tables” style then click “Modify”). This is handy if you want to change the fonts from Times New Roman to Arial all at once, for example.
Step 2: Change cell size minimums
The cells are still pretty tall. Let’s see if we can fix that. Click on the symbol on the top left again to highlight the entire table, go to the layout tab, then find the cell size box for HEIGHT (we don’t care about width now). Change this to zero. (It will change itself to 0.01″ and that’s fine.)
There we go! A much neater table. We can do one better though, there’s still a bit of extra spacing in the cell margins that can probably be removed.
Step 3: Narrowing the margins
Select the entire table by clicking the symbol on the top left, go to the Layout tab, then click “Cell Margins”.
In the window that pops up, change the left and right margin to 0.03 or 0.04. The top and bottom should be zero if they aren’t already.
Now the text is a little closer to the cell border. It’s subtle, but it’s there! Notice how the “M” in “Model” is pretty close to the left border. This example uses 0.03 as left and right, you might opt to use 0.04 instead if this is too narrow for you.
Step 4: Resizing your columns
This is pretty straightforward. Double click on the column borders (where I drew x marks) to shrink the column to be the maximum width of the contents of the cells.
Now you have this:
But let’s say that you have some sort of really wide cell for some reason?
Notice that the first column has long labels. If you click on the borders of the columns (starting with the right most, moving left), you get this:
…But let’s say that you wanted to have the text wrap a bit more neatly, rather than be stretched out? In this scenario, I recommend strategically inserting line breaks (“hitting enter or return”, red checks in this picture) so the text wraps at the maximum width of the cell that you want. THEN click the right border of the first column to shrink it down:
…and you get this:
If you want even more control over column width, you can directly adjust them using the sliders on the ruler. If you don’t see the ruler, you need to turn it on under “View” tab then check the box next to “Ruler”. Then click anywhere on your table and you’ll see the grey sliders appear on the ruler.
Step 5(optional): Final tweaks
I think all cells should have the text floating in the middle (rather than all the way at the top, which is default), which can be changed by RIGHT clicking on the top left symbol then selecting table properties…
…and then selecting the “Cell” tab and clicking the “Center” option.
Now notice that things are floating nicely with vertical centering.
I also like to make the top row’s font bolded, and center all columns except the first.
Now you can fiddle with the font type and size as needed. And there you are! What I think is a nicely optimized table.
Second: PowerPoint example
You have a table that looks like this:
Step 1: Fixing paragraph settings
First, highlight the entire contents of the table (there’s no top left icon in PPT like there is in Word to highlight the entire table) then click on the little button at the home tab’s paragraph section’s bottom right.
On the pop up window, change indentation and spacing before and after to zero, special to none, and line spacing to single, and click okay.
Now we see a less unwieldy table!
The margins separating the text to the
Step 2: Reduce cell margins
Let’s shrink the cell margins, which is the distance between the text and the border of the cells. Highlight the contents of your table, then on the layout tab, drop down the options under “cell margins”. Ignore the options inside and go to custom margins.
On the pop up window, change the left and right margins to 0.03 or 0.04 and change the top and bottom to zero.
Now we have some nice narrow margins!
Step 3: Reduce cell height
Let’s shrink down the height of the cells. Highlight the contents of the table, then under the layout tab, reduce the height as much as the down button will let you. Don’t worry about the width.
Now you have a table that is nice and short. (It didn’t actually change the table in this example so nothing to look at.)
Step 4: Fix the column width
Let’s fix the width of the columns. In this example, I added some extra text in the first row. First, insert line breaks (“hit enter or return”) strategically so the cells aren’t overextended by length of lines. In this example, I’m inserting line breaks where the checks are
Now double click the border line of the columns to auto-fit the width.
Now you have a nice narrow table:
Step 5 (optional): Final tweaks
I think all cells should be arranged vertically, you hit this button under they layout tab to arrange the content vertically centered.
Now the cells are vertically centered!
I also think that all columns except the first should be centered, so highlight those columns and hit the center button. I leave the first column aligned left.
Now you might want to make all columns except the first the same width as the widest column. Specifically, notice that the Tertile 1 column (with the “ref”) is narrower than the other two. Click on the widest column and under the layout tab, note that the cell size width is 1.8″.
Now highlight all tertile columns and under the layout tab, type “1.8” (without quotes) into the cell size width box and hit enter.
Scheduling meetings with someone inside of your institution is pretty easy in Outlook since you can typically look at shared availability with the Scheduling Assistant when generating a calendar invitation. Things get a bit more complex for folks outside of your institutions, which is why there are services like Doodle, WhenIsGood.net, and When2Meet.com. When you are trying to meet with 1 or 2 people outside of your institution, you can instead directly send your calendar availability in-line in an email.
Steps:
Pop out your email message draft
Click to set your cursor where you want your calendar to appear
Click insert –> calendar (you probably need to make your window full screen in order to see this calendar option)
Next:
4. Change the date range to “specify dates…”
5. The start date will be today. Change the end date to some other date.
6. Click “okay” and then you’ll have your calendar in-line! It also tacks on an ics file.
I recently had to make some boxplots with Stata’s –graph box– command. I find this to be challenging to use since it varies from syntax from the –twoway– command set that I use all the time. I was using the –over– subcommand x2 and wanted to change the colors of each box & dots by group from the first –over– subcommand. I found some helpful details on the Statalist Forum here and here. Here’s what I did to accomplish this, using the help from the Statalist forum.
Some tweaks here: I wanted to show rotate some labels 45 degrees with –angle– and I also aggressively labeled variables and their values so I didn’t need to manually relabel the figure (which is done with the –relabel– subcommand if needed). It takes an extra 30 seconds to label variables and values, and will save you lots of headbanging fiddling with the –relabel– command, so just label your variables and values from the start.
This example uses fake data. Code follows the picture. Good luck!
clear all
set obs 1000 // blank dataset with 1000 observations
set seed 8675309 // jenny seed
gen group = round(runiform()) // make group that is 0 or 1
gen time = round(3*runiform()) // make 4 times, 0 through 3
replace time = time+1 // now time is 1 through 4
gen tommy2tone = 100*rbeta(3,10) // fake skewed data
// now apply labels to variables.
// technically you only need to label the 3rd one
// of these since categorical variable value labels
// are shown instead of the variable label itself,
// but might as well do all 3 in case you need them
// labeled somewhere else.
label variable group "Group"
label variable time "Time"
label variable tommy2tone "Jenny Lyrics remembered, %"
// now make value labels.
* group
label define grouplab 0 "Tommy" 1 "Tutone"
label values group grouplab //DON'T FORGET TO APPLY LABELS
* time
label define timelab 1 "Time 1" 2 "Time 2" 3 "Time 3" 4 "Time 4"
label values time timelab //DON'T FORGET TO APPLY LABELS
// code for boxplot
graph box tommy2tone ///
, ///
over(group, label(angle(45))) ///
over(time) ///
scale(1.3) /// embiggen labels & figure components
box(1, color(red)) marker(1, mcolor(red)) ///
box(2, color(blue)) marker(2, mcolor(blue)) ///
asyvars showyvars leg(off)
Writing the first draft of a scientific conference abstract is challenging. As part of an Early Career Advisory Committee ‘Science Jam’ sponsored by the UVM CVRI, a group of us came up with fill-in-the-blank, Mad Lib-style guide to help guide the completion of the first draft of a scientific conference abstract.
There’s one Zip file with 2 documents:
Clinical or epidemiological-style abstract (Note: Not intended for case reports)
Basic science abstract
The first page is where you declare all of the terms and concepts, the second page is the fill-in-the-blank section that is drawn from the first page. Do the first page first. I also color coded the clinical/epidemiology one since that’s the one I’m using.
These documents use some fancy MS Word features to help you complete the sections that may not work too well with browser-based MS Word applications, so best to do on your computer with the ‘standard’ MS Word desktop app.
5/2023: Every few years I have to download my DEA certificate (“DEA license”) for credentialing. I always forget how to do it so I’m writing down the steps here in case someone else needs help with this. You need some details from your old DEA certificate, so hopefully you have a copy of that.
There are a few other links in case above no longer works. The first one brings you to the “Available Options” page, and requires you to do steps 2-4 below, but after finishing #4 it says it logs out out but it didn’t really do that. Just hit the back button in your browser as described in #6 below.
On the next page, type in your last name, SSN without dashes, your ZIP code (this might be for your office/medical center rather than your home address), and expiration date. If your DEA certificate expired since you last tried to download it, try adding 3 years to the expiration date on your last DEA certificate.
On the next screen, type in your DOB in MM-DD-YYYY. Click “validate DOB”.
Download your certificate and save as PDF. You are done.
Bonus: Do other things on the DEA website. From the page you land on for #5, hit “done” and you’ll get to the “Logout” screen, saying that you have been successfully logged out. This is a lie. Hit the back button on your browser. You’ll now see “Available Options” (“New Registration” “Collector Status Request/Update Login” “Registration Renewal” “Registration Update” “Check Registration Status” “Registration Reprint Receipt” “Registration Reprint Certificate” “Registration Validation” “Request Form 222” “ADS Registration”).
Frames were introduced in Stata 16 and are handy for (a) storing/manipulating multiple datasets simultaneously and (b) building datasets on the fly. I’ve had good luck making a table using frames. This strategy includes (1) making a new frame with as many columns as you need, specifying they are long strings (strL), and printing the top row, (2) running regressions or whatnot, (3) saving components of the regression as local macros that print as text, (4) making “display” macros for each column, (5) saving those “display” local macros to the new frame, repeating steps 2-5 as many times as needed, and (6) exporting the new frame as an Excel file.
Stata has recently introduced “tables” and “collect” features that allow you to do similar things. I find those features overly confusing, so I’ve developed this approach. I hope you find it useful! Here’s what we are trying to make:
Note: these details depend on use of concepts discussed on this other post (“return list” “ereturn list” “matrix list r(table)”, local macros, etc.). Make sure to familiarize yourself with that post.
Problems you might run into here:
This uses local macros, which disappear when a do file stops running. You need to run the entire do file from top to bottom every time, not line by line.
Stata’s -frames post- functionality is finicky. It expects exactly as many posted variables as there are variables in the new frame, and that each posted format matches the predefined format on the frame’s variable.
Three forward slashes (“///”) extends things across lines. Two forward slashes (“//”) allows commenting but does not continue to the next line. Check your double and triple forward slashes.
Stata can’t write to an open excel file. Close the excel file before re-running.
Code
* Reset frames and reload data
frames reset
sysuse auto, clear
*
* STEP 1 - Make a new frame with long strings
*
* Make new frame called "table" and
* define all columns/variables as being strings.
* Name all variables as col#, starting with col1.
* You can extend this list as long as you'd like.
frame create table /// <--TRIPLE SLASH
strL col1 /// name
strL col2 /// n
strL col3 /// beta
strL col4 /// 95% CI
strL col5 // <-- DOUBLE SLASH, P-value
* If you want to add more col#s, make sure you change
* the last double slashes to triple slashes, all new
* col#s should have triple slashes except the last,
* which should have double (or no) slashes
*
* Prove to yourself that you have made a new frame
* called "table" in addition to the "default" one
* with the auto dataset.
frames dir
* You could switch to the new table frame with
* the "cwf table" command, if interested. To switch
* back, type "cwf default".
*
* STEP 1B - print out your first row
*
frame post table /// <--TRIPLE SLASH
("Variable name") /// col1
("N") /// col2
("Beta") /// col3
("95% CI") /// col4
("P-value") // <--DOUBLE SLASH col5
*
* You can repeat this step as many times as you want
* to add as many rows of custom text as needed.
* Note that if you want a blank column, you need
* to still include the quotations within the parentheses.
* eg, if you didn't want the first column to have
* "variable name", you'd put this instead:
* strL ("") /// col1
*
* If you wanted to flip over to the table frame and
* see what's there, you'd type:
*cwf table
*bro
*
* ...and to flip back to the default frame, type:
*cwf default
*
* STEP 2 - run your regression and look where the
* coefficients of interest are stored
* and
* STEP 3 - saving components of regression as local
* macros
*
regress price weight
*
ereturn list
* The N is here under e(N), save that as a local macro
local n1_1 = e(N)
*
matrix list r(table)
* The betas is at [1,1], the 95% thresholds
* are at [5,1] and [6,1], the se is
* at [2,1], and the p-value is at [4,1].
* We weren't planning on using se in this table,
* but we'll grab it anyway in case we change our minds
local beta1_1 = r(table)[1,1]
local ll1_1 = r(table)[5,1]
local ul1_1 = r(table)[6,1]
local se1_1 = r(table)[2,1]
local p1_1 = r(table)[4,1]
*
* STEP 4 - Making "display" macros
*
* We are going to use local macros to grab things
* by column with a "display" commmand. Note that
* column 1 is name, which we are going to call "all" here.
* You could easily grab the variable name here with
* the "local lab:" command detailed here:
* https://www.stata.com/statalist/archive/2002-11/msg00087.html
* or:
*local label_name: variable label foreign
* ...then call `label_name' instead of "all"
*
* Now this is very important, we are not going to just
* capture these variables as local macros, we are going
* to capture a DISPLAY command followed by how we
* want the text to be rendered in the table.
* Immediately after defining the local macro, we are going
* to call the macro so we can see what it will render
* as in the table. This is what it will look like:
*local col1 di "All"
*`col1'
*
* IF YOU HAVE A BLANK CELL IN THIS ROW OF YOUR TABLE,
* USE TWO EMPTY QUOTATION MARKS AFTER THE "di" COMMAND, eg:
* local col1 di ""
*
* now we will grab all pieces of the first row,
* column-by-column, at the same time:
local col1 di "All" // name
`col1'
local col2 di `n1_1' // n
`col2'
local col3 di %4.2f `beta1_1' // betas
`col3'
local col4 di %4.2f `ll1_1' " to " %4.2f `ul1_1' // 95% ci
`col4'
local col5 di %4.3f `p1_1' // p-value
`col5'
*
* note for the P-value, you can get much fancier in
* formatting, see my script on how to do that here:
* https://blog.uvm.edu/tbplante/2022/10/26/formatting-p-values-for-stata-output/
*
* STEP 5 - Posting the display macros to the table frame
*
* Now post all columns row-by-row to the table frame
frame post table ///
("`: `col1''") ///
("`: `col2''") ///
("`: `col3''") ///
("`: `col4''") ///
("`: `col5''") // <-- DOUBLE SLASH
*
* Bonus! Insert a text row
*
frame post table /// <--TRIPLE SLASH
("By domestic vs. foreign status") /// col1
("") /// col2
("") /// col3
("") /// col4
("") // <--DOUBLE SLASH col5
* Now repeat by foreign status
* domestic
regress price weight if foreign ==0
ereturn list
local n1_1 = e(N)
*
matrix list r(table)
local beta1_1 = r(table)[1,1]
local ll1_1 = r(table)[5,1]
local ul1_1 = r(table)[6,1]
local se1_1 = r(table)[2,1]
local p1_1 = r(table)[4,1]
*
* note: you could automate col1 with the following command,
* which would grabe the label from the foreign==0 value of
* foreign:
*local label_name: label foreign 0
* ... then call `label_name' in the "local col1 di".
local col1 di " Domestic" // name, with 2 space indent
`col1'
local col2 di `n1_1' // n
`col2'
local col3 di %4.2f `beta1_1' // betas
`col3'
local col4 di %4.2f `ll1_1' " to " %4.2f `ul1_1' // 95% ci
`col4'
local col5 di %4.3f `p1_1' // p-value
`col5'
*
frame post table ///
("`: `col1''") ///
("`: `col2''") ///
("`: `col3''") ///
("`: `col4''") ///
("`: `col5''") // <-- DOUBLE SLASH
*
* foreign
regress price weight if foreign ==1
ereturn list
local n1_1 = e(N)
*
matrix list r(table)
local beta1_1 = r(table)[1,1]
local ll1_1 = r(table)[5,1]
local ul1_1 = r(table)[6,1]
local se1_1 = r(table)[2,1]
local p1_1 = r(table)[4,1]
*
local col1 di " Foreign" // name, with 2 space indent
`col1'
local col2 di `n1_1' // n
`col2'
local col3 di %4.2f `beta1_1' // betas
`col3'
local col4 di %4.2f `ll1_1' " to " %4.2f `ul1_1' // 95% ci
`col4'
local col5 di %4.3f `p1_1' // p-value
`col5'
*
frame post table ///
("`: `col1''") ///
("`: `col2''") ///
("`: `col3''") ///
("`: `col4''") ///
("`: `col5''") // <-- DOUBLE SLASH
*
* STEP 6 - export the table frame as an excel file
*
* This is the present working directory, where this excel
* file will be saved if you don't specify a directory:
pwd
*
* Switch to the table frame and take a look at it:
cwf table
bro
* Now export it to excel
export excel using "table.xlsx", replace
*
* That's it!
I recently had a dataset with two groups (0 or 1), and a continuous variable. I wanted to show how the overall deciles of that continuous variable varied by group. Step 1 was to generate an overall decile variable with an –xtile– command. Step 2 was to make a frequency histogram. BUT! I wanted these histograms to overlap and not be side-by-side. Stata’s handy –histogram– is a quick and easy way to make histograms by groups using the –by– command, but it makes them side-by-side like this, and not overlapping. (Note: see how to use –twoway histogram– to make overlapping histograms at the end of this post.)
I instead used a collapse command to generate a count of # in each decile by group (using the transparent color command as color percent sign number), like this:
Here’s the code to make both:
clear all
// make fake data
set obs 1000
set seed 8675309
gen id=_n // ID of 1 though 100
gen var0or1 = round(runiform())
gen continuousvalue = 100*runiform()
// make overall deciles of continuousvalue
xtile decilesbygroup = continuousvalue, nq(10)
// now make a frequency histogram of those deciles
set scheme s1color // I like this scheme
hist decilesbygroup, by(var0or1) frequency bin(10)
// make a variable equal to 1 that we will sum in collapse
gen countbygroup = 1
// now sum that variable by the 0 or 1 indicator and deciles
collapse (sum) countbygroup, by(var0or1 decilesbygroup)
// now render the count from above as a bar graph:
set scheme s1color // I like this scheme
twoway ///
(bar countbygroup decilesbygroup if var0or1==0, vertical color(red%40)) ///
(bar countbygroup decilesbygroup if var0or1==1, vertical color(blue%40)) ///
, ///
legend(order(1 "var0or1==0" 2 "var0or1==1")) ///
title("Title!") ///
xtitle("Decile of continuousvalue") ///
xla(1(1)10) ///
yla(0(10)70, angle(0)) ///
ytitle("N in Decile")
You could also offset the deciles by the var0or1 and shrink the bar width a bit to get a frequency histogram where the bars are next to each other, like this:
clear all
// make fake data
set obs 1000
set seed 8675309
gen id=_n // ID of 1 though 100
gen var0or1 = round(runiform())
gen continuousvalue = 100*runiform()
// make overall deciles of continuousvalue
xtile decilesbygroup = continuousvalue, nq(10)
// now make a frequency histogram of those deciles
set scheme s1color // I like this scheme
hist decilesbygroup, by(var0or1) frequency bin(10)
// offset the decilesbygroup by var0or1 a bit:
replace decilesbygroup = decilesbygroup - 0.2 if var0or1==0
replace decilesbygroup = decilesbygroup + 0.2 if var0or1==1
// make a variable equal to 1 that we will sum in collapse
gen countbygroup = 1
// now sum that variable by the 0 or 1 indicator and deciles
collapse (sum) countbygroup, by(var0or1 decilesbygroup)
// now render the count from above as a bar graph:
set scheme s1color // I like this scheme
twoway ///
(bar countbygroup decilesbygroup if var0or1==0, vertical color(red%40) barwidth(0.4)) ///
(bar countbygroup decilesbygroup if var0or1==1, vertical color(blue%40) barwidth(0.4)) ///
, ///
legend(order(1 "var0or1==0" 2 "var0or1==1")) ///
title("Title!") ///
xtitle("Decile of continuousvalue") ///
xla(1(1)10) ///
yla(0(10)70, angle(0)) ///
ytitle("N in Decile")
A few quick notes here: The way that I am specifying the “bins” for the histograms here is different than how Stata specifies bins for histograms, since I’m forcing it to render by decile. If you were to generate a histogram of the “continuousvalue” instead of the above example using “decilebygroup”, you’ll notice that the resulting histograms looks a bit different from each other:
clear all
// make fake data
set obs 1000
set seed 8675309
gen id=_n // ID of 1 though 100
gen var0or1 = round(runiform())
gen continuousvalue = 100*runiform()
// make overall deciles of continuousvalue
xtile decilesbygroup = continuousvalue, nq(10)
// now make a frequency histogram of those deciles
set scheme s1color // I like this scheme
hist decilesbygroup, title("hist decilesbygroup") by(var0or1) frequency bin(10) name(a)
hist continuousvalue, title("hist continuousvalue") by(var0or1) frequency bin(10) name(b)
Also, this code will only render frequency histograms, not density histograms, which are the default in Stata. You can also use the –twoway hist– command to overlay two bar graphs, but these might not perfectly align with the deciles. But, using the –twoway hist– allows you to use density histograms instead. See the example that follows. I suspect that most people will get what they need with the –twoway hist– command in Stata.
clear all
// make fake data
set obs 1000
set seed 8675309
gen id=_n // ID of 1 though 100
gen var0or1 = round(runiform())
gen continuousvalue = 100*runiform()
set scheme s1color // I like this scheme
twoway ///
(hist continuousvalue if var0or1==0, bin(10) color(red%40) density) ///
(hist continuousvalue if var0or1==1, bin(10) color(blue%40) density) ///
, ///
legend(order(1 "var0or1==0" 2 "var0or1==1")) ///
title("Title!") ///
xtitle("Grouping in 10 Bins")
Stata has some handy built-in features to make boxplots. If you are trying to make a typical boxplot in Stata, go read up on the –graph box– command as described on this post.
I was asked to abstract a boxplot from an old paper and re-render it in Stata.
I used the excellent WebPlotDigitizer to abstract the points in these figure. It took me a few rounds of data abstraction to get all of the data. I generated the following variables
(1) “row” – rows 1-8 — the “A-D” labels on the x axis are offset at 1.5, 3.5, 5.5, and 7.5, respectively,
(2) “group” – an indicator for group 1 or 2,
(3-5) “boxlow” “boxmid” and “boxhigh” – the lower, mid, and upper bounds of the box,
(6-7) “bar1” and “bar2” – the low and upper end of the bars, and
(8) “dot” – extreme values.
I used “input” at the top of a do file to load these data. The colors were taken from ColorBrewer2. The “box” are just really wide pcspike lines with overlaying horizontal bars as floating text boxes to indicate the bottom, median, and top of the box. One problem is the placement of the horizontal lines on and above/below the box itself, since the “―” ascii character isn’t perfectly in the middle of the vertical space that is used to render it. To get around this, I included an offset value that could be modified to shift these lines up and down so they lined up with the boxes. The “whiskers” are an rcap. The extreme values are just scatterplot dots.
I was the analyst for the myPACE trial (published here in JAMA Cardiology), and needed to put together a subgroup analysis figure. I didn’t find any helpful stock code, so I wrote my own. The code uses the Frames feature that was introduced in Stata 16. It will (a) make a new frame with the required variables but no data, (b) generate a new dichotomous variable from continuous variables, (c) generate labels, (d) grab the Ns, point estimates, and 95% CI from a logistic regression, (e) grab the P-value for interaction for the primary exposure dependent variable*, (f) write the point estimate/95% CI/P-value for interaction to the new frame, then (g) switch to the new frame and make this figure. This script uses local macros so needs to be run all at once in a do file, not line by line.
This uses a stock Stata dataset called “catheter”. The outcome of interest/dependent variable is “infect”, the primary exposure/independent variable of interest is “time”, and the subgroups are age, sex, and patient number. This uses logistic regression, but you can easily swap this model out for any other model.
*You can get more complex code to format the P-values here.
Here’s the code!
frame reset // drop all existing frames and data
webuse catheter, clear // load analytical dataset
version 16 // need Stata version 16 or newer
*
* Make an empty frame with the variables we'll add later row by row
* The variables "rowname" and "pvalue" will be strings so when
* you add to these variables with the --frame post-- command,
* you need to use quotes.
frame create subgroup str30 rowname n beta low95 high95 str30 pval
*
*** Age
* Need to generate a dichotomous age variable
* You don't need to do this if the variable is already dichotomous,
* ordinal, or nominal
generate agesplit = (age>=50) if !missing(age) // below 50 is 0, 50 and above is 1
*
* Now generate a label for the overall grouping
local label_group "{bf:Age}"
*
* Group 0, below age 50
* Generate label for this subgroup
local label_subgroup_0 "Under 50y"
* Now run the model for this subgroup
logistic infect time if agesplit==0
* Now save the N, beta, and 95% CI as local macros.
* There's lots you can save after a regression, type --return list--,
* --ereturn list--, and --matrix list r(table)-- to see what's there
local n_0 = e(N)
local beta_0 = r(table)[1,1]
local low95_0 = r(table)[5,1]
local high95_0 = r(table)[6,1]
* print above local macros to prove you collected them correctly:
di "For `label_subgroup_0', n=" `n_0' ", beta (95% CI)=" %4.2f `beta_0' " (" %4.2f `low95_0' " to " %4.2f `high95_0' ")"
*
* Group 1, at least 50 years old
local label_subgroup_1 "At least 50y"
logistic infect time if agesplit==1
local n_1 = e(N)
local beta_1 = r(table)[1,1]
local low95_1 = r(table)[5,1]
local high95_1 = r(table)[6,1]
di "For `label_group', subgroup `label_subgroup_1', n=" `n_1' ", beta (95% CI)=" %4.2f `beta_1' " (" %4.2f `low95_1' " to " %4.2f `high95_1' ")"
*
* Now run the model with an interaction term between the primary exposure
* and the subgroup.
logistic infect c.time##i.agesplit
* Grab the p-value as a local macro saved as pval
local pval = r(table)[4,5]
* Print that local macro to see that you've grabbed it correctly
di "P-int = " %4.3f `pval'
* Format that P-value and save that as a new local macro called pvalue
local pvalue "P=`: display %3.2f `pval''"
di "`pvalue'"
*
* Now write your local macros to the the "subgroup" frame
* Each "frame post" command will add 1 additional row to the frame.
* We will graph these line by line.
* First line with overall group name a P-value for interaction:
frame post subgroup ("`label_group'") (.) (.) (.) (.) ("`pvalue'")
* Now each subgroup by itself:
frame post subgroup ("`label_subgroup_0'") (`n_0') (`beta_0') (`low95_0') (`high95_0') ("")
frame post subgroup ("`label_subgroup_1'") (`n_1') (`beta_1') (`low95_1') (`high95_1') ("")
* Optional blank line:
frame post subgroup ("") (.) (.) (.) (.) ("")
*
*** Female sex
* This is already dichotomous, so don't need to create a new variable
* like we did for age.
local label_group "{bf:Sex}"
*group 0, males
local label_subgroup_0 "Males"
logistic infect time if female==0
local n_0 = e(N)
local beta_0 = r(table)[1,1]
local low95_0 = r(table)[5,1]
local high95_0 = r(table)[6,1]
*group 1, females
local label_subgroup_1 "Females"
logistic infect time if female==1
local n_1 = e(N) // N
local beta_1 = r(table)[1,1]
local low95_1 = r(table)[5,1]
local high95_1 = r(table)[6,1]
*interaction P-value
logistic infect c.time##i.female
local pval = r(table)[4,5]
local pvalue "P=`: display %3.2f `pval''"
*write to subgroup frame
frame post subgroup ("`label_group'") (.) (.) (.) (.) ("`pvalue'")
frame post subgroup ("`label_subgroup_0'") (`n_0') (`beta_0') (`low95_0') (`high95_0') ("")
frame post subgroup ("`label_subgroup_1'") (`n_1') (`beta_1') (`low95_1') (`high95_1') ("")
frame post subgroup ("") (.) (.) (.) (.) ("")
*
*** patient
* need to generate a patient dichotomous variable
generate patientsplit = (patient>=20) if !missing(patient) // below 20 is 0, 20 and above is 1
local label_group "{bf:Patient}"
*group 0, below 20
local label_subgroup_0 "Under 20th patient"
logistic infect time if patientsplit==0
local n_0 = e(N)
local beta_0 = r(table)[1,1]
local low95_0 = r(table)[5,1]
local high95_0 = r(table)[6,1]
*group 1, 20 and above
local label_subgroup_1 "At least the 20th patient"
logistic infect time if patientsplit==1
local n_1 = e(N)
local beta_1 = r(table)[1,1]
local low95_1 = r(table)[5,1]
local high95_1 = r(table)[6,1]
*interaction P-value
logistic infect c.time##i.agesplit
local pval = r(table)[4,5]
local pvalue "P=`: display %3.2f `pval''"
*write to subgroup frame
frame post subgroup ("`label_group'") (.) (.) (.) (.) ("`pvalue'")
frame post subgroup ("`label_subgroup_0'") (`n_0') (`beta_0') (`low95_0') (`high95_0') ("")
frame post subgroup ("`label_subgroup_1'") (`n_1') (`beta_1') (`low95_1') (`high95_1') ("")
frame post subgroup ("") (.) (.) (.) (.) ("")
*
*** Now make the figure. You'll have to modify this so the number of rows
* in your subgroup frame matches the labels and whatnot
set scheme s1mono // I like this scheme
* Change frame to the subgroup frame
cwf subgroup
* Generate a row number by the current order of the data in this frame
gen row=_n
* Here's the code to make the figure
twoway ///
(scatter row beta, msymbol(d) mcolor(black) msize(medium)) ///
(rcap low95 high95 row, horizontal lcolor(black) lwidth(medlarge)) ///
, ///
legend(off) ///
xline(1, lcolor(red) lpattern(dash) lwidth(medium)) ///
title("Title") ///
yti("Y Title") ///
xti("X Title") ///
yscale(reverse) ///
yla( ///
1 "`=rowname[1]'" ///
2 "`=rowname[2]', n=`=n[2]'" ///
3 "`=rowname[3]', n=`=n[3]'" ///
4 " " /// blank since it's a blank row
5 "`=rowname[5]'" ///
6 "`=rowname[6]', n=`=n[6]'" ///
7 "`=rowname[7]', n=`=n[7]'" ///
8 " " /// blank since it's a blank row
9 "`=rowname[9]'" ///
10 "`=rowname[10]', n=`=n[10]'" ///
11 "`=rowname[11]', n=`=n[11]'" ///
12 " " /// blank since it's a blank row
, angle(0) labsize(small) noticks) ///
xla(0.8(.2)2.2) ///
text(1 1.1 "`=pval[1]'", placement(e) size(small)) /// these are the p-value labels
text(5 1.1 "`=pval[5]'", placement(e) size(small)) ///
text(9 1.1 "`=pval[9]'", placement(e) size(small))
*
* Now export your figure as a PNG file
graph export "myfigure.png", replace width(1000)