Using Stata and Graphviz to make social network graphs and hierarchical graphs

I recently had to make a figure that showed relationship between variables. I tried a few different software packages and ultimately decided that Graphviz is the easiest. Thanks to dreampuf for their Graphviz online program! I used this web-based implementation and didn’t have to install Graphviz on my computer. So for this, we’ll be using this online Graphviz package: https://dreampuf.github.io/GraphvizOnline

I wrote a basic Stata script that inputs data and then outputs Graphviz code that you can copy and paste right into the Graphviz website above. (I strongly strongly strongly recommend saving your Graphviz code locally on your computer as a text file. On Windows I use Notepad++. Don’t save this code in a word processor because it will do unpredictable things to the quotes.) You can then tweak the settings of the outputted Graphviz code to your liking. See all sorts of settings in the left-sided menu here: https://graphviz.org/docs/nodes/

Originally I wanted this to be a network graph (“neato”) but ultimately liked how it looked best in the hierarchical graph (“dot”). You can change between graph types using the engine dropdown menu on the top right of the Graphviz online website. You can also change the file type to something that journals will use, like PNG, on the top right.

Code to output Graphviz code from your Stata database

Run the following in its entirety from a Stata do file.

clear all // clear memory
//
// Now input variables called 'start' and 'end#'.
// 'start' is the originating node and 'end#' is
// every node that 'start' connects to.
// If you need additional 'end#' variables, just add them
// using strL (capital L) then the next 'end#' number.
// In this example, there's 1 'start' and 4 'end#'
// so there are 5 total columns. 
input strL start strL end1 strL end2 strL end3 strL end4
"ant" "bat" "cat" "dog" "fox"
"bat" "ent" "" "" ""
"cat" "ent" "fox" "" ""
"dog" "ent" "fox" "" ""
end // end input
// 
// The following code reshapes the data from wide
// to long and drops all subsequent blank variables
// from that rotation (if any). 
gen row = _n //make row variable
reshape long end, i(row) j(nodenum) // reshape
drop if end=="" // drop empty cells
keep start end // just need these variables
//
// The following loop renders the current Stata
// dataset as Graphviz code. Since this uses loops
// and local macros, it needs to be run all at once 
// in a do file rather than line by line. 
local max = _N
quietly {
	// Start of graphviz code:
	noisily di ""
	noisily di ""
	noisily di ""
	noisily di ""
	noisily di "// Copy everything that follows"
	noisily di "// and paste it into here:"
	noisily di "// https://dreampuf.github.io/GraphvizOnline"
	noisily di "digraph g {"
	//
	// This prints out the connection between
	// each 'start' node and all connected
	// 'end#' nodes one by one.
	forvalues n = 1/`max' {
		noisily di start[`n'] " -> " end[`n'] ";"
	}
	//
	// Global graph attributes follows. 
	// "bb" sets the size of the figure
	// from lower left x, y, then upper right x, y.
	// There are lots of other settings here: 
	// https://graphviz.org/docs/graph/
	// ...if adding more, just add between the final
	// comma and closing bracket. If adding several
	// additional settings here, separate each with 
	// a comma. 
	// Note that this has an opening and closing tick
	// so the quotes inside print like characters
	// and not actual stata code quotes.
	noisily di `"graph [bb="0,0,100,1000",];"' 
	//
	// The next block generates code to render each
	// node. First, we need to reshape long(er) so that
	// all of the 'start' and 'end#' variables are all
	// in a single column, delete duplicates, and 
	// sort them. 
	rename start thing1
	rename end thing2
	gen row = _n
	reshape long thing, i(row) j(nodenum) 
	keep thing
	duplicates drop
	sort thing
	//
	// Now print out settings for each node. These
	// can be fine tuned. Lots of options for 
	// node formatting here: 
	// https://graphviz.org/docs/nodes/
	local max = _N
	forvalues n= 1/`max' {
		noisily di thing[`n'] `" [width="0.1", height="0.1", fontsize="8", shape=box];"'
	}
	// End of graphviz code: 
	noisily di "}"
	noisily di "// don't copy below this line"
	noisily di ""
	noisily di ""
	noisily di ""
	noisily di ""
}
// that's it!

The above Stata code prints the following Graphviz code in the Stata output window. This code can be copied/pasted to the Graphviz website linked above. (Make sure to save a backup of this Graphviz code as a txt file on your computer!!) Make sure your Stata screen is full size before running the above Stata code or it might insert some line breaks that you have to manually delete since the output width is (usually) determined by the window size. Also, if your node settings get long, it’ll also insert line breaks that you’ll have to manually delete.





// Copy everything that follows
// and paste it into here:
// https://dreampuf.github.io/GraphvizOnline
digraph g {
ant -> bat;
ant -> cat;
ant -> dog;
ant -> fox;
bat -> ent;
cat -> ent;
cat -> fox;
dog -> ent;
dog -> fox;
graph [bb="0,0,100,1000",];
ant [width="0.1", height="0.1", fontsize="8", shape=box];
bat [width="0.1", height="0.1", fontsize="8", shape=box];
cat [width="0.1", height="0.1", fontsize="8", shape=box];
dog [width="0.1", height="0.1", fontsize="8", shape=box];
ent [width="0.1", height="0.1", fontsize="8", shape=box];
fox [width="0.1", height="0.1", fontsize="8", shape=box];
}
// don't copy below this line






Example figures from outputted code above using the different Graphviz engines

Clicking the dropdown in the top right “engine” toggles between the figures below. You can learn more about these here: https://graphviz.org/docs/layouts/

Not shown below are “nop” and “nop2” which don’t render correctly for unclear reasons. Some of these will need to be tweaked to be publication quality, some of them frankly don’t work with this dataset. For this made up code, I think dot and neato look great!

Dot (hierarchical or layered drawing of directed graphs, my favorite for this project):

Neato (a nice network graph, called “spring model” layout):

Circo, aka circular layout:

fdp (force-directed placement):

sfdp (scalable force-directed placement):

twopi (radial layout):

osage (clustered graphs):

Patchwork (clustered graph using squarified treemap):

Creating a desktop shortcut to Stata in Ubuntu Linux

I’m a Linux novice and installed Stata 18 on my new Ubuntu 24.10 dual boot laptop. But! Stata doesn’t show up as an installed program in my launcher. I found the installed files, including the executable for the GUI-based version of Stata 18 (“xstata-se”) in the /usr/local/stata18/ folder. I wanted to make a desktop shortcut to that folder but there doesn’t seem to be an option to make a shortcut from the Ubuntu file launcher. Instead, I followed the directions from the user ‘forester’ here and did it from the terminal.

Pop open your terminal by clicking ctrl+alt+T and plop in the text that follows, substituting your user name where indicated below

sudo ln -s /usr/local/stata18/ /home/[your user name]/Desktop  

After you enter your password, a new link should appear on your desktop

FYI: If you are using a different version of Stata, the number will be different for the Stata folder (e.g., stata19).

Also, I had tried to create a desktop link to the xstata-se file itself, but clicking the link wouldn’t run Stata. Popping open the parent folder that the executable lives in is pretty close so I’ll stick with it.

Part 8: Regressions

There are all sorts of models out there for performing regressions. This page focuses on 3 that are used for binary outcomes:

  1. Logistic regression
  2. Modified Poisson regression
  3. Cox proportional hazards models.

Getting set up

Before you get going, you want to explicitly define your outcome of interest (aka dependent variable), primary exposure (aka independent variable) and covariates that you are adjusting for in your model (aka other independent variables). You’ll also want to know your study design (eg case-control? cross-sectional? prospective? time to event?).

Are your independent variables continuous or factor/categorical?

In these regression models, you can specify that independent variables (primary exposure and covariates that are in your model) are continuous or factor/categorical. Continuous variables can be specified with a leading “c.”, and examples might include age, height, or length. (“c.age”.) In contrast, factor/categorical variables might be New England states (e.g., 1=Connecticut, 2=Maine, 3=Mass, 4=New Hampshire, 5=RI, and 6=Vermont). So, you’d want to specifiy a variable for “newengland” as i.newengland in your regression. Stata defaults to treating the smallest number (here, 1=Connecticut) as the reference group. You can change that by using the “ib#.” prefix instead, where the # is the reference group, or here ib2.newengland to have Maine as the reference group.

Read about factor variables in –help fvvarlist–.

What about binary/dichotomous variables (things that are ==0 or 1)? Well, it doesn’t change your analysis if you specify a binary/dichotomous as either continuous (“c.”) or factor/categorical (“i.”). The math is the same on the back end.

Checking for interactions

In general, when checking for an interaction, you will need to specify if the two variables of interest are categorical or factor/categorical and drop two pound signs in between the two. See details on –help fvvarlist–. Here’s an example of how this would look, looking for an interaction between sex on age groups.

regress bp i.sex##c.agegrp

      Source |       SS           df       MS      Number of obs   =       240
-------------+----------------------------------   F(3, 236)       =     23.86
       Model |   9519.9625         3  3173.32083   Prob > F        =    0.0000
    Residual |  31392.8333       236   133.02048   R-squared       =    0.2327
-------------+----------------------------------   Adj R-squared   =    0.2229
       Total |  40912.7958       239  171.183246   Root MSE        =    11.533

------------------------------------------------------------------------------
          bp | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         sex |
     Female  |     -4.275   3.939423    -1.09   0.279    -12.03593    3.485927
      agegrp |     7.0625   1.289479     5.48   0.000      4.52214     9.60286
             |
sex#c.agegrp |
     Female  |      -1.35   1.823599    -0.74   0.460    -4.942611    2.242611
             |
       _cons |   143.2667   2.785593    51.43   0.000     137.7789    148.7545
------------------------------------------------------------------------------

You’ll see the sex#c.agegrp P-value is 0.460, so that wouldn’t qualify as a statistically significant interaction.

Regressions using survey data?

If you are using survey analyses (eg need to account for pweighting), generally you have to use the svy: prefixes for your analyses. This includes svy: logistic, poisson, etc. Type –help svy_estimation– to see what the options are.

Logistic regression

Logistic regressions provide odds ratios of binary outcomes. Odds ratios don’t approximate the risk ratio if the outcome is common (eg >10% occurrence) so I tend to avoid them as I study hypertension, which occurs commonly in a population.

There are oodles of details on logistic regression in the excellent UCLA website. In brief, you want want to use a regression command, you can use “logit” and get the raw betas or “logistic” and get the odds ratios. Most people will want to use “logistic” to get the odds ratios.

Here’s an example of logistic regression in Stata. In this dataset, “race” is Black/White/other, so you’ll need to specify this as a factor/categorical variable with the “i.” prefix, however “smoke” is binary so you can either specify it as a continuous or as a factor/categorical variable. If you don’t specify anything, then it is treated as continuous, which is fine.


webuse lbw, clear
codebook race // see race is 1=White, 2=Black, 3=other
codebook smoke // smoke is 0/1
logistic low i.race smoke 

// output: 
Logistic regression                                     Number of obs =    189
                                                        LR chi2(3)    =  14.70
                                                        Prob > chi2   = 0.0021
Log likelihood = -109.98736                             Pseudo R2     = 0.0626

------------------------------------------------------------------------------
         low | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        race |
      Black  |   2.956742   1.448759     2.21   0.027     1.131716    7.724838
      Other  |   3.030001   1.212927     2.77   0.006     1.382616     6.64024
             |
       smoke |   3.052631   1.127112     3.02   0.003     1.480432    6.294487
       _cons |   .1587319   .0560108    -5.22   0.000     .0794888    .3169732
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

So, here you see that the odds ratio (OR) for the outcome of “low” is 2.96 (95% CI 1.13, 7.72) for Black relative to White participants and the OR for the outcome of “low” is 3.03 (95% CI 1.38, 6.64) for Other race relative to White participants. Since it’s the reference group, you’ll notice that White race isn’t shown. But what if we want Black race as the reference? You’d use “ib2.race” instead of “i.race”. Example:

logistic low ib2.race smoke 
// output:

Logistic regression                                     Number of obs =    189
                                                        LR chi2(3)    =  14.70
                                                        Prob > chi2   = 0.0021
Log likelihood = -109.98736                             Pseudo R2     = 0.0626

------------------------------------------------------------------------------
         low | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        race |
      White  |   .3382101   .1657178    -2.21   0.027     .1294525    .8836136
      Other  |   1.024777   .5049663     0.05   0.960     .3901157    2.691938
             |
       smoke |   3.052631   1.127112     3.02   0.003     1.480432    6.294487
       _cons |   .4693292   .2059043    -1.72   0.085     .1986269    1.108963
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.


Modified Poisson regression

Modified Poisson regression (sometimes called Poisson Regression with Robust Variance Estimation or Poisson Regression with Sandwich Variance Estimation) is pretty straightforward. Note: this is different from ‘conventional’ Poisson regression, which is used for counts and not dichotomous outcomes. You use the “poisson” command with subcommands “, vce(robust) irr” to use the modified Poisson regression type. Note: with svy data, robust VCE is the default so you just need to use the subcommand “, irr”.

As with the logistic regression section above, race is a 3-level nominal variable so we’ll use the “i.” or “ib#.” prefix to specify that it’s not to be treated as a continuous variable.

webuse lbw, clear
codebook race // see race is 1=White, 2=Black, 3=other
codebook smoke // smoke is 0/1
// set Black race as the reference group
poisson low ib2.race smoke, vce(robust) irr
// output:

Iteration 0:  Log pseudolikelihood = -122.83059  
Iteration 1:  Log pseudolikelihood = -122.83058  

Poisson regression                                      Number of obs =    189
                                                        Wald chi2(3)  =  19.47
                                                        Prob > chi2   = 0.0002
Log pseudolikelihood = -122.83058                       Pseudo R2     = 0.0380

------------------------------------------------------------------------------
             |               Robust
         low |        IRR   std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
        race |
      White  |    .507831    .141131    -2.44   0.015     .2945523    .8755401
      Other  |   1.038362   .2907804     0.13   0.893     .5997635      1.7977
             |
       smoke |   2.020686   .4300124     3.31   0.001     1.331554    3.066471
       _cons |   .3038099   .0800721    -4.52   0.000     .1812422    .5092656
------------------------------------------------------------------------------
Note: _cons estimates baseline incidence rate.

Here you see that the relative risk (RR) of “low” for White relative to Black participants is 0.51 (95% CI 0.29, 0.88), and the RR for other race relative to Black participants is 1.04 (95% CI 0.60, 1.80). As above, the “Black” group isn’t shown since it’s the reference group.

Cox proportional hazards model

Cox PH models are used for time to event data. This is a 2-part. First is to –stset– the data, thereby telling Stata that it’s time to event data. In the –stset– command, you specify the outcome/failure variable. Second is to use the –stcox– command to specify the primary exposure/covariates of interest (aka independent variables). There are lots of different steps to setting up the stset command, so make sure to check the –help stset– page. Ditto –help stcox–.

Here, you specify the days of follow-up as “days_hfhosp” and failure as “hfhosp”. Since follow-up is in days, you set the scale to 365.25 to have it instead considered as years.

stset days_hfhosp, f(hfhosp) scale(365.25)

Then you’ll want to use the stcox command to estimate the risk of hospitalization by beta blocker use by age, sex, and race with race relative to the group that is equal to 1.

stcox betablocker age sex ib1.race

You’ll want to make a Kaplan Meier figure at the same time, read about how to do that on this other post.

Making a 15x15cm graphical abstract for Hypertension (the AHA journal)

I recently had a paper published in the AHA journal, Hypertension (here: https://www.ahajournals.org/doi/abs/10.1161/HYPERTENSIONAHA.123.22714). The submission required that I include a graphical abstract that was 15×15 cm at 300 dpi and saved in a jpeg format. (That’s 15/2.54*300 = 1772 x 1772 pixels.) I’ve been trying to use EPS files to get around annoying journal image formatting requirements recently, but they really wanted just a jpeg and not EPS. It took a bit of back and forth with the journals to give them what they wanted. Here’s how I made it. It requires PowerPoint and the excellent Inkscape free and open source program that you can download here: https://inkscape.org/

This specific example works with figures and text made within PowerPoint, YMMV if you are trying to embed pictures (eg microscopy). For that, you might want to use Photoshop or GIMP or the excellent web-based equivalent, Photopea. Just remember to output a file that is 1772×1772 pixels and saved as a jpeg.

Step 1: Make a square PowerPoint slide.

  • Open PowerPoint, make a blank presentation
  • Design –> slide size –> custom slide size
  • Change width to 15 cm and height to 15 cm (it defaults to inches in the US version of PPT)
  • Make your graphical abstract.
  • Save the pptx file.
    • Note: Following this bullet point is the one I made if you want to use the general format. Obviously it’ll need to be heavily modified for your article. I selected the colors using https://colorbrewer2.org. I’m not sure I love the color palette in the end, but it worked:

Step 2: Output your PowerPoint slide as an SVG file. I use this format since it’s a vector format that uses curves and lines to make an image that can be enlarged without any loss in quality. It doesn’t use pixels.

  • While looking at your slide in PowerPoint, hit File –> export –> change file type –> save as another file type.
  • In the pop up, change the “save as type” drop down to “scalable vector graphics format (*.svg)” and click save.
    • Note: For some reason OneDrive in Windows wouldn’t let me save an SVG file to it at this step. I had to save to my desktop instead, which was fine.
  • If you get a pop up, select “this slide only”.

Step 3: Set resolution in Inkscape

  • Open Inkscape, and open your new SVG file.
    • Note: In the file browser, it might look like a Chrome or Edge html file since Windows doesn’t natively handle SVG files.
  • When you have the SVG file open in Inkscape, click file –> export. You will see the export panel open up on the right hand side. Change the file type to jpeg. Above, change the DPI to 300 and the width and height should automatically change to 1772 pixels.
  • Hit export and you should be set!

Finding outside or difficult-to-find records in UVMHN’s Epic and outside of Epic.

Here’s my approach as someone who practices at UVMMC, within the UVMHN.

CareEverywhere to find records in outside/non-UVMHN Epic and *also* outside non-Epic EMRs

Epic’s CareEverywhere works well with other hospital’s Epic implementations. Regionally, that means Dartmouth and the MGH/BWH/Partner’s network in Boston. (Lahey/BIDMC is switching to Epic soon as well.) In the 2020s, it has started working with non-Epic shops as well, including Community Health Center of Burlington. Many of these non-Epic shops are vendors for 15,000+ clinics and medical centers (eg, athenahealth, surescript, nextgen, particlehealth), so linking with these vendors will ping a broad array of clinics across the country. You need to link outside clinics and hospitals within CareEverywhere as a one-time step for each patient. I never assume that this linkage step was done because it usually isn’t.

To do a linkage, in CareEverywhere, click the little “e” next to the patient’s name in the left hand column or under the tabs (eg chart review, results) –> ‘Request Outside Records’. This might be hidden under ‘Rarely Used’. Once there, click the link that says ‘Find Outside Charts’. High-yield linkages to try are below. Bonus: click the star next to the names of these so they’ll show up as a favorite and you don’t have to search for them in other records!

  • Community Health Center of Burlington, inc – by searching “Community Health Center of Burlington”
    • Note: CHCB is sometimes listed as using NextGen or ParticleHealth, which you’ll see below. I usually ping them directly because one of those two vendors doesn’t work and I can’t remember which one it is.
  • Practices using athenahealth EHR – by searching “athenahealth” (not aetna, it’s like the Greek goddess Athena)
    • Note: This is what the private cardiology group in Timber Lane uses
  • Surescripts record locator gateway – by searching “surescripts”
    • Note: This is what the private OB group in Tilley uses
  • NextGen Share – by searching “nextgen”
  • ParticleHealth – By searching “particlehealth”.
  • Vermont Information Technology Leaders – aka VITL, which as of 2/2024 is broken (see the separate VITL section below) by searching “Vermont Information Technology Leaders”
  • Dartmouth Health – by searching “dartmouth”
  • Mass General Brigham – aka Partners by searching “mass general”
  • PRIMARY CARE HEALTH PARTNERS – a consortia of pediatric and adult primary care practices headquartered in Williston, search “primary care health partners”
  • Veterans Affairs/Department of Defense Joint HIE – aka the VA. Search “Veterans Affairs”.
  • A few regional hospitals to consider, based upon where they live:
    • Northwest Medical Center
    • Northeastern Vermont Regional Hospital
    • Rutland Regional Medical Center

For outside Epics: Finding information on CareEverywhere is pretty straightforward for other sites using Epic. In fact as of 2024, I’ve noticed that outside notes show up in-line with our notes in Chart Review! Super cool.

For non-Epic EMRs: There is usually one really ugly note from each group called “Summarization of episode note” or something like that in CareEverywhere –> Documents. These summarization notes are basically a snapshot of the entire medical record of these non-Epic linkages! Take a look: You’ll find labs, vitals, problem lists, notes, radiology reports, etc. They are unwieldy and usually ugly, but have lots of good info included. Keep scrolling all the way to the bottom.

Again, as of 2/2024, VITL’s CareEverywhere linkage is broken so those summarization notes for VITL don’t populate with anything useful.

VITL, aka VT’s HIE – An outstanding resource

The Vermont Information Technology Leaders (VITL) service is our regional health information exchange (HIE) for the state of VT and provides near real-time summary of notes, labs, radiology, etc from the state of Vermont. I can’t overstate how incredible this service is, especially getting outside hospital records and structured data from patients transferred from non-UVMHN hospitals in VT (eg NWMC, RRMC, NEVRH). Here’s the login: https://vitlaccess.vitl.net/login

Unfortunately, as of 2/2024, you need a separate login to get into VITL — it’s not a ‘single click’ from within Epic like HIXNY (see below). To get an account, please email vhiesupport@vitl.net with (1) name, (2) email address, and (3) location/department that the person works. I guess in theory you can do this for entire departments all at once. The VITL folks then apparently reach out to a Trained Security Officer within UVMHN (Jennifer Parks, the chief compliance officer) who verifies things and then VITL folks will grant access. I’m guessing you then get an email to set up an account afterwards. (Perhaps cc Jennifer Parks in the initial email to the vhiesupport email to expedite things? Who knows. Seems like it would save a step.)

Anyway, nearly everything of value within VITL exists within the All Results tab in their web portal. This includes notes, labs, radiology reports, etc. If you poke around in other tabs, you’ll find problem lists, medication lists, billing codes, etc. But the best bang-for-the-buck is in the All Results tab.

HIXNY, aka NY’s HIE

HIXNY is New York’s HIE (well, it looks like it’s the eastern part of NY north of NYC per the map here). You can find it under epic –> chart review –> encounters –> HIXNY (one of the buttons at the top next to Epiphany). This will pop up HIXNY is a separate window. Whether a patient is included is a bit hit-or-miss as I guess it’s an opt in for patients? I’ve had pretty good luck with patients having active accounts if they are middle aged or a senior. I bet that primary care practices across the lake must have some mechanism to get patients signed up HIXNY as part of their care. It looks like there is some sort of consent form that institutions can have patients complete. I’m not sure that UVMHN is actively having patients complete this form.

For patients with active HIXNY accounts, it’s outstanding. You can find all sorts of records, labs, radiology, etc. Per the HIXNY website, there is a functionality to access HIXNY data for patents without accounts/who have yet to provide consent in cases of emergency (aka “break the glass”). I haven’t figured out how to do this “break the glass” within HIXNY, but I haven’t been in a situation where I needed HIXNY access during an emergency.

Legacy Chart, aka pre-Epic, “old records” from hospitals in UVMHN

When UVMHN brought other hospitals into the network, it saved much of the scanned/dictated prior records in this funny app linked within in Epic called ‘Legacy Chart’. This is very helpful for finding old records from Porter, CVMC, CVPH, Etown, Ti, etc. You will find it next to the HIXNY and Epiphany buttons under epic –> chart review –> encounters –> Legacy Chart. When you click on it, it will pull open this weird file structure (if there are old records to be found). I’ve found critical information from old echocardiograms, colonoscopies/endoscopies, PFTs, op notes, consult notes, H&Ps, etc that has changed management.

Pre-Epic notes using a Notes Filter

This isn’t a setting or linkage as much as it is setting filters strategically within Epic’s Chart Review –> Notes tab to pull up things from the pre-Epic time (Epic turned on 10/2010). Before Epic there was our super old EMR called HISSPROD and later a pseudo EMR called Maple. Lots of HISSPROD discharge summaries and notes from Maple were brought into Epic.

When you are in a patient’s chart who had lots of care pre-2010, you can build this filter. (You unfortunately can’t build this filter unless old notes of the below type exist since they won’t appear as filter options.) Go to epic –> chart review –> notes –> filters –> type then select as many of these as appear:

  • Amb Consult
  • Amb Eval
  • Amb General Summary
  • Amb Letter
  • Amb Procedure
  • Amb Progress Note
  • Brief Procedure Op Note
  • Clinical Progress Notes
  • Communications
  • Emergency Room Record
  • H&P – (this unfortunately also will give recent H&Ps, but also gives old H&Ps, back then called ‘History and Physical’)
  • HISSPROD Discharge Summary
  • Op Procedure Note
  • Update Letter

You might need to try to build this shortcut in a few separate patients with pre-Epic documents. The list above will (mostly) pull in records from pre-Epic times.

Printing hazard ratio on Kaplan Meier curve in Stata

I recently made a figure that estimates a hazard ratio and renders it right on top of a Kaplan Meier curve in Stata. Here’s some example code to make this.

Good luck!


// Load example dataset. I got this from the --help stset-- file
webuse diet, clear

// First, stset the data. 
stset dox /// dox is the event or censor date
, ///
failure(fail) /// "fail" is the failure vs censor variable
scale(365.25)


// Next, estimate a cox ph model by "hienergy"
stcox hienergy
// now grab the bits from output of this
local hrb=r(table)[1,1]
local hrlo=r(table)[5,1]
local hrhi=r(table)[6,1]
local pval = r(table)[4,1]
// now format the p-value so it's pretty
if `pval'>=0.056 {
	local pvalue "P=`: display %3.2f `pval''"
}
if `pval'>=0.044 & `pval'<0.056 {
	local pvalue "P=`: display %5.4f `pval''"
}
if `pval' <0.044 {
	local pvalue "P=`: display %4.3f `pval''"
}
if `pval' <0.001 {
	local pvalue "P<0.001"
}
if `pval' <0.0001 {
	local pvalue "P<0.0001"
}

di "original P is " `pval' ", formatted is " "`pvalue'"
di "HR " %4.2f `hrb' " (95% CI " %4.2f `hrlo' "-" %4.2f `hrhi' "; `pvalue')"

// Now make a km plot. this example uses CIs
sts graph ///
, ///
survival /// 
by(hienergy) ///
plot1opts(lpattern(dash) lcolor(red)) /// options for line 1
plot2opts(lpattern(solid) lcolor(blue)) /// options for line 2
ci /// add CIs
ci1opts(color(red%20)) /// options for CI 1
ci2opts(color(blue%20)) /// options for CI 2
/// Following this is the legend, placed in the 6 O'clock position. 
/// Only graphics 5 and 6 are needed, but all 6 are shown so you 
/// see that other bits that can show up in the legend. Delete 
/// everything except for 5 and 6 to hide the rest of the legend components
legend(order(1 "[one]" 2 "[two]" 3 "[three]" 4 "[four]" 5 "First group" 6 "Second group") position(6)) ///
/// Risk table to print at the bottom:
risktable(0(5)20 , size(small) order(1 "First group" 2 "Second group")) ///
title("Title") ///
t1title("Subtitle") ///
xtitle("Year of Follow-Up") ///
ytitle("Event-Free Survival") ///
/// Here's how you render the HR. Change the first 2 numbers to move it:
text(0 0 "`: display "HR " %4.2f `hrb' " (95% CI " %4.2f `hrlo' "-" %4.2f `hrhi' "; `pvalue')"'", placement(e) size(medsmall)) ///
yla(0(0.2)1) 

Merging Stata and R SVG vector figures for publication using Inkscape, saving as SVG or EPS files

I recently needed to make a figure for publication and the publisher didn’t like the resolution of the figure that I provided. One option is to increase the number of pixels of the rendered figure (eg increasing the width and height), the other is to create a figure using vectors that can be zoomed in as much as you want without losing quality so the journal can render the figure however they want. When you generate a PNG, JPEG, or TIFF figure, it renders/rasterizes the figure using pixels. Vectors instead embed lines using mathematical formulas, so the rendering of the figure isn’t tied to a specific resolution, and zooming in and out will redraw the lines at the current resolution of the screen. The widely-adopted SVG vector format should be universally accepted as a figure format by publishers, but isn’t for some dumb reason. PDFs and PS/EPS files can also handle vectors and are sometimes accepted by journals but require proprietary software (usually) to render. PS/EPS files are also annoying in that they don’t embed non-standard characters correctly (e.g., beta, gamma, alpha, delta characters).

Stata and R can easily output SVG files. The excellent and free Inkscape app/program can manipulate these to create merged SVG, PS, EPS, or PDFs that can then be provided to a journal. Inkscape is also nice because it will help you get around the problems with non-standard characters not rendering correctly in PS/EPS files since you can export nonstandard characters from SVG files as paths in PS/EPS files. I’m a GIMP proponent for most things but think Inkscape blows GIMP out of the water for manipulating vector images.

Here’s how I manipulated SVG files to make an EPS file to submit to a journal.

Step 1: In Stata or R, export your figure as an SVG file

In Stata, after making your figure, you type –graph export “MYFILENAME.svg”, replace fontface(“YOUR PREFERRED FONT”) — For my figure, I needed to provide Times New Roman, so the fontface was “Times New Roman”. Note that you can’t specify a width for an SVG file. Type –help graph export– to view general export details and details specific for exporting SVG figures.

In R, use the the svglite package that you can read about here.

Step 2: Importing your SVGs in Inkscape

Download and install Inkscape if you haven’t already. To begin, make a new document by clicking File –> New. Change the dimensions under File –> Document Properties. I’m arbitrarily selecting US Letter and changing the format from mm to in, so I have an 8.5×11 in figure. I can change this later.

Now set the background as White, clicking the page button and then typing in 6 f letters in a row (ffffff) if it isn’t already like this. That’s the hexadecimal code for white in RGB.

Now import your figures under file –> import. If you have an R figure, do that one first. After you select your figure, you’ll see this pop up. I set the rendered SVG DPI to 600 and left everything else as default.

You’ll see that you’ve imported your figure, but it might be a bit bigger than your layer. That’s fine, just go back to file –> document properties and select the “resize to content” button to fix this.

You’ll notice that these R figures have black boxes where the main graphic should be. This is apparently a bug in how R outputs SVG files (I didn’t make these specific files so I’m not sure if it’s also a bug with the svglite package). It’s pretty simple to fix, and is detailed here. It turns out that there’s a layer piece of the figure that R doesn’t specify should be transparent, so Inkspace renders it as black. if you have this problem, follow these steps:

  1. Click on your figure then ungroup with shift+ctrl+g (or object –> ungroup).
  2. Now open layers window on the right (if you don’t see it, open it up with layer –> layers and objects).
  3. With the “selector tool” (the mouse), click the black box and see which layer is selected. Expand that layer and find the “rect” in it.

4. If you hover over the “rect” object, you’ll see a little eye icon. This will hide the layer to prove that it’s the offending layer. You’ll be able to see the underlying object.

5. Now click it again to unhide it. Then, make sure you have selected that rect layer in the layers window, and click the “no fill” option in the bottom left of the screen (the white box with a red X).

6. (Optional) Drag a box over your figure to select the entire figure and then regroup it (object –> group or ctrl+g).

Now you should be set. I had to repeat the fill color steps for the other box in this figure before regrouping BTW.

Now that I’ve fixed the R bug (hopefully this doesn’t happen to you), I’ve imported my Stata file. It comes in waaay smaller than the R one, which is fine. I’ve placed it below the other file and then (1) click on the new image so I see in/outward facing arrows, then (2) hold down CTRL+shift and select and drag the corner arrow to expand it while preserving the ratio.

I’ve imported another figure below that one. I’ve more-or-less eyeballed the layout and size of these layers, but I can fine-tune the size so they match each other. (1) click on the first figure you want to resize, then (2) click on the little deadbolt figure to lock the proportions — aka so width and height change at the same time, then (3) manually change the width to whatever you want. Then (4) repeat those steps for the other figure, and specify the same width as in step 3. Now you can move around the images so they are placed nicely.

Now you’ll want to expand the document layer so it’s the size of all of the added figures. To do that, (1) select all layers with ctrl-a or edit –> select all, then (2) go to file –> document properties, and (3) click on the “resize to content” button.

Now your layer should perfectly match the size of your figures.

Step 3: Adding overlaying text

I want to add the letters “A” and “B” to this figure since it’s two separate panels. The text tool on the left (capital “A”) allows to add text. So, (1) click the text tool, then (2) choose your font, (3) choose the text size, here 120, (4) select the font color in the bottom left, here black, then (5) click where you want to add your text, and start typing.

You might not get the placement perfect the first time. If you need to move it around, click the selector tool/mouse cursor icon on the top left to move the added text layer around. If you want to edit the text, select the text tool again and re-select your added text.

If your text is outside of the bounds of your document layer, you might want to “resize to content” button one more time (hit ctrl-a then go to file –> document properties, and hit the “resize to content” button).

Step 4: Saving and exporting

Step 4a: Saving as SVG for future editing with Inkscape

Inkscape’s default file format is an SVG, so I recommend saving your file as an SVG to start. Do that with ctrl+s or file –> save.

Step 4b: Saving as EPS, which is the file you’ll want to send to the publisher

The semi-proprietary EPS format is typically accepted by publishers, so you’ll want to generate this one to send off to the journal. This is done under the file –> “save as…” option (or shift+ctrl+s). In the dropdown, select “encapsulated post script (*.eps)”.

On this popup, I unchecked the button next to rasterize filter effects, I selected embed fonts. If you are using nonstandard characters (e.g., alpha, beta, gamma, delta), instead check the “convert text to paths” button. This will change the text so that it’s vectors drawn on the image and not actual font-based text. Set the resolution for rasterization to 600 DPI. Hopefully nothing is actually rasterized since avoiding rasterization was the point of this little exercise.

Note that you’ll now have 2 separate files, one SVG and one EPS (if you did both steps 4a and 4b), so for any additional edits, you’ll want to remember to overwrite your SVG and your EPS files.

Step 4c (optional): Export as PDF so you can share the figure with coauthors

You might want to also save as a PDF since people are familiar with these. I don’t know that I would provide a PDF to a journal, probably just an EPS file. It’s nice to have PDFs to share with coauthors since they are such universally-accepted file formats. Instead of using the “save as…” option, I recommend using the file –> export option (shift+ctrl+e) to export. This will pop up an export window on the bottom right. Set your directory, select file type as PDF, then click on the little settings icon.

On the settings pop-up, I selected “no” for rasterize filter effects. Embedding the fonts might be preferable to converting text to paths since it will retain compatibility with screen reading software. Set the DPI to 600. I also left the default for compensating for rounding. Whatever that means.

Then pop out of that window and click the big export button and you’re done!

Automatically deleting Outlook calendar invites and sending a reply

We have 2 separate email accounts at UVM if you are on the medical faculty: (1) the College of Medicine aka med.uvm.edu and (2) the hospital aka uvmhealth.org. They don’t integrate. (There’s actually a 3rd one with the University aka uvm.edu without the med, but that easily forwards to the College of Medicine email.) Having two separate inboxes is kind of a pain but is somewhat manageable. Having two separate calendars is nonsense. I use my College of Medicine email as my primary and only calendar.

To keep my hospital calendar from being used accidentally by well-meaning people trying to send me calendar invites, I set up a rule in my hospital Outlook account that automatically sends a reply to all calendar invitations. This reply says that the invite was deleted and to send the invite to my College of Medicine account instead. Works like a charm.

Here’s how I did it.

Step 1: Open up the desktop version of Outlook for the calendar you don’t want to use

This won’t work with the internet browser version of Outlook. For med.uvm.edu, use your desktop version of Outlook, which you can install on any device.

For uvmhealth.org, accessing the Outlook desktop app is a bit more complicated. On campus, log into a hospital-owned desktop and open up Outlook. Off campus, log into the Citrix Workspace portal using your hospital credentials then open up the virtual desktop that’s under the “Desktops” tab.

Wait for the virtual desktop to open then open up the desktop version of Outlook.

Step 2: Open up the “Rules” function

This is hidden under the Home –> Move –> Rules –> “Create rule…” if your window isn’t all the way expanded or Home –> Rules –> “Create rule…” if your window is really big.

Step 3: Make a rule

In the Rules and Alerts pop-up, select “new rule…”

On the Rules Wizard, select “Start from a blank rule… Apply rule on messages I receive” and click next.

In your new rule, select for Select Conditions, scroll way down and select “Which is a meeting invitation or update” and click next

For Select actions, select “have server reply using a specific message” at the very least. You can also select “delete it” if you want to really delete it but you can also just pretend you are deleting it and folks will assume you really did delete it. It’s the same effect in my opinion.

After checking at least the “have the server reply…” option, click at the link in the bottom half of the window to pop up the “specific message” to reply with.

This pops up a blank message that you can fill out. Make sure to fill out a very clear subject line as well as the actual message that will go back to the person who sent you the calendar invite. Hit “save and close” when done. Then click “next” on the Select Options menu.

For Exceptions, I didn’t include any exceptions so this was blank and I clicked next.

Then turn the rule on and you should be good to go! Give it a test by sending a calendar invite from your other account.

EKG leads (inferior, lateral, anterior, right) color coding in a great manuscript

A student shared this paper by Blakeway and colleagues. It shows the leads of the heart color-coded by region. It’s awesome. I’m posting it here mostly so I can find it again. Link to the PDF, look at the figure on Page 2: https://www.resuscitationjournal.com/article/S0300-9572(12)00053-6/pdf

Citation: Blakeway E, Jabbour RJ, Baksi J, Peters NS, Touquet R. ECGs: colour-coding for initial training. Resuscitation. 2012 May;83(5):e115-6. doi: 10.1016/j.resuscitation.2012.01.034. Epub 2012 Feb 2. PMID: 22306667. https://pubmed.ncbi.nlm.nih.gov/22306667/