Generic start of a Stata .do file

I took the Stata programming class at the Johns Hopkins School of Public Health during grad school It was taught by Dorry Segev. If you are at the school, I highly, highly, highly recommend taking it and doing all of the assignments in term 4. It saved me many hours of labor in writing up my thesis. It’s a phenomenal class.

One of the biggest takeaways from the class was using a .do file as much as possible when interacting with Stata. As in 99% of the time.

Below is the stock header and footer of every .do file that I make. Steps to success:

  1. Open a blank .do file
  2. Paste the code from below
  3. Save it in the same folder as your dataset
  4. Close Stata
  5. In Windows File Explorer, find your new .do file and open it up then get rolling.

By opening the .do file through file explorer, Stata automatically knows which folder you are working in. Then you don’t have to write the entire directory to start. For example, you can write:

use data.dta, clear

…and not

use c:\windows\users\myname\work\research\001project\data\data.dta, clear

******************************HEADER STARTS HERE********************************
// at the beginning of every do file:
macro drop _all // remove macros from previous work, if any
capture log close // Close any open logs. Capture will ignore a command that gives 
//                   an error. So if there isn't an open log, instead of giving you 
//                   an error and stopping here, it'll just move onto the next line.
clear all // clean the belfries
drop _all // get rid of everything!

log using output.log, replace text // change the name of this to whatever you'd like

// The purpose of this .do file is... [say why you are writing this do file]

version 14 // Every version of Stata is slightly different, but all are backwards 
//            compatible with previous ones. If you open up this do file with a way 
//            newer version, it'll run it in version 14 compatibility mode. Change 
//            this to the current version of Stata that you are using. This will 
//            also keep your code from running on older versions of stata that will 
//            break with new code that it isn't designed to handle. 

set more off, permanently // so you don't have to keep clicking through stata to 
//                           keep it running

set linesize 255 // this keeps longer lines from getting clipped. Helpful for making 
//                  tables.

capture shell md pictures // this makes a folder called pictures in the Windows 
//                           version of stata. Save your pictures here.
capture shell mkdir pictures // ditto, except in the Mac version.

******************************IMPORT DATA HERE**********************************
* working with stata dta file: 
// use data.dta, clear // change this with whatever data file you are using and 
//                        remove the double slashes 
* working with excel file: 
// import excel using "name of file.xlsx", firstrow clear // firstrow imports 
//                       the first row of the sheet as variable. Change to the 
//                       appropriate name and delete the double lines as needed. 

******************************CODE STARTS HERE********************************** 
// ... you get the idea

******************************FOOTER STARTS HERE********************************
// At the very end of your .do file: 
log close
// Fin.

Table 1 program

Here are a few simple Stata programs that will write a CSV file for your Table 1.

This has three parts:

  1. Header (writes the first line of the table with names and the row with Ns),
  2. Program for continuous variables, and
  3. Program for dichotomous variables.

Each time you want to add another line your table, just call the appropriate program followed by the variable of interest.

webuse auto.dta, clear

set seed 12345
gen treatment = .
replace treatment = round(runiform()) // make a random treatment 
// variable that's 0 or 1

// Header + second row with Ns
quietly {
capture log close table1 // force closes any tables with the same name
log using "my_table_1.csv", text replace name(table1) //replace will erase 
// any CSV files you started already with the same name
noisily disp ",All,Group 1,Group 2" // line 1
local nall=r(N)
count if treatment==0
local ntreatment0=r(N)
count if treatment==1
local ntreatment1=r(N)
noisily disp "N," `nall' "," `ntreatment0' "," `ntreatment1'
log close table1

// Program for continuous variables
capture program drop table1_cont // drops any programs with the same name
program define table1_cont
quietly {
syntax varlist
capture log close table1
log using "my_table_1.csv", text append name(table1) // append will 
// keep writing onto existing tables
foreach var of varlist `varlist' {
sum `var'
local `var'mean = r(mean)
local `var'sd = r(sd)
local `var'n=r(N)
sum `var' if treatment==0
local `var'mean0 = r(mean)
local `var'sd0 = r(sd)
local `var'n0=r(N)
sum `var' if treatment==1
local `var'mean1 = r(mean)
local `var'sd1 = r(sd)
local `var'n1=r(N)

noisily disp "`var' (Mean (SD))," ///
%3.1f ``var'mean' " (" %3.1f ``var'sd' "),"  ///
 %3.1f ``var'mean0' " (" %3.1f ``var'sd0' "),"  ///
 %3.1f ``var'mean1' " (" %3.1f ``var'sd1' ")"
} // end varlist loop
log close table1
} // end quietly

// program for dichotomous variables
capture program drop table1_dichotomous
program define table1_dichotomous
quietly {
syntax varlist
capture log close table1
log using "my_table_1.csv", text append name(table1)
foreach var of varlist `varlist' {
sum `var'
local `var'n= r(N)
local `var'mean = r(mean)*100
sum `var' if treatment==0
local `var'n0 = r(N)
local `var'mean0 = r(mean)*100
sum `var' if treatment==1
local `var'n1 = r(N)
local `var'mean1 = r(mean)*100
noisily disp "`var' (N (%))," ///
``var'n' " (" %3.1f ``var'mean' "),"  ///
``var'n0' " (" %3.1f ``var'mean0' ")," ///
``var'n1' " (" %3.1f ``var'mean1' ")" 
log close table1 
// now just call these programs as needed: 

table1_cont trunk 
table1_cont weight 
table1_dichotomous foreign 
// and so-on

Here is the output from the example above:

,All,Group 1,Group 2
trunk (Mean (SD)),13.8 (4.3),13.0 (3.9),14.8 (4.6)
weight (Mean (SD)),3019.5 (777.2),2906.0 (746.1),3176.8 (804.1)
foreign (N (%)),74 (29.7),43 (32.6),31 (25.8)

If opened with MS Excel, it will look like this: