The very excellent table1_mc program will automate generation of your Table 1 for nearly all needs (read about it here), except for datasets using pweight. I’ve been toying around with automating Excel table generation using Stata v16+ Frames features. I recently started working on a database that requires pweighting for analyses, and opted to use this to as an opportunity to use Frames to generate the automation of a pweight adjusted Table 1.
v1.3 of my code (updated 2024-2) to automate this lives here: https://www.uvm.edu/~tbplante/p_weight_table1_v1_3.do
You can just put:
do https://www.uvm.edu/~tbplante/p_weight_table1_v1_3.do
…in your Stata do file and it’ll pull in the entire script! Note that full instructions will show up in the Stata output window when you run the above line. The instructions below are incomplete. This code does not produce P-values.
How to use this do file
// Step 1a: close all open frames, drop all macros,
// and open your dataset
frames reset
macro drop _all
webuse multistage, clear
//
// Step 1b: Figure out where your present working directory is,
// this is where the excel spreadsheet will be saved.
// Change the working directory with the "cd"
// command as needed.
pwd
//
// Step 2: Declare your data to be pweighted
svyset county [pweight=sampwgt], strata(state) fpc(ncounties) || school, fpc(nschools)
//
// Step 3: If your columns require the generation of pweighted
// tertiles, quartiles, or whatnot, do that now.
// For this example, we'll do by quartile of weight.
// note: per this website: https://www.stata.com/support/faqs/statistics/percentiles-for-survey-data/
// ...Only the pweight needs to be specified when making
// weighted quartiles.
xtile weightquart=weight [pweight=sampwgt], n(4)
//
// Step 4: Recode binary variables so they are 0 and 1 (if needed)
// Note: in this dataset, it's 1 and 2 for male and female,
// respectively.
gen female = 1 if sex==2 // recode sex to female, where 1 is female
replace female=0 if sex==1 // male is now 0
//
// Step 5: Name your variables and options for multiple options
// Note: The variables are already labeled but we are doing it
// again for completeness' sake.
//
// Continuous variables
label variable weight "Weight in lbs"
label variable height "Height in in" // I don't know why people are 400 in tall. that's 33 ft.
// Nominal variables (same process would happen for ordinal or continuous varibles)
label variable race "Race" // Race is nominal so need to also define values of race
label define racelabels 1 "White" 2 "Black" 3 "Other"
label values race racelabels // Apply the labels!!!
// Binary variables, no need to apply labels
label variable female "Female sex"
//
// Step 6: Call the do file
// Note: Instructions on this program's use will show right
// after it's called. Look at the Stata output window.
do https://www.uvm.edu/~tbplante/p_weight_table1_v1_3.do
//
// Step 7: Now follow the instructions! That are in the stata
// output window!
table1pweight_start table1 1 4 weightquart weight %10.1f
table1pweight_contn_sd table1 1 4 weightquart height %10.1f
table1pweight_bin table1 1 4 weightquart female %10.0f
table1pweight_cat table1 1 4 weightquart race %10.0f
table1pweight_end table1 1 4 weightquart weight %10.2f
//
// CLOSE THE NEW EXCEL FILE OR YOU'LL GET AN ERROR WHEN
// RUNNING STEP 7.
//
// Step 8: Look at the excel output! Here, it's a file called
// table1.xlsx that's sitting in your pwd (see step
// 1b above). You might notice blanks for the 2nd and
// 3rd columns, but that's because of a strata with a single
// sampling unit. You can confirm numbers using the survey
// tools.
// remember! This is where your excel file is saved:
pwd