Extracting variable labels and categorical/ordinal value labels in Stata

Stata allows the labeling of variables and also the individual values of categorical or ordinal variable values. For example, in the –sysuse auto– database, “foreign” is labeled as “Car origin”, 0 is “Domestic”, and 1 is “Foreign”. It isn’t terribly intuitive to extract the variable label of foreign (here, “Car origin”) or the labels from the categorical values (here, “Domestic” and “Foreign”).

Here’s a script that you might find helpful to extract these labels and save them as macros that can later be called back. This example generates a second string variable that applies those labels. I use this sort of code to automate the labeling of figures with the value labels, but this is a pretty simple example for now.

Remember to run the entire script from top to bottom or else Stata might drop your macros. Good luck!

sysuse auto, clear
// Note: saving variable label details at 
//       --help macro--, under "Macro functions 
//       for extracting data attributes"
// 
// 1. Extract the label for the variable itself.
//    If we look at the --codebook-- for the 
//    variable "foreign", we see...
//
codebook foreign
//
//    ...that the label for "foreign" is "Car 
//    origin" (see it in the top right of 
//    the output).  Here's how we grab the 
//    label of "foreign", save it as a macro, 
//    and print it.
//
local foreign_lab: variable label foreign
di "`foreign_lab'"
//
// 2. Extract the label for the values of the variable.
//    If we look at --codebook-- again for "foreign", 
//    we see...
//
codebook foreign
// 
//    ...that 0 is "Domestic" and 1 is "Foreign". 
//    Here's how to grab those labels, save macros, 
//    and print them
//
local foreign_vallab_0: label (foreign) 0 
local foreign_vallab_1: label (foreign) 1 
di "The label of `foreign_lab' 0 is `foreign_vallab_0' and 1 is `foreign_vallab_1'"
//
// 3. Now you can make a variable for the value labels.
//  
gen strL foreign_vallab = "" // this makes a string 
replace foreign_vallab="`foreign_vallab_0'" if foreign==0
replace foreign_vallab="`foreign_vallab_1'" if foreign==1
// 
// 4. You can also label this new string variable 
//    using the label from #1
// 
label variable foreign_vallab "`foreign_lab' as string"
// 
// BONUS: You can automate this with a loop, using 
//    --levelsof-- to extract options for each 
//    categorical variable. There is only one 
//    labeled categorical variable in this dataset 
//    (foreign) so this loop only uses the single one. 
// 
sysuse auto, clear
foreach x in foreign {
	local `x'_lab: variable label `x' // #1 from above
	gen strL `x'_vallab = "" // start of #3
	label variable `x'_vallab "``x'_lab' string"
	levelsof `x', local(valrange)
	foreach n of numlist `valrange' { 
		local `x'_vallab_`n': label (`x') `n' // #2 from above
		replace `x'_vallab = "``x'_vallab_`n''" if `x' == `n' // end of #3
	}
}