{"id":779,"date":"2021-05-27T15:59:44","date_gmt":"2021-05-27T19:59:44","guid":{"rendered":"https:\/\/blog.uvm.edu\/tbplante\/?p=779"},"modified":"2026-03-25T11:15:54","modified_gmt":"2026-03-25T15:15:54","slug":"stata-r-integration-with-rcall","status":"publish","type":"post","link":"https:\/\/blog.uvm.edu\/tbplante\/2021\/05\/27\/stata-r-integration-with-rcall\/","title":{"rendered":"Stata-R integration with Rcall"},"content":{"rendered":"\n<p>Stata is great because of its intuitive syntax, reasonable learning curve, and dependable implementation. There&#8217;s some cutting edge functionality and graphical tools in R that are missing in Stata. I came across the <a href=\"https:\/\/github.com\/haghish\/rcall\" target=\"_blank\" rel=\"noreferrer noopener\">Rcall<\/a> package that allows Stata to interface with R and use some of these advanced features. (Note: this worked for me as far back as Stata 16. I have no reason to think that this wouldn&#8217;t work with Stata 13 or newer.)<\/p>\n\n\n\n<p>Below details:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>How to get set up with Rcall<\/li>\n\n\n\n<li>Example 1: How to make figures with the &#8216;ggplot2&#8217; and &#8216;ggstatsplot&#8217; R packages<\/li>\n\n\n\n<li>Example 2: How to estimate the Charlson comorbidity index or Elixhauser comorbidity score with the &#8216;comorbidity&#8217; R package<\/li>\n\n\n\n<li>Example 3: Estimating race by last name and sex by first name with the &#8216;predictrace&#8217; R package<\/li>\n\n\n\n<li>Example 4: Making a heatplot with lattice\/levelplot<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Installation of R, R packages, and Rcall (you only need to do this once)<\/h2>\n\n\n\n<p><a href=\"https:\/\/cran.r-project.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Download<\/a> and install R. <em>(Note: If you have problems this working with an old version of R, such with outdated packages, uninstall R and then also delete the folder that includes the local install packages. On my Windows PC, it was at: C:\\Users\\[username]\\AppData\\Local\\R\\win-library\\4.5\\. The &#8220;Local&#8221; folder is hidden, so in Windows Explorer allow hidden folders to be seen under options &#8211;&gt; view &#8211;&gt; hidden files and folders).<\/em> Open R and install the readstata13 package, which is required to install Rcall. While you&#8217;re at it, install ggplot2 and <a href=\"https:\/\/indrajeetpatil.github.io\/ggstatsplot\/\" target=\"_blank\" rel=\"noreferrer noopener\">ggstatsplot<\/a>. <em>Note<\/em>: ggplot2 is included in the excellent multi-package collection called <a href=\"https:\/\/www.tidyverse.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Tidyverse<\/a>. We are going to install Tidyverse instead of ggplot2 alone. Tidyverse also installs several other packages useful in data science that you might need later. This includes dplyr, tidyr, readr, purrr, tibble, stringr, forcats, import, wrangle, program, and model. I have also gotten an error saying &#8220;&#8216;Rcpp_precious_remove&#8217; not provided by package &#8216;Rcpp'&#8221;, which was fixed by installing Rcpp, so install that too.<\/p>\n\n\n\n<p>In R, type:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>install.packages(\"readstata13\")\ninstall.packages(\"tidyverse\")\ninstall.packages(\"ggstatsplot\")\ninstall.packages(\"Rcpp\")<\/code><\/pre>\n\n\n\n<p>It&#8217;ll prompt you to set up an install directory and choose your mirror\/repository. Just pick one geographically close to you. After these finish installing, you can close R. <\/p>\n\n\n\n<p>Rcall&#8217;s installation is within Stata (as usual for Stata programs) but originates from Github, not the usual SSC install. You need to install a separate package to allow you to install things from <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/haghish\/github\" target=\"_blank\">Github<\/a> in Stata. From the Stata command line, type:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>net install github, from(\"https:\/\/haghish.github.io\/github\/\")<\/code><\/pre>\n\n\n\n<p>Now install Rcall itself from the Stata command line:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>github install haghish\/rcall, stable<\/code><\/pre>\n\n\n\n<p>If all goes well, it should install!<\/p>\n\n\n\n<p><em>Edit: <\/em>In July 2024, there seems to be a problem with the installation. If you get an error saying &#8220;github package was not found&#8221; and &#8220;please update your GitHub package&#8221;, you might try to install an older version of rcall than the current version (which is 3.1.0 in 7\/2024) as such:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>github install haghish\/rcall, version(3.0.7)<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Using Rcall<\/h2>\n\n\n\n<p>You should read details on the Rcall help file (type &#8211;help rcall&#8211; in Stata) for an overview. Also read the <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/haghish\/rcall\" target=\"_blank\">Rcall overview<\/a> on Github. In brief, you can send datasets from Stata to R using &#8211;rcall st.data()&#8211;. You can kick things back to stata with  &#8211;st.load(<em>name of R frame<\/em>)&#8211;. &#8211;rcall clear&#8211; reboots R as a new instance. <\/p>\n\n\n\n<p>There are four modes for using Rcall: vanilla, sync, interactive, and console. For our purposes, we are going to focus on the interactive mode since this allows you to manipulate R from within a do file.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Example 1: Make a figure in ggplot2 using Stata and Rcall<\/h1>\n\n\n\n<p>Here&#8217;s some demo code to make a figure with ggplot2, which is the standard for figures in R. There&#8217;s a handy <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/rstudio\/cheatsheets\/blob\/master\/data-visualization-2.1.pdf\" target=\"_blank\">cheat sheet here<\/a>. <a rel=\"noreferrer noopener\" href=\"https:\/\/mgimond.github.io\/ES218\/Week04c.html\" target=\"_blank\">This intro page<\/a> is quite helpful. <a href=\"https:\/\/psyteachr.github.io\/introdataviz\/introduction.html\" target=\"_blank\" rel=\"noreferrer noopener\">This overview<\/a> is excellent. Check out the demo figures from <a rel=\"noreferrer noopener\" href=\"http:\/\/r-statistics.co\/Top50-Ggplot2-Visualizations-MasterList-R-Code.html\" target=\"_blank\">this page<\/a> as well. <strong><em>If your ggplot command extends across multiple lines, make sure to end each line (except the final line) with the three forward slash (&#8220;\/\/\/&#8221;) line break notation that is used by Stata.<\/em><\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ load sysuse auto dataset\nsysuse auto, clear\n\/\/ set up rcall, clean session and load necessary packages\nrcall clear \/\/ starts with a new instance of R\nrcall: library(ggplot2) \/\/ load the ggplot2 library\n\/\/ move Stata's auto dataset over to R and prove it's there.\nrcall: data&lt;- st.data() \/\/ move auto dataset to r\nrcall: names(data) \/\/ prove that you can see the variables. \nrcall: head(data, n=10) \/\/ now look at the first 10 rows of the data in R\n\/\/ now make a scatterplot with ggplot2, note the three slashes for line break\nrcall: e&lt;- ggplot(data, aes(x=mpg, y=weight)) + \/\/\/\n       geom_point()\nrcall: ggsave(&quot;ggtest.png&quot;, plot=e)\n\/\/ figure out where that PNG is saved:\nrcall: getwd()<\/code><\/pre>\n\n\n\n<p>Note: rather than using the three forward slashes, you can also change the delimiter to a semicolon, like the following. Just remember to change it back to normal (&#8220;cr&#8221;). Here&#8217;s an equivalent to above with semicolon delimiters. Note that the ggplot bit that spreads across two lines no longer has any slashes. This looks a bit more like &#8220;true R code&#8221;.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sysuse auto, clear\n#delimit ;\nrcall clear ; \nrcall: library(ggplot2) ; \nrcall: data&lt;- st.data() ;\nrcall: names(data) ;\nrcall: head(data, n=10) ;\nrcall: e&lt;- ggplot(data, aes(x=mpg, y=weight)) + \n       geom_point() ;\nrcall: ggsave(&quot;ggtest.png&quot;, plot=e) ; \nrcall: getwd() ; \n#delimit cr<\/code><\/pre>\n\n\n\n<p>Here&#8217;s what it made! It was saved in my Documents folder, but check the output above to see where you working directory is. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"773\" height=\"765\" src=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-1.png\" alt=\"\" class=\"wp-image-785\" style=\"width:514px;height:509px\" srcset=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-1.png 773w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-1-300x297.png 300w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-1-768x760.png 768w\" sizes=\"auto, (max-width: 773px) 100vw, 773px\" \/><\/figure>\n\n\n\n<p>You can get much more complex with the figure, like specifying colors by foreign status, specifying dot size by headroom size, adding a loess curve with 95% CI, and adding some labels. You can swap out the &#8220;rcall: e &lt;- ggplot(&#8230;)&quot; bit above for the following. <strong><em>Remember to end every non-final line with the three forward slashes.<\/em><\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>rcall: e&lt;- ggplot(data, aes(x=mpg, y=weight)) + \/\/\/\n\tgeom_point(aes(col=foreign, size=headroom)) + \/\/\/\n\tgeom_smooth(method=&quot;loess&quot;) +   \/\/\/\n\tlabs(title=&quot;ggplot2 demo&quot;, x=&quot;MPG&quot;, y=&quot;Weight&quot;, caption=&quot;Caption!&quot;)<\/code><\/pre>\n\n\n\n<p>Here&#8217;s what I got. Varying dot size by a third variable can be done in <a href=\"https:\/\/www.stata.com\/support\/faqs\/graphics\/gph\/graphdocs\/scatterplot-with-weighted-markers\/\" target=\"_blank\" rel=\"noreferrer noopener\">Stata using weighted markers<\/a>, as FYI.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"517\" height=\"527\" src=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-2.png\" alt=\"\" class=\"wp-image-801\" srcset=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-2.png 517w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-2-294x300.png 294w\" sizes=\"auto, (max-width: 517px) 100vw, 517px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Let&#8217;s make a figure in ggstatsplot using Stata and Rcall<\/h3>\n\n\n\n<p>Here&#8217;s some demo code to make a figure with <a rel=\"noreferrer noopener\" href=\"https:\/\/indrajeetpatil.github.io\/ggstatsplot\/\" target=\"_blank\">ggstatsplot<\/a> (which is very awesome and you should check it out). <strong><em>If your ggstatsplot command extends across multiple lines, make sure to end each line (except the final line) with the three forward slash (&#8220;\/\/\/&#8221;) line break notation that is used by Stata.<\/em><\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ load sysuse auto dataset\nsysuse auto, clear\n\/\/ set up rcall, clean session and load necessary packages\nrcall clear \/\/ starts with a new instance of R\nrcall: library(ggstatsplot) \/\/ load the ggstatsplot library\nrcall: library(ggplot2) \/\/ need ggplot2 to save the png\n\/\/ move Stata's auto dataset over to R and prove it's there.\nrcall: data&lt;- st.data() \/\/ move auto dataset to r\nrcall: names(data) \/\/ prove that you can see the variables. \nrcall: head(data, n=10) \/\/ now look at the first 10 rows of the data in R\n\/\/ let&#039;s make a violin plot using ggstatsplot\nrcall: f &lt;- ggbetweenstats( data = data, x=foreign, y=weight, title=&quot;title&quot;)\nrcall: ggsave(&quot;ggstatsplottest.png&quot;, plot=f)\n\/\/ figure out where that PNG is saved:\nrcall: getwd()<\/code><\/pre>\n\n\n\n<p>If you check your working directory (it was my &#8220;Documents&#8221; folder in Windows), you&#8217;ll find this figure as a PNG:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"957\" height=\"960\" src=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image.png\" alt=\"\" class=\"wp-image-784\" style=\"width:539px;height:540px\" srcset=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image.png 957w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-300x300.png 300w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-150x150.png 150w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-768x770.png 768w\" sizes=\"auto, (max-width: 957px) 100vw, 957px\" \/><\/figure>\n\n\n\n<p>You can automate the output of ggstatsplot figures by editing the ggplot2 components that make it up. You&#8217;d insert the following into the ggstats plot code in the parentheses following &#8220;ggbetweenstats&#8221; to make the y scale on a log axis, for example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ggplot.component = ggplot2::scale_y_continuous(trans='log') <\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Quick do file to automate R-Stata integration and make ggplot2 or ggstatsplot figures <\/h2>\n\n\n\n<p>I made a <a rel=\"noreferrer noopener\" href=\"http:\/\/www.uvm.edu\/~tbplante\/rcall_ggplot2_ggstatsplot_setup_v1_0.do\" target=\"_blank\">do file<\/a> that simplifies the setup of Rcall. Specifically, it 1. Sets R&#8217;s working directory to match your current Stata working directory, 2. Starts with a fresh R install, 3. Loads your current Stata dataset in R, and 3. Loads ggplot2 and ggstatsplot in R. <\/p>\n\n\n\n<p>To use, just load your data, run a &#8220;do&#8221; command followed by the URL to my do file, then run whatever ggplot2 or ggstatsplots commands you want.This assumes you have installed R, the required packages, and Rcall (see very top of this page). If you get an error, <a href=\"http:\/\/www.uvm.edu\/~tbplante\/rcall_ggplot2_ggstatsplot_setup_alt_v1_0.do\" target=\"_blank\" rel=\"noreferrer noopener\">try using this alternative version of the do file<\/a> that doesn&#8217;t try to match Stata and R&#8217;s working directory.<\/p>\n\n\n\n<p>Example code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ Step 1: open dataset\nsysuse auto, clear\n\/\/ Step 2: run the do file, hosted on my UVM directory:\ndo https:\/\/www.uvm.edu\/~tbplante\/rcall_ggplot2_ggstatsplot_setup_v1_0.do\n\/\/ if errors with above, use this do file instead:\n\/\/ do https:\/\/www.uvm.edu\/~tbplante\/rcall_ggplot2_ggstatsplot_setup_alt_v1_0.do\n\/\/ Step 3: run whatever ggplot2 or ggstatsplot code you want:\nrcall: e&lt;- ggplot(data, aes(x=mpg, y=weight)) + \/\/\/\n\tgeom_point(aes(col=foreign, size=headroom)) + \/\/\/\n\tgeom_smooth(method=&quot;loess&quot;) +   \/\/\/\n\tlabs(title=&quot;ggplot2 demo&quot;, x=&quot;MPG&quot;, y=&quot;Weight&quot;, caption=&quot;Caption!&quot;)\nrcall: ggsave(&quot;ggtest.png&quot;, plot=e)<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Example 2: Using &#8220;comorbidity&#8221; R package in Stata with Rcall to estimate Charlson comorbidity index or Elixhauser comorbidity score<\/h2>\n\n\n\n<p>Read all about this handy package <a rel=\"noreferrer noopener\" href=\"https:\/\/cran.r-project.org\/web\/packages\/comorbidity\/index.html\" target=\"_blank\">here<\/a> and in the <a rel=\"noreferrer noopener\" href=\"https:\/\/cran.r-project.org\/web\/packages\/comorbidity\/comorbidity.pdf\" target=\"_blank\">PDF reference manual<\/a>. In R, type:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>install.packages(\"comorbidity\") <\/code><\/pre>\n\n\n\n<p>Here&#8217;s some semicolon delimited Stata code to run from a Stata do file apply the Charlson comorbidity index to some Stata data.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>webuse australia10, clear \/\/ load a default stata dataset with an ICD10 variable\ngen id=_n \/\/ make an ID by row as there's no ID variable in this dataset\n\n#delimit ;\nrcall clear ; \nrcall: library(comorbidity) ; \/\/ load comorbidity package\nrcall: data&lt;- st.data() ; \/\/ move data to r\nrcall: names(data) ; \/\/ look at data\nrcall: head(data, n=10) ; \/\/ look at rows\nrcall: charlston &lt;- comorbidity(x=data, id=&quot;id&quot;, code = &quot;cause&quot;, \n\t\tmap = &quot;charlson_icd10_quan&quot;, assign0 = FALSE) ;\nrcall: score(x=charlston, weights = &quot;charlson&quot;, assign0=FALSE) ;\nrcall: mergeddata &lt;- merge(data, charlston, by=&quot;id&quot;) ; \/\/ merge the original &amp; new charlson data\nrcall: head(mergeddata, n=10) ; \/\/ look at rows\nrcall: st.load(mergeddata) ; \/\/ kick the merged data back to stata\n#delimit cr\n<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Example 3: Predicting race by last name and sex by first name using the &#8216;predictrace&#8217; package<\/h2>\n\n\n\n<p>I came across this &#8216;predictrace&#8217; package: <a href=\"https:\/\/jacobkap.github.io\/predictrace\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/jacobkap.github.io\/predictrace\/<\/a><\/p>\n\n\n\n<p>&#8230;which says that it implements the methods described in this paper: <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Tzioumis, K. Demographic aspects of first names. <em>Sci Data<\/em> <strong>5<\/strong>, 180025 (2018). https:\/\/doi.org\/10.1038\/sdata.2018.25<\/li>\n\n\n\n<li><a href=\"https:\/\/www.nature.com\/articles\/sdata201825\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.nature.com\/articles\/sdata201825<\/a><\/li>\n<\/ul>\n\n\n\n<p>Here&#8217;s how to use this package in Stata using Rcall. <strong><em>There is a lot of nuance in this package so make sure to read the paper and the github page<\/em><\/strong>. <\/p>\n\n\n\n<p>In R, type: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>install.packages(predictrace)<\/code><\/pre>\n\n\n\n<p>Then in Stata, write a do file that generates a variable called &#8220;lastnames&#8221; that is lower case last names (for race matching) and &#8220;firstnames&#8221; that is lowercase first names (for sex matching). Below is the code to estimate race by last name (first batch of code) and then sex by first name (second batch of code). This is semicolon delimited.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Estimating race by last name<\/h3>\n\n\n\n<p>(Note: this considers &#8220;Hispanic&#8221; to be a race and not an ethnicity&#8230;)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ Clear memory, input dataset of first and last names.\n\/\/ Here, I'm formatting the strings so they are up to 100 characters\n\/\/ in length so they don't get clipped (str100). \n\/\/ If I specified str5 then \"Flintstone\" would be \"Flint\". \nclear all\ninput str100 firstname str100 lastname\n\"Jacob\" \"Peralta\"\n\"Rosa\" \"Diaz\"\n\"Terrence\" \"Jeffords\"\n\"Amy\" \"Santiago\"\n\"Charles\" \"Boyle\"\n\"Regina\" \"Linetti\"\n\"Raymond\" \"Holt\"\n\"Michael\" \"Hitchcock\"\n\"Norman\" \"Scully\"\nend\n\ncompress firstname \/\/ optional, shortens string format from 100 char to the minimum length\ncompress lastname \/\/ optional, shortens string format from 100 char to the minimum length\n\n\/\/ now replace all names with their lower case variant:\nreplace firstname = lower(firstname) \/\/ first name isn't used here fyi\nreplace lastname = lower(lastname)\n\n#delimit ;\nrcall clear ; \nrcall: library(predictrace) ; \/\/ load predictrace\nrcall: data&lt;- st.data() ; \/\/ move data to r\nrcall: names(data) ; \/\/ look at names of data\nrcall: head(data, n=10) ; \/\/ look at first 10 rows\nrcall: lastnamevector &lt;-data&#091;, &quot;lastname&quot;] ; \/\/ make vector from lastname column \nrcall: data = merge(data, predict_race(lastnamevector), by.x = &#039;lastname&#039;, by.y = &#039;name&#039;, sort = FALSE) ; \nrcall: head(data, n=10) ; \/\/ look at rows\nrcall: st.load(data) ; \/\/ kick the merged data back to stata\n#delimit cr<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Estimating sex by first name<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ Clear memory, input dataset of first and last names.\n\/\/ Here, I'm formatting the strings so they are up to 100 characters\n\/\/ in length so they don't get clipped (str100). \n\/\/ If I specified str5 then \"Flintstone\" would be \"Flint\". \nclear all\ninput str100 firstname str100 lastname\n\"Jacob\" \"Peralta\"\n\"Rosa\" \"Diaz\"\n\"Terrence\" \"Jeffords\"\n\"Amy\" \"Santiago\"\n\"Charles\" \"Boyle\"\n\"Regina\" \"Linetti\"\n\"Raymond\" \"Holt\"\n\"Michael\" \"Hitchcock\"\n\"Norman\" \"Scully\"\nend\n\ncompress firstname \/\/ optional, shortens string format from 100 char to the minimum length\ncompress lastname \/\/ optional, shortens string format from 100 char to the minimum length\n\n\/\/ now replace all names with their lower case variant:\nreplace firstname = lower(firstname)\nreplace lastname = lower(lastname) \/\/ last name isn't used here fyi\n\n#delimit ;\nrcall clear ; \nrcall: library(predictrace) ; \/\/ load predictrace\nrcall: data&lt;- st.data() ; \/\/ move data to r\nrcall: names(data) ; \/\/ look at names of data\nrcall: head(data, n=10) ; \/\/ look at first 10 rows\nrcall: firstnamevector &lt;-data&#091;, &quot;firstname&quot;] ; \/\/ make vector from firstname column \nrcall: data = merge(data, predict_gender(firstnamevector), by.x = &#039;firstname&#039;, by.y = &#039;name&#039;, sort = FALSE) ; \nrcall: head(data, n=10) ; \/\/ look at rows\nrcall: st.load(data) ; \/\/ kick the merged data back to stata\n#delimit cr<\/code><\/pre>\n\n\n\n<p>Special thanks to Katherine Wilkinson for her R brilliance in debugging this. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Example 4: Making a heatplot with lattice\/levelplot<\/h2>\n\n\n\n<p>I wanted to make a heatplot ranging from -1 to +1 and wanted the negatives to be a different color from the positives, and have them turn more muted as they get to zero. I couldn&#8217;t quite figure out how to do this with the &#8211;twoway contour&#8211; or &#8211;plotmatrix&#8211; commands. It was pretty simple to do these in R, just needed to use the <a href=\"https:\/\/r-graph-gallery.com\/38-rcolorbrewers-palettes.html\" data-type=\"link\" data-id=\"https:\/\/r-graph-gallery.com\/38-rcolorbrewers-palettes.html\" target=\"_blank\" rel=\"noreferrer noopener\">RColorBrewer package<\/a>, specifying &#8220;BrBG&#8221; as the palette. I used the &#8220;lattice&#8221; package and its &#8220;levelplot&#8221; command as described in <a href=\"https:\/\/r-graph-gallery.com\/27-levelplot-with-lattice.html\" data-type=\"link\" data-id=\"https:\/\/r-graph-gallery.com\/27-levelplot-with-lattice.html\" target=\"_blank\" rel=\"noreferrer noopener\">this post<\/a>.<\/p>\n\n\n\n<p>In R, install the lattice and RColorBrewer packages (lattice was already installed on my R desktop):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>install.packages(\"lattice\")\ninstall.packages(\"RColorBrewer\") <\/code><\/pre>\n\n\n\n<p>Now in Stata, input the data you&#8217;re interested in rendering in order of columns left to right then each column. Then, kick it to R, load the necessary libraries, grab your colorbrewer scheme of choice, label the axes, and then use the &#8220;levelplot&#8221; to render this. It will save a PDF in your R working directory. I couldn&#8217;t for the life of me figure out how to save it as a PNG but there was diminishing returns on figuring that out.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>clear all\ninput x y z \n-0.8 0.2 0.4\n0.7 -0.4 0.1\n0.9 -0.2 -0.4\nend \n\n#delimit ;\nrcall clear ; \nrcall: data&lt;- st.data() ; \/\/ move data to r\nrcall: names(data) ; \/\/ look at names of data\nrcall: head(data, n=10) ; \/\/ look at first 10 rows\nrcall: library(lattice) ; \/\/ load packages\nrcall: library(RColorBrewer) ;\nrcall: colors &lt;- colorRampPalette(brewer.pal(16, &quot;BrBG&quot;)) ; \/\/ get colors\nrcall: colnames(data) &lt;-c(&quot;alfa&quot;, &quot;bravo&quot;, &quot;charlie&quot;) ; \/\/names of columns and rows\nrcall: rownames(data) &lt;-c(&quot;one&quot;, &quot;two&quot;, &quot;three&quot;) ;\nrcall: levelplot( \n\t\tt(data&#091;c(nrow(data):1) , ]), \n\t\tcol.regions=colors, \n\t\txlab=&quot;xlabel!&quot;, \n\t\tylab=&quot;ylabel!&quot;,\n\t\tat=seq(min(-1), max(1), length.out=100),\n\t\tscales=list(y=list(rot=0), x=list(rot=45))\n\t) ; \/\/ pull data, apply colors &amp; labels, set color axis range, rotate labels\nrcall: getwd() ; \/\/ figure out where that PDF is saved:\n#delimit cr<\/code><\/pre>\n\n\n\n<p>Here&#8217;s your output!<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"884\" height=\"844\" src=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-5.png\" alt=\"\" class=\"wp-image-1884\" style=\"width:605px;height:auto\" srcset=\"https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-5.png 884w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-5-300x286.png 300w, https:\/\/blog.uvm.edu\/tbplante\/files\/2021\/05\/image-5-768x733.png 768w\" sizes=\"auto, (max-width: 884px) 100vw, 884px\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Stata is great because of its intuitive syntax, reasonable learning curve, and dependable implementation. There&#8217;s some cutting edge functionality and graphical tools in R that are missing in Stata. I came across the Rcall package that allows Stata to interface with R and use some of these advanced features. (Note: this worked for me as &hellip; <a href=\"https:\/\/blog.uvm.edu\/tbplante\/2021\/05\/27\/stata-r-integration-with-rcall\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Stata-R integration with Rcall<\/span><\/a><\/p>\n","protected":false},"author":4473,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[477491],"tags":[703377,703379,703378,703380,703498,703499,686490,686734,703505,686197,686916,703496,686336,703497,502556],"class_list":["post-779","post","type-post","status-publish","format-standard","hentry","category-stata-code","tag-charlson","tag-charlson-comorbidity-index","tag-elixhauser","tag-elixhauser-comorbidity-score","tag-estimating-race-by-name","tag-estimating-sex-by-name","tag-ggplot2","tag-ggstatsplot","tag-heatplot","tag-r","tag-r-stata-integration","tag-race-by-name","tag-rcall","tag-sex-by-name","tag-stata"],"_links":{"self":[{"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/posts\/779","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/users\/4473"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/comments?post=779"}],"version-history":[{"count":63,"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/posts\/779\/revisions"}],"predecessor-version":[{"id":2175,"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/posts\/779\/revisions\/2175"}],"wp:attachment":[{"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/media?parent=779"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/categories?post=779"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.uvm.edu\/tbplante\/wp-json\/wp\/v2\/tags?post=779"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}