Julia for Stata users: Part 1 – getting set up

I’m a big fan of Stata because of its simplicity and excellent documentation. I have a lot of colleagues that use R and I see why they like it, but I’m not a fan of the R syntax. I’ve played with Python and liked the syntax quite a bit but found simply installing Python and its packages to be really annoying on Windows.

The Julia language seems to have some of the nice features of R (e.g., arrays start at 1, Tidyverse-like packages, ggplot2 was rewritten in Julia) and syntax similar to Python. Julia seems to have a unique(?) feature called “broadcasting” (with shorthand being just a dot or “.”) that allows you to run commands by row. The help files seem okay, though pretty brief. Julia users aren’t typically huge jerks on online forums. I’ve read online that the speed of Julia is very attractive to R users (but I do epidemiology work in ‘relatively’ small datasets, e.g., <1 million observations, so speed of my statistical packages doesn't really make a difference in my analyses). It is similar to Stata in that missing values are treated as positive infinity. Unlike Stata, it'll use all available CPU cores in the base version (Stata only uses a single core unless you pay for a more expensive version). Finally, Julia seems to be pretty well-developed for AI/ML (as are R and Python), which is something that Stata leaves to be desired.

My main reservation about Julia is that it’s relatively new since it was only started ~13 years ago (it’s currently 2025) so it’s not going to have every package or instructional post under the sun. It’s old enough that I expect the language to be reasonably well-developed though. I’m also not a huge fan of capital letters in coding since I’m used to Stata and everything in Stata is lower case. That’s DEFINITELY not the case in Julia, and Julia is not at all forgiving if you type “using pkg” instead of “using Pkg”. But I thought I’d give it a go!

To the point of Tidyverse, my perspective is that much of the rise of R’s popularity is in the development and uptake of Tidyverse. Tidyverse is a meta-package that brings together a bunch of individual packages that simplify data analysis. Core to Tidyverse is the concept of “tidy data”, meaning that:

  1. Each variable is a column; each column is a variable.
  2. Each observation is a row; each row is an observation.
  3. Each value is a cell; each cell is a single value.

…for Stata users, that should sound familiar since it’s the exact data structure that Stata uses. Only columns are called “variables” and rows are called “observations”. Conceptually, tidy data and the Tidyverse makes R run like Stata. For Julia, there is an implementation of the Tidyverse called Tidier that started ~2 years ago (it’s 2025). Like Tidyverse, Tidier is a meta-package including lots of subpackages:

  • TidierData – For data manipulation, like R’s dplyr and tidyr
  • TidierPlots – For figures, an R implementation of ggplot2
  • TidierFiles – For reading and writing different filetypes, like R’s haven and readr
  • TidierCats– For managing categorical data, like forcats
  • TidierDates – For managing dates/time, like lubridate
  • TidierStrings – For managing strings, like stringr
  • …and a few others

Unlike Stata, Julia is intended to have an infinite amount of “datasets” open at once, some can just be a single string, some can be a vector of data, some can be a full dataframe. However, Tidier is designed to function on only a single dataframe so if needing to include something from a “dataset” outside of the current dataframe, you need to precede the Tidier command with @eval and use a dollar sign in front of whatever the non-current-dataframe thing is. This is called interpolating, and details are here.

There aren’t many drawbacks to using Tidier that I can find, other than it being pretty new and it taking about 15-30 seconds to load the first time you load (“using”) it in a Julia session.

This series of posts documents my foray into the Julia language, being an epidemiology-focused user of Stata on Windows, with an emphasis on using Tidier. Note that Mac and Linux users can probably follow along without any problems, though the installation might be slightly different.

Things that annoy me about Julia

As I’m putting these pages together, I’m coming back to document what annoys me about Julia. Here’s an incomplete list:

  • Lack of forgiveness with capitalization. I’m sorry that I typed “pkg” instead of “Pkg”. Cmon though, can you let it slide? Or at least let me know that it’s a capitalization problem?
  • Pkg isn’t auto-loaded with Julia. C’mon…
  • Strings with backslashes or dollarsigns ($) and probably other characters will confuse Julia. The simple workaround is to smush ‘raw’ before the opening quote of these strings, e.g., raw”$50,000″
  • Sorting functions appear to sort by capital letters first, so “Zebra” would be sorted to be above “apple”. The workaround is to generate a lowercase column and sort on that. It’s clunky.
  • Slowness with loading packages the first time (in 2025). Tidier takes 30 seconds to load, and that’s not unique to Tidier. Yes, Julia is reported to be faster than R, but it doesn’t seem zippy when you are loading things.
  • Needing to manually load packages before using them. It would be nice for Julia to load packages on the fly when called the first time.
  • No way to ‘reset’ Julia like the ‘clear all’ command does to Stata. The current workaround is to close down Julia and open it up fresh.
  • No ‘cheat sheets’ for Julia packages (in 2025) like the awesome ones that have been written for Stata and R. There is a nice ‘general’ Julia cheat sheet though.

Installing Julia on Windows 11

Installing it from the command line prompt is incredibly simple. It automatically sets the PATH and whatnot. Steps: Hit win key + R to open the run prompt, then type “cmd” without quotes to open the command line in Windows (or just hit the start button and type “cmd” and click on “command prompt”), and drop the prompt listed here: https://julialang.org/install/

You can also manually install from the Julia Downloads page, but I would honestly follow the guidance on the install page above and do the command line prompt install. But here’s the Downloads page in case you need it: https://julialang.org/downloads/

The Julia download page also lists a portable version of Julia that you can run off a thumbdrive or as a folder on your desktop without needing to formally install it!: https://julialang.org/downloads/ (I’d get the 64 bit version, unlikely that you’ll run into a 32 bit windows PC these days. You can check your Windows version in cmd by entering “wmic os get osarchitecture”.) You just unzip the folder where you want it. You can run Julia by clicking “bin\julia.exe”. It takes about 30 seconds to open, but it works!

Running Julia on Windows 11

There are a few ways to run Julia. Since I’m just starting out, I’ll be using Option 1, which is a lot simpler.

Option 1 – Write Julia’s *.jl scripts in Notepad++ and run by copying/pasting them into Julia running in the Windows Terminal — aka the ‘low tech’ way

I’m a HUGE fan of Notepad++ as a text editor, and it works very well to write Julia scripts. (Notepad++ is for Windows only. If you are on MacOS or Linux, you might want to try Sublime Text Editor instead.) You’ll want to get the markup to reflect the Julia language. Steps:

  1. Download and install Notepad++ (get the 64-bit version, aka x64)
  2. Add Julia markup to Notepad++ as follows:
    • Download the XML file here: https://github.com/JuliaEditorSupport/julia-NotepadPlusPlus (click the XML file then click the “download as raw” button that’s a couple over from the “raw” button).
    • In Notepad++, click Language –> User defined language –> Define your language… and then import the XML file.
  3. To test it out, select Julia markup from the language list (Language –> way below the alphabetized list click “Julia”).

FYI, there is a portable version of Notepad++, so you can install it from your thumbdrive or local folder without having to install it. This would match a portable version of Julia. Details: https://portableapps.com/apps/development/notepadpp_portable

Now make a new script and save it as a Julia script file with the extension “*.jl” (that’s a J and an L). When you save with that file extension, Notepad++ automatically knows it’s for the Julia language and will apply your downloaded Julia markup.

Here’s an example program. I adapted an example counter to 1 billion from section 2.1 of this great Julia Primer from Bartlomiej Lukaszuk: Romeo and Julia, where Romeo is Basic Statistics. Note that you can add an underline in large numbers to make them a bit more readable, so “1000000000” in Julia is equivalent to “1_000_000_000”.

using Dates
# If Dates is not installed, go to the pkg interface ("]") and 
# type "add Dates" and try this code again
for i in 1:1_000_000_000
	if i == 1
		println("Starting!")
		println(now())
		println(" ")
	end
	if i == 500_000_000
		println("Half way through. I counted to 500 million.")
		println(now())
		println(" ")
	end
	if i==1_000_000_000
		println("Done. I counted to 1 billion.")
		println(now())
	end
end
# fin

If you copy and paste that into a *.jl file, you’ll see colored markup.

Double clicking the Julia link on the start menu opens up Julia in Windows Terminal by default (on my computer).

Here’s what Windows Terminal looks like with Julia running:

FYI, You can also open Julia from within Windows Terminal, just pop open the run prompt (win key + R) then type “wt” to open up Windows Terminal. It opens up Powershell by default, but there’s a little drop down that allows you to switch to Julia. Neat!

In a couple of seconds, Julia will load:

There are a few modes in the Julia interface to know about that you can read about here:

  • REPL mode (“julia>”) – This is the interface for Julia. Enter in all of your commands here. If you are in another mode, hit backspace to get back to REPL.
  • Help (“help?>”) – Enter a question mark (“?”), you enter the help context. You can query commands by typing the command name here.
    • Note: You need to load any package before accessing the help file btw, so if you type “?” then “Pkg”, you get an error. First, load Pkg with “using Pkg” then “?” then “Pkg”, and you’ll get the help file.
    • Note: You can also find help files for subcommands of packages. For example, after loading Pkg, you can type “Pkg.status()” to get a list of all installed packages. In the help screen (again, hit “?” to access it), type “Pkg.status” to see a help file for Pkg.status(). Similarly, if you want to learn about subpackages within Tidier, you need to list them after a dot, e.g., “Tidier.TidierDates”, or subsubpackages/commands by stringing multiple dots, e.g., “Tidier.TidierDates.difftime”.
    • Hit backspace to go back to REPL.
  • Shell (“shell>”)- Hit the semicolon (“;”) to get to the system shell. I have no idea how accessing the shell is of any value to me for what I’d use Julia for.
    • In Windows, you need to secondarily open Powershell by then typing “powershell” or the command prompt by typing “cmd”.
    • Hit backspace to go back to REPL.
  • Pkg to install/add new packages (“Pkg>”) – Aka installing new programs.
    • Type “add” and the name of the package of interest to install the package. E.g., “add Dates”.
    • Type “status” to see installed packages.
    • Type “update” to update all packages.
    • Hit backspace to go back to REPL.

But how do you run scripts from Notepad++ in Julia’s REPL mode? When you want to run bits of your script, simply copy and paste it into the Julia REPL interface. Note: One little glitch is that when you paste into REPL, the last line doesn’t load until you hit enter, you’ll see it just chilling on the input (e.g., “Julia > end”) when you copy/paste. You can do a workaround by having the last line of your code be a comment that isn’t actually needed to run your code. I’m using “# fin” as my last-line comment because it’s super classy. If you see “# fin” hanging out on the “Julia>” input line in REPL, just delete it before pasting anything else or your first line will follow “# fin” and be interpreted as a comment by Julia.

If you copy above and paste the “count to a billion” code from Notepad++ into the Windows Terminal running Julia, you get the following:

Cool! Julia counted to 1 billion in a little over 1 second. WOW THAT’S FAST!

In Part 2, I show an alternative way of running a script you are updating and saving in Notepad++ from within Julia, rather than copying and pasting everything.

Option 2 – Visual Studio Code with the Julia extension, aka the R Studio of Julia

R Studio is the most iconic Integrated Development Environment (IDE) for R. There were a few projects intending to be IDEs for Julia (e.g., Atom and Juno) that have halted development in support of Visual Studio (VS) Code’s Julia extension. VS Code is a general-purpose IDE developed by Microsoft that’s widely used in computer science and is available for all sorts of operating systems, not just Windows. VS Code should not be confused with Visual Studio, which is another IDE that Microsoft makes. The core of VS Code is open source, but the specific VS Code download from Microsoft isn’t.

Despite being a long-time Windows and Office user, I’m really not a fan of Microsoft products. They tend to be bloated and include whatever is trendy in the business world. VS Code is pretty bloat-free but does include a ‘bloaty’ AI assistant. I find VS code to be a bit overwhelming, which is why I like Notepad++.

Downloading VS code:

You can also install VS Code as a portable app, so presumably you can have a portable version of Julia and also a portable version of VS Code both running from a thumbdrive or desktop folder without requiring an install. Details are here: https://code.visualstudio.com/docs/editor/portable

I might one day switch over to VS Code, but for now I’m sticking with Notepad++ and Windows Terminal.

Option 3 – Using Pluto notebooks, aka the Julia equivalent to Jupyter notebooks

Jupyter notebooks are ‘reactive’ notebooks that allow you to write and execute code in a browser, and see results in-line. Jupyter notebooks are great but require Python for their use. Pluto is very similar to Jupyter notebooks except implemented using Julia code alone — no requirement for Python. You can read all about Pluto here.

After installing it (‘using Pkg’ and ‘Pkg.add Pluto’), you can run Pluto in a browser by typing ‘Pluto.run()’. You can then run Julia interactively in your browser. If you want to run a block of code all at once (rather than line-by-line), you need to add a “begin” before all of the code and “end” after all of the code and indent what’s in between. Here’s an example of what this looks like, adapting some code for counting to 1 billion from above. (Note that this was taking A REALLY LONG TIME so I made it count to 1 million and not 1 billion.) You’ll see that the code has “begin” and “end” and everything is indented in between. The Julia output shows up below.

Pluto is REALLY COOL and seems user friendly for simple projects. I’m worried about how slow it was compared to the Julia in Windows Terminal. I’ll probably explore it a bit later. For now, I’m sticking with Notepad++ and Julia in Windows Terminal.

Installing packages in Julia

Stata is nice in that it comes with built-in functionality that allows you to do 95% of what you’re trying to do right out of the box. Occasionally you’ll need to install additional ado programs via SSC. Julia (like R) is limited in what it can do out of the box and requires you to install additional packages to add necessary functionality. Julia’s packages are a 2-step process:

  1. Add (ie install) packages of interest.
  2. In your script, call the package with “using”. (There’s also something called “import” that you can read about here that you may want to use instead of “using”. I’m still not entirely sure of the fundamental differences between “import” and “using”, so I’m sticking with “using” for now.)

The great thing about Julia is that its package management is VERY simple and baked into its interface. It ships with a package manager “Pkg” (details here) that you can load (“using Pkg”) and then call use to install other packages in a script, or alternatively you can switch over to the package management interface. In the REPL (aka “Julia”), simply hit a closing non-curly bracket (“]”) to open the package manager. Exit the package manager back to REPL by hitting backspace.

Above, you’d just hit the “]” key and see:

Again, just hit backspace to get out of the package manager back into Julia’s REPL interface.

Here’s an example for using Pkg in REPL or a script to download Tidier. Since Tidier is a metapackage with TONS of other packages inside of it, so it’ll take a bit of time to install (“add”), and also will take a bit of time to load (“using”) the first time. Subsequent times that you load (“using”) Tidier will be faster, but it’ll still take ~15 seconds to load each time.

using Pkg # need to make Pkg available to Julia
Pkg.add("Tidier") 
#fin

Example for using the package manager, which you enter by hitting “]” in REPL:

add Tidier

Then hit backspace to exit the package manager back to REPL.

After these packages are installed, you can use them, but you need to make them available to Julia in your script, such as the following (Tidier will take a bit to load the first time, be patient — it’s faster with future loads):

using Tidier
[code that uses Tidier]
# fin

To update packages, use the Pkg.update() functionality, in REPL:

using Pkg
Pkg.update("Tidier")

Or in the Pkg interface (“]”):

update Tidier

Continue reading part 2 here.