20M digits of pi in 1 minute using Julia

queuebert · on March 4, 2023

I've found Julia to be an excellent language for Project Euler [1]. Besides the speed, you can use Unicode identifiers, so the solution can closer follow the math.

1. https://projecteuler.net/

recursive · on March 4, 2023

That doesn't seem to be a particularly distinguishing feature. I know that C#, python, and javascript can all use non-ascii identifiers. I believe most widely used languages can also, but I'm less certain of the details.

adgjlsfhk1 · on March 4, 2023

I think the distinction is that Julia's parser does a relatively good job assigning code-points to unary operators/binary operators/identifiers as appropriate, combined with the fact that multiple dispatch inherently looks more like math.

martinsmit · on March 4, 2023

The difference is that Julia makes it easy to insert these characters (\symbol) so programming with them becomes more natural.

jb1991 · on March 4, 2023

There’s also Swift, of course.

https://docs.swift.org/swift-book/documentation/the-swift-pr...

npalli · on March 4, 2023

Julia will read 2π (fixed:thx kps) as 2 times pi which follows the mathematical notation. In all the other languages you have mentioned, I believe even if you can represent pi in Unicode (π) you will have to write 2*π. This can become very irritating when you get to for ex. python/numpy where you will litter your code with * and ** instead of the conventional mathematical notation. Gets a bit tiring if you are looking at more than a page of code. Aesthetics matter.

oefrha · on March 4, 2023

If you really care about mathematical notations, you should be using Mathematica, not Julia.

juliusgeo · on March 4, 2023

Mathematica is both closed source, and very idiosyncratic. It is primarily useful as a way to check things, but I have never seen anyone in actual research use Mathematica.

adgjlsfhk1 · on March 4, 2023

Mathematica is definitely better than Julia for symbolic math, but for computational math or algorithm development, Matlab is basically unusable.

galangalalgol · on March 4, 2023

I find that really surprising, would you elaborate? I love julia but I haven't found a repl or pluto based workflow that I am anywhere near as productive in for prototyping algorithms as I am in Matlab. And it typically doesn't even run that much slower.

adgjlsfhk1 · on March 4, 2023

typo, sorry. that Matlab was supposed to be Mathematica. IMO Julia is also better than Matlab here (I use editor+repl, but Pluto is also pretty good), but Matlab is definitely a contender in this area.

galangalalgol · on March 4, 2023

Editor from within the repl or outside? If outside, what triggers the repl to reload files? If inside, which editor? I never found a good way to copy and paste a line I got right and liked into a file. I tried drdocstrings but it never seemed to feel smooth bit doesn't help all the keys in julia default to being emacslike and ally muscle memory is vimish. So even after making my repl editor vim, there is stuff in the repl that feels foreign.

adgjlsfhk1 · on March 4, 2023

editor outside the repl. Revise means that you get auto-reload on save.

galangalalgol · on March 4, 2023

Thanks! I thought revise only worked if I used the internal editor.

staunton · on March 4, 2023

Why would you use Matlab to "prototype algorithms"? Maybe we have different notions what that means but Matlab is certainly not aimed at that. It's aimed at engineers who want to evaluate their data or design/control embedded devices, and who are experts at their field of engineering while having very little computer science, numerics or programming background.

DNF2 · on March 6, 2023

I know no one who uses Matlab for the uses you mention. Basically everyone is using it for algorithm development, that includes both students and companies. And the way I understand it, algorithm development/prototyping also its core application.

Not that I like doing this in Matlab, but that's what everyone uses, though some are moving to python.

galangalalgol · on March 5, 2023

I work in embedded DSP and controls, and even with a very strong programming background, I find Matlab to be a lot faster to verify my approach for something is going to work than julia or numpy. Not because of all the toolboxes, which I mostly don't use, but because the work flow makes it easier to try stuff fast and visualize it in meaningful ways more easily.

moelf · on March 4, 2023

the Julia REPL allows you to type unicode with \blah, also if you use `?` help mode, you can give it a unicode sequence and it will tell you how to type it, so it's nice quality of life

    help?> Ψ₂ˣ
    "Ψ₂ˣ" can be typed by \Psi<tab>\_2<tab>\^x<tab>

andyferris · on March 4, 2023

While true, one distinguishing features is the sheer number of unicode binary and unary operators that can be used and overloaded. Take for example ∪ for set union and ∩ for set intersection (eg `set1 ∪ set2`).

(You can also type these easily via latex-style completions, e.g. with \cup<TAB> for ∪ or \cap<TAB> for ∩)

Someone · on March 4, 2023

Nitpick: is that really computing 20M digits correctly? Reading the code gives me the impression that this does all float computations in 20 million digits (or 20 million-ish? If BigFloat internally is binary, it might use a bit more), but I would expect that you need to compute using a few more digits to avoid rounding errors in the last digits.

Edit: also, reading https://stackoverflow.com/a/67919533, it seems you can do

  BigFloat(π, precision=20000000)

How does that perform?

SherlockSage · on March 4, 2023

I ran the following in my Julia REPL

julia> @time BigFloat(π, precision=20000000); 11.208211 seconds (60.53 k allocations: 2.932 GiB, 0.16% gc time)

Which is pretty good considering I ran this in WSL on my laptop and Mathematica in a different comment took 10 seconds. (Plain Windows took ~26 seconds for some reason?)

Someone · on March 4, 2023

I’m too late to update my own comment, but I now suspect BigFloat(π, precision=20000000) may allocate a Bigfloat with precision of two million digits and then store the constant π which has much lower precision in it.

What value does that print?

juliusgeo · on March 4, 2023

It definitely does calculate it otherwise it would diverge significantly at lower digits from the Gauss Legendre and Chudnovsky algorithm that I compared it to. It just defers calculation to MPFR, the C library Julia binds to for BigFloat calculations.

juliusgeo · on March 4, 2023

Also one additional comment is that you are setting precision in number of bits, not in decimal digits. It obviously runs much faster when it is only computing log2(20000000) not 20000000 digits of precision.

rnestler · on March 4, 2023

> (Plain Windows took ~26 seconds for some reason?)

Maybe the allocations were the reason?

juliusgeo · on March 4, 2023

BigFloat internally is not binary, it uses an array because it is made for arbitrary precision. There will absolutely be small errors at the end but it’s easier to say 20M than 1.999999M digits of pi.

jiggawatts · on March 4, 2023

In Wolfram Mathematica on my desktop PC with a 7 year old processor:

    Timing[N[\[Pi], 20000000];]
    {10.1094, Null}

On a fairly recent laptop CPU I got 7.6 seconds.

A fair ways to go still!

juliusgeo · on March 4, 2023

While this is impressive, it is neither the same algorithm, nor even the same type of language (I already know Julia is much slower than C). I don’t think it is much of a relevant comparison for a benchmark when almost nothing is the same except for the expected answer. Perhaps if Gauss Legendre were implemented in Mathematica it would only take 3 seconds! There’s no way for us to know.

xdavidliu · on March 4, 2023

does Mathematica actually compute it or is it just loading the value that it already knows?

jiggawatts · on March 4, 2023

Definitely computing. For arbitrary precision evaluation like this, Mathematica has always had cutting-edge algorithms built in.

As a random example: large matrix operations have been multi-threaded for ages.

extasia · on March 4, 2023

7.6 second seems too much to be just loading a value from disk

moelf · on March 4, 2023

it's not using the same algorithm right?

adgjlsfhk1 · on March 4, 2023

This is pretty cool. One minor gripe is that there were a lot of useless type asserts cluttering things up. every usage of ::BigFloat here wasn't needed.

juliusgeo · on March 4, 2023

This is absolutely true—I was just putting them everywhere possible in an attempt to avoid allocations.

cmovq · on March 4, 2023

I don’t know much about Julia, why would it potentially cause allocations?

adgjlsfhk1 · on March 4, 2023

if the code was type unstable that can cause allocations, but in cases like this it wouldn't. Type instability is the ability of the compiler to predict the types of variables from the types of the inputs. Type unstable code can be found using `@code_warntype` and will cause allocations do to boxing and dynamic dispatch.

juliusgeo · on March 4, 2023

Thank you for the feedback. I will try to remove as many of the explicit type hints as possible. The main reason I added them everywhere is because I found sometimes there would be instability at super low decimal ranges because of type promotion

xmcqdpt2 · on March 4, 2023

Yeah this is a minor gripe with Julia. It has type inference but it's risky to use.

andyferris · on March 4, 2023

True, but you can remove a lot of the risk by inspecting the results of type inference with `@code_warntype` in places where you suspect it might be not be working.

I believe JET.jl can automate global program analysis, but I haven't used it.

Temporary_31337 · on March 4, 2023

Just wanted to mention that SuperPi - the benchmark that is very widely used for at least a decade for measuring PC / CPU single threaded performance uses the same algorithm so not really obscure. That said, multi threaded algorithms make more sense for modern multi core computer architectures, something like this for example:

http://www.numberworld.org/y-cruncher/

userbinator · on March 4, 2023

I still remember when the SuperPI 1M <10s record was beaten... with a CPU overclocked to 5GHz+: https://tweakers.net/ext/i/1160952453.gif

The world record remains at just under 4s, with 8GHz+ oerclocks: https://hwbot.org/benchmark/superpi_-_1m/halloffame

Clock cycle efficiency doesn't seem to have improved much in the last decade.

goldenkey · on March 4, 2023

For reference: https://en.wikipedia.org/wiki/Gauss%E2%80%93Legendre_quadrat...

bakul · on March 4, 2023

The following implementation of the Chudnovsky algorithm takes about the same time in Gambit-Scheme on M1 Mac for 20M digits (and allocates 16GB):

  (define (pi digits)
    (let* ((A 13591409)
           (B 545140134)
           (C 640320)
           (C^3/24 (quotient (expt 640320 3) 24))
           (D 12))
      (define (-1^n*g n g) (if (odd? n) (- g) g))
      (define (split m n)
        (if (= 1 (- n m))
          (let* ((6n (* 6 n)) (g (* (- 6n 5) (- (+ n n) 1) (- 6n 1))))
            (list g (* C^3/24 (expt n 3)) (* (-1^n*g n g) (+ (* n B) A))))
          (let* ((mid (quotient (+ m n) 2))
                 (gpq1 (split m mid))
                 (gpq2 (split mid n))
                 (g1 (car gpq1)) (p1 (cadr gpq1)) (q1 (caddr gpq1))
                 (g2 (car gpq2)) (p2 (cadr gpq2)) (q2 (caddr gpq2)))
            (list (* g1 g2) (* p1 p2) (+ (* q1 p2) (* q2 g1))))))
      (let* ((num-terms (inexact->exact (floor (+ 2 (/ digits 14.181647462)))))
             (sqrt-C (integer-sqrt (* C (expt 100 digits))))
             (gpq (split 0 num-terms))
             (g (car gpq)) (p (cadr gpq)) (q (caddr gpq)))
        (quotient (* p C sqrt-C) (* D (+ q (* p A)))))))

I am surprised it is much slower in Julia (as per what is noted in the gist).

juliusgeo · on March 4, 2023

Interesting! What does Gauss Legendre look like in Gambit-Scheme? I also wonder if perhaps there is a way to get the Julia implementation up to par…

bakul · on March 4, 2023

I will give it a try, time permitting. The scheme code above is a variation of something I converted from CommonLisp (orignal code by Robert Smith) in 2013. Its speed depends on bignum multiplication speed. I imagine most serious bignum implementations use the Schönhage–Strassen algorithm for very large numbers.

juliusgeo · on March 4, 2023

Thanks for looking into this! That is interesting, I wonder if it varies a great deal based on the implementation of the BigFloat rather than the language that it is implemented in.

IncRnd · on March 4, 2023

Here is a video [1] and associated java source code [2] from pi day in 2018 to generate 1M (or more) digits of pi using the same Gauss-Legendre algorithm.

[1] https://www.youtube.com/watch?v=8RONJPOgZjw

[2] https://github.com/R-e-t-u-r-n-N-u-l-l/Java-Algorithms/blob/...

juliusgeo · on March 4, 2023

My algorithm only takes about 1.5 seconds for 1M digits. These are very cool resources though, I will read them.

IncRnd · on March 4, 2023

Thanks! I haven't run any timings on the link I posted. I suspect julia would be much faster than java, though! I just thought it was neat that your post in March reminded me of an earlier pi day post on the same algorithm. I like the explosion from 1M to 20M.

juliusgeo · on March 4, 2023

I would love to see the timings for Java runtimes. The various implementations of BigFloats seem to be very customized to the language, so maybe you could do it faster!

adgjlsfhk1 · on March 4, 2023

On my computer the Julia algorithm gives .98 seconds for 1M digits and 40.7 seconds for 20M (zen 2 cpu).

juliusgeo · on March 4, 2023

Thanks for the data!

joseftexas · on March 4, 2023

Isn't this is more hardware dependent? A 2012 Intel notebook vs a 2022 AMD laptop would yield different results with same code. Also a well tune assembly can do these way faster. A python wrapped with assembly functions and run on a machine 6ghz Intel machine with ramdisk would also outperform this. I always feel this kind of claim is not helpful in day to day usage.

juliusgeo · on March 4, 2023

That is why I compared it to the stdlib implementation.

tiffanyh · on March 4, 2023

Is 1-minute fast?

ewams · on March 4, 2023

I've done 2.5 billion digits in 88 seconds with ycruncher. That's about 28.4 million digits a second?

http://ewams.net/?date=2020/04/17&view=sockets_vs_cores_for_...

Though that is with a broadwell CPU and about 3 years ago. Might see if I can get my hands on a 6348 ice lake CPU and see how she burns.

Joel_Mckay · on March 4, 2023

All variables held the same, Julia often beats C, NumPy, and most other languages.

Julia also supports transparent Cluster API and CUDA integration. i.e. the hot-mess you hope to never have to maintain professionally.

In my humble opinion, the utility of this high-level language makes it fundamentally different. Thus, it is slowly becoming more mainstream as the ecosystem stabilizes.

Remember to have fun =)

jiggawatts · on March 4, 2023

I just ran the same thing in Mathematica and I got 20 million digits in just over 7 seconds.

You can have "very high level functional programming" and 8x the speed of Julia in a mathematics-oriented language!

Joel_Mckay · on March 4, 2023

Mathematica does several optimizations before execution, and as such can exploit a priori knowledge of many problems.

The danger of course is when Wolfram takes those assumptions into a room of mathematicians. It is controversial for all the wrong reasons. =)

It is a common question by the way =)

https://stackoverflow.com/questions/60121757/julia-vs-mathem...

Max_Limelihood · on March 4, 2023

Julia already is a very high-level mathematics-oriented programming language, though.

The reason Mathematica is so much faster here is it’s using a different algorithm. When you compare using the same algorithm, Julia is 10-100x faster than Mathematica. https://julialang.org/benchmarks/

bboygravity · on March 4, 2023

But last I checked there's Julia for Python? So you could get the best of both world (nice syntax of Python and the speed of Julia).

Max_Limelihood · on March 4, 2023

I feel like you’re misunderstanding how Julia syntax works. Julia syntax is Python syntax, for the most part. They’re not identical, but the differences are very small, and typically favor Julia—for example, compare:

x*(y.^2)

To the Python equivalent:

np.matmul(x, map(lambda x : x^2, y))

Note also that the first will be much faster because it can fuse the matrix multiplication with the exponentiation. That’s because Julia is a single language, and the whole thing is compiled at the same time. Python can’t fix the above code because NumPy is written in C, not Python.

Julia is just Python but fast.

Joel_Mckay · on March 4, 2023

Yes, Julia has wrappers for many other languages, so there are options if you really need some specific chunk of C/C++, Fortran, Python, Octave or R etc.

Yet a polyglot project is bad design, and tends to become an abomination in time. This is one reason many Go programmers rewrote libraries rather than saturate their code with cgo/C and SWIG/C++ wrappers. Similarly, people are writing scalable versions of libraries in Julia to improve transparent parallel performance for problems too big for a single machine to feasibly handle.

Things are still undergoing change, but of course the commercial nvidia binary blobs will haunt everyone for awhile. =)

juliusgeo · on March 4, 2023

In the article I compare my implementation to the MPFR pi implementation which is in pure C. So the fact I’m able to get close to that performance in Julia is impressive(I think).

ftxbro · on March 4, 2023

When I googled digits of pi I found http://www.numberworld.org/y-cruncher/ where they get 25M digits in about one second (rather than one minute) depending on what kind of computer you use.

juliusgeo · on March 4, 2023

y-cruncher is the state of the art but it is also not written in a language like julia, and doubtlessly occupies many more lines of code

Olreich · on March 4, 2023

Y-cruncher is closed source, there’s no way to know the number of lines of code it uses compared to Julia.

juliusgeo · on March 4, 2023

That is true, but isn’t it then more useful to provide an open source method of computation at these ranges?

boredemployee · on March 4, 2023

genuine question: would that amount of digits be useful for any application?

fisherjeff · on March 4, 2023

https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-decimal...

juliusgeo · on March 4, 2023

Thank you for posting this! This is exactly what I was looking for.

yesenadam · on March 4, 2023

TLDR: "For JPL's highest accuracy calculations, which are for interplanetary navigation, we use 3.141592653589793"...Using pi to 15 decimal places, you'd get the distance of Voyager 1 wrong by less than 1cm!

juliusgeo · on March 4, 2023

No, it is basically only useful as a programming exercise. You only need like 40 digits to compute the circumference of the universe to the precision of a hydrogen atom.

edgyquant · on March 4, 2023

What is the circumference of the universe?

mk_stjames · on March 4, 2023

2.8×10^39 picometers is circumference of the observable part of our universe, roughly. A sole hydrogen atom has a bohr radius of about 25 picometers.

So 40 digits of pi would indeed be in the ballpark for computing anything distance-wise at the circumference of the observable universe down to the hydrogen electron cloud size.

kataklasm · on March 4, 2023

One of the siblings posted a link from JPL about the number of digits they use for their highest precision calculations. In that article they also state that 37 decimals would be sufficient for calculating the cirumference of the observable universe down to the accuracy of the diameter of a hydrogen atom.[0]

[0] https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-decimal... (Right before the last inline image)

melagonster · on March 4, 2023

https://github.com/philipl/pifs

but this is a joke hahaha