Julia was always gonna be the RC Cola of data science languages. Python and R we...

krastanov · on June 26, 2022

Why are you restricting the discussion just to data science? In "general" science there are way more devs than just in data science / statistics, and Julia absolutely shines there. Don't get me wrong, the language is general purpose, the ecosystem is a bit niche for now, but still, it seems wild to restrict comments to such a small field as data science.

bzxcvbn · on June 26, 2022

> Julia absolutely shines there

Do you have evidence for that?

I'm a mathematician at a research university, and maybe two of my colleagues are using Julia. Despite their proselytism, everyone else is using Python, C, or math-specific software such as GAP, Matlab, or Mathematica.

krastanov · on June 26, 2022

Depends on what exactly you would like evidence for. Your dissenting comment is that Julia is not popular yet. With that I can easily agree, but that is also not directly related to whether it is an amazing tool, which was my claim.

In terms of examples of hard sciences where it shines: It is the only tool in existence that has at the same time high-quality differential equation solvers and autodiff on them. Compare DifferentialEquations.jl to any other package in any other language. The rich capabilities of the aforementioned package depend on the multiple dispatch + aggressive devirtualization used in Julia. Python/Jax/Tensorflow/Pytorch while wonderful on their own, are nowhere near these capabilities. Matlab/Mathematica do not have these capabilities. The famous C/Fortran/C++ libraries are also far less capable in comparison.

bzxcvbn · on June 26, 2022

Once again, I'm looking for evidence, not your say-so...

krastanov · on June 26, 2022

Quoting from my reply so the evidence is easier for you to notice:

> Compare DifferentialEquations.jl to any other package in any other language

If you do not know how to do such a comparison for yourself, this thread has details https://news.ycombinator.com/item?id=31883793

shadowofneptune · on June 26, 2022

Am I the only one who actually likes the taste of RC the most? That aside, Julia is relatively young for a programming language, 10 years old. It's still possible for it to find a niche, even if it is not the one it aimed for. It took until the late 00s for Python to enter the data science niche in the first place.

anothernewdude · on June 26, 2022

Sure, RC is fine, but if you 1-base your arrays then you chose to be the pariah. I have no patience for technology that just chooses to be special for the sake of it.

stellalo · on June 26, 2022

Let’s be serious: I yet have to see a convincing argument for 0-based being better than 1-based, or the other way around. And designing a language based on “what everybody else does” is definitely not the right approach.

anonymoushn · on June 26, 2022

The argument for 0-based indexing with exclusive upper bounds is that hi - lo == len, that 0-length intervals can be expressed without hi < lo, that if you have found the first index at which a predicate is true and the first index at which it is no longer true, those indices themselves are usable as upper and lower bounds, so you may write my_slice[first_true_idx..first_subsequent_false_idx] to get the sub-slice in which the predicate is true continually, that using mod to restrict indices to the valid range is idx % n rather than (idx - 1) % n + 1.

The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way. Edit: And that people who are not primarily programmers may find 0-based indexing weird.

rightbyte · on June 26, 2022

> Edit: And that people who are not primarily programmers may find 0-based indexing weird

Alas, if just our forefathers had called it "offset" instead of "index".

patrec · on June 26, 2022

> The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way.

Nope. The argument for 1-based indexing with inclusive upper bounds is that, to express ranges, exclusive indexes require you to have a bogus just-one-beyond-the-largest-valid-index element and a way to get the successor for every index. This is a giant PITA for anything but integers (MAX_INT, and x+1), which is why every single programming language I can think of also has end-inclusive ranges in some contexts. If you don't believe me try writing regexps with end-exclusive ranges.

On the other hand, there is no natural way to express empty ranges with 1-based indexing.

anonymoushn · on June 27, 2022

I'd love to learn more about writing regexps with various kinds of ranges. I don't see the issue, but it's very likely that I'm missing something.

patrec · on June 27, 2022

Here's what a regexp for a C-style identifier looks like (ranges are inclusive):

    [a-zA-Z_][a-zA-Z0-9_]*

Here's what the same regexp would look like if regexp ranges were end-exclusive:

    [a-{A-[][a-{A-[0-:_]*

Do you see the issue now? Of course if regexps really worked that way, you'd be better off doing this instead:

    [a-zzA-ZZ][a-zzA-ZZ0-99_]

But that specific trick only works because regexp ranges occur in a set union context.

anonymoushn · on June 27, 2022

I see, the `a-z` range has an inclusive upper bound. That makes sense and it really should be like that.

bzxcvbn · on June 26, 2022

If there are no convincing arguments one way or the other, then "what everybody else does" becomes the convincing argument. Why change established conventions for no good reason? There's no reason for curly braces to indicate control blocks and square braces to indicate indexing, but if a language swapped the two, what would you say?

ModernMech · on June 26, 2022

Indeed, and if you add up all the users of 1-based languages (Fortran, Matlab, R, Excel, etc.) and 0-based languages (C, C++, Java, Python, etc.), I think you’ll find that the 1-based languages have vastly more programmers.

Going by popularity, 1-based indexing is the established convention.

bzxcvbn · on June 26, 2022

> I think you’ll find that the 1-based languages have vastly more programmers.

Any shred of evidence for that? Listing four languages for each won't cut it.

ModernMech · on June 26, 2022

Excel alone has more users than all other languages combined, so it's not even close.

anothernewdude · on June 27, 2022

And how many of those users index anything? Heck given the languages you'd write macros for excel, they really screwed up by picking 1-based.

ModernMech · on June 27, 2022

Indexing is the primary operation in Excel and one of the first things you learn how to do in the language (selecting a range of cells). It’s how you refer to anything, by indexing into the global cell space.

Given Excel’s massive success and nontechnical user base, which again is larger than all languages combined, it’s hard for me to see 1-based indexing was a mistake. I have experience teaching novices how to program, and 0-based indexing is always a sticking point of confusion. So from my perspective, 1-based indexing is the right choice for excel given the user base and programming style.

goatlover · on June 26, 2022

Yes, but not so much in scientific computing, where scientists and mathematicians do a lot of the coding.

kgwgk · on June 26, 2022

Because 1-based languages like Fortran, Matlab, R or Mathematica are less used in scientfic computing than elsewhere?

goatlover · on June 27, 2022

More used in scientific computing, which was my point, because only programmers think of indexing based on offset instead of the first positive integer. Nobody outside of programming in C/Unix inspired languages starts counting at 0.

kgwgk · on June 27, 2022

I may have misinterpreted you comment.

When someone wrote “Going by popularity, 1-based indexing is the established convention” and you replied “Yes, but not so much in scientific computing” I understood that as “1-based indexing is less used in scientific computing compared to general computing”.

goatlover · on June 27, 2022

Oh, either I replied to the wrong person or misread the parent.

goatlover · on June 26, 2022

Fortran, R and Matlab have 1 based arrays. That's pretty common with scientific computing languages. I'd hardly call those pariahs, and Fortran's older than C.

ModernMech · on June 26, 2022

More programming in the world is done in a 1-based language, Excel, which has more users than all other languages combined.

0-based languages are in the minority as far as installed base and usage goes. Being 1-based is a decision dependent on target audience and application.

Fortran, one of the earliest languages, is 1-based, so there’s plenty of historical precedent.

Matlab, one of the most commercially successful languages in history is 1-based, so it’s not exactly a barrier to adoption.

anothernewdude · on June 27, 2022

If you're counting the number of people who use excel spreadsheets as "programming in excel" then that's great. I no longer need to take you seriously.

ModernMech · on June 27, 2022

What else would it be? Excel is a programming language. Using it is using a programming language. Am I missing something?

throwaway290 · on June 26, 2022

This argument could be made about any language, though. Python's indentation significance? Rust's match (vs switch)? R's arrow assignment?

yellowcake0 · on June 26, 2022

both R and Matlab also use 1-based indexing, if anything among languages used for data science and scientific computing, it's Python that is special.

t_mann · on June 26, 2022

Julia was originally pitched as 'C-like speed with Matlab-like syntax', which explains 1-based arrays.

leephillips · on June 26, 2022

That must be why Fortran never caught on.