Why are you restricting the discussion just to data science? In "general" science there are way more devs than just in data science / statistics, and Julia absolutely shines there. Don't get me wrong, the language is general purpose, the ecosystem is a bit niche for now, but still, it seems wild to restrict comments to such a small field as data science.
I'm a mathematician at a research university, and maybe two of my colleagues are using Julia. Despite their proselytism, everyone else is using Python, C, or math-specific software such as GAP, Matlab, or Mathematica.
Depends on what exactly you would like evidence for. Your dissenting comment is that Julia is not popular yet. With that I can easily agree, but that is also not directly related to whether it is an amazing tool, which was my claim.
In terms of examples of hard sciences where it shines: It is the only tool in existence that has at the same time high-quality differential equation solvers and autodiff on them. Compare DifferentialEquations.jl to any other package in any other language. The rich capabilities of the aforementioned package depend on the multiple dispatch + aggressive devirtualization used in Julia. Python/Jax/Tensorflow/Pytorch while wonderful on their own, are nowhere near these capabilities. Matlab/Mathematica do not have these capabilities. The famous C/Fortran/C++ libraries are also far less capable in comparison.
Am I the only one who actually likes the taste of RC the most? That aside, Julia is relatively young for a programming language, 10 years old. It's still possible for it to find a niche, even if it is not the one it aimed for. It took until the late 00s for Python to enter the data science niche in the first place.
Sure, RC is fine, but if you 1-base your arrays then you chose to be the pariah. I have no patience for technology that just chooses to be special for the sake of it.
Let’s be serious: I yet have to see a convincing argument for 0-based being better than 1-based, or the other way around. And designing a language based on “what everybody else does” is definitely not the right approach.
The argument for 0-based indexing with exclusive upper bounds is that hi - lo == len, that 0-length intervals can be expressed without hi < lo, that if you have found the first index at which a predicate is true and the first index at which it is no longer true, those indices themselves are usable as upper and lower bounds, so you may write my_slice[first_true_idx..first_subsequent_false_idx] to get the sub-slice in which the predicate is true continually, that using mod to restrict indices to the valid range is idx % n rather than (idx - 1) % n + 1.
The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way. Edit: And that people who are not primarily programmers may find 0-based indexing weird.
> The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way.
Nope. The argument for 1-based indexing with inclusive upper bounds is that, to express ranges, exclusive indexes require you to have a bogus just-one-beyond-the-largest-valid-index element and a way to get the successor for every index. This is a giant PITA for anything but integers (MAX_INT, and x+1), which is why every single programming language I can think of also has end-inclusive ranges in some contexts. If you don't believe me try writing regexps with end-exclusive ranges.
On the other hand, there is no natural way to express empty ranges with 1-based indexing.
If there are no convincing arguments one way or the other, then "what everybody else does" becomes the convincing argument. Why change established conventions for no good reason? There's no reason for curly braces to indicate control blocks and square braces to indicate indexing, but if a language swapped the two, what would you say?
Indeed, and if you add up all the users of 1-based languages (Fortran, Matlab, R, Excel, etc.) and 0-based languages (C, C++, Java, Python, etc.), I think you’ll find that the 1-based languages have vastly more programmers.
Going by popularity, 1-based indexing is the established convention.
Indexing is the primary operation in Excel and one of the first things you learn how to do in the language (selecting a range of cells). It’s how you refer to anything, by indexing into the global cell space.
Given Excel’s massive success and nontechnical user base, which again is larger than all languages combined, it’s hard for me to see 1-based indexing was a mistake. I have experience teaching novices how to program, and 0-based indexing is always a sticking point of confusion. So from my perspective, 1-based indexing is the right choice for excel given the user base and programming style.
More used in scientific computing, which was my point, because only programmers think of indexing based on offset instead of the first positive integer. Nobody outside of programming in C/Unix inspired languages starts counting at 0.
When someone wrote “Going by popularity, 1-based indexing is the established convention” and you replied “Yes, but not so much in scientific computing” I understood that as “1-based indexing is less used in scientific computing compared to general computing”.
Fortran, R and Matlab have 1 based arrays. That's pretty common with scientific computing languages. I'd hardly call those pariahs, and Fortran's older than C.
More programming in the world is done in a 1-based language, Excel, which has more users than all other languages combined.
0-based languages are in the minority as far as installed base and usage goes. Being 1-based is a decision dependent on target audience and application.
Fortran, one of the earliest languages, is 1-based, so there’s plenty of historical precedent.
Matlab, one of the most commercially successful languages in history is 1-based, so it’s not exactly a barrier to adoption.
If you're counting the number of people who use excel spreadsheets as "programming in excel" then that's great. I no longer need to take you seriously.