Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Julia was always gonna be the RC Cola of data science languages. Python and R were too far ahead and had too much mind share


Why are you restricting the discussion just to data science? In "general" science there are way more devs than just in data science / statistics, and Julia absolutely shines there. Don't get me wrong, the language is general purpose, the ecosystem is a bit niche for now, but still, it seems wild to restrict comments to such a small field as data science.


> Julia absolutely shines there

Do you have evidence for that?

I'm a mathematician at a research university, and maybe two of my colleagues are using Julia. Despite their proselytism, everyone else is using Python, C, or math-specific software such as GAP, Matlab, or Mathematica.


Depends on what exactly you would like evidence for. Your dissenting comment is that Julia is not popular yet. With that I can easily agree, but that is also not directly related to whether it is an amazing tool, which was my claim.

In terms of examples of hard sciences where it shines: It is the only tool in existence that has at the same time high-quality differential equation solvers and autodiff on them. Compare DifferentialEquations.jl to any other package in any other language. The rich capabilities of the aforementioned package depend on the multiple dispatch + aggressive devirtualization used in Julia. Python/Jax/Tensorflow/Pytorch while wonderful on their own, are nowhere near these capabilities. Matlab/Mathematica do not have these capabilities. The famous C/Fortran/C++ libraries are also far less capable in comparison.


Once again, I'm looking for evidence, not your say-so...


Quoting from my reply so the evidence is easier for you to notice:

> Compare DifferentialEquations.jl to any other package in any other language

If you do not know how to do such a comparison for yourself, this thread has details https://news.ycombinator.com/item?id=31883793


Am I the only one who actually likes the taste of RC the most? That aside, Julia is relatively young for a programming language, 10 years old. It's still possible for it to find a niche, even if it is not the one it aimed for. It took until the late 00s for Python to enter the data science niche in the first place.


Sure, RC is fine, but if you 1-base your arrays then you chose to be the pariah. I have no patience for technology that just chooses to be special for the sake of it.


Let’s be serious: I yet have to see a convincing argument for 0-based being better than 1-based, or the other way around. And designing a language based on “what everybody else does” is definitely not the right approach.


The argument for 0-based indexing with exclusive upper bounds is that hi - lo == len, that 0-length intervals can be expressed without hi < lo, that if you have found the first index at which a predicate is true and the first index at which it is no longer true, those indices themselves are usable as upper and lower bounds, so you may write my_slice[first_true_idx..first_subsequent_false_idx] to get the sub-slice in which the predicate is true continually, that using mod to restrict indices to the valid range is idx % n rather than (idx - 1) % n + 1.

The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way. Edit: And that people who are not primarily programmers may find 0-based indexing weird.


> Edit: And that people who are not primarily programmers may find 0-based indexing weird

Alas, if just our forefathers had called it "offset" instead of "index".


> The argument for 1-based indexing with inclusive upper bounds is that binary heaps are easier to write this way.

Nope. The argument for 1-based indexing with inclusive upper bounds is that, to express ranges, exclusive indexes require you to have a bogus just-one-beyond-the-largest-valid-index element and a way to get the successor for every index. This is a giant PITA for anything but integers (MAX_INT, and x+1), which is why every single programming language I can think of also has end-inclusive ranges in some contexts. If you don't believe me try writing regexps with end-exclusive ranges.

On the other hand, there is no natural way to express empty ranges with 1-based indexing.


I'd love to learn more about writing regexps with various kinds of ranges. I don't see the issue, but it's very likely that I'm missing something.


Here's what a regexp for a C-style identifier looks like (ranges are inclusive):

    [a-zA-Z_][a-zA-Z0-9_]*
Here's what the same regexp would look like if regexp ranges were end-exclusive:

    [a-{A-[][a-{A-[0-:_]*
Do you see the issue now? Of course if regexps really worked that way, you'd be better off doing this instead:

    [a-zzA-ZZ][a-zzA-ZZ0-99_]
But that specific trick only works because regexp ranges occur in a set union context.


I see, the `a-z` range has an inclusive upper bound. That makes sense and it really should be like that.


If there are no convincing arguments one way or the other, then "what everybody else does" becomes the convincing argument. Why change established conventions for no good reason? There's no reason for curly braces to indicate control blocks and square braces to indicate indexing, but if a language swapped the two, what would you say?


Indeed, and if you add up all the users of 1-based languages (Fortran, Matlab, R, Excel, etc.) and 0-based languages (C, C++, Java, Python, etc.), I think you’ll find that the 1-based languages have vastly more programmers.

Going by popularity, 1-based indexing is the established convention.


> I think you’ll find that the 1-based languages have vastly more programmers.

Any shred of evidence for that? Listing four languages for each won't cut it.


Excel alone has more users than all other languages combined, so it's not even close.


And how many of those users index anything? Heck given the languages you'd write macros for excel, they really screwed up by picking 1-based.


Indexing is the primary operation in Excel and one of the first things you learn how to do in the language (selecting a range of cells). It’s how you refer to anything, by indexing into the global cell space.

Given Excel’s massive success and nontechnical user base, which again is larger than all languages combined, it’s hard for me to see 1-based indexing was a mistake. I have experience teaching novices how to program, and 0-based indexing is always a sticking point of confusion. So from my perspective, 1-based indexing is the right choice for excel given the user base and programming style.


Yes, but not so much in scientific computing, where scientists and mathematicians do a lot of the coding.


Because 1-based languages like Fortran, Matlab, R or Mathematica are less used in scientfic computing than elsewhere?


More used in scientific computing, which was my point, because only programmers think of indexing based on offset instead of the first positive integer. Nobody outside of programming in C/Unix inspired languages starts counting at 0.


I may have misinterpreted you comment.

When someone wrote “Going by popularity, 1-based indexing is the established convention” and you replied “Yes, but not so much in scientific computing” I understood that as “1-based indexing is less used in scientific computing compared to general computing”.


Oh, either I replied to the wrong person or misread the parent.


Fortran, R and Matlab have 1 based arrays. That's pretty common with scientific computing languages. I'd hardly call those pariahs, and Fortran's older than C.


More programming in the world is done in a 1-based language, Excel, which has more users than all other languages combined.

0-based languages are in the minority as far as installed base and usage goes. Being 1-based is a decision dependent on target audience and application.

Fortran, one of the earliest languages, is 1-based, so there’s plenty of historical precedent.

Matlab, one of the most commercially successful languages in history is 1-based, so it’s not exactly a barrier to adoption.


If you're counting the number of people who use excel spreadsheets as "programming in excel" then that's great. I no longer need to take you seriously.


What else would it be? Excel is a programming language. Using it is using a programming language. Am I missing something?


This argument could be made about any language, though. Python's indentation significance? Rust's match (vs switch)? R's arrow assignment?


both R and Matlab also use 1-based indexing, if anything among languages used for data science and scientific computing, it's Python that is special.


Julia was originally pitched as 'C-like speed with Matlab-like syntax', which explains 1-based arrays.


That must be why Fortran never caught on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: