I kinda feel this way about variable names in physics. You could call the (x,y,z) components of the magnetic field (L,M,N), see [0]. There are so many people who call that utterly wrong, but really it's totally fine and merely a source of confusion.
It could, but important to keep in mind that the filesystem architecture there is also very different with a parallel filesystem with disaggregated data and metadata.
When you run `ls -l` you could potentially be enumerating a directory with one file per rank, or worse, one file per particle or something. You could try making the read fast, but I also think that it makes no sense to have that many files: you can do things to reduce the number of files on disk. Also many are trying to push for distributed object stores instead of parallel filesystems... fun space.
Which is horrifying: if an ESL author must publish in English and therefore does not have a full grasp of nuances and meaning conveyed by English wording, they should involve an editor... not a word machine that doesn't understand, either.
Wording matters when conveying information, ESL speakers should be working with fellow humans when publishing in a language they do not feel comfortable writing on their own.
As with many areas, it’s easier to recognize “correct” than to generate “correct.” When I lived in Germany I would often use the early online translation tools to help refine my written German and it was useful to see how they corrected it, and it was usually a matter of “of course that’s the right way, I see it now!”
I think you're generalizing widely between your experience and the ability of researchers publishing these millions of papers, which as of 2024 at least ~13% were being LLMed.
Seems important to know. LLMs lie and mislead and change meaning and completely ignore my prompt regularly. If I'm just trying to get my work out the door and in the editing phase, I don't trust myself to catch errors introduced by and LLM in my own work.
There is a lack of awareness of the importance of the conveyed meaning in text, not just grammatical correctness. Involving people, not word machines, is the right thing to do when improving content for publication.
Does the grant cover an editor's involvement? How much does the experiment need to be trimmed back and the sample size need to be reduced: how much data must be sacrificed to support that?
If no one on a given team fluently speaks, writes, and understands the cultural context of the required output language, that team will need to find a solution.
It should not be a word machine (which, I should point out, does not have a brain)
Solving this problem might just involve using some of the resources to support the output being correct in the required language. You can call that a "cost"
How would this work for a personal blog? Would I need to be careful not to endorse or even talk about companies and products? And if I didn't have to, wouldn't that open the door for advertising masquerading as news or opinion? Genuinely interested in this.
Were you paid to talk about the product? If not, then it’s constitutionally protected speech. If there is any kind of payment, it’s advertising. If it’s advertising, follow the law.
If a company sends you a free sample in exchange for writing a review, and you get to keep it regardless of your conclusion, is that a payment? If so, that shuts down a way for consumers to get reviews of products before purchasing, but if not, the company might find various non-payment ways to influence what the reviewer writes.
I've had the bad luck that my first prescription was quite wrong: incorrect axis for astigmatism, and incorrect spherical (I basically have only astigmatism, no spherical). So for years I was suffering through the days. Optometrists flat out refuse to correct such mistakes (I've been to many!), preferring only minor changes. I finally started ordering a bunch of glasses cheaply online, and eventually found a prescription that works for me. Cannot trust optometrists anymore.
> With books, movies, series and music I like to own physically only my personal favorites that I know will use many times.
I don't own many DVDs, but when I see them in my bookshelf they remind of the story the tell, the context when I watched them, and my theories about society, sci-fi, and utopias that those movies inspired.
With books it's even more extreme. I get the physical copy sometimes more to have a physical representation of an idea than to read the book.
I have books I've never opened since I read them that is just a marker like that. I have one copy I've never opened: I read it as an e-book but got the physical copy just to complete my set and have it on my shelf.
I also have books I've suddenly taken down to reference 20+ years after I read it, and where the text was not online, so I also like to have physical copies of books where the idea mattered because I know that some day I'll want to pass that idea on and will need a refresher.
> There are countries running on 95%+ renewable electricity right now[1].
That list ignores that countries trade electricity. If, say, Germany produces a lot of electricity by renewables it can sell the excess to France (which has lots of nuclear). If Germany is low on electricity, it can buy from is neighbors. The system as a whole is nowhere near 90%.
The one exception are probably countries that have all hydro and biomass.
You created a straw man.
There are 10 countries that fit the 95%+ criteria, all of them probably have excess of production and have 0 interest in buying power.
But yeah, what you say could theorically happen.
1. Procrastination seems to be a type of early stopping. I knew I had a good strategy in school!
2. Something that seems to be sorely missing in machine learning (I'm not a ML expert) are error bars. If you take the example of the figure at the end, as you increase the number of parameters in the model, your error bars become larger (at least in the overfitting regime), and they are infinite when you have more parameters than data points. Indeed, chi^2 tests are usually used in physics/astro to test for this. Of course, you need error bars on the data points to do this. So perhaps the difficulty is really in assigning meaningful uncertainties to your pictures/test scores/politicians.
It it very counterintuitive. It is also a very common observation that has taken everybody by surprise for almost 2 decades by now. At the beginning, people were very resistant to the idea, even when every experiment confirmed it.
The catch is that you need a huge amount of data to train those.
It also seems to have limits. There has been a few well documented cases where our current huge and very well trained kind of networks got errors there were lower than the rate of mislabeling of the data.
Can’t provide a reference, but I can confirm that this is common knowledge. It’s why e.g. GPT-3 outperforms GPT-2.
Though as stable diffusion shows, network architecture still matters a lot!
Note that the article points out you’ll get more overfitting as your number or parameters approaches that of the training set, which is what I suspect you’ve seen. The trend does reverse later on, but only once the parameter count is orders of magnitude beyond that point, and I don’t know if that ever happens outside of ML. It’s a lot of parameters.
Other purpose than being a psychological trick, what purpose could pointing out the lack of evidence at the time have? Instead they could have written something like "We found the problem in 2021 and promptly fixed it. We first learned that it has been exploited in 2022."
[0] page 907: https://onlinelibrary.wiley.com/doi/epdf/10.1002/andp.190532...