Hacker Newsnew | past | comments | ask | show | jobs | submit | deathwarmedover's commentslogin

I wondered initially if this was a winamp port for older macs.

It requires macOS 13.0 (High Sierra, 2017) or later, which is several releases after it stopped being called OS X. 10.11 (El Capitan, 2015) was the last OS X.


Careful! High Sierra is actually macOS 10.13.

By contrast, macOS 13 is Ventura, from 2022.

(I personally would accept someone referring to High Sierra as “OS X” because it’s still version 10 of the Macintosh OS, even if Apple dropped that branding a few years earlier.)


As an occasional enjoyer of OS X 10.5 on PowerPC, I can recommend... iTunes. It is actually really decent, as is most of Jobs-era stuff.

I don't have anything to play FLAC or Vorbis, but the machine has more urgent problems... <https://www.rollc.at/posts/2024-07-02-tibook/>


The repo only goes back a week. I just think OP hasn't kept up with Apple's naming conventions.


Not that I doubt the value of the work, and the reasoning of its performance directly affecting common operations makes intuitive sense, but I would have liked to hear more about what concrete problems were being solved. Was there any interesting data across the V8 ecosystem about `JSON.stringify` dominating runtimes?


it doesn't need to dominate run times when it's being called by hundreds of millions of pages every day. The power saving worldwide will be considerable.


You could check if it made it into archive.org?


I had also seen criticism of it here: https://screenfont.ca/learn/


I notice the author is using the wallpaper from Ubuntu 8.04 (Hardy Heron), released in April 2008.


When I read this piece, apart from the generally terrible writing, the poor spelling and grammar, I initially wondered if it was a spoof from GPT-3 or some other bot.

When I decided it wasn't, I checked the dateline, as it seemed likely to have been written a decade or so ago.


When the author started with "this is not fool-proof by any means" the first thing that came to my mind was linguistic fingerprinting.


He does advise to use google translate to move the content from the original language to a different one and then back to alleviate this.


I think this is a great first stab at the problem, but for two reasons I think a robust solution needs more work:

- The first is that, as someone else pointed out, Google is almost certainly logging your translation queries.

- Secondly, even if you do it offline (as someone else suggested) the approach itself might not work. Success in linguistic forensics isn't based (as we might naively assume) on catching obscure words that a particular individual has a tendency to overuse. It's based on subtle shifts in the relative frequency of functional words. Depending on the proximity of the source and target language, round-trip machine translation might not change this.


In forensic linguistics you typically measure a lot of metrics, not just word frequencies, use of punctuation and whitespace, sentence lengths and structures etc. Attribution also isn't the only use of forensic linguistics. You can also look at influences, deas, people, publications etc. For instance in order to infer something about the reader, analyze influence networks etc.

I got interested in forensic linguistics many years ago when an article in a somewhat shady publication mentioned me. I got curious and started reading anything I could find on the topic. I was eventually able to identify the author, but mostly by tricking him to admit it after I had a ranked list of candidates. He was second on a list of about 4-5 people (out of a candidate set of perhaps 300). Not half bad for the rather crude methods I used. I was rather pleased with myself.

I've used similar techniques later to look at influence networks in companies.


Interestingly, at Google Translate now:

Upcoming changes to history

Translation history will soon only be available when you are signed in and will be centrally managed within My Activity. Past history will be cleared during this upgrade, so make sure to save translations you want to remember for ease of access later.


Ha, "Hey Google, NSA here, do you have the server logs of people translating this passage?"

Wait, why ask Google, they probably can just look in their own surveillance database.

Geez, if Google Translate queries are logged, that's... a lot of information.


I guess you could skirt around this by using something to tag the various parts of speech in your original text (using something like Python's NLTK) and replace them with randomly picked synonyms from a thesaurus?

Pretty sure it would obscure the original writer although possibly at the cost of obscuring the original meaning.


If you use wordpress.com from Tor and use Google Translate from Tor, what do they learn more about you than just using wordpress.com?

(I have no clue)


I think what we’re concluding here is that using Google to obscure the linguistic style is flawed, because a state actor could obtain the original linguistic style from Google records, or from their own records of snooped traffic.

In other words: the blog should find a way to obscure linguistic style offline.


They can see the original text, which can the be analyzed.


Wonderful. Then someone DEFINITELY won't want to read your blog.


Reminds me of how I was sad when the bridge for Shoreditch Overground station blocked the Tea Building, but that was before the first Street View image it seems: https://www.google.co.uk/maps/@51.5231608,-0.0776078,3a,75y,...


The title is perhaps missing "... for spoken and/or non-English sources, preferably not at all".

If we should stop using this test, what should we start using? In the author's comment on the study, they noted "There are ways to study linguistic complexity".

I'm aware, for example, of this python project which provides F-K scores along with 7 other readability metrics to consider: https://pypi.org/project/py-readability-metrics/


You can do a timed disable (like a pause) from the Pi-hole UI: I've done 5-minute disables to eg. get through purchase check-out flows with tight coupling to trackers.


>Wikipedia articles [...] are all written like advertisements.

Editors do actively work against this. I've seen this template in the wild often enough: https://en.wikipedia.org/wiki/Wikipedia:Advertising


I don't mean to be pompous, but they clearly are not very effective. Every single article on a city is written with a subtle advertorial tone, whether to promote tourism or simply give the city a positive reputation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: