Hacker Newsnew | past | comments | ask | show | jobs | submit | vbo's commentslogin

I think there may be more to overcomplicated brews than lifestyle or status. It's a desire to get technical in a world where so much of life has been automated away from the consumer's view or ability to affect it. Coffee is a way to tinker with something in the same way previous generations tinkered with their cars or radios or whatever. It's an outlet for creativity and technical skill building for those that engage in brewing.

Coffee shops will sell rituals, status, prestige, sophistication or the appearance thereof. Same as every other business. But that's not to say the product can't be superior - it can. It also doesn't say that every product that makes use of those marketing techniques is superior; but even if it isn't, if the customer walks away happy, they must've done something right, right?

Don't we do the same with the technology we're working on? How often is it truly better beyond any critique? People were getting stuff done even before our products were around with less fuss and a different set of problems. We do what we do to end up busy with stuff so we can do what we do all over again, don't we? I digress.

It does get tiresome when everyone is trying to sell you an experience and it becomes disappointing when the selling of experiences becomes so commoditized, the thing being sold loses its credibility as something special on account of being sold as such. Is it a crisis of authenticity?

To each, their own. I used to tinker with espresso based drinks, but I'm mostly over it. I've learned to discern (some) better coffee beans from others, but I mostly don't drink that - I can't justify paying that much for a coffee I brew myself and that I may botch out of being in a hurry. It's also a distraction that takes time I don't have anymore. But it was fun to explore for a while and I now own a very fancy looking espresso machine, grinder and all sorts of acccessories.


I'm working on a tool to generate and host full stack web apps from prompts (just like everyone else). I'm loving it. Using llms to do as much of my coding as possible, so in a way eating my own dog food, although it's a more developer-driven effort than what the end product will be.

Strange thing is, the most time consuming part of getting this ready for a user facing launch is not the code generating, but all the scaffolding/queues/storage to run it.


Looks good, might give it a try. I was looking for something similar to provide a unified interface for gpt and claude and eventually hacked something together myself, as none of the solutions I found could deal with structured output properly across both vendors.


I miss the 00s internet. I miss IRC and geeking out for the sake of it. Maybe i'm just missing my younger years, but I think there was a distinct feeling back then, of wonder and being amongst the first to tinker with these promising technologies that were going to change the world for the better and now it's 2024 and we've screwed it all up.


A lot of things got worse, it's not just nostalgia.

The spread of social media from mid-00's onwards, and especially in 10's was a tragedy, but not for the main reasons people normally think. The way people organized back then (forums, IRC channels, blogs, etc) was more authentic, as there was no tangible corporate interest in keeping you hooked to it through underhanded algorithmic manipulation to drive engagement. There were no sponsored content, no farming of every piece of data about users to feed an endlessly greedy advertisement machine. It was just people and their genuine interests.

Part of the problem is that geek culture became mainstream. When I was a kid in the 90's, me and my friends were considered the weird bunch for liking videogames, computers, tabletop RPG, etc. Sometime around mid-00s it became mainstream, and brought along with it people that prior to that had no interest in that niche of culture, and along with it that culture meaningfully changed for the worse.

There's more to it, but I rambled enough. If there's one positive thing I can think of, is that at least the general positivity surrounding tech is gone. This skepticism is healthy, especially considering how things worsened since then.


On the other side, a lot of it wasn't sustainable. Just how many forums just vanished all of a sudden as the owner died, ran out of money or was simply fed up moderating bullshit and infights, not to mention the ever increasing compliance workload/risk (yeeting spam, warez and especially CSAM)?

A lot of the early-ish Internet depended on the generosity of others - Usenet, IRC, Linux distros or SourceForge for example, lots of that was universities and ISPs - and on users keeping to the unwritten contract of "don't be evil". Bad actors weren't the norm, especially as there were no monetary incentives attached to hackers. Yes, you had your early worms and viruses (ILOVEYOU, remember that one), you had your trolls (DCC SEND STARTKEYLOGGER 0 0 0), but in general these were all harmless.

Nowadays? Bad actors are financially motivated on all sides - there's malware-as-a-service shops, bitcoin and other cryptocurrencies attract both thieves and money launderers, you can rent out botnets for a few bucks an hour that can take down anyone not hiding behind one of the large CDNs. CSAM spreaders are even more a threat than before... back in the day, they'd fap off in solitude to teen pageants, nowadays virtually every service that allows UGC uploads has to deal with absurd amounts of CSAM, and they're all organized in the darknet to exchange tips about new places / ways to hide their crap in the clearnet because Tor just is too slow.

And honestly it's hard to cope with all of that, which means that self-hosting is out of the question unless you got a looot of time dealing with bad actors of all kinds, and people flock to the centralized megapolises and walled gardens instead. A subreddit for whatever ultra niche topic may feed Reddit and its AI, but at least Reddit takes care about botnets, CSAM and spam.

I think that Shodan and LetsEncrypt (or rather, Certificate Transparency) are partially to blame for the rise of cybercrime. Prior to both, if you'd just not share your domain name outside your social circle, chances were high you'd live on unnoticed in the wide seas of the Internet. But now, where you all but have to get a HTTPS certificate to avoid browser warnings, you also have to apply for such a certificate, and your domain name will appear in a public registry that can, is and will be mined by bad actors, and then visited by Shodan or by bad actors directly, all looking for common pitfalls or a zero-day patch you missed to apply in the first 15 minutes after the public release.


I remember when admins of phpBB boards asked for PayPal donations to pay server bills every 6 months! I feel like running the same forums now should cost almost nothing for infrastructure. The moderation is still a killer though.


I don't disagree that it was not really sustainable outside that small-ish timeframe of mid-90s to mid-00s. The change for the worse was perhaps an unavoidable change for the worse. And there are things that changed for the worse that neither you nor me talked about. For example, I really miss how online gaming worked back in the early 2000s (no matter how janky it was), qhen there was no real monetary incentive of companies trying to keep people playing on their online platforms.

Maybe the fact that I recognize that the way things changed were unavoidable fuels my general disdain internet culture nowadays, and my skepticism to tech innovations in a broader sense. Oh well.


> For example, I really miss how online gaming worked back in the early 2000s (no matter how janky it was), qhen there was no real monetary incentive of companies trying to keep people playing on their online platforms.

I'd also blame rampant cheating for that. It's damn expensive to keep up with pirates, but cheaters are an entirely different league... the most advanced cheats these days are using dedicated PCI cards to directly manipulate memory with barely any ability for the host to detect or prevent it [1]. From the grapevines, there are developers charging hundreds of dollars per month to develop and maintain these things.

On top of that, up until the late '00s no one cared too much about racist slurs, sexism or other forms of discrimination. Maybe you'd get yeeted off from a server if you'd overdo it. But nowadays? Ever since GTA SA and its infamous Hot Coffee mod, there are a loooooot of "concerned parent" eyeballs on gaming, there's advertisers/sponsors looking for their brand image, and game developers also don't want to be associated with such behavior. And so, they took away self-hosted servers so that they could moderate everything that was going on... and here we are now.

[1] https://github.com/mbrking/ceserver-pcileech


> I'd also blame rampant cheating for that. It's damn expensive to keep up with pirates, but cheaters are an entirely different league

I'm not an avid gamer, but it's not hard to notice that multiplayer games nowadays means "all the player in the world". Most games don't have a local version to either play with multiple controllers or through LAN. They don't even want to allow custom groups to play with. Cheating is way easier to manage at small scale.


People certainly cared about racism, sexism, and other discrimination back then. They just put up with it because there was no movement to change it. It got worse any time I spoke up, so I learned to keep my head down.

Do not mistake my tolerating slurs and other insults for enjoying Nintendo games with being okay with it or the people who did it, or the people who did it and still remember being able to do it without consequence as a better time.


Back in the day you didn't typically play in massivevly populated online servers with matchmaking against complete anonymous strangers.

You typically played with a small group of people. LAN houses with people that were there physically, or groups of friends (even if they were online friends).

Even for stuff such as bnet when I played Diablo 2 or WC3, you typically created a game instance, and over time you could recognize the people playing. You curated friends lists, so you would know to avoid the ones that behaved in a way that didn't jive with the rest of the group.

Perhaps it was not scalable, and a change for the worse was unavoidable. There was a simplicity in those interactions that is completely lost and may be impossible to capture again. An echo of a time long past.


Even then, it had the same problem you still face with in-person tabletop groups. If you find a good group that does a session 0 where everyone respects what's laid down, it's fantastic. If not, it's no better than a matchmaking lobby with the worst teenagers. In-person or online or with a small group makes no difference if the norms they all agree on are trash.

Things are better now because you can find that group that aligns with your values. You aren't stuck with the shitty guild that tolerates your differences (at best) because there are enough people online and gaming to where there's probably another that fits better. And it's even better offline because you can connect with those few people in your nowhere little town who aren't butts.

edit: for example

https://news.ycombinator.com/item?id=40347601


Eh, I think things are much worse now. It's the reason why I seldom play online, and when I do I have absolutely no desire to communicate with anyone (when I play online the first thing I do is muting everyone else. I don't want to read what they write and much less listen to their voices).

There is no community, I am in a centralized server being matched against random people. And when there is a community, it's normally a cesspool where online interaction is at best meaningless. See Twitter for example (no matter if it before or after the retarded buffoon that acquired it, it was always a toxic dump).

Anyway, what is past is past. I talk about those times without much nostalgia (I was a broke teenager at the time, not really the happiest of times). I just rationalize about how things got worse since then.


I see that you are too young to remember MUDs, netrek, or hunt(6)


> the most advanced cheats

How do they work? I know that in a game (Red Dead Redemption 2, for example) cheaters have infinite health and so forth. How? The server is responsible for validating all actions performed by players to prevent cheating, such as verifying movement, health, ammunition, and other game variables. It is not supposed to accept health values sent by the client without verifying against expected game logic. The server is the authoritative source. It is not supposed to rely on the client for authoritative game state, and if it does, it is fundamentally and terribly flawed.


> The server is the authoritative source. It is not supposed to rely on the client for authoritative game state, and if it does, it is fundamentally and terribly flawed.

Indeed. Likely the client is responsible for certain state things and/or implicitly trusted with state updates. How this happens is that most of your game devs are not paid enough or given enough time to do it right. Engines are selected (generally not built) for their ability to get shit to market fast and multiplayer is an after thought, hacked on. And management just shrugs and says we'll force players to run anti-cheat ring0 nonsense.


Say you have a shooter with support for surround sound and immersive sound effects aka "an enemy comes from behind, so make the sound appear from rear left". For that to render properly the client needs to know where the enemy is positioned, which is information a cheat can read out from RAM and display it as an alert for the cheater. Or your average aimbot - the precise position of the enemy is (by definition) known to the client, so a cheat can "take over" keyboard and mouse when it sees an enemy and achieve a perfect headshot.

Or in racing games, extremely precise braking and steering assistance. Everything that a gamer can do, a cheat can also do.


In the case of aimbot: it is very easy to detect aimbot though, and you can always look for patterns, even in cases of triggerbot.

As far as assistance goes: I despise it. Modern games have "aim assist" which is just a built-in aimbot. sighs


> On the other side, a lot of it wasn't sustainable. Just how many forums just vanished all of a sudden as the owner died, ran out of money or was simply fed up moderating bullshit and infights, not to mention the ever increasing compliance workload/risk (yeeting spam, warez and especially CSAM)?

But also lots of commercial social media sites and forums disappeared, because the company got bust or changed its focus.


“ And honestly it's hard to cope with all of that, which means that self-hosting is out of the question unless you got a looot of time dealing with bad actors of all kinds”

That doesn’t follow from the points you made. What follows is that you have to deal with whatever percentage of bad actors you get. If not, you have to contract someone to handle that part of the job or do the entire job. Plenty of opportunities on those sentences that look nothing like today’s feudalism.

For example, I have several sites I self-host on cheap VM’s with lighttpd and BunnyCDN. I can do anything I want with the whole site, including moving suppliers. They have no comments. I have both an email and Facebook messaging if they want to contact me.

For comments, the main problem is catching spam or illegal content. That just means a 3rd-party provider needs to see the content, make a decision based on customer’s needs, and customer’s server needs to post the edit they made. Disqus already implemented much of this concept but it could be modified for more owner control.

The stronger control some want over comment quality requires more time and controls. There’s tools to help with that. The Lobste.rs’s site had great moderation tools. MetaFilter added a cheap, paid system for account creation that filtered tons of spam. Most implementations just need laborers to enforce their view of social norms with might or might not be easy, and might not be right or worth keeping around either.

That leads to countering the last assumption some commenters have: the methods used should keep the sites around as long as Google or Facebook. Most, human activity is temporary. Much has little, long-term value. Many sites will serve their purpose for a specific time. Others might go up and down. These possibilities are fine for non-mission-critical uses. Life will go on.


> That doesn’t follow from the points you made. What follows is that you have to deal with whatever percentage of bad actors you get. If not, you have to contract someone to handle that part of the job or do the entire job. Plenty of opportunities on those sentences that look nothing like today’s feudalism.

Well, the "eternal september" problem... back in the '00s you could reasonably run a forum or a blog even if you're some high school kid, all you needed was your parents and 10 bucks a month for some shitty virtuozzo/UML VPS. No need to deal with stuff like setting up a CDN just to survive some random asshat thinking they can DDoS you off the 'net.


You’re right that the problems increased. Although, I needed a phone line tied up for as long as I was online. I used to DDOS myself.


>When I was a kid in the 90's, me and my friends were considered the weird bunch for liking videogames, computers, tabletop RPG, etc. Sometime around mid-00s it became mainstream,

This does make sense, of course: 1990s->2005-ish is ~15 years, it's 2024 today. The "weird" kids became adults and replaced the previous and outgoing generation and their norms.


That's not how it works. If a minority of people like Thing A when they're teenagers that doesn't mean suddenly when they're adults everyone will like Thing A. It just means a minority of adults will like Thing A.

Put another way, what do you think happened to all the "normal" kids? They would have become adults too, so wouldn't you expect the "normal" to replace the previous and outgoing generation rather than this one particular minority?


>Put another way, what do you think happened to all the "normal" kids?

Silent majority. It wouldn't surprise me if most "normal" kids simply minded their own "weird" business and waited for the winds to shift more in their favour.

What is mainstream today was counterculture 20~30 years ago, which coincidentally is about right for generational shifts in trends.


Nah, in the 90s nerdy kids were definitely the minority, even among kids.

What happened is that in early to mid 2000s, careers that nerdy kids flocked to became desirable because they were well paid.

To this day I think there is something vaguely amusing regarding the push to get more girls to code, and how it is implied that women don't flock to it as some kind of conspiracy to keep them away from nice jobs or whatever.

By all means, I think this push is a good thing. Especially as I have a daughter and I'll certainly teach her the ropes when she is a little older, maybe try to code some silly games with her, that sort of stuff.

But in the 90s when I was a kid? Girls were absolutely repeled by anything nerdy. When my group of friends found a girl that had any remote interest in nerdy things, they would fall over one another to try to accommodate her. Fairly pathetic when I remember in hindsight. There was this active desire to feel less as outcasts by having our own tastes validated by someone from the outgroup, that sort of thing.

It was a different world. Weird to think that it was a mere 3 decades ago.


I miss the 90s internet and I think even though some things have objectively gotten worse (most people interact in proprietary networks, as opposed to using open standards), as a parent, I think part of that feeling is still there, since my kids are doing some of what I was doing back then, only that instead of irc they're mostly using discord now.

The one thing I do believe is legit to miss and not just rose-colored nostalgia lenses is that, back in the 90s, I remember all or the vast majority of what I found only was not tied to profit in any way. I'm not against profit per se, but I do believe you get a very different network when people create content because they want to share something they're interested in as opposed to them trying to make a living out of that.


Yeah, half of my Facebook feed is shit that is intentionally wrong so that people will interact with it because they get paid based on engagement.


Firefox was good, Electron wasn't a thing and Microsoft didn't own most game studios... we really messed up huh.


Well, who can realistically oppose billions of dollars coming to ruin something?


> now it's 2024 and we've screwed it all up.

Moved to tears. A strong sense of 'How Time Flies'.


Maybe folks were waiting for someone else to make the internet better for the many when it’s now that group itself who could.


If you miss IRC, come to Libera.


> ... to tinker with these promising technologies that were going to change the world for the better and now it's 2024 and we've screwed it all up.

Perhaps I'm overly optimistic but considering what we do have, I'd hardly call it a screw up. Far from it. I grew up in the 90s and looking around I'm amazed at what we have.

Last week was the first time in 5 years that I physically went to the bank, and it was only due to a rare edge case scenario that their online services (until now) don't cover. Just about all admin in my life is done online.

And there's so much tech to tinker with. Raspberry pi, Arduino, PCs, ... Connect it to your mobile device and it just explodes what you can do, if you have the energy and time for it. Fun / nerdy tech is (for the most part) dirt cheap now. Sensors, electric motors, microcontrollers - it's all there readily available for basically nothing.

... and considering the personal tech / mobile devices. I remember interviewing for a job in the biggest city in the country some 16 years ago. Printed paper map, getting paper tickets for the subway, getting lost and almost missing the interview. That's unthinkable nowadays. I'd have the map on my phone and I'd let my mobile phone guide me through the subway, with the ticket on the phone.


> Printed paper map, getting paper tickets for the subway, getting lost and almost missing the interview. That's unthinkable nowadays. I'd have the map on my phone and I'd let my mobile phone guide me through the subway, with the ticket on the phone.

The kind of surveillance that walks hand in hand with this is what hackers of the 90s intended to prevent from happening.


I agree somewhat. There really was a sense of wonder. Whats coming next? Where will it go?

I don't think it's all screwed up though. The difference were seeing is what happens when the marketeers take over (no offence intended) the technologists. Everything has to have a point, be commercialised. Learn to filter that stuff out, and the nerds and geeks are still there, doing interesting things, you just have to fight more to see it.


I miss it too.


In the future people might think the same about Bitcoin and AI.


> In the future people might think the same about Bitcoin and AI.

Perhaps, but these will be different people than those who miss the 90-00 internet.


Ofc, because it will be the people growing up in the 20s.


Not even close. No one cares about Bitcoin even today.


> In the future people might think the same about Bitcoin

In 2017 I thought I missed bitcoin, still managed to mine a meager amount on my parents computers. In the modern day I was proven wrong.


I already think like that of the first years of Bitcoin. It had the same energy.


It did. I'm not sure AI has this energy. We quickly skipped the "early tinkering" phase of AI and jumped straight to the annoying "let's put this technology into everything" stage like the blockchain craze of ~2018+. Perhaps the difference is how "top-down" AI has been. Most of the push has come from massive companies trying to get people to use it instead of people finding it organically.


Early internet of 90s+ had a pretty easy learning curve. Think: HTML and a Javascript fart button. Whereas Machine Learning and Large Language Models aka AI have a pretty steep learning curve that I'm working on now. Bitcoin is kind of both, where one can easily interact with the coin trading and/or dive into the complex world of cryptography.


>We quickly skipped the "early tinkering" phase of AI

Was it skipped, or was it spread over many decades with AI winters interspersed throughout?


I miss when "fed-pegged lightning side chains" peddled by Blockstream, Luke, Greg et al was the biggest load of buzzwordy self-serving bullshit in the space.


Yeah, we will fondly reminisce about the planet-destroying ponzi scheme which made it so convenient to pay for illegal goods, scams and ransoms to cryptolockers. What a nice unnecessary ecological catastrophe we managed to concoct out of nothing.


Won't compare


We already do..


I don't want to defend Altman. He may or may not be a good actor. But as an engineer, I love the idea of building something magical, yet lately that's not straightforward tinkering - unless you force your way - because people raise all sorts of concerns that they wouldn't have 30 years ago. Google (search) was built on similar data harvesting and we all loved it in the early days, because it was immensely useful. So is ChatGPT, but people are far more vocal nowadays about how what it's doing is wrong from various angles. And all their concerns are valid. But if openai had started out by seeking permission to train on any and every piece of content out there (like this comment, for example) they wouldn't have been able to create something as good (and bad) as ChatGPT. In the early search days, this was settled (for a while) via robots.txt, which for all intents and purposes openai should be adhering to anyway.

But it's more nuanced for LLMs, because LLMs create derivative content, and we're going to have to decide how we think about and regulate what is essentially a new domain and method and angle on existing legislation. Until that happens, there will be friction, and given we live in these particular times, people will be outraged.

That said, using SJ's voice given she explicitly refused is unacceptable. It gets interesting if there really is a voice actor that sounds just like her, but now that openai ceased using that voice, the chances of seeing that play out in court are slimmer.


Google search linked to your content on your site. It didn't steal your content, it helped people find it.

ChatGPT does not help people find your content on your site. It takes your content and plays it back to people who might have been interested in your site, keeping them on its site. This is the opposite of search, the opposite of helping.

And robots.txt is a way of allowing/disallowing search indexing, not stealing all the content from the site. I agree that something like robots.txt would be useful, but consenting to search indexing is a long, long way from consenting to AI plagiarism.


Point is we couldn't have a way of consenting to ai training until after we had llms. And I'm guessing we will, pretty quickly.


>Point is we couldn't have a way of consenting to ai training until after we had llms.

Sure we could have. even if we're talking web 1.0 age, Congress passed a law as early as the early 00's for email, which is why every newsletter has to have a working unsubsribe link. so it's not impossible to do so.

regardless, consent is a concept older than the internet. Have an option of "can we use your data to X?" and a person says yes/no. It's that simple. we can talk about how much we cared 30 years ago, but to be frank that mistrust is all on these tech companies. They abused the "ask for forgiveness" mentality and proceeded to make the internet nearly impossible to browse without an ad blocker. Of course people won't trust the scorpoion one more time.

As an conrast, look at Steam. It pretty much does the same stuff on the inside, but it reinvests all that data back to the platorm to benefit users. So people mind much less (and will even go to war for) having a walled garden of their game library. Short sighted, but I understand where it comes from.


Licenses allowing derivative works without attribution preceded LLMs.


> But if openai had started out by seeking permission to train on any and every piece of content out there...

But why would anyone seek permission to use public data? Unless you've got Terms and Conditions on reading your website or you gatekeep it to registered users, it's public information, isn't it? Isn't public information what makes the web great? I just don't understand why people are upset about public data being used by AI (or literally anything else. Like open source, you can't choose who can use the information you're providing).

In the case being discussed here, it's obviously different, they used the voice of a particular person without their consent for profit. That's a totally separate discussion.


>why would anyone seek permission to use public data?

first of all it's not all public data. software licenses should already establish that just because something is on the internet doesn't mean it's free game.

>Unless you've got Terms and Conditions

The new york times did:

https://help.nytimes.com/hc/en-us/articles/115014893428-Term...

Even if you want to bring up an archive of the pre-lawsuit TOS, I'd be surprised if that mostly wasn't the same TOS for decades. OpenAI didn't care.

>Isn't public information what makes the web great?

no. Twitter is "public information" (not really, but I'll go with your informal definition here). If that's what "public information" becomes then maybe we should curate for quality instead of quantity.

Spam is also public information and I don't need to explain how that only makes the internet worse. and honestly, that's what AI will become if left unchecked.

> Like open source, you can't choose who can use the information you're providing

That's literally what software licenses are for. You can't stop people from ignoring your license, but breaking that license opens you wide open for lawsuits.


The right to copy public information to read it does not grant the right to copy public information to feed it into a for-profit system to make a LLM that cannot function without the collective material that you took.


That's the debatable bit, isn't it. I will keep repeating that I really don't see a difference between this and someone reading a bunch of books/articles/blog posts/tech notes/etc etc and becoming a profficient writer themselves, even though they paid exactly 0 money to any of these or even asked for permission. So what's the difference? The fact that AI can do it faster?


> That's the debatable bit, isn't it.

If people used the correct term for it, "lossy compression", then it would be clearer that yeah, definitely there's a line where systems like these are violating copyright and the only questions are:

1. where is the line that lossy compressions is violating copyright?

2. where are systems like chatgpt relative to that line?

I don't know that it's unreasonable to answer (1) with that even an extremely lossy compression can violate copyright. I mean, if I take your high-res 100MB photo, downsample it to something much smaller, losing even 99% of it, distributing that could still violate your copyright.


Again, how is that different than me reading a book then giving you the abridged version of it, perhaps by explaining it orally? Isn't that the same? I also performed a "lossy compression" in my brain to do this.


> is that different than me reading a book then giving you the abridged version of it, perhaps by explaining it orally?

That seems like a bad example, I think you are probably free to even read the book out loud in its entirety to me.

Are you able to record yourself doing that and sell it as an audiobook?

What if you do that, but change one word on each page to a synonym of that word?

10% of words to synonyms?

10% of paragraphs rephrased?

Each chapter just summarized?

The first point that seems easier to agree on isn't really about the specific line, just a recognition that there is a point that such a system crosses where we can all agree that it is copying and that then the interesting thing is just about where the boundaries of the grey area are (i.e. where are the points on that line that we agree that it is and isn't copying, with some grey area between them where we disagree or can't decide).


> how is that different than ...

In one case, you are doing it and society is fine with that because a human being has inherent limitations. In other case, a machine is doing it which has different sets of limitations, which gives it vastly different abilities. That is the fundamental difference.

This also played out in the streetview debate - someone standing in public areas taking pictures of surroundings? No problem! An automated machine being driven around by a megacorp on every single street? Big problem.


I think that must be it.

There's an unstated assumption that some authors of blog posts have: if I make my post sufficiently complex, other humans will be compelled to link to my post and not rip it off by just paraphrasing it or duplicating it when somebody has a question my post can answer.

Now with AIs this assumption no longer holds and people are miffed that their work won't lead to engagement with their material, and the followers, stars, acknowledgement, validation, etc. that comes with that?

Either that or a fundamental misunderstanding of natural vs. legal rights.


human vs. bot is all the difference:

- a human will be an organic visitor that can be advertised to. A bot is useless

- A human can one day be hired for their skills. An AI will always be in control of some other corporate entity.

- volume and speed is a factor. It's the buffet metaphor, "all you can eat" only works as long as it's a reasonable amount for a human to eat in a meal. Meanwhile, a bot will in fact "eat it all" and everyone loses.

- Lastly, commercial value applies to humans and bots. Even as a human I cannot simply rehost an article on my own site, especially if I pretend I read it. I might get away with it if it's just some simple blog, but if I'm pointing to patreons and running ads, I'll be in just as much trouble as a bot.

> I really don't see a difference between this and someone reading a bunch of books/articles/blog posts/tech notes/etc etc and becoming a profficient writer themselves

tangential, but I should note that you in fact cannot just apply/implement everything you read. That's the entire reason or the copyright system. Always read the license or try to find a patent before doing anything commercially.


To me it's more like photocopying the contents of a thousand public libraries and then charging people to access to your private library. AI is different because you're creating a permanent, hard copy of the copyrighted works in your model vs. someone reading a bunch of material and struggling to recall the material.


Quantity has a quality all its own.


You state that as a fact. Is that your opinion, or based on a legal fact?


> Google (search) was built on similar data harvesting and we all loved it in the early days, because it was immensely useful. So is ChatGPT, but people are far more vocal nowadays about how what it's doing is wrong from various angles.

Part of that is that we've seen what Google has become as a result of that data harvesting. If even basic search engines are able to evolve into something as cancerous to the modern web as Google, then what sorts of monstrosities will these LLM-hosting corporations like OpenAI become? People of such a mindset are more vocal now because they believe it was a mistake to have not been as vocal then.

The other part is that Google is (typically) upfront about where its results originate. Most LLMs don't provide links to their source material, and most LLMs are prone to hallucinations and other wild yet confident inaccuracies.

So if you can't trust ChatGPT to respect users, and you can't trust ChatGPT to provide accurate results, then what can you trust ChatGPT to do?

> It gets interesting if there really is a voice actor that sounds just like her, but now that openai ceased using that voice, the chances of seeing that play out in court are slimmer.

It's common to pull things temporarily while lawyers pick through them with fine-toothed combs. While it doesn't sound like SJ's lawyers have shown an intent to sue yet, that seems like a highly probable outcome; if I was in either legal teams' shoes, I'd be pulling lines from SJ's movies and interviews and such and having the Sky model recite them to verify whether or not they're too similar - and OpenAI would be smart to restrict that ability to their own lawyers, even if they're innocent after all.


> Google (search) was built on similar data harvesting and we all loved it in the early days

Google Search linked back to the original source. That was the use case: to find a place to go, and you went there. Way less scummy start than OpenAI.


As an engineer, the current state of LLMs is just uninteresting. They basically made a magical box that may or may not do what you want if you manage to convince it to, and fair chance it'll spout out bullshit. This is like the opposite of engineering.


In my opinion, they're extremely interesting... for about a week. After that, you realise the limitations and good-old-fashioned algorithms and software that has some semblance of reliability start to look quite attractive.


They're free because they could not achieve network effects if there was a monetary cost involved for users. So basically if they weren't free they wouldn't exist (at scale).


With or without EU regulations, client software could decide to discard all cookies once the user has "left" the site. Or it could block cross domain cookies of its own volition. Yes, it doesn't fix the fundamental issue, but it does address it for those that want to fix it against the tide. Yes, it comes with drawbacks, but it is what it is so long as we don't collectively move towards paying for content, ideally in micro form.

I sometimes come across articles in local publications that ask me to subscribe - dude, seriously? Do you expect me to subscribe to an Alaskan publication when I live half the world away and could not care less of what happens there, but just want to read this one article that seems interesting?

So instead we have ad funded websites that have to do what they have to do in order to make some money and keep publishing whatever it is they publish. Hence tracking cookies.

Everyone's needs would be better served if we could pay for content the same way we did back in the day of printed newspapers. You buy today's edition and you get today's edition and no one except the newsagent is tracking you (if you happen to regularly buy the newspaper from her, she'll remember you, and she may even suggest additional newspapers to buy but it's implied, right? we dislike machine tracking, not humans remembering our buying habits).

Alas, we don't have that. We have intrusive tracking and subscriptions, even though technically it's something we could build in weeks (lest the payment companies didn't make it unfeasible, for their own benefit).

And people do sometimes try to figure it out. Bundles come to mind. Everything -- except micro transactions allowing you to purchase just. this. article. And while micro transactions don't exclude tracking, companies are more likely (is this wishful thinking?) to be careful with a paying customer's experience than with freeloaders, which is what we insist of being, while putting up demands as to what publishers can do with our data.


> Everyone's needs would be better served if we could pay for content the same way we did back in the day of printed newspapers.

This is one option. Another is that advertisement goes back to those days: you associate advertisement to a content and to a rough geographical location, and that's it. No personalised ads is still possible.


and so has the internet. some use it for good, others for evil.

these are behaviours and traits of the user, not the tool.


I can use a 5ltr V8 to drive to school and back or a Nissan Leaf.

Neither thing is evil, or good, but the choice of what is used and what is available to use for a particular task has moral significance.


So much of the horribly unmaintainable code I've come across was the result of overengineering that whenever I see an abstraction I have a compulsive urge to scream into the void. Some (the simplest ones) are ok, but other times an abstraction that seems like a good idea on day 0 evolves into an unmanageable hodgepodge of hacks that makes an innocent developer that bumps into it years later question their choice of career.


I don't mind working from home, but I find WFH-accommodating calls exhausting; it feels they take far longer than the equivalent in person discussion, in addition to losing the ad-hoc nature of office conversations. The context switch feels heavier on calls than it does in person. Also, the way daily calls tend to be scheduled (everyone at once, let's hash out everything for everyone so we can then get on with things) means I have to sit in on conversations that are irrelevant to me and for some reason having noise coming out of my headphones means I can't focus on anything else. On the opposite end, I'm now sat in the office while colleagues are having a meeting right next to me and I am able to focus on writing this reply (or maybe coding, if I wasn't procrastinating on HN).

Maybe it's my audio setup, but I've tried multiple types of headphones to no avail. I feel the quality of my work (and life) would improve if I could cope better with calls, but despite my efforts, I can't do it. I've removed myself from calls with an adjacent team that mostly don't concern me, and asked them to ping me only if they need me. That saved 30m to 2 hours of my life and saved me from becoming exhausted before the day has properly begun.

There's also a problem in that _some_ people prefer WFH because they can slack off more (or have multiple jobs); that means they're not always available and not fully committed, with whatever implications for the people who rely on them and the company/product.

Maybe it's down to the pace of the work. Startups, I feel, benefit from in office huddling. Corporations with ample time on their hands could be better suited to WFH.


It sounds to me like you're WFH in a culture that doesn't do async coordination well. I don't know how large an org you're in, but some targeted team process refresh goals/training for better async coord would seem like it could improve your working environment. (edit: this could be a suggestion put to your team leads/managers with the objective of improving culture & efficiency)

Some of that is explicitly writing up short coordinating docs/memos instead of real-time hashing it out, and comes with some of the attendant advantages of writing and thinking often discussed in HN posts.


It's a hybrid 20-ish person team and I work from the office 4 days a week (my choice), but yeah, you're right that we fail at async coordination and fall back onto calls. I've suggested alternatives (meet in the office a couple of days a week) but it has failed to materialise and the only workable solution I found has been to remove myself from noisy/unproductive calls.

I hear what you're saying about docs but knowing everyone I don't feel we could do it. Pre-covid everyone-in-the-office worked well, but you need to drag people into the office these days so it seems we're stuck with an ineffective hodgepodge of calls and in person fractional team chats.


I feel like in a hybrid work situation it's unfortunate that it can fall into the worst-of-both-worlds type situations.

Shifting coordinating doc habits are hard, but a lot of times they're being generated in pieces in an uncoordinated way already. Or it could be a matter of team leadership pointing back to the right contexts - issue discussion etc.


> but I find WFH-accommodating calls exhausting

How many of these are truly necessary?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: