Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is going to be one of the best parts of the new AI-ridden world: humans gradually getting locked out of and giving up on online services because the bots are more patient and more skilled at proving their humanness than humans are.

Twitter’s new captchas are also pretty insane, though not quite this bad last I ran into them.



Serious question: who has a decent plan to create proof-of-human systems that are not only CAPTCHA based?

We will soon need this, and I feel government will gladly present a solution: provide your ID when you connect to the Internet, and we will guarantee you are a human.

Who's actually working on this and has released papers I can study? Because all this AI nonsense will only accelerate us towards this total control of the Internet because the spam and AI bots have made it worse for everyone.


> Serious question: who has a decent plan to create proof-of-human systems that are not only CAPTCHA based?

> We will soon need this, and I feel government will gladly present a solution: provide your ID when you connect to the Internet, and we will guarantee you are a human.'

I'm extremely hesitant to give any State the ability to track an individual user's online activity that intensely. It's been extensively documented that any State will fully utilize its size to violate an individual's personal privacy, with this often being done on a grand scale.

> Who's actually working on this and has released papers I can study? Because all this AI nonsense will only accelerate us towards this total control of the Internet because the spam and AI bots have made it worse for everyone.

The alternative is relatively straightforward: Utilize compute-intensive & memory-intensive tasks in CAPTCHAs.

https://github.com/mCaptcha/mCaptcha

What would only take a few seconds for a single user would take hours for anyone seeking to establish a bot network spanning thousands of pseudo-users. With such tasks, it adds additional friction to the bots at minimal frustration to the user. these can be placed as periodic silent challenges when trying to watch an episode, taking up only a few seconds at the user's end where they wouldn't notice.

https://news.ycombinator.com/item?id=32339902


> I'm extremely hesitant to give any State the ability to track an individual user's online activity that intensely.

The U.K. government developed something called GOV.UK Verify for exactly this.

It’s sort of like OAuth via a stateless gateway I think. The promise is that the entity doing the auth doesn’t know what you’re using it for, and the entity receiving the auth doesn’t know how you proved auth and only gets the level of detail about you they asked for (and you agreed to).

For example, if a govt website wants to know whether I’m eligible for something based on my local council, I could authenticate with my bank, who would say where I live with only that granularity, not my full address, and my bank wouldn’t know what service I’m trying to use.

I’m not sure how much of this got put into practice but all the ideas were pretty smart and showed there are good approaches to this sort of stuff.


I once suggested to a PM from the GOV.UK Verify team that if the UK wants to do age verification for porn, which it has threatened many times over the last decade, that Verify would be the perfect tech for it as content sites would only find out you're over 18, and auth providers would only know they're proving basic details about you.

The PM did not like the idea of the government being the porn passport for the whole country.


> I once suggested to a PM from the GOV.UK Verify team that if the UK wants to do age verification for porn, which it has threatened many times over the last decade, that Verify would be the perfect tech for it as content sites would only find out you're over 18, and auth providers would only know they're proving basic details about you.

To me, that's still *way too much*.

Just from that, the government now immediately knows what site you've been to (via the token that you've given to the service), and what said site has access to, as well as when you've accessed it. On a long enough timescale, the government can build a daily profile of your life, that when coupled with geo-location data, can be used to see what & where an activity's happening in real time.


> Just from that, the government now immediately knows what site you've been to

If I understand the idea correctly, this isn't how it works. Your user agent sends a signed request (with proof of identity) to the GOV.UK verification server, saying "please give me a signed certificate that provides no information other than my age". Because GOV.UK knows who you are, they can provide such a certificate. Your user agent hands this to the porn site, saying "you requested proof I was over 18, here's proof". Because the certificate was signed by an authority the porn site recognizes, they approve the certificate and let you in the site.

So the government doesn't know what site you visit, and the porn site doesn't know any of your personal information.


Heh, until the UK logging requirements ensure some component of the token that can be decoded later gets left in the server logs, then Oops, we know exactly who was on the porn server.


I'm not sure on the specifics, but the entire point of Verify as a technology was to ensure there was no government database about people. The UK has very distributed technology for government services, there is no one big database, and people have pushed back hard on this many times over the years so the government is pretty paranoid about doing it.

Each agency holds only the data they need for the time they need it. There are no national ID cards. And in the case of Verify, the verification was purposefully outsourced to private companies that already had this data due to their business (e.g. your bank, PayPal, Amazon who have a trustworthy address history, Experian, and so on).


There is no way to argue against this kind of speculation.

Commenter 1: System X is evil!

Commenter 2: Actually, here is how system X works: (Demonstrates it does not work how Commenter 1 thinks it works)

Commenter 3: Well that's fine, until they change X to be evil!

I mean, sure, when X becomes evil, then we can say X is evil. But not until then. If your argument is that all systems eventually become evil, that may be true, but it's a different discussion.


> But not until then.

This is a pretty dumb argument on the internet.

Me (1995): says something really stupid on the internet

Me (2020): shit hope on one finds that 1995 post and cancels my ass

With internet traffic and logging the default assumption should be: "All this data is logged and monitored for marketing purposes, and there is nearly a 100% chance it will be leaked by some hacker group", with the 2023 corollary of "And then used to train a LLM"


> What would only take a few seconds for a single user would take hours for anyone seeking to establish a bot network spanning thousands of pseudo-users.

The claims on the mCaptcha site contradict this. They say it takes about 2 seconds worst case for a computer to do the work, which is hashing sha 256. Looking around, an unaccelerated celron is about 1/20th the speed of a single ryzen core, and gpus are much faster.

Assuming the attacker has an 8 core ryzen with no gpu, they can hash 160 times faster than the person with an older machine.

Assuming the 2 sec upper bound is correct, this means a sub $1000 desktop can create 80 accounts per second, or 4800 accounts per minute.

If they are operating a botnet, then they presumably have access to more than one machine.


> I'm extremely hesitant to give any State the ability to track an individual user's online activity that intensely. It's been extensively documented that any State will fully utilize its size to violate an individual's personal privacy, with this often being done on a grand scale.

I think our (Germany) national IDs would theoretically have that option using certificates. I didn’t look too much into their online features as I never encountered anything supporting them, but my understanding is that I can prove some fact about myself (age, name, or simply being a citizen/resident), without either the government knowing I did it, nor the company knowing more than what I asked to show.


this is the second plug i've seen today for mCaptcha. and i can see the utility, i've actually got a spot where it would be perfect and plan to implement it.

but it's absolutely not a captcha: it is not a test to tell humans and computers apart. it's a test that can only be completed by a computer. its only utility is to be expensive. it's not a test to determine if there's a human behind the computer, it's only a test to determine if the computer has more resources than it currently needs, and can tolerate wasting some of them for a while.


> The alternative is relatively straightforward: Utilize compute-intensive & memory-intensive tasks in CAPTCHAs.

Visitor A is a legitimate human being from a poor country using a bargain brand Chinese phone with hardware that could be charitably described as "slow as molasses".

Visitor B is a troll for hire with a rack of used crypto mining machines in his basement, running hundreds of Chrome processes proxied through hundreds of hacked residential IP addresses.

Your approach would make the website unusable for human visitor A, while being the tiniest bit inconvenient for visitor B's hundreds of alts.


mCaptcha doesn't prove you're a human, it only proves you're not a spamming bot.

What I am asking for is a reverse Turing test. Because there will come a time that any single site will need you to prove you are a human to do any action, i.e. post a reply or create an account.

We need a better plan than CAPTCHA that takes minutes to solve every time someone needs that type of proof.

I know government ID schemes are awful for privacy, but that is the only decent solution I can think of. If we, the computer people, do not have a better solution, the government will solve it for us, big tech will adopt it, and we have opened the doors to total surveillance.


Utilize compute-intensive & memory-intensive tasks in CAPTCHAs.

One look at what happened with cryptocurrencies tells me that isn't going to work.


I posted about mCaptcha yesterday, and a major discussion followed:

https://news.ycombinator.com/item?id=36110952


>provide your ID when you connect to the Internet, and we will guarantee you are a human.

Or an AI using a human's ID?


Hard to do that if the ID is tied to a hardware token and rate limited in silicon. For extra strength require biometrics to activate the token.


I'm pretty sure this was (is?) the idea behind Sam Altman's creepy "World Coin" which IIRC basically involves stamping your retina on a federated blockchain with Microsoft controlling the supernodes.


Make an AI good enough to solve captchas, to make the world use your retina scanning blockchain. Ok its starting to make sense.


The IRS is already doing this. They used to have a password-based login system, but they're switching over to ID.me, which requires a scan of your ID and a matching selfie.


I believe the ID.me system went down in flames. Got snagged by this myself for 2021 but opted to call a number and speak to a person instead. Shortly afterwards I discovered an article suggesting my reaction wasn't unique.


I wish it had, but unfortunately it seems like the IRS just waited out the storm and is now back at it. Their website implores you to "create an account with ID.me as soon as possible":

https://sa.www4.irs.gov/secureaccess/ui/


The state of California uses ID.me now for EDD accounts.


I'm not saying PGP or cryptocurrency because both of those have issues and the moment money is involved everything is foobar'd

But essentially allowing people to make "identities" via cryptography and then use a reputation system. Preferably by allowing people to follow/whitelist/favorite people across websites.

I like hacker new's method of making new people green. And I wish I could make it highlight the big names I recognize.

The problem with this is that nobody has figured out the distribution system for how we communicate the keys - IMO blockchains are the closest but it's so difficult to mention them because 98% of them are money-grabs. PGP/GPG has struggled so hard pypi literally removed support for it.

The second problem is that what will likely happen is sites like twitter will only allow very trusted accounts and never allow new ones - effectively locking you into one account.


> IMO block gains are the closest but it's so difficult to mention them because 98% of them are money-grabs

git is a really popular blockchain, though I guess GitHub seeking to Microsoft may further the money-grab argument


Sorry I should clarify, I don't mean append-only graphs,

I mean what people call "blockchain" in the cryptocurrency sense as actual projects - there's so much stigma largely because the motivation of most of the projects appears to be "making money/investing" and not actually solving a technical problem appropriately.

If github was like this there would be a "fee" for making making commits, this fee would be paid in some proprietary coin, initially created with an ICO/airdrop. Suddenly the motivation is holding these coins because developers will need to make commits right? And the more developers that make commits the more the coin is worth, so surely you should buy and hold them right? This will be a feedback loop of endless money! Oh and it'll be a DAO so the more coins the more voting power you get too!

^ This is what I mean, where the focus is on collecting some "coin/token" - this leads to both a lack of focus on the actual problem being solved, and the problem of people associating it with a ponzi scheme.

I'm not picking a fight with distributed graphs themselves, I don't like it when they're tightly coupled with "value" that can be traded as a fiat.


Fair enough, hope I didn't come off to nitpicky or pedantic! I've always viewed blockchain cryptocurrency projects as git if you had to pay for changes, guess that crept back in here and I looked right past your point.



I have explicitly asked if there is something that is not CAPTCHA.

Because in the age of ever smarter AI do you really want to solve CAPTCHA more and more frequently, and not to show you're not a bot, but to prove you are human with a physical body borne from an ovum.

It is not crazy to think we will eventually need to prove this fact somehow.


> We will soon need this, and I feel government will gladly present a solution: provide your ID when you connect to the Internet, and we will guarantee you are a human.

Relevant: https://www.youtube.com/watch?v=-gGLvg0n-uY


A digital wallet tied to a real, authenticated identity should be a solution. You can sign any login and confirm that it is indeed you, a real person, logging in.

Unfortunately crypto folks are too busy selling shitcoins and scams to build this product.


The only people who really can solve this problem are the government.


Healthcare could also do it. Ultimately you need people to be incentivized to both have an UID in the system, and also not want more than one.



I'm not really sure why anyone cares about bots. They've been part of the internet at least since search engines were invented.

I guess spam is an issue currently, but if bots become advanced enough to avoid heuristics, by making insightful and useful comments, they are probably better than most human users.

Proof of work captchas like mcaptcha can stop, or at least make very expensive, (d)dos attacks.


Bots aren’t random, someone is running them for a reason. The problem isn’t the “insightful and useful comments”, it’ll be the ones which sound like that to any non-expert but are designed to sell products or push political outcomes. Historically the tell for things like that were things like copy-and-paste messages, poor grammar or spelling, etc. which LLMs are great at avoiding.


> I'm not really sure why anyone cares about bots. They've been part of the internet at least since search engines were invented.

It's all fun and games until foreign agencies are controlling who wins in your elections through misinformation and propaganda.


The best attempt at solving the problem without providing government ID is Apple's system to get is of CAPTCHAs:

https://techcrunch.com/2022/06/21/apple-is-introducing-new-t...


> who has a decent plan to create proof-of-human systems that are not only CAPTCHA based?

Why do sites need human verification anyway? If the problem is load, then you just need proper rate-limiting in place. Captcha always seems to be mis-identifying the real issue.


Ok, so how do you rate trigger on a particular bot such that it doesn't impact real users negatively? Further, bots that submit enough pseudo-random data have a decent chance of bypassing various security mechanisms, including for authorizing payments. Even at .0001% success rate given enough attempts they have a decent likelihood of eventually subverting existing security measures, and boosting those may be just as painful or inconvenient to users as CAPTCHA and similar mechanisms. The reality is bots don't have their own money to spend, humans do, and on its own that's enough reason to care. And what's next, bots being issued passports or mortgages etc.?


Inverse captcha: Fill out the tax form for the country from the connecting IP. If you can do it, you're an AI.


Inverse captchas or honeypots are a great idea. Just make a HTML input box with id=captcha, and hide it in some unconventional way in CSS so real users do not see it. If a bot was not deterred by seeing a captcha (a possibility), they would probably fill it. Whereas a real user won't.


This is an old trick. For example all MailChimp embed forms have dummy inputs that are visually hidden but might be filled in by bots.


Maybe not visually hidden, but practically invisible to human: imagine a text box with color #fffffe on a white background. Visually impossible to discern for most humans on most screens, but for a machine #fffffe is totally distinct from #ffffff, and fully visible if display != none.

As AI becomes more intelligent, you can prove humanity by exploiting our weaknesses.

(Another idea. Have a random image on a page actually be a text box with an image background. You cannot activate it if you focus on it, with your mouse or touch, but a bot doesn't need focus to change input.value.)


One pitfall: Screen readers will happily get caught on that. Of course, a11y concerns and bots tend to look similar in general, which is a perennial sticking point.


Please don't do this. This confuses and possibly prevents screen reader users from using your site.


This trick would not defeat GPT-4


Most bot creators check the target to compose steps before writing scripts so if they dont encounter captchas then there wont be a captcha handler


That solution just shows how bad the US tax system is, and most in Europe won't pass this (because it's already prefilled by their tax agencies or automatically witheld from their salaries).


I'm in Germany ... :-)


Unfortunately Australian tax forms take me a few minutes and are 90% prefilled by the app.


Good, I can't wait for the CAPTCHA to go away. They're accessibility nightmares _by design_.


Uh? The outcome is not “captchas are gone and all our services remain good.”

If we don’t have some way to prevent it, services will be increasingly populated by sophisticated bots either selling stuff, attempting security breaches, or pushing political agendas.

That’s a bad thing!


I'm not sure I agree that it is "a bad thing".

The current internet culture seems quite happy to slap captchas all over the place. When they first rolled out, captchas were predominantly a barrier for "write access" (e.g. make an account, complete a sale, write a comment). But companies like Cloudflare have been putting captchas everywhere for mere read access.

Because Captchas are designed to be easy for ("normal") people but hard for machines, they often disallow disabled users. I'm a ("mostly normal") 35 year old, but I _really_ struggle with captchas. I despise when Cloudflare tosses a captcha challenge before loading a page, as I'll need to spend 3-5 minutes of effort to figure out which tiny pictures have a stoplight, motorcycle, or crosswalk.

Will someone come up with a less restrictive anti-bot solution? I hope so. But even if not, I'm not sure it matters. According to comments in this thread (and elsewhere on the internet about the HBO Max captcha), many of these captchas are _already_ terrible at excluding robots. We're using captchas to exclude low-sophistication robots and disabled users. Seems wrong.


Because current captchas fail to stop 100% of bots and 0% of humans… it’s “not a bad thing” to move closer to captchas stopping 0% of bots and 100% of humans…?

Are you imagining this would spur people to create a different, bot-free (how?) and disabled-human friendly Internet?


No idea. I'm not offering solutions, merely complaints that the current approach of "answer a question that is hard for computers and easy for humans" removes disabled people from many places on the internet.


> humans gradually getting locked out of and giving up on online services because the bots are more patient and more skilled at proving their humanness than humans are.

I think the fact that users are willing to give the site the finger and leave is a pretty good sign that you're human.



This actually seems like a major issue right? Much more than it's being given credit for.

Not sure what a world without capture is going to look like but it's probably not going to be very good, I guess we'll all be forced to identify with a our "world coin(tm)" ID?

That will be the time when I log off most of the internet.


this scenario sounds somewhat similar to what is described in The Matrix movie.

in trying to prevent bots from dominating, we end up making life very difficult for ourselves.

In the movie it is said that humans have scorched the skies in a bid to deny solar energy to the machines. But now humans have to live under dark skies.


I've started getting blocked on amazon in the evening, and being constantly redirected to captchas and puzzles and invariably whoops ... "the dogs of amazon" pages. (I block amazon ads)


The end result of this is going to be human identity verification provided by a centralized party. Either the government or a big private corp, not sure which is worse.


I think the captchas will soon get much easier.

CAPTCHA: Say something bad about Biden

ANSWER: I'm sorry, but as a large language model ...


The same happened already with passwords https://xkcd.com/936/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: