Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Twitter Should Open Up the Algorithm (every.to/divinations)
71 points by dshipper on April 16, 2022 | hide | past | favorite | 136 comments


Recommender systems like Twitter are as much about data as they are about code. Without the dataset and the derived statistics and models that are used for ranking and recall, the code is not going to help much with transparency.


Exactly. “Here’s the code for our matrix multiplication, dot product, and sorting; you’re welcome to check it for bugs.”


> Recommender systems like Twitter are as much about data as they are about code. Without the dataset and the derived statistics and models that are used for ranking and recall, the code is not going to help much with transparency.

And also other way around, data depends on algorithms and tools that created them. You can't have insightful understanding of data without knowing how and why they were created.


True - however the thrust of the arguments for opening Twitter seem to be focused on opening up the code - so the “code without data” scenario is the one worth honing in on.


Right. It’s a bit worrisome that people (even people in this very thread) know nothing about how recommendation and ranking work but think they know everything. “Release the code” is nonsensical.


This is a ridiculous claim. While you won’t be able to make use of the code, it would still provide insight of the inputs in relative terms, and evidence methods that do not apply.


> and provide insight as to some things that are not being done.

That's not an algorithm, that's product strategy & management.


It would provide insight but if the goal is to increase transparency to the point of meaningfully improving trust in the centralized actor that is Twitter, I don’t see it moving that needle much. Given the right data you can kind of coax a lot of these algorithms into arbitrary outputs, so if you’re starting from a place of distrust then if you can’t replicate the outputs yourself you wont have evidence to alter priors significantly with just code imo.


The point is to increase transparency, period. And you are the one arguing against that. Nobody is saying it’s enough, but it’s objectively better than nothing, and it would make it possible to ask more specific questions and make more specific arguments for the value of more transparency. Again, you are the one arguing against it but you have not provided a reason why.


I’m not arguing against it - I’m arguing that people who are focused on the idea of releasing the code are miscalibrated on the marginal impact of doing so given a lack of domain knowledge.

The net suggestion is to demand more, and start describing solutions, today, to get data visibility without compromising other things, to help offset the inevitable objections that will come once people are forced to demand it when they realize the code didn’t really solve the fundamental problem.


Ironically, the best way to prove your own point is to open the source code.


The point I'm stuck on in this whole conversation is why you'd continue to use Twitter if you felt they were untrustworthy. Why are we discussing freedom of speech when you can just load another app and get all the freedom you want?


The reason people are complaining is we have found ourselves in a situation where Twitter is arguably already or indisputably will become what amounts to a global public square.

The early philosophical developments around freedom of speech didn’t foresee this but now that it is here, we have to ask the question of what ought we want if we are to uphold the principle that people should be free to speak unpopular ideas without censorship or fear of excommunication, given that a key ingredient in progress, revelation of the truth, and finding compromise to avoid violence.

Twitter is of course a private company, and can do what they want. But what those who wish to see free speech principles upheld say is we ought to want those in twitters position to be as lenient as possible so as to not stifle the free exchange of ideas. And, perhaps failing that, we ought to want to see another mechanism that is less susceptible to widespread censorship and overreach. Too many people seem counter someone explaining that they feel the status quo is undesirable with an is/ought fallacy - it’s our job in liberal democracies to continually raise and debate issues regarding basic freedoms and the ability for our society to continue to evolve according to shared principles.


I dunno, I deleted my twitter when they banned the sitting president for being too hyperbolic.

A social network complaining about hyperbolism and outrageous claims being made on it's platform is like complaining people are breathing too much air.

Twitter is not a place grownups should be participating.


I think that's what everyone is talking about. They're just using the term algorithm colloquially.


I don’t think so - because releasing the data is a massive hill comparatively due to the immense privacy and legal issues.


Hmm Twitter already sells its "firehose" to the highest bidder.


I’m sure twitters algorithms take advantage of all interaction data, like impressions, scrolls, and clicks. Agree tho it’s good a lot of the data is public.


Publishing user scrolls is a "massive hill" and a legal/privacy nightmare?


Sure seems like it could be - I wouldn’t opt-in to allowing mine to be distributed, why would I? And I don’t think Twitter has the rights to do so. So you’d at least need to solve the anonymization problem, or you’d have to package a data release such that you can replicate the ranking algorithm without it.


User clicks in form of "likes" are already public so why would scroll data be so sacred?


Oh, you mean the matrix came up with rule to ban people talking about lab leak? but matrix can't figure out simple crypto bots and need humans to report and manually ban them?


That's moderation which is a completely separate topic from recommendation.


At least according to this Twitter both agrees with you and is in the process of exploring transparency for both.

https://blog.openmined.org/announcing-our-partnership-with-t...


I think this is a great idea, and it shouldn't be limited to Twitter. Youtube recommendation algorithms should be handled similarly, indeed any public-facing system needs to be honest about the kind of spin they're putting on their recommendatiuon algorithms. Even better, allow the users to spin the dials when it comes down to their own particular searches/results.

> “One of the things I believe Twitter should do is open source the algorithm and make any changes to people’s tweets — you know, if they’re emphasized or de-emphasized — that action should be made apparent so anyone can see that action’s been taken. So there’s no sort of behind-the-scenes manipulation either algorithmically or manually.” Musk also added later, “The code should be on GitHub so people can look through it.”" (CNBC interview)


> allow the users to spin the dials

If I could pick something to regulate about recommendation algorithms, probably a big part of it would be mandating consumer controls over the algorithm. Complete transparency might be hard when sometimes even the programmers of these systems don't have the ability to see why they recommend what they do, but there should definitely also be mandatory oversight by consumer protection agencies that have full ability to audit the code and report on it.


Sounds like a bad idea. Bad actors would immediately exploit it and figure out how to most effectively flood everyone with spam, same as if Google's exact algorithms were public. Not everything has to be open.


On occasion, people make the same argument about all open source software, that making it public makes it insecure.

Fortunately this has not turned out to be the case in reality. If there are flaws in the system they will probably be easier to find and fix with more eyes on it.


In open source development you're trying to prevent exploits from being possible as much as you can, and for anything new you have time to get multiple people thinking about it before a release goes live.

In content moderation vs spammers, you have spammers adapting to your tools in real time while posting content in the same way real users do, and the balance between "don't want to accidentally ban real users (who post stuff very similar to the bots)" vs "does it matter if someone gets past your defences a few times an hour" is massively different too.


The consumption of FOSS and the consumption of tweets are so different I don't really see the merit in comparing the two.


This is a terrible comparison. With OSS people are looking for security flaws. With a twitter/google algorithm people are trying to exploit the system when it works as intended. If there is a secret formula for a successful tweet, people will find it and repeat that behavior ad nauseum.

The problem with google search isn't that people are exploiting 0-days that the developers aren't aware of. The problem is that the majority of the internet is becoming nearly identical pages that do the bare minimum to get ranked by search engines. Making the algorithm public will only exacerbate that


If there is a "secret formula" (exploit) then that is a flaw in the system that should be addressed, it will be discovered by the sufficiently motivated attackers, as it already is regularly. This is equivalent to security through obscurity, and it clearly is not working as you point out.

You assert that more eyes on the flaws would make the issue worse rather than better, but i disagree.


The ratio of risk and effort to payoff for gaming tweets vs. compromising operating systems is radically different.

Further, compromise of one box typically does not render all of them running that OS useless and harmful.


The “many eyes” saying is a comfortably common, but untrue, saying. Were it true, heartbleed and its ilk would never have occurred, given how widely used and distributed it was.


It's not about security bugs in this case, anti-spam needs obscurity as a layer of defense.


Not if, as the article suggested, there were multiple algorithms to choose from.

Especially not if the top choices were fundamentally different from each other.

It may go so far as to make it significantly more difficult for bad actors to operate.


Is that what happened with Linux and Mastodon?


Yes it'd be a cat and mouse game and that's how you build an efficient and fair algorithm. Not too unlike how the market decides what the price should be for a certain instrument. Lots of buys, lots of sells and where they meet is called the price. All in the open for anyone to see.

Let's stop the Communistic mentality around keeping things hidden. Please read about Soviet Russia, about Siberia and how Russians had no idea about the miseries inflected on them for decades.


The most contentious moderation happens via subjective human intervention rather than an algorithms. Fully machine driven moderation is a black hole into which copious amounts of time and money from the r&d labs of the biggest social media companies disappear into every year for a long time now.


But the article is not talking about content moderation as such. It's about surfacing the right content for right people, and regain the trust that author believes twitter has lost, without really justifying how.


> The most contentious moderation happens via subjective human intervention rather than an algorithms.

1. Source?

2. Thought experiment: if said subjective human intervention was recorded and codified into an algorithm it wouldn't be as contentious?


Source for probably the most contentious piece of moderation Twitter has ever done being manual was their own statement:

“After close review of recent Tweets from the @realDonaldTrump account and the context around them we have permanently suspended the account due to the risk of further incitement of violence,”


Or just give us more options for viewing a feed than "recommended" or "latest."

Heck, even create an alogrithm store where people could create and purchase different ways to view their feed.

It doesn't seem that difficult to add these multiple sort and filter options, but maybe it's more complex than I imagine.


I feel like authors are thinking about the "algorithm" almost as if it's just a bunch of if-else statements in a tree.

Read only audit is an overly naive way to think about understanding complex algorithms. In an audit they would find an enable of large DNN transformer models with thousands of layers and thousands of features often with their own transformation trees. There are entire CS departments dedicated to researching tools to understand complex nonlinear models. You definitely can't do with a read only code audit. You can _barely_ do it with full access to the model, full access to the input data and being able to retrain and rerun the model on that data.


You are suggesting the people who read HN and might click over to the "open Twitter" codebase won't understand it?

Engineers don't exist in a vacuum, the same engineers who write the algos are here on HN. Show HN the Twitter source code and a lot of engineers will understand it.

What is more likely is that the engineers are currently bound by NDAs and can't say what influences the algo.


Comments saying that this is naive seem to be completely missing the point in their desire to cynically defend a miserable state of affairs: clients, and not Twitter's servers, should ultimately determine what users see. Like the web browser or an email client.

The fact that we calculate precisely what users see on their devices on servers is a result of the architectural constraints of the time, and more importantly, the ad model of monetizing social networks.

If social networks were not monetized in this way, there would be far more power allocated to clients and APIs.

This isn't to say user devices can do everything: they can't. But can easily be given significant power to filter, reorder, and request different content — and with more advanced engineering, allow users to parameterize feeds.

The reason we can't have this is that allowing this degree of user choice undermines the ads model.


> clients, and not Twitter's servers, should ultimately determine what users see. Like the web browser or an email client.

That would involve giving clients access to information that clients probably shouldn't have. eg: If a part of the weighting for recommendation is that people you follow who regularly DM other people you follow should be weighted higher, doing it client side would allow you to see other people's private DM information.


The author's suggestion of a marketplace for algorithms sounds really cool. Whether it's easily technically feasible is another question entirely but the idea of that level of customization would be really cool. Some of the functionality is already there with Lists (as far as allowing regular users to create feeds and others being allowed to subscribe to them).

I do worry about the political implications of it though. People choosing to subscribe to only their world view would create an impenetrable echo chamber. At least now, there is some crossover. If people chose algorithms that avoided the other side of a debate, it would make things worse. Furthermore, you could have outside influences tricking people into certain algorithms to secure a populace with their beliefs.


If the problem is that the "recommendations" are biased, wouldn't it be simpler to:

1) Remove the recommendation part. Go back to a simpler version of Twitter. 2) Still give the option for a recommendation system but, somehow, open the code, maybe some version of the data and publish documentation (like papers or something) detailing the training process of the current running version.

than build a recommender system marketplace? Also, in what sense would this be different from allowing third party Twitter clients + opening a ritcher API?

The ultimate idea of using ML is to "automatically" build the recommender system the user likes (measuring this with some particular metric like online time or retention) the most and also automatically adapt it as his/her preferences change. The problem to me is more the metrics chosen to be optimized.

However, I believe that in the end, and in order to be profitable, user retention and time on the platform will still be pursued. It doesn't seem like an easy fix to me.

Regarding the "free speech" part, I'm not an expert, but I'd say (and after having watched the TED interview) that countries' legislations will considerably constraint this.

I love the idea of a true free platform tho


Amazingly Naive.

Just the data and code needed to make the “algo” scale to what it is means there is no good way to “open it up”.

But let’s say, and why not, that it actually was opened.

I can see it now, everyone and their mother would be recommending changes to it. People would want it at the extremes, or tweaked to just not recommend their pet peeves.

And if they did not get their way, they would go off in a huff and threaten to join another service.

Years ago I ran a big enough service that had millions and millions of users every month. A member of the military came on and demanded we allow him to do something that was against TOS.

He complained that this was what he fought in Iraq for. So he could write anything he wanted anywhere.

America is certainly not free and it’s time we stop giving our content to services like Twitter and Facebook if we believe that.


> I can see it now, everyone and their mother would be recommending changes to it. People would want it at the extremes, or tweaked to just not recommend their pet peeves.

I think the solution to this is to let people switch between different recommendation algorithms, a stream of chronological tweets, most liked among recent posted, one powered by ML, etc. I don't see any reason the implementations behind this, or even Twitter more broadly, could not be open source. There are many other open source projects, like Signal that manage this just fine. And for a while, even Reddit used to be open source. It's definitely do-able.


Exactly. While ML interpretability may be tough, I don't really read it like that. More here are X number of algorithms and each one is aiming to do Y. Chronological, Engagement, Happiness, Current Trends, etc.


I think they're looking for a statement like:

if (tweet.from in bad_users): score *= 0.8

Do they have the levers in place to manipulate results or is it a clear objective scoring system that shows what you see on your feed and search results? I believe they have something like this. To what extent its being manipulated is a separate question.


that's why a lot of responders are calling the approach naive. This function is potentially the last layer in thousands of functions. Thinking that one can predictably change the behavior of a large nonlinear model via one variable is like thinking you can change the US political system by passing one good law in Reno, Nevada.

This approach only works if auditability is mandated to also be a property of the model, which typically reduces model complexity and accuracy too. We do this with credit scoring models semi-successfully.


The interesting question this poses and has come up on a few comment threada here is

Is "security through on obscurity" necessary for automated content moderation?

Would an open algorithm be trigially gamed by spammers as they can now test offline exactly how their posts will be ranked/promoted?

My gut says yes but I'm not an expert in this area. Curious if anyone has a theory or idea on if an open moderation algorithm could work.

SpamAssassin exists and is open to moderate success. But is that just because it's use is not widespread enough to bother to test your spam mails against it?

If every email account in the world was covered by SpamAssassin, what would spam look like, and how much would make it through.


The twitter algo shouldn't be complicated. For the most part twitter shows me posts and likes from people I follow, occasionally sprinkling in a topic I follow, like viral tweets.

I want to read stuff from people I follow in more or less chronological order. Sure if something has a 1k likes and was posted a few hours ago and I hadn't seen it, show me that. There are simple formulas for time based rank that are out there.

But I regularly see twitter suppressing posts from people I follow. I won't see someone tweet for a few weeks and I check their account and I see they've been tweeting this whole time. It's wrong and annoying to me as a user.


I wish they'd just let me control what I see. Sure, throw an occasional ad in there to make money. It would be hard to blame them for the nastiness on Twitter if they didn't control it.


Make moderation guidelines and moderation audit trails public and open.


Not sure there is anything to gain here. Moderation is inherently subjective, especially at the scale of Twitter. The same people would just continue arguing about what should and should not be moderated and how.


Transparency would be helpful. At the very least it would cut off the speculation about what did or didn't behind the scenes. Right now it's a black box.


transparency would mean that every action they take gets scrutinized to absurdity, there’s no way they’d open that door of their own volition


In all 14,000,605 futures people will argue about this.

I think it would be better if it was an informed argument.


Both filtration(moderation) and sortation(the algorithm) should be handled exactly as Adblock handles it. People can create any list that they want, and people can choose to either subscribe to those lists or not, and can at anytime decide to see what it would look like if they were not.


Yes, like the lobste.rs moderation log. It's cute because, like, nobody cares.


Lots of good points here about how opening up the algorithm itself won't illuminate much and is technically extremely difficult without potentially devaluing the company since you would also have to open the data system behind the algorithm.

However, open sourcing any and all manual interventions over the algorithm + the guidelines used for evaluation and/or labeling (if any is done), would help to build a little bit of trust.

Not that much though, but it would be a start.


Let’s start with HackerNews for a start.

What would you say about opening up every poster who’s been blocked and exactly how and for reason (or keywords) they’ve been blocked?

How about opening up what keywords trigger mail to go to spam vs inbox for email providers? It’s going to be very valuable for someone to know how spam filtering works if their delivery rate doubles!


What would be much more interesting would be Twitter allowing users to define and publish their own algorithm, like lists with superpowers.

Stumbled on this idea on https://twitter.com/nbashaw/status/1515054551371378688


So by the same logic McDonald's should publish its secret sauce recipe. There is business value in keeping it hidden. It'd have a chance of happening if Twitter was taken private by someone with the intention of experimentation and improving the world in the long run.


McDonald's is required in some places (Canada, for example) to publish the ingredients for its food. For example, Big Mac Sauce:

Soybean oil, sweet relish (cucumber, glucose-fructose, sugar, vinegar, salt, xanthan gum, calcium chloride, natural flavour), water, vinegar, egg yolk, onion powder, spices, salt, propylene glycol alginate, colour, sugar, garlic powder, hydrolyzed (corn, soy, wheat) proteins. CONTAINS: Soy, Wheat, Egg, Mustard.

-- https://www.mcdonalds.com/ca/downloads/IngredientslistCA_EN....

And there are strict rules about how that ingredients list must be constructed:

"Ingredients must be declared by their common name in descending order of their proportion by weight of a prepackaged product. The order must be the order or percentage of the ingredients before they are combined to form the prepackaged product. In other words, based on what was added to the mixing bowl"

and on and on for many thousands of words, https://inspection.canada.ca/food-label-requirements/labelli...

No, that's not the full recipe, but it is information a company might not want to disclose, but is required to disclose because it affects the health and safety of consumers.


"Open up the algorithm" is such a charmingly naive take on community moderation at scale that I can almost forgive the author for literally parroting Musk as the source of a lightweight take synthesized mostly from other people's thoughts in order to attract attention to his "Every" newsletter platform.


I think that's a way to call for transparency. For instance, if they say they don't shadow ban people, then it should be evident in the code. And it would force them to have a clear process for things like search without any "nudging" or social manipulation.


Do you consider Jack Dorsey naive on the subject?

"The choice of which algorithm to use (or not) should be open to everyone" - Jack Dorsey

https://twitter.com/jack/status/1507146276416098307


I sure do. "The algorithm" fits nicely in articles complaining about unfair treatment, but it's not a real thing. Any mildly complicated reccomendation system is going to be 1) more than a single algorithm 2) far beyond the reasoning skills of laymen.


> Do you consider Jack Dorsey naive on the subject?

On the contrary, Jack knows what he's talking about and wants this because it would allow him to abdicate responsibility for Twitter platform moderation.


given that Jack Dorsey is also on the record for claiming that Bitcoin will lead to world peace I think it's fair to say the answer to that question is yes.


What makes in your opinion the request naive? Too complex to open source or for people to analyze and criticize? I see some people becoming Twitter algorithm experts as we also have Linux kernel developers


"Open up the algorithm" is a fantasy where you point a function at a tweet and it returns a boolean. In reality, content moderation at this scale is a complex problem that requires a sophisticated system of many parallel strategies (some of which are more automatable, some less) and many thousands of people to manage.


Is in your opinion the Twitter moderation algorithm more complicated than the Linux Kernel or Bitcoin software code bases? Those are audited and modified publicly al the time. The human part can also be audited if the rules are stated clearly and an explanation is always given when an action is taken.


How do you know? Where did you come up with evidence to confirm that statement? There has never been a platform as large as Twitter that opened up its algorithm.


As an example, the bits of Google's original algorithms that they did talk about (pagerank using reputation of in-links to propagate to a page) publicly began getting gamed into irrelevance fairly early on (link-for-link schemes being the start of the SEO industry). After that it's been a continuing struggle of a whole industry looking for weaknesses and Google adapting.

Another way of thinking about it, is that if someone could see what features were most important for the ranking on some site, then they could start to optimize for those, breaking the usefulness of that feature. One obvious example of this is "Please remember to like, comment, and subscribe" on YouTube.


For a long period of time, Reddit used to be open source. It can be done, plus additional changes to add transparency on top, like giving people the option to switch between different recommendation algorithms, making it clear why content was recommended, and so forth.


Twitter et al should/must let people opt-in to recommendation algorithms (opt-out being the default), and must have a "reset all recommendations" button present and visible at all times.

I don't care about the actual algorithm.


The best feed is the chronological, non-algorithmic feed.


An unmoderated chronological feed would simply be a running list of every tweet from every one.

A screen full of this feed would just be tweets made at the moment the data for the screen refresh was fetched.

Is that what you or anyone wants?


Yes. This is exactly what I want.

It's not a global feed of every twitter in existence, it's a feed of accounts that you proactively choose to follow.


How/where does discovery happen?


As a pull, not a push. Meaning, in the very beginning, from off the chronological feed. Later, from following conversations of follows to new interesting accounts.


I suppose you can ditch twitter and switch to reddit.


I don't understand what that has to do with my comment.


I still would prefer that to what is currently there.


Yes, of the people I choose to follow.


Truly do not understand this point of view. The strictly chronological presentation is too easy to flood with unwanted posts.


You could unfollow the people who spam you with unwanted posts. I find strictly chronological feeds to increase the control I have over what I see by not prioritizing the more popular posts of those I follow. It also makes it much easier to say “I’ve caught up to where I was last time, good time to stop scrolling”. (Parts of a website that center around discovering new content do need curation for the reason you mentioned, but content discovery is to me not the primary purpose of twitter style sites)

Chronological presentation is less useful for serving advertisements though, so I don’t expect it to show up very often.


I also do not understand the theory that the order of presentation is necessitated by advertising. Adverts are inserted into the 2nd position regardless of how the feed is ordered.


It's more like the social network can use the algorithm to surface clickbait or whatever drives their metrics.


You're discussing some hypothetical. Actual Twitter just sells the 2nd position in the feed for targeted adverts. That's it. Nothing fancy.


Incorrect. Twitter has been pushing its ghastly “Home” on those who prefer “Latest Tweets” for a while.


Simple solution, unfollow the spammers.


I choose people to follow. If somebody is flooding my feed more than I like, then I unfollow them. Truly do not understand why anyone would want it to work differently other than marketers.


But it's not spam. If fifty people you follow retweet the same popular post, you do not want to see it fifty-one times. You want to see it once, and you want it ranked highly because everyone you follow thinks it's an important post. It's a form of crowdsourcing from a curated group of contributors. The chonological view cannot accommodate.


That’s more of a problem with Twitter’s design. I don’t want to see any “retweets”. If someone I care about replies to something then (optionally!) show me their reply. If I find that compelling I can follow the primary source. I do not care about regurgitated content.


Again, I’ll control that by curating my own follow set. I don’t need some algorithm to crowdsource for me. That’s the last thing I want.


That only works for daily addicts. Otherwise, the low-volume accounts get lost in all the fast-posting people.


Even if you “open” the algorithm you won’t have proof that the system is in fact adhering to said algorithm or even including all data in its set.


Do people really browse those recommendations? Twitter is for following a couple of people, switching to timeline mode, and done.


Elon will definitely access private conversations if he takes Twitter private.


I my experience as someone who resides in India, I have seen Twitter repeatedly silencing voices of the opposition and amplifying the voices of those connected and aligned with the majoritarian, right-wing elements - and I don't think it is because of any algorithm. There are clearly people involved deciding case-by-case to shutout voices of the opposition.

We have seen this with Facebook as well with Facebook charging 3 times higher for ads from the opposition parties in a bid to influence the elections: https://www.aljazeera.com/economy/2022/3/16/facebook-charged...


Amazingly naive to develop an Operating system in the open. Amazingly naive to try and build an electric car company. Amazingly naive to think the internet will be worth anything. Amazongly naive to think that everyone will need a computer in their pocket.


A marketplace of algorithms and "pick your own algorithm" would be interesting. Is there any reason not to do this?

Not sure why this is getting downvoted. It's not crazy. Even Jack is pushing for it: "The choice of which algorithm to use (or not) should be open to everyone"

https://twitter.com/jack/status/1507146276416098307


Several reasons.

1. Absolutely incredibly complex architecture. The recommendation engines are not just single libraries that can be easily open sourced or transformed to a plugin system.

2. Recommendation engines exist to push up relevant metrics. These are either clear wins for users (like reported satisfaction), mixed wins for users and businesses (content engagement), or clear wins for businesses (ad engagement). Most businesses aren't thrilled about subbing in systems that degrade metrics.


I wonder about the feasibility of this.

How does an algorithm "that attempts to prioritize nuanced conversations about important topics" work? (Or "to find mind-expanding threads", for "savage dunks" or "thirst traps of hot new snax"? -- other examples from the post.)

I suppose you spend some time with existing tweets and ML and develop a model that can produce a score of some sort on these concepts for a tweet, and then run every tweet through the model and present the high-scoring ones. Of course, you can't just look at individual tweet, which in isolation don't mean much, but also at how it fits into a conversation. (For that matter, I'd be interested to see what a model of a conversation is.)

Sounds expensive and quite possibly not accurate enough to be worthwhile.

It also seems like a major strategic decision to give third parties every tweet by everyone. It seems like the business changes from being what twitter is now to a message routing backend, and these third parties become what twitter used to be. That's a fundamental shift, that probably devalues the company by an order of magnitude, since they would be dissolving the valuable thing they have -- the social network they have.

Just doesn't make any sense to me.


Any reason to keep the general public from running arbitrary database queries? Yes, I think there is. For example, regular expressions are turing-complete so you can make a query arbitrarily slow without a single join.

edit: complexity lol


You could also just limit a query to max N seconds, and abort it if so.

I don't think "marketplace of algorithms" necessarily mean people need to be able to make arbitrary database queries either, but seems besides the point.

Having their own DSL for sorting timelines can also solve that problem nicely and within whatever performance requirements they would end up with.


That seems like a very unlikely scenario which can be handled with a timeout.


> are NP-complete

You mean Turing-complete?


I meant what I said, but I was wrong, thanks.


Hard to monetize when people can chose algorithms that doesn't prefer content that makes them stay at the platform longer, angry or happy. Public companies are all about making money.


who would choose to see unengaging content? "My algorithm is set to show bland agreement with last week's news"


Subscriptions?


  if(tweet.author.affiliation == "democrat"){
    promoteTweet(tweet);
  } else {
    shadowBan(tweet.author);
  }


Yishan Wong , the CEO of Reddit until 2012 have tweeted a long thread about how Musk doesn't understand the current state of moderation necessary because he was much more involved with an earlier Internet. But I am of opinion that Yishan is also behind as one half of the debaters simply left facts and reality behind. This post here is an example.


Yes, this is what people legitimately think the recommendation system is like. It’s insane


Oh my. Twitter, or any other platform, is not your megaphone that will amplify your opinion without restrictions. Don't like that? Start your own platform, and see how that turns out.

Related: yes, I do support interoperability requirements between platforms. No, that still doesn't mean you get to blast your opinion all over the internet without hitting a roadbump every now and then.


> Twitter, or any other platform, is not your megaphone that will amplify your opinion without restrictions

It will be so satisfying for Musk to buy Twitter, open it up completely, and then be able to use this argument in reverse.


> Oh my. Twitter, or any other platform, is not your megaphone that will amplify your opinion without restrictions

Where did this sentiment originate from? I never heard of it before and all of a sudden in the last few years I hear so many people parroting it. Why is it that all these people were silent for so long and now they're yelling in unison about how bad Twitter as a megaphone is?


For YouTube at least I can give a clear example. I’m pretty sure they’ve cleared this up now but for a while you had no way to get consistent notifications of new content from creators you’d explicitly subscribed to. Twitter is similar in this way. If I follow someone, I want to see their tweets. Full stop.

I don’t want them arbitrarily hidden from my timeline by an algorithm. Twitter offers a chronological timeline but has repeatedly reset my user preference for it. If there weren’t third party applications that respected my preference I definitely would not be using it anymore.


Because fundamentally when people complain about being censored on places like twitter or TikTok, they're really complaining about not being broadcast loudly enough.


Because of Trump. He figured out how to use social media to his advantage.

Before for that, other “movements” that had bridged the gap between real and online worlds were celebrated.

Arab Spring circa 2012, is a particular good example.


When criticizing X, you cannot in good faith say — why don’t you build Y that’s like X? It just implicitly admits that criticizing X is off limits and you don’t like it.


Except that's not what's happening here. When doing anything on the internet "how are you going to control spam and obvious trolls" is question #1. Twitter has an answer to that: we mechanically and/or personally identify spam/troll content, and ban their creators.

Now, many people want them to stop doing that. Which they decline, since they KNOW what will be happening in that case.

So, if you think you can do a better platform, while disregarding the (minimal) lessons learned from Twitter (or Reddit, or...), go ahead! You will fail, not because "criticism of the original platform is off-limits", but because it's a well-known anti-pattern.


The discussion is about whether people can see what the algorithms are doing. Read only access to the code so they cannot manipulate people into their belief system or arbitrary shadow-ban rules.

You’re right about human judgement but that’s not the topic. The central point, I repeat emphatically, is about transparency, not governance.

Twitter can continue exactly the same way but just be transparent. The intense pushback is because they’ve holed themselves into an untenable position? Not sure why people are so against transparency. Maybe they lied in congressional testimonies?


I'm pretty sure the "algorithm" is "we count the number of end-user flags/reports, and if > X we remove it"?

Public knowledge of what "X" is doesn't really help, I think, other than to aid spammers? And a requirement to "talk to a human" upon hitting X would surely immediately degrade into "Google has reviewed your appeal and has determined that the infinite block of your account remains in effect. There is no further appeal"?


There are a whole host of algorithms. Recomendation, feed, suggested followers, interest-based suggestions, etc. I am suspecting algorithms for how "trending" topics are picked are quite involved. It probably goes through filters, blacklists, whitelists, some AI-voodoo and gets increasingly promoted based on engagement in real-time.

That secret sauce is ripe for manipulation and extremely powerful.

Against combating spam - I mean, isn't this how something gets stronger? HN has a strong view that open source software is more secure because it gets hardened through exposure, not through obfuscation.


> Twitter can continue exactly the same way but just be transparent.

No it can't. If the algorithm was transparent the only thing you'd see is spammers who have put tons of resources into figuring out the exactly optimal way to maximize engagement. Grassroots engagement would be impossible.


Some are arguing that the algorithm is so simple that there is nothing to disclose. That means that spamming has reached a plateau and can't get anyworse.

Also, Twitter's spam control has been objectively bad.

https://twitter.com/paulg/status/1487022342630957062?lang=en

People think that the entire platform has been hijacked by left-wing / progressives and the reason for lack of transparency is more insiduos than "spam". For example, being liable for what they told Congress.


Why not make the ban evident in a log of changes to a user's feed and allow that user to personally unban who they like? For a wide-swath of use cases from the recent past, the posts that incited the bans were not criminal in nature.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: