Twitter Should Open Up the Algorithm

gfodor · on April 16, 2022

Recommender systems like Twitter are as much about data as they are about code. Without the dataset and the derived statistics and models that are used for ranking and recall, the code is not going to help much with transparency.

sokoloff · on April 16, 2022

Exactly. “Here’s the code for our matrix multiplication, dot product, and sorting; you’re welcome to check it for bugs.”

sobkas · on April 16, 2022

> Recommender systems like Twitter are as much about data as they are about code. Without the dataset and the derived statistics and models that are used for ranking and recall, the code is not going to help much with transparency.

And also other way around, data depends on algorithms and tools that created them. You can't have insightful understanding of data without knowing how and why they were created.

gfodor · on April 16, 2022

True - however the thrust of the arguments for opening Twitter seem to be focused on opening up the code - so the “code without data” scenario is the one worth honing in on.

cycrutchfield · on April 16, 2022

Right. It’s a bit worrisome that people (even people in this very thread) know nothing about how recommendation and ranking work but think they know everything. “Release the code” is nonsensical.

nopenopenopeno · on April 16, 2022

This is a ridiculous claim. While you won’t be able to make use of the code, it would still provide insight of the inputs in relative terms, and evidence methods that do not apply.

Juliate · on April 16, 2022

> and provide insight as to some things that are not being done.

That's not an algorithm, that's product strategy & management.

gfodor · on April 16, 2022

It would provide insight but if the goal is to increase transparency to the point of meaningfully improving trust in the centralized actor that is Twitter, I don’t see it moving that needle much. Given the right data you can kind of coax a lot of these algorithms into arbitrary outputs, so if you’re starting from a place of distrust then if you can’t replicate the outputs yourself you wont have evidence to alter priors significantly with just code imo.

nopenopenopeno · on April 16, 2022

The point is to increase transparency, period. And you are the one arguing against that. Nobody is saying it’s enough, but it’s objectively better than nothing, and it would make it possible to ask more specific questions and make more specific arguments for the value of more transparency. Again, you are the one arguing against it but you have not provided a reason why.

gfodor · on April 16, 2022

I’m not arguing against it - I’m arguing that people who are focused on the idea of releasing the code are miscalibrated on the marginal impact of doing so given a lack of domain knowledge.

The net suggestion is to demand more, and start describing solutions, today, to get data visibility without compromising other things, to help offset the inevitable objections that will come once people are forced to demand it when they realize the code didn’t really solve the fundamental problem.

nopenopenopeno · on April 20, 2022

Ironically, the best way to prove your own point is to open the source code.

gopher_space · on April 16, 2022

The point I'm stuck on in this whole conversation is why you'd continue to use Twitter if you felt they were untrustworthy. Why are we discussing freedom of speech when you can just load another app and get all the freedom you want?

gfodor · on April 17, 2022

The reason people are complaining is we have found ourselves in a situation where Twitter is arguably already or indisputably will become what amounts to a global public square.

The early philosophical developments around freedom of speech didn’t foresee this but now that it is here, we have to ask the question of what ought we want if we are to uphold the principle that people should be free to speak unpopular ideas without censorship or fear of excommunication, given that a key ingredient in progress, revelation of the truth, and finding compromise to avoid violence.

Twitter is of course a private company, and can do what they want. But what those who wish to see free speech principles upheld say is we ought to want those in twitters position to be as lenient as possible so as to not stifle the free exchange of ideas. And, perhaps failing that, we ought to want to see another mechanism that is less susceptible to widespread censorship and overreach. Too many people seem counter someone explaining that they feel the status quo is undesirable with an is/ought fallacy - it’s our job in liberal democracies to continually raise and debate issues regarding basic freedoms and the ability for our society to continue to evolve according to shared principles.

dhzhzjsbevs · on April 16, 2022

I dunno, I deleted my twitter when they banned the sitting president for being too hyperbolic.

A social network complaining about hyperbolism and outrageous claims being made on it's platform is like complaining people are breathing too much air.

Twitter is not a place grownups should be participating.

soheil · on April 16, 2022

I think that's what everyone is talking about. They're just using the term algorithm colloquially.

gfodor · on April 16, 2022

I don’t think so - because releasing the data is a massive hill comparatively due to the immense privacy and legal issues.

soheil · on April 16, 2022

Hmm Twitter already sells its "firehose" to the highest bidder.

gfodor · on April 17, 2022

I’m sure twitters algorithms take advantage of all interaction data, like impressions, scrolls, and clicks. Agree tho it’s good a lot of the data is public.

soheil · on April 17, 2022

Publishing user scrolls is a "massive hill" and a legal/privacy nightmare?

gfodor · on April 17, 2022

Sure seems like it could be - I wouldn’t opt-in to allowing mine to be distributed, why would I? And I don’t think Twitter has the rights to do so. So you’d at least need to solve the anonymization problem, or you’d have to package a data release such that you can replicate the ranking algorithm without it.

soheil · on April 18, 2022

User clicks in form of "likes" are already public so why would scroll data be so sacred?

djanogo · on April 16, 2022

Oh, you mean the matrix came up with rule to ban people talking about lab leak? but matrix can't figure out simple crypto bots and need humans to report and manually ban them?

shalmanese · on April 17, 2022

That's moderation which is a completely separate topic from recommendation.

11111 · on April 16, 2022

At least according to this Twitter both agrees with you and is in the process of exploring transparency for both.

https://blog.openmined.org/announcing-our-partnership-with-t...

photochemsyn · on April 16, 2022

I think this is a great idea, and it shouldn't be limited to Twitter. Youtube recommendation algorithms should be handled similarly, indeed any public-facing system needs to be honest about the kind of spin they're putting on their recommendatiuon algorithms. Even better, allow the users to spin the dials when it comes down to their own particular searches/results.

> “One of the things I believe Twitter should do is open source the algorithm and make any changes to people’s tweets — you know, if they’re emphasized or de-emphasized — that action should be made apparent so anyone can see that action’s been taken. So there’s no sort of behind-the-scenes manipulation either algorithmically or manually.” Musk also added later, “The code should be on GitHub so people can look through it.”" (CNBC interview)

yosito · on April 16, 2022

> allow the users to spin the dials

If I could pick something to regulate about recommendation algorithms, probably a big part of it would be mandating consumer controls over the algorithm. Complete transparency might be hard when sometimes even the programmers of these systems don't have the ability to see why they recommend what they do, but there should definitely also be mandatory oversight by consumer protection agencies that have full ability to audit the code and report on it.

jimmaswell · on April 16, 2022

Sounds like a bad idea. Bad actors would immediately exploit it and figure out how to most effectively flood everyone with spam, same as if Google's exact algorithms were public. Not everything has to be open.

LightHugger · on April 16, 2022

On occasion, people make the same argument about all open source software, that making it public makes it insecure.

Fortunately this has not turned out to be the case in reality. If there are flaws in the system they will probably be easier to find and fix with more eyes on it.

swores · on April 16, 2022

In open source development you're trying to prevent exploits from being possible as much as you can, and for anything new you have time to get multiple people thinking about it before a release goes live.

In content moderation vs spammers, you have spammers adapting to your tools in real time while posting content in the same way real users do, and the balance between "don't want to accidentally ban real users (who post stuff very similar to the bots)" vs "does it matter if someone gets past your defences a few times an hour" is massively different too.

airstrike · on April 16, 2022

The consumption of FOSS and the consumption of tweets are so different I don't really see the merit in comparing the two.

hooande · on April 16, 2022

This is a terrible comparison. With OSS people are looking for security flaws. With a twitter/google algorithm people are trying to exploit the system when it works as intended. If there is a secret formula for a successful tweet, people will find it and repeat that behavior ad nauseum.

The problem with google search isn't that people are exploiting 0-days that the developers aren't aware of. The problem is that the majority of the internet is becoming nearly identical pages that do the bare minimum to get ranked by search engines. Making the algorithm public will only exacerbate that

LightHugger · on April 16, 2022

If there is a "secret formula" (exploit) then that is a flaw in the system that should be addressed, it will be discovered by the sufficiently motivated attackers, as it already is regularly. This is equivalent to security through obscurity, and it clearly is not working as you point out.

You assert that more eyes on the flaws would make the issue worse rather than better, but i disagree.

slowmovintarget · on April 16, 2022

The ratio of risk and effort to payoff for gaming tweets vs. compromising operating systems is radically different.

Further, compromise of one box typically does not render all of them running that OS useless and harmful.

falcolas · on April 16, 2022

The “many eyes” saying is a comfortably common, but untrue, saying. Were it true, heartbleed and its ilk would never have occurred, given how widely used and distributed it was.

rockooooo · on April 16, 2022

It's not about security bugs in this case, anti-spam needs obscurity as a layer of defense.

matheweis · on April 16, 2022

Not if, as the article suggested, there were multiple algorithms to choose from.

Especially not if the top choices were fundamentally different from each other.

It may go so far as to make it significantly more difficult for bad actors to operate.

memish · on April 16, 2022

Is that what happened with Linux and Mastodon?

soheil · on April 16, 2022

Yes it'd be a cat and mouse game and that's how you build an efficient and fair algorithm. Not too unlike how the market decides what the price should be for a certain instrument. Lots of buys, lots of sells and where they meet is called the price. All in the open for anyone to see.

Let's stop the Communistic mentality around keeping things hidden. Please read about Soviet Russia, about Siberia and how Russians had no idea about the miseries inflected on them for decades.

dkobia · on April 16, 2022

The most contentious moderation happens via subjective human intervention rather than an algorithms. Fully machine driven moderation is a black hole into which copious amounts of time and money from the r&d labs of the biggest social media companies disappear into every year for a long time now.

ankit219 · on April 16, 2022

But the article is not talking about content moderation as such. It's about surfacing the right content for right people, and regain the trust that author believes twitter has lost, without really justifying how.

soheil · on April 16, 2022

> The most contentious moderation happens via subjective human intervention rather than an algorithms.

1. Source?

2. Thought experiment: if said subjective human intervention was recorded and codified into an algorithm it wouldn't be as contentious?

crazy1van · on April 16, 2022

Source for probably the most contentious piece of moderation Twitter has ever done being manual was their own statement:

“After close review of recent Tweets from the @realDonaldTrump account and the context around them we have permanently suspended the account due to the risk of further incitement of violence,”

jimkleiber · on April 16, 2022

Or just give us more options for viewing a feed than "recommended" or "latest."

Heck, even create an alogrithm store where people could create and purchase different ways to view their feed.

It doesn't seem that difficult to add these multiple sort and filter options, but maybe it's more complex than I imagine.

alm1 · on April 16, 2022

I feel like authors are thinking about the "algorithm" almost as if it's just a bunch of if-else statements in a tree.

Read only audit is an overly naive way to think about understanding complex algorithms. In an audit they would find an enable of large DNN transformer models with thousands of layers and thousands of features often with their own transformation trees. There are entire CS departments dedicated to researching tools to understand complex nonlinear models. You definitely can't do with a read only code audit. You can _barely_ do it with full access to the model, full access to the input data and being able to retrain and rerun the model on that data.

CrimsonCape · on April 16, 2022

You are suggesting the people who read HN and might click over to the "open Twitter" codebase won't understand it?

Engineers don't exist in a vacuum, the same engineers who write the algos are here on HN. Show HN the Twitter source code and a lot of engineers will understand it.

What is more likely is that the engineers are currently bound by NDAs and can't say what influences the algo.

lazzlazzlazz · on April 17, 2022

Comments saying that this is naive seem to be completely missing the point in their desire to cynically defend a miserable state of affairs: clients, and not Twitter's servers, should ultimately determine what users see. Like the web browser or an email client.

The fact that we calculate precisely what users see on their devices on servers is a result of the architectural constraints of the time, and more importantly, the ad model of monetizing social networks.

If social networks were not monetized in this way, there would be far more power allocated to clients and APIs.

This isn't to say user devices can do everything: they can't. But can easily be given significant power to filter, reorder, and request different content — and with more advanced engineering, allow users to parameterize feeds.

The reason we can't have this is that allowing this degree of user choice undermines the ads model.

shalmanese · on April 17, 2022

> clients, and not Twitter's servers, should ultimately determine what users see. Like the web browser or an email client.

That would involve giving clients access to information that clients probably shouldn't have. eg: If a part of the weighting for recommendation is that people you follow who regularly DM other people you follow should be weighted higher, doing it client side would allow you to see other people's private DM information.

310260 · on April 16, 2022

The author's suggestion of a marketplace for algorithms sounds really cool. Whether it's easily technically feasible is another question entirely but the idea of that level of customization would be really cool. Some of the functionality is already there with Lists (as far as allowing regular users to create feeds and others being allowed to subscribe to them).

I do worry about the political implications of it though. People choosing to subscribe to only their world view would create an impenetrable echo chamber. At least now, there is some crossover. If people chose algorithms that avoided the other side of a debate, it would make things worse. Furthermore, you could have outside influences tricking people into certain algorithms to secure a populace with their beliefs.

alvaroir · on April 16, 2022

If the problem is that the "recommendations" are biased, wouldn't it be simpler to:

1) Remove the recommendation part. Go back to a simpler version of Twitter. 2) Still give the option for a recommendation system but, somehow, open the code, maybe some version of the data and publish documentation (like papers or something) detailing the training process of the current running version.

than build a recommender system marketplace? Also, in what sense would this be different from allowing third party Twitter clients + opening a ritcher API?

The ultimate idea of using ML is to "automatically" build the recommender system the user likes (measuring this with some particular metric like online time or retention) the most and also automatically adapt it as his/her preferences change. The problem to me is more the metrics chosen to be optimized.

However, I believe that in the end, and in order to be profitable, user retention and time on the platform will still be pursued. It doesn't seem like an easy fix to me.

Regarding the "free speech" part, I'm not an expert, but I'd say (and after having watched the TED interview) that countries' legislations will considerably constraint this.

I love the idea of a true free platform tho

eric4smith · on April 16, 2022

Amazingly Naive.

Just the data and code needed to make the “algo” scale to what it is means there is no good way to “open it up”.

But let’s say, and why not, that it actually was opened.

I can see it now, everyone and their mother would be recommending changes to it. People would want it at the extremes, or tweaked to just not recommend their pet peeves.

And if they did not get their way, they would go off in a huff and threaten to join another service.

Years ago I ran a big enough service that had millions and millions of users every month. A member of the military came on and demanded we allow him to do something that was against TOS.

He complained that this was what he fought in Iraq for. So he could write anything he wanted anywhere.

America is certainly not free and it’s time we stop giving our content to services like Twitter and Facebook if we believe that.

extheat · on April 16, 2022

> I can see it now, everyone and their mother would be recommending changes to it. People would want it at the extremes, or tweaked to just not recommend their pet peeves.

I think the solution to this is to let people switch between different recommendation algorithms, a stream of chronological tweets, most liked among recent posted, one powered by ML, etc. I don't see any reason the implementations behind this, or even Twitter more broadly, could not be open source. There are many other open source projects, like Signal that manage this just fine. And for a while, even Reddit used to be open source. It's definitely do-able.

curioushacking · on April 16, 2022

Exactly. While ML interpretability may be tough, I don't really read it like that. More here are X number of algorithms and each one is aiming to do Y. Chronological, Engagement, Happiness, Current Trends, etc.

bko · on April 16, 2022

I think they're looking for a statement like:

if (tweet.from in bad_users): score *= 0.8

Do they have the levers in place to manipulate results or is it a clear objective scoring system that shows what you see on your feed and search results? I believe they have something like this. To what extent its being manipulated is a separate question.

alm1 · on April 16, 2022

that's why a lot of responders are calling the approach naive. This function is potentially the last layer in thousands of functions. Thinking that one can predictably change the behavior of a large nonlinear model via one variable is like thinking you can change the US political system by passing one good law in Reno, Nevada.

This approach only works if auditability is mandated to also be a property of the model, which typically reduces model complexity and accuracy too. We do this with credit scoring models semi-successfully.

admax88qqq · on April 16, 2022

The interesting question this poses and has come up on a few comment threada here is

Is "security through on obscurity" necessary for automated content moderation?

Would an open algorithm be trigially gamed by spammers as they can now test offline exactly how their posts will be ranked/promoted?

My gut says yes but I'm not an expert in this area. Curious if anyone has a theory or idea on if an open moderation algorithm could work.

SpamAssassin exists and is open to moderate success. But is that just because it's use is not widespread enough to bother to test your spam mails against it?

If every email account in the world was covered by SpamAssassin, what would spam look like, and how much would make it through.

bko · on April 16, 2022

The twitter algo shouldn't be complicated. For the most part twitter shows me posts and likes from people I follow, occasionally sprinkling in a topic I follow, like viral tweets.

I want to read stuff from people I follow in more or less chronological order. Sure if something has a 1k likes and was posted a few hours ago and I hadn't seen it, show me that. There are simple formulas for time based rank that are out there.

But I regularly see twitter suppressing posts from people I follow. I won't see someone tweet for a few weeks and I check their account and I see they've been tweeting this whole time. It's wrong and annoying to me as a user.

jmugan · on April 16, 2022

I wish they'd just let me control what I see. Sure, throw an occasional ad in there to make money. It would be hard to blame them for the nastiness on Twitter if they didn't control it.

andrew_ · on April 16, 2022

Make moderation guidelines and moderation audit trails public and open.

rco8786 · on April 16, 2022

Not sure there is anything to gain here. Moderation is inherently subjective, especially at the scale of Twitter. The same people would just continue arguing about what should and should not be moderated and how.

memish · on April 16, 2022

Transparency would be helpful. At the very least it would cut off the speculation about what did or didn't behind the scenes. Right now it's a black box.

micromacrofoot · on April 16, 2022

transparency would mean that every action they take gets scrutinized to absurdity, there’s no way they’d open that door of their own volition

BurningFrog · on April 16, 2022

In all 14,000,605 futures people will argue about this.

I think it would be better if it was an informed argument.

ItsMonkk · on April 16, 2022

Both filtration(moderation) and sortation(the algorithm) should be handled exactly as Adblock handles it. People can create any list that they want, and people can choose to either subscribe to those lists or not, and can at anytime decide to see what it would look like if they were not.

mdb31 · on April 16, 2022

Yes, like the lobste.rs moderation log. It's cute because, like, nobody cares.

dontreact · on April 16, 2022

Lots of good points here about how opening up the algorithm itself won't illuminate much and is technically extremely difficult without potentially devaluing the company since you would also have to open the data system behind the algorithm.

However, open sourcing any and all manual interventions over the algorithm + the guidelines used for evaluation and/or labeling (if any is done), would help to build a little bit of trust.

Not that much though, but it would be a start.

mritun · on April 16, 2022

Let’s start with HackerNews for a start.

What would you say about opening up every poster who’s been blocked and exactly how and for reason (or keywords) they’ve been blocked?

How about opening up what keywords trigger mail to go to spam vs inbox for email providers? It’s going to be very valuable for someone to know how spam filtering works if their delivery rate doubles!

Juliate · on April 16, 2022

What would be much more interesting would be Twitter allowing users to define and publish their own algorithm, like lists with superpowers.

Stumbled on this idea on https://twitter.com/nbashaw/status/1515054551371378688

soheil · on April 16, 2022

So by the same logic McDonald's should publish its secret sauce recipe. There is business value in keeping it hidden. It'd have a chance of happening if Twitter was taken private by someone with the intention of experimentation and improving the world in the long run.

rufus_foreman · on April 16, 2022

McDonald's is required in some places (Canada, for example) to publish the ingredients for its food. For example, Big Mac Sauce:

Soybean oil, sweet relish (cucumber, glucose-fructose, sugar, vinegar, salt, xanthan gum, calcium chloride, natural flavour), water, vinegar, egg yolk, onion powder, spices, salt, propylene glycol alginate, colour, sugar, garlic powder, hydrolyzed (corn, soy, wheat) proteins. CONTAINS: Soy, Wheat, Egg, Mustard.

-- https://www.mcdonalds.com/ca/downloads/IngredientslistCA_EN....

And there are strict rules about how that ingredients list must be constructed:

"Ingredients must be declared by their common name in descending order of their proportion by weight of a prepackaged product. The order must be the order or percentage of the ingredients before they are combined to form the prepackaged product. In other words, based on what was added to the mixing bowl"

and on and on for many thousands of words, https://inspection.canada.ca/food-label-requirements/labelli...

No, that's not the full recipe, but it is information a company might not want to disclose, but is required to disclose because it affects the health and safety of consumers.

CharlesW · on April 16, 2022

"Open up the algorithm" is such a charmingly naive take on community moderation at scale that I can almost forgive the author for literally parroting Musk as the source of a lightweight take synthesized mostly from other people's thoughts in order to attract attention to his "Every" newsletter platform.

bko · on April 16, 2022

I think that's a way to call for transparency. For instance, if they say they don't shadow ban people, then it should be evident in the code. And it would force them to have a clear process for things like search without any "nudging" or social manipulation.

paulsutter · on April 16, 2022

Do you consider Jack Dorsey naive on the subject?

"The choice of which algorithm to use (or not) should be open to everyone" - Jack Dorsey

https://twitter.com/jack/status/1507146276416098307

powerslacker · on April 16, 2022

I sure do. "The algorithm" fits nicely in articles complaining about unfair treatment, but it's not a real thing. Any mildly complicated reccomendation system is going to be 1) more than a single algorithm 2) far beyond the reasoning skills of laymen.

CharlesW · on April 16, 2022

> Do you consider Jack Dorsey naive on the subject?

On the contrary, Jack knows what he's talking about and wants this because it would allow him to abdicate responsibility for Twitter platform moderation.

Barrin92 · on April 16, 2022

given that Jack Dorsey is also on the record for claiming that Bitcoin will lead to world peace I think it's fair to say the answer to that question is yes.

dmarcos · on April 16, 2022

What makes in your opinion the request naive? Too complex to open source or for people to analyze and criticize? I see some people becoming Twitter algorithm experts as we also have Linux kernel developers

CharlesW · on April 16, 2022

"Open up the algorithm" is a fantasy where you point a function at a tweet and it returns a boolean. In reality, content moderation at this scale is a complex problem that requires a sophisticated system of many parallel strategies (some of which are more automatable, some less) and many thousands of people to manage.

dmarcos · on April 16, 2022

Is in your opinion the Twitter moderation algorithm more complicated than the Linux Kernel or Bitcoin software code bases? Those are audited and modified publicly al the time. The human part can also be audited if the rules are stated clearly and an explanation is always given when an action is taken.

soheil · on April 16, 2022

How do you know? Where did you come up with evidence to confirm that statement? There has never been a platform as large as Twitter that opened up its algorithm.

yuliyp · on April 16, 2022

As an example, the bits of Google's original algorithms that they did talk about (pagerank using reputation of in-links to propagate to a page) publicly began getting gamed into irrelevance fairly early on (link-for-link schemes being the start of the SEO industry). After that it's been a continuing struggle of a whole industry looking for weaknesses and Google adapting.

Another way of thinking about it, is that if someone could see what features were most important for the ranking on some site, then they could start to optimize for those, breaking the usefulness of that feature. One obvious example of this is "Please remember to like, comment, and subscribe" on YouTube.

extheat · on April 16, 2022

For a long period of time, Reddit used to be open source. It can be done, plus additional changes to add transparency on top, like giving people the option to switch between different recommendation algorithms, making it clear why content was recommended, and so forth.

dmitriid · on April 16, 2022

Twitter et al should/must let people opt-in to recommendation algorithms (opt-out being the default), and must have a "reset all recommendations" button present and visible at all times.

I don't care about the actual algorithm.

baggy_trough · on April 16, 2022

The best feed is the chronological, non-algorithmic feed.

jmull · on April 16, 2022

An unmoderated chronological feed would simply be a running list of every tweet from every one.

A screen full of this feed would just be tweets made at the moment the data for the screen refresh was fetched.

Is that what you or anyone wants?

badwolf · on April 16, 2022

Yes. This is exactly what I want.

It's not a global feed of every twitter in existence, it's a feed of accounts that you proactively choose to follow.

jmull · on April 16, 2022

How/where does discovery happen?

baggy_trough · on April 16, 2022

As a pull, not a push. Meaning, in the very beginning, from off the chronological feed. Later, from following conversations of follows to new interesting accounts.

jmull · on April 16, 2022

I suppose you can ditch twitter and switch to reddit.

baggy_trough · on April 16, 2022

I don't understand what that has to do with my comment.

Nuzzerino · on April 16, 2022

I still would prefer that to what is currently there.

baggy_trough · on April 16, 2022

Yes, of the people I choose to follow.

jeffbee · on April 16, 2022

Truly do not understand this point of view. The strictly chronological presentation is too easy to flood with unwanted posts.

NineStarPoint · on April 16, 2022

You could unfollow the people who spam you with unwanted posts. I find strictly chronological feeds to increase the control I have over what I see by not prioritizing the more popular posts of those I follow. It also makes it much easier to say “I’ve caught up to where I was last time, good time to stop scrolling”. (Parts of a website that center around discovering new content do need curation for the reason you mentioned, but content discovery is to me not the primary purpose of twitter style sites)

Chronological presentation is less useful for serving advertisements though, so I don’t expect it to show up very often.

jeffbee · on April 16, 2022

I also do not understand the theory that the order of presentation is necessitated by advertising. Adverts are inserted into the 2nd position regardless of how the feed is ordered.

baggy_trough · on April 16, 2022

It's more like the social network can use the algorithm to surface clickbait or whatever drives their metrics.

jeffbee · on April 16, 2022

You're discussing some hypothetical. Actual Twitter just sells the 2nd position in the feed for targeted adverts. That's it. Nothing fancy.

baggy_trough · on April 16, 2022

Incorrect. Twitter has been pushing its ghastly “Home” on those who prefer “Latest Tweets” for a while.

mulmen · on April 16, 2022

Simple solution, unfollow the spammers.

baggy_trough · on April 16, 2022

I choose people to follow. If somebody is flooding my feed more than I like, then I unfollow them. Truly do not understand why anyone would want it to work differently other than marketers.

jeffbee · on April 16, 2022

But it's not spam. If fifty people you follow retweet the same popular post, you do not want to see it fifty-one times. You want to see it once, and you want it ranked highly because everyone you follow thinks it's an important post. It's a form of crowdsourcing from a curated group of contributors. The chonological view cannot accommodate.

mulmen · on April 16, 2022

That’s more of a problem with Twitter’s design. I don’t want to see any “retweets”. If someone I care about replies to something then (optionally!) show me their reply. If I find that compelling I can follow the primary source. I do not care about regurgitated content.

baggy_trough · on April 16, 2022

Again, I’ll control that by curating my own follow set. I don’t need some algorithm to crowdsource for me. That’s the last thing I want.

BeFlatXIII · on April 16, 2022

That only works for daily addicts. Otherwise, the low-volume accounts get lost in all the fast-posting people.

rasengan · on April 16, 2022

Even if you “open” the algorithm you won’t have proof that the system is in fact adhering to said algorithm or even including all data in its set.

jdrc · on April 16, 2022

Do people really browse those recommendations? Twitter is for following a couple of people, switching to timeline mode, and done.

ravish0007 · on April 17, 2022

Elon will definitely access private conversations if he takes Twitter private.

pkphilip · on April 18, 2022

I my experience as someone who resides in India, I have seen Twitter repeatedly silencing voices of the opposition and amplifying the voices of those connected and aligned with the majoritarian, right-wing elements - and I don't think it is because of any algorithm. There are clearly people involved deciding case-by-case to shutout voices of the opposition.

We have seen this with Facebook as well with Facebook charging 3 times higher for ads from the opposition parties in a bid to influence the elections: https://www.aljazeera.com/economy/2022/3/16/facebook-charged...

spacexsucks · on April 16, 2022

Amazingly naive to develop an Operating system in the open. Amazingly naive to try and build an electric car company. Amazingly naive to think the internet will be worth anything. Amazongly naive to think that everyone will need a computer in their pocket.

memish · on April 16, 2022

A marketplace of algorithms and "pick your own algorithm" would be interesting. Is there any reason not to do this?

Not sure why this is getting downvoted. It's not crazy. Even Jack is pushing for it: "The choice of which algorithm to use (or not) should be open to everyone"

https://twitter.com/jack/status/1507146276416098307

UncleMeat · on April 16, 2022

Several reasons.

1. Absolutely incredibly complex architecture. The recommendation engines are not just single libraries that can be easily open sourced or transformed to a plugin system.

2. Recommendation engines exist to push up relevant metrics. These are either clear wins for users (like reported satisfaction), mixed wins for users and businesses (content engagement), or clear wins for businesses (ad engagement). Most businesses aren't thrilled about subbing in systems that degrade metrics.

jmull · on April 16, 2022

I wonder about the feasibility of this.

How does an algorithm "that attempts to prioritize nuanced conversations about important topics" work? (Or "to find mind-expanding threads", for "savage dunks" or "thirst traps of hot new snax"? -- other examples from the post.)

I suppose you spend some time with existing tweets and ML and develop a model that can produce a score of some sort on these concepts for a tweet, and then run every tweet through the model and present the high-scoring ones. Of course, you can't just look at individual tweet, which in isolation don't mean much, but also at how it fits into a conversation. (For that matter, I'd be interested to see what a model of a conversation is.)

Sounds expensive and quite possibly not accurate enough to be worthwhile.

It also seems like a major strategic decision to give third parties every tweet by everyone. It seems like the business changes from being what twitter is now to a message routing backend, and these third parties become what twitter used to be. That's a fundamental shift, that probably devalues the company by an order of magnitude, since they would be dissolving the valuable thing they have -- the social network they have.

Just doesn't make any sense to me.

klyrs · on April 16, 2022

Any reason to keep the general public from running arbitrary database queries? Yes, I think there is. For example, regular expressions are turing-complete so you can make a query arbitrarily slow without a single join.

edit: complexity lol

capableweb · on April 16, 2022

You could also just limit a query to max N seconds, and abort it if so.

I don't think "marketplace of algorithms" necessarily mean people need to be able to make arbitrary database queries either, but seems besides the point.

Having their own DSL for sorting timelines can also solve that problem nicely and within whatever performance requirements they would end up with.

fooker · on April 16, 2022

That seems like a very unlikely scenario which can be handled with a timeout.

systemvoltage · on April 16, 2022

> are NP-complete

You mean Turing-complete?

klyrs · on April 16, 2022

I meant what I said, but I was wrong, thanks.

capableweb · on April 16, 2022

Hard to monetize when people can chose algorithms that doesn't prefer content that makes them stay at the platform longer, angry or happy. Public companies are all about making money.

hooande · on April 16, 2022

who would choose to see unengaging content? "My algorithm is set to show bland agreement with last week's news"

mountainriver · on April 16, 2022

Subscriptions?

fareesh · on April 16, 2022

  if(tweet.author.affiliation == "democrat"){
    promoteTweet(tweet);
  } else {
    shadowBan(tweet.author);
  }

_ugfj · on April 16, 2022

Yishan Wong , the CEO of Reddit until 2012 have tweeted a long thread about how Musk doesn't understand the current state of moderation necessary because he was much more involved with an earlier Internet. But I am of opinion that Yishan is also behind as one half of the debaters simply left facts and reality behind. This post here is an example.

cycrutchfield · on April 16, 2022

Yes, this is what people legitimately think the recommendation system is like. It’s insane

mdb31 · on April 16, 2022

Oh my. Twitter, or any other platform, is not your megaphone that will amplify your opinion without restrictions. Don't like that? Start your own platform, and see how that turns out.

Related: yes, I do support interoperability requirements between platforms. No, that still doesn't mean you get to blast your opinion all over the internet without hitting a roadbump every now and then.

evandale · on April 16, 2022

> Twitter, or any other platform, is not your megaphone that will amplify your opinion without restrictions

It will be so satisfying for Musk to buy Twitter, open it up completely, and then be able to use this argument in reverse.

soheil · on April 16, 2022

> Oh my. Twitter, or any other platform, is not your megaphone that will amplify your opinion without restrictions

Where did this sentiment originate from? I never heard of it before and all of a sudden in the last few years I hear so many people parroting it. Why is it that all these people were silent for so long and now they're yelling in unison about how bad Twitter as a megaphone is?

charrondev · on April 16, 2022

For YouTube at least I can give a clear example. I’m pretty sure they’ve cleared this up now but for a while you had no way to get consistent notifications of new content from creators you’d explicitly subscribed to. Twitter is similar in this way. If I follow someone, I want to see their tweets. Full stop.

I don’t want them arbitrarily hidden from my timeline by an algorithm. Twitter offers a chronological timeline but has repeatedly reset my user preference for it. If there weren’t third party applications that respected my preference I definitely would not be using it anymore.

admax88qqq · on April 16, 2022

Because fundamentally when people complain about being censored on places like twitter or TikTok, they're really complaining about not being broadcast loudly enough.

matheweis · on April 16, 2022

Because of Trump. He figured out how to use social media to his advantage.

Before for that, other “movements” that had bridged the gap between real and online worlds were celebrated.

Arab Spring circa 2012, is a particular good example.

systemvoltage · on April 16, 2022

When criticizing X, you cannot in good faith say — why don’t you build Y that’s like X? It just implicitly admits that criticizing X is off limits and you don’t like it.

mdb31 · on April 16, 2022

Except that's not what's happening here. When doing anything on the internet "how are you going to control spam and obvious trolls" is question #1. Twitter has an answer to that: we mechanically and/or personally identify spam/troll content, and ban their creators.

Now, many people want them to stop doing that. Which they decline, since they KNOW what will be happening in that case.

So, if you think you can do a better platform, while disregarding the (minimal) lessons learned from Twitter (or Reddit, or...), go ahead! You will fail, not because "criticism of the original platform is off-limits", but because it's a well-known anti-pattern.

systemvoltage · on April 16, 2022

The discussion is about whether people can see what the algorithms are doing. Read only access to the code so they cannot manipulate people into their belief system or arbitrary shadow-ban rules.

You’re right about human judgement but that’s not the topic. The central point, I repeat emphatically, is about transparency, not governance.

Twitter can continue exactly the same way but just be transparent. The intense pushback is because they’ve holed themselves into an untenable position? Not sure why people are so against transparency. Maybe they lied in congressional testimonies?

mdb31 · on April 16, 2022

I'm pretty sure the "algorithm" is "we count the number of end-user flags/reports, and if > X we remove it"?

Public knowledge of what "X" is doesn't really help, I think, other than to aid spammers? And a requirement to "talk to a human" upon hitting X would surely immediately degrade into "Google has reviewed your appeal and has determined that the infinite block of your account remains in effect. There is no further appeal"?

systemvoltage · on April 16, 2022

There are a whole host of algorithms. Recomendation, feed, suggested followers, interest-based suggestions, etc. I am suspecting algorithms for how "trending" topics are picked are quite involved. It probably goes through filters, blacklists, whitelists, some AI-voodoo and gets increasingly promoted based on engagement in real-time.

That secret sauce is ripe for manipulation and extremely powerful.

Against combating spam - I mean, isn't this how something gets stronger? HN has a strong view that open source software is more secure because it gets hardened through exposure, not through obfuscation.

colinmhayes · on April 16, 2022

> Twitter can continue exactly the same way but just be transparent.

No it can't. If the algorithm was transparent the only thing you'd see is spammers who have put tons of resources into figuring out the exactly optimal way to maximize engagement. Grassroots engagement would be impossible.

systemvoltage · on April 16, 2022

Some are arguing that the algorithm is so simple that there is nothing to disclose. That means that spamming has reached a plateau and can't get anyworse.

Also, Twitter's spam control has been objectively bad.

https://twitter.com/paulg/status/1487022342630957062?lang=en

People think that the entire platform has been hijacked by left-wing / progressives and the reason for lack of transparency is more insiduos than "spam". For example, being liable for what they told Congress.

_hdki · on April 16, 2022

Why not make the ban evident in a log of changes to a user's feed and allow that user to personally unban who they like? For a wide-swath of use cases from the recent past, the posts that incited the bans were not criminal in nature.