Comments

You must log in or register to comment.

ZestyData t1_jegdmzo wrote

Putting aside the political undertones behind many peoples' desire to publish "the algorithm", this is a phenomenal piece of educational content for ML professionals.

Here we have a world-class complex recommendation & ranking system laid bare for all to read into, and develop upon. This is a veritable gold mine of an an educational resource.

631

Educational-Net303 t1_jeggs0s wrote

Yeah, like Elon or not, the push for open source is always going to be beneficial to the community. Ironic how twitter is more open than ____AI.

308

Erosis t1_jegj2l9 wrote

Twitter is already established as a brand to near saturation and Elon has more money than god. It's the perfect combo for ML philanthropy. Now waiting for that Tesla vision algorithm...

93

FinancialElephant t1_jeh33j9 wrote

Most infrastructure code like computer vision code, device drivers, etc are either not culturally relevant or have little cultural relevance.

I don't think it makes any sense to prioritize them when things like twitter have much more direct cultural impact. It would be great if my network card driver was open source, but does it really matter? Is it worth prioritizing? Will it likely have any cultural relevance? To most people the answer to all these questions is no.

−6

pier4r t1_jegm5a1 wrote

> world-class complex recommendation & ranking system

https://twitter.com/amasad/status/1641879976529248256?s=20

I mean surely it is great but my recommendations weren't exactly stellar in those years.

38

Ulfgardleo t1_jegoe8z wrote

this aprt is not used for recommendations though. this is for analytics and internal testing and ensuring that different groups (+elon) don't get disadvantaged.

30

ZestyData t1_jeh12gm wrote

Idk man as a fairly well seasoned MLE I find their general architecture and scale of their combined models to be fascinating in-and-of itself.

Twitter sucks ass - but this is a beautiful piece of ML Engineering.

8

grumpyp2 t1_jegkfri wrote

Where to start with, it’s such a huge project 😳

24

LetMeGuessYourAlts t1_jegmjkk wrote

Readme.md

Sorry, had to 🤓

68

Internationalizard t1_jegnp8m wrote

I checked the commit history but it has only one commit. So this is a pretty straight forward place to start: https://github.com/twitter/the-algorithm/commit/7f90d0ca342b928b479b512ec51ac2c3821f5922

22

lordofbitterdrinks t1_jegql5s wrote

So how do we know this is the repo used by Twitter and not some stripped down version of it

14

ZestyData t1_jeh198p wrote

This quite obviously isn't the repo used by twitter.

It is a pretty large and well put together documentation epic & consolidation of multiple microservices.

Whether the content is 100% reflective of whats deployed is completely unclear. But its not "fake" that's for sure, its genuinely too many man-years of work to not be in-essence real.

52

f10101 t1_jegs5yt wrote

It will take time, but I'd imagine it should be possible to derive a method of determining this by observation.

Algorithms like this will have fingerprints.

14

MjrK t1_jegtjqj wrote

We don't and likely we won't know.

Unless perhaps someone internal checks and leaks important missing details that later on...

But for now, it does seem robust enough to be reflective of what they have probably been using up to some recent - but that's still just speculation

10

LoaderD t1_jegsuar wrote

> Here we have a world-class complex recommendation

...You know this is twitter's recommender system right? All the tweets I interact with are ML related from very 'left' people like Jeremy Howard.

My recommender system could legit be:

if interested_in_finance_or_ML:
     recommend_alt_right_hate_speech_accounts()
     recommend_crypto_scam_ads()
24

Educational-Net303 t1_jegta0z wrote

Get rid of the if statement and you just recreated Twitter's recommendation algorithm

28

Roger_Cockfoster t1_jegy0u7 wrote

In fairness, it doesn't really matter what you interact with. Twitter is just a sewer of alt-right hate speech for everyone.

−9

Necessary-Meringue-1 t1_jegshy4 wrote

It's a pretty cool resource to get to look at an enterprise recommendation algorithm like that.

​

An aside, if you want a chuckle, search the term "Elon" in the repo:https://github.com/twitter/the-algorithm/search?q=elonhttps://github.com/twitter/the-algorithm/search?q=elon&type=issues

​

[edit 1]
since it's gone now, here's the back up provided by u/MjrK:https://i.imgur.com/jxqaByA.png
[edit 2] lol
https://github.com/twitter/the-algorithm/commit/ec83d01dcaebf369444d75ed04b3625a0a645eb9#diff-a58270fa1b8b745cd0bd311bed9cd24c983de80f96e7bd445e16e88b61e492b8L225

100

t98907 t1_jegy79j wrote

However, it does not seem to affect the recommendation algorithm.

−25

Necessary-Meringue-1 t1_jegz6md wrote

I think we can safely go with Occam's Razor here. I would assume the "influential celebrity" is the "power_user" type, see: https://i.imgur.com/s6ntUil.png

​

Either way, I'm not surprised they are giving tweets from Musk their own type. Why wouldn't they. It probably became necessary to deal with his antics.

30

midasp t1_jeh2awl wrote

It's kinda nice to see PageRank is still being used as one of the components of the algorithm

48

midnitte t1_jeh1mia wrote

Seems to be deleted now, which wouldn't be surprising...

19

codingwoman_ t1_jeh2iw5 wrote

Well devil is in the detail, don't miss the fun part in commit messages :)

Please note we have force-pushed a new initial commit in order to remove some publicly-available Twitter user information. Note that this process may be required in the future.
40

codingwoman_ t1_jeh31os wrote

I'm still able to access this link though, even on private browser

6

junkboxraider t1_jegcvli wrote

Wonder whether they included the Elon+1000 and Can'tBlockHim mods in this version?

41

CommunismDoesntWork t1_jegsr5b wrote

As far as I know, there was never any evidence to back up those claims

13

londons_explorer t1_jegw8tx wrote

The claims are plausible accidents from a technical perspective. It's very possible for a system which does blocklists to choke up on the longest Blocklist it has ever seen and fail to add new things to the list.

7

mikiex t1_jegkzo3 wrote

If it's anything like their algorithm that shows me the tweets from a trending, I wouldn't want it.

13

Long_Educational t1_jegm6fp wrote

There is too much money at stake for there not to be additional invisible weights that are able to be tweaked by Twitter behind the scenes.

For example, I would imagine a 2 billion dollar stake by the Saudi's would purchase huge influence. This goes for anyone else that Elon "hangs" with during the Olympics or the Superbowl, or FIFA WorldCup.

6

mcilrain t1_jegtja1 wrote

Those are probably part of the advertisement system.

21

midnitte t1_jeh1w7t wrote

I wonder if this is an effort to save face after the source code leak

3

[deleted] t1_jegn6js wrote

[deleted]

−13

ElectronicHorse3219 t1_jegs3wq wrote

We get it, space man bad but it’s a for profit company. Nobody was expecting 100% of the code. How much did you pay for the self driving bridge?

36

master3243 t1_jeh48sn wrote

I don't take any CEO's words at face value without considering the monetary values and incentives behind that tongue.

A large project like this being open-sourced, even if it's a very old or heavily stripped down version, is always a great thing for the community.

21