Submitted by hcarlens t3_11kzkla in MachineLearning

I run mlcontests.com, a website that aggregates ML competitions across Kaggle and other platforms.

I've just finished a detailed analysis of 200+ competitions in 2022, and what winners did (we found winning solutions for 67 competitions).

Some highlights:

  • Kaggle still dominant with the most prize money, most competitions, and most entries per competition...
  • ... but there are 10+ other platforms with interesting competitions and decent prize money, and dozens of single-competition sites
  • Almost all competition winners used Python, 1 used C++, 1 used R, 1 used Java
  • 96% (!) of Deep Learning solutions used PyTorch (up from 77% last year)
  • All winning NLP solutions we found used Transformers
  • Most computer vision solutions used CNNs, though some used Transformer-based models
  • Tabular data competitions were mostly won by GBDTs (gradient-boosted decision trees; mostly LightGBM), though ensembles with PyTorch are common
  • Some winners spent hundreds of dollars on cloud compute for a single training run, others managed to win just using Colab's free tier
  • Winners have largely converged on a common toolkit - PyData stack for the basics, PyTorch for deep learning, LightGBM/XGBoost/CatBoost for GBDTs, Optuna for hyperparam optimisation.
  • Half of competition winners are first-time winners; a third have won multiple comps before; half are solo winners. Some serial winners won 2-3 competitions just in 2022!

Way more details as well as methodology here in the full report: https://mlcontests.com/state-of-competitive-machine-learning-2022?ref=mlc_reddit

Most common Python Packages used by winners

When I published something similar here last year, I got a lot of questions about tabular data, so I did a deep dive into that this year.People also asked about leaderboard shakeups and compute cost trends, so those are included too. I'd love to hear your suggestions for next year.

I managed to spend way more time on this analysis than last year thanks to the report sponsors (G-Research, a top quant firm, and Genesis Cloud, a renewable-energy cloud compute firm) - if you want to support this research, please check them out. I won't spam you with links here, there's more detail on them at the bottom of the report.

491

Comments

You must log in or register to comment.

backhanderer t1_jb9ph2v wrote

Thanks for this. I knew PyTorch was dominant but didn’t realise it was this dominant for deep learning!

75

hcarlens OP t1_jb9q3cm wrote

Yeah, not just competitive ML but the research community as a whole seem to have almost entirely switched to PyTorch now (based on the Papers With Code data). I was expecting to see some people using JAX though!

37

TubasAreFun t1_jbapwmb wrote

JAX does offer some general matrix math that can be more useful/fast than torch alone. I often do deep learning with torch and then use JAX on the top to train statistical models (i.e. fuse features from multiple models, raw features, etc. into a single regression/inference)

16

saintshing t1_jbdbgwy wrote

Is pytorch also better than TF for usecases where I have to do training/inference on mobile?

2

WirrryWoo t1_jb9v31t wrote

Thank you for the analysis! Any insights on forecasting and time series related problems? Do most solutions use GRUs? Thanks!

16

hcarlens OP t1_jb9woj6 wrote

I found that for a lot of time-series problems, people often treated them as if they were standard tabular/supervised learning problems. There's a separate page of the report which goes into these in detail: https://mlcontests.com/tabular-data?ref=mlc_reddit

For example, for the Kaggle Amex default prediction competition, the data is time-series in the sense that you're given a sequence of customer statements, and then have to predict the probability of them defaulting within a set time period after that. The winner's solution mostly seemed to flatten the features and use LightGBM, but they did use a GRU for part of their final ensemble: https://www.kaggle.com/competitions/amex-default-prediction/discussion/348111

The M6 forecasting competition finished recently, I'm looking forward to seeing what their winners did: https://m6competition.com/

13

jamesmundy t1_jb9rdku wrote

Really detailed analysis! For the Real Robot Challenge you mentioned (very cool!) were people able to test on the robot before the competition/during training?

12

hcarlens OP t1_jb9vu1f wrote

Yeah that one is really cool! They had an initial competition stage, open to everyone, where evaluation was done in a simulation environment (in software) as opposed to real robots. Competitors were given data from dozens of hours of actual robot interaction which they could use to train their policies.

The teams that qualified there made it through to the real robot stage. At that point they could submit their policies for weekly evaluation on actual robots - so they could have a few practice runs on the actual robots before the final leaderboard run.

8

Lucas_Matheus t1_jbatxn8 wrote

Amazing. This seems like a great way to learn how things are currently being done in ML

10

Svaruz t1_jbadan9 wrote

That’s a lot of work 🫡. Thanks

6

senacchrib t1_jbbsgju wrote

This is amazing, thank you! Where do L2R problems fall in your classification? Tabular?

5

hcarlens OP t1_jbdv11j wrote

Thanks! I didn't create a separate category for learning-to-rank problems because they often overlap with other domains.

For example, some of the conservation competitions (https://mlcontests.com/state-of-competitive-machine-learning-2022/#conservation-competitions) are L2R problems on image data.

Or the Amazon/AIcrowd competitions (https://mlcontests.com/state-of-competitive-machine-learning-2022/#nlp--search) which were L2R with NLP.

In reality the mapping of competition:(competition type) is almost always one:many, and I'm planning on updating the ML Contests website to reflect that!

If I'd had more time and better data I would have sliced the data in multiple different ways to also look into e.g. L2R problems specifically in more depth.

2

senacchrib t1_jbfczfo wrote

What you accomplished is wonderful enough. I agree wholeheartedly with your 1:n mapping

2

cesarebo t1_jba5k13 wrote

Good job u/hcarlens and MLcontests team! Thank you for the insights!

4

Alchera_QQ t1_jbbfukn wrote

Can somebody elaborate on the discrepancy between PyTorch and TF?

I keep hearing that Torch is preferred for research and academic purposes, but TF seems to be very close in terms of accuracy and performance.

What factor makes Torch users over an order of magnitude more popular here?

4

NitroXSC t1_jbdp6ge wrote

Interesting meta-study with many remarkable trends.

This is seen from the competitor's side, but what correctly the best website to set up a simple prediction competition? I'm asking this since I'm planning on creating a small competition for students of one of the courses given (no large files needed).

3

hcarlens OP t1_jbdvb7c wrote

Thanks! It depends on exactly what you're planning, but CodaLab (https://codalab.lisn.upsaclay.fr/competitions/) or their new platform CodaBench (https://www.codabench.org/) will probably work well.

They both allow you to run competitions for free, and people do use them for teaching purposes (you'll see some examples if you browse through the list of competitions).

I'm planning on writing a shorter blog post on running competitions soon.

4

scaldingpotato t1_jbflxan wrote

For noobs like me: GBDT=gradient boost decision tree

3

ilovekungfuu t1_jbi2xao wrote

Thank you very much u/hcarlens !
This distills so much of information .

Do you think that we might be at a plateau in terms of new methods being tried out (say, since last 24 months) ?

3

hcarlens OP t1_jbnreef wrote

Hi! I'm not sure I fully understand your question, but if you're asking whether the rate of progress in competitive ML is slowing down, I think probably not. A lot of the key areas of debate (gbdt vs nn in tabular data, cnn vs transformers in vision) are seeing a lot of research still and I expect the competitive ML community to adopt new advances when they happen. Also in NLP there's a move towards more efficient models, which would also be very useful.

2

XGDragon t1_jbc09qd wrote

Awesome roundup. I wonder, was https://grand-challenge.org/ included in your analysis?

2

hcarlens OP t1_jbdv2vl wrote

Thanks! Yes, but I didn't manage to get as much data as I wanted for the competitions on there. I emailed some of the competition organisers but didn't get a response.

2

sLqHA3RbL2MBi t1_jbgya0w wrote

This was a fascinating read, thanks a lot!

2