Updated: 2024-02-26 Mon 10:47

NeurIPS 2023 Retrospect


Thus another NeurIPS has ended. This is the third one for me, and for every one I've attended I feel it has increased in size. I reflect on my experience this time around. Some context: I have just finished my PhD and I'm on the job market looking for a job, ideally ML research in industry. While the main reason I went is to present my paper, I also attended in order to see what the job market is currently like and make some connections with companies and researchers in industry and academia.

I presented our approach (which we call MEKRR, "maker") on how to successfully transfer pretrained GNNs for learning to predict energies of atomistic system. This combines GNN feature representations with kernel mean embeddings and ridge regression. Do have a look at the arxiv paper (will update this link with the conference paper once it's been made available in the official NeurIPS proceedings) or the code base :)


I will write more about MEKRR in a separate note and will link it here later. The gist of it is that using learned features from GNNs on trained an upstream dataset and using KRR on these features, together with some kernel tricks for dealing with sets / point clouds, works really well. Will kernels make a come-back? Probably not. Can kernels improve performance on small to medium size datasets when given a strong feature map when compared to fine-tuning? Probably yes!

Expo hall and companies

The expo hall was bustling. My sense is that the companies can be categorized into

  1. big Tech such as MAMAA (previously known as FAANG) and some older tech companies such as IBM,
  2. trading companies such as Jane Street or DE Shaw,
  3. lots of smaller companies serving LLMs and other models as a service, or speeding up inference using quantization and other postprocessing techniques,
  4. the rest which includes peripheral companies using ML (Sony, Disney), biology / drugs / medicine and publishing houses.

In general it was a pretty good place to get in touch with companies. I made some really good genuine contacts which I will cherish whether it'll lead to job or not. I disliked talking to recruiters which in the end just forward me to the general recruitment page of their companies, feels like a waste of time on both their and my part.

What I would have done differently

  • Go through all booths on the expo day instead of spacing it out over the week
  • Get over fear of talking and engage with people instead of circling around wasting time
    • Tip: Use the booths which you are not really interested in or have low engagement to warm up. It's fun to see what people are up to and they probably enjoy people actually talking to them.

I spent a couple of hours each day going through the hall in a systematic manner (my wife remarked that I had so much good swag, mostly really high-quality socks, which can be explained simply by going through almost every booth…). I think this is worthwhile but drains your energy and takes up a lot of time. Looking back I wish I just had done this during the Sunday when I arrived so that I could have focused on the posters and talks on the other day. This time I didn't spend much time at all on the research part of the conference which I slightly regret. But I met with the companies and engaged in some way or another with about 80% of the booths which I count as a success.

Another part is getting over yourself and just talk to people. At the start of the conference I was quite shy and wondering why people would want to engage with me, but honestly, the whole reason for these companies to be present at NeurIPS is to talk to the attendants (even if the chat does not actually lead anywhere) at least out of courtesy. I should work on overcome this fear and just throw myself out there. Talking to low-stakes companies first help to get over this barrier I felt.

Meetings and parties

I went to a couple of parties which was great. Good time to reconnect with people I haven't seen in a while and connect with new people. One party was thrown by one of the UK initiatives for AI Safety and it was interesting chatting about the state of things. Seemed like the onus was on first making people aware of the problem and in what ways it can be approached and potentially solved, working similarly to a think-tank. After this party ended we went to the Cohere party which was thrown in this amazing multi-layered building, really cool party, would go again.

While the above party was more general and open, I also went to an open bar hosted by Imbue. This was super cozy and intimate. I made some great contacts there and spoke to many of the people in the team. The party was hosted outside of the usual district which probably made it so less people attended, but on the other hand, the people there really were there for a reason. I thoroughly enjoyed myself, one of the highlights of the conference for me.

The actual research conference

LLMs where everywhere. I don't think this should have come as a surprise to anyone. In one sense I feel like ChatGPT and it's kind has been the first to actually deliver on the promise of AI to the consumer (at least during my lifetime, would be interested to here contrasting viewpoints) so it's only natural that research will tail this development as we have a pretty poor understanding on what actually goes on inside of these models. On the other hand there's the question of what the role of academia and publishing should be, to always cater to industry or to do high-risk research which enables the Next Big Thing (TM)? Pretty hard to do this kind of research when a lot of it comes down to compute.

MEKRR Poster

Presenting my poster was a blast. We had a lot of activity and I had great feedback. Someone even came and wanted a selfie with me and the poster! This is the highest flattery I've ever received in my academic career by far. There were some senior researchers that found this work interesting and really engaged with me. This cemented my confidence in this line of work and I hope that others will continue investigating how kernels can fit into a neural network world.