[LLVM-DEV'25] LLVM ♥ ML Workshop

Continuing from Tanya’s announcement, the workshop is the continuation of last year’s LLVM :hearts: ML, with pretty much the same format.

…and before we continue, yes it can be about AI; in fact maybe we should’ve had AI in the title, since technically that encompasses ML, and besides we’ve already included AI-ish topics last year. Plus it’s all the fashion now. Alas, we didn’t change the name, just for the sake of continuity.

As announced:

The workshop aims to bring together LLVM contributors and researchers that work on applying machine learning techniques to LLVM, including compiler optimizations, latency estimation, IR comprehension, code generation, or data set curation (to name a few) - to discuss their current, ongoing projects, and explore ways to better collaborate and plot pathways for integration into community LLVM.

Just like before, the workshop is an unconference. It’s a half-day workshop and it has 2 parts:

In the first part, folks that want to propose topics or update the community on their work give a short presentation followed by Q&A. If interested in presenting in the 1st half, please DM @mtrofin by Oct. 1st so we can understand how to budget the time. We’ll communicate back on Oct 2nd to confirm timing for this first half. In addition, and optionally, if you’re interested in presenting and want to share more information with the participants before the workshop, share whatever materials you want to with us, a week before the workshop the latest, so we can group all of them in one post on Discourse a few days before the workshop.

The second part will be round table on topics the workshop participants choose. We’ll ask a volunteer to take notes, which will then be shared back on Discourse, together with the slides presented in the first part.Looking forward to seeing you all in October!

@mtrofin @jdoerfert @boomanaiden154-1

7 Likes

To my knowledge we won’t have the necessary AV setup; besides, online won’t really work too well for the lively discussions which we want to encourage, not only in the second half, but also the first.

It’d be great if you could make it in person! If not, the presentations and minutes will be shared afterwards.

Agenda for LLVM :hearts: ML

General Conference Registration Site

Monday Oct 27, 2025, 1:30pm - 5:30pm, Grand Ballroom Salon A-D

@ydebl: Does AI Boost Compiler Engineers’ Productivity?
@JDevlieghere: AI & LLDB
Yifan Zhang: ProfGen: LLM-Powered Generation of LLVM IR Benchmarks for Profile-Guided Optimization
@Jaddyen: EmitC for MLGO
@boomanaiden154-1: Large Scale Training for RL Guided Register Allocation
@DataCorrupted: Careful What You Wish For - What we learned using MLGO for size optimization

– BREAK–

@chhzh123 : Magellan: Autonomous Discovery of Novel Compiler Optimization Heuristics with AlphaEvolve
@svkeerthy: IR2Vec Embeddings: Lessons Learned and The Road Ahead
@vshah: Towards Learning Latency Impact of Cache Misses: Retrofitting Learned BB Latency Models to Process LBR Traces

– Round Table(s) on whatever topics we select ad-hoc –

7 Likes

Notes from the roundtable section during today’s workshop:

  • Why do we use Tensorflow for MLGO?

    • Difference between the framework used during training and during deployment.

    • EmitC is aimed at the deployment scenario.

    • Nothing is stopping you from using a different ML framework at training time. Mobile app folks mentioned that they used Pytorch for some of their experiments.

    • Torchcompile

      • Mobile app folks did not use it

      • Used torch execute (embedded version of torch compile). Still too difficult to deploy because all the pytorch dependencies are still attached.

      • EmitC still looks like the ideal solution.

      • Pytorch infra apparently supports emitting object files, that is how people working on optimizing code size for mobile apps has been deploying models.

  • AI Slop

    • Anecdotes about using Claude Code being sycophantic.
  • MLP vs LLM for optimizing code

    • LLMs maybe can one shot optimizations, MLPs need a lot of training.

    • Anecdotes from Aiden about LLMs still needing training to perform well.

    • Meta code size experiments (Cummins et al) using LLMs to generate phase orderings

      • Can work because you only need to invoke the LLM once.

      • Lots of engineering problems deploying dynamic phase orderings.

    • LLMs vs IR2Vec

    • Compilation as a service using LLMs (neural compilation esque)

  • Applying to new optimizations

    • Feature request: instruction scheduling - VLIW target

      • Mircea’s advice: start small to see if things are feasible.

      • Headroom analyses would be very helpful. It’s currently hard to tell how good the current heuristics are.

    • What Google has been working on

      • Inliner for performance (using contextual profiling)

      • Other regalloc related (existing trace based cost modelling)

      • Code layout

    • InputGen for Loop Unrolling

    • The biggest problem is still the reward signal.

    • Hyperparameter search

      • Some anecdotes using brute force to get 10-30% gains on GPU kernels

      • TVM might be using some ML models to make these sorts of tiling/etc. Decisions.

      • Vizier (Google Project) and ES could help with this.

        • ES has the advantage of being relatively simple.
    • How do we encode the context for the ML models? Best practices for designing inputs?

      • Two approaches - feature engineering for specific optimizations. Other approach is to use embeddings (like IR2Vec). Can mix with features if desired.

      • MLGO inliner uses ~32 features per function.

      • What is best can be very problem dependent.

      • Feature extraction is costly. Anecdotes of lots of timeouts happening because of feature extraction being expensive.

    • Anecdote about trying GA for tuning compiler flags. You have to come up with an initial generation. Can run into lots of crashes, hangs, flags not working.

      • Sort of like fuzzing.

      • Things have not really improved around reliablity/deviating from the happy path recently.

    • CompilerGym

      • Google was looking at using it over maintaining their own.

      • Some researchers were interested in interactive mode training.

      • Mircea is not sure that there are enough examples of workable training processes to try and build out a good general framework.

      • Python dependency hell can bring about its own challenges.

      • There was some work around making interactive mode in LLVM MLGO infra work with CompilerGym.

    • We have been working on models/deployment methodologies. What about big/more complex models like GNNs?

      • What about model optimization techniques?

      • So far, no evidence suggesting that bigger models are significantly better for the problems that we are working on.

      • Definitely interested in using model optimization techniques if we start seeing benefits to using big models.

      • Open question of how well this works/if we want to do this.

      • Also tradeoff between more parameters and more sophisticated feature extraction.

      • More parameter count also can lead to overfitting, especially in the supervised case, preventing generalization.

    • How does ML change compilation times?

      • Eg in graphics, ML is used to reduce the amount of time to produce a frame.

      • There was a GSOC project to predict passes that won’t do anything so they can be disabled and reduce compilation time. Some progress was made. Exact results can be looked up.

      • Showed that it was possible. No one did large scale supervised training.

      • IR2Vec might help enable this.

      • At Google: We cannot have a compilation run for more than 15 minutes. So we have to keep the compilation time about the same with ML or no ML.

8 Likes

…and here are the slides.

(please let me know if there are any access issues)

2 Likes

Also, we’ll use the Nov 7th MLGO meeting for post-workshop feedback - what went well, what can we improve, what can we add next time, etc.

Details on how to join in the agenda document.

1 Like

Hi! I believe @vshah ‘s presentation on Learning Latency Impact of Cache Misses is missing from the slides folder. Would it be possible to add that one as well please? Thank you!

There was something wonky with the permissions, I think I fixed it now, if you could please check. Thanks!

Yep, I see it now. Thank you! We’re still missing:

  • Large Scale Training for RL Guided Register Allocation
  • Careful What You Wish For - What we learned using MLGO for size optimization and
  • Magellan: Autonomous Discovery of Novel Compiler Optimization Heuristics with AlphaEvolve

Will these be uploaded as well?
I’m particularly interested in Peter’s experiments with MLGO for size. Thanks!

Must be the same problem: the owner used their corporate account to create the slides. @DataCorrupted @boomanaiden154-1 could you make a copy of yours using a private account? Thanks!

Done. It should be there now.

1 Like

@ioghiban @mtrofin Done.

2 Likes