Souper – A Superoptimizer for LLVM IR

regehr · on Oct 28, 2015

I'm one of the Souper developers and can offer a few pointers that may be more helpful than the github repo. Here's a somewhat recent blog post showing things that Souper can do:

  http://blog.regehr.org/archives/1252

Here's a talk I gave at the University of Washington last winter (before Souper did synthesis):

  https://www.youtube.com/watch?v=Ux0YnVEaI6A

Recently we've been teaching Souper to use dataflow facts such as LLVM's known bits and demanded bits. Perhaps interestingly, Souper can also teach LLVM how to compute these facts more precisely:

  http://lists.llvm.org/pipermail/llvm-dev/2015-September/089904.html

One really common question about Souper is "why operate on LLVM IR instead of instructions?" One of the main answers is "so that we can interact with dataflow analyses." This isn't something that previous superoptimizers have done. It seems to be working out really well.

bhouston · on Oct 28, 2015

It uses a redis cache? I wonder if future version could use a global cache somewhere on the internet, and use cloud solvers. Seems like you need a redis cache per version, but that shouldn't be hard to do -- or just add a version key to each cache request.

Seems weird to have this type of tool as a local tool rather than a cloud based tool.

It would be neat to have a service you can query with a snippet of IR, and it gives you back an optimal version of it. One can continually increase the side of these snippets as you get more cloud resources.

Having statistics on which patterns are common, even if they are not optimized would be really interesting. You could then use these statistics to target future optimization research.

radarsat1 · on Oct 28, 2015

It's a good idea. Keep in mind that an optimiser takes a bit of code and replaces it with a different bit of code. So a "cloud" solution would certainly require some thought towards security.

unfamiliar · on Oct 28, 2015

I found some slides with background information

http://llvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR....

mbrock · on Oct 28, 2015

Maybe a tall order, but if someone has it in them to provide a clear and simple explanation of SMT/SAT at the level of an intrigued layman, that'd be amazing. I've looked at the Wikipedia pages a few times and always just feel kinda overwhelmed. I'm probably missing the education in formal logic/math to understand why these are fruitful models, but they obviously are. Is this a kind of forefront of artificial reasoning research? What are some examples of problems they can help solve, and approximately what would be involved in modelling those problems within the formalization?

gsnedders · on Oct 28, 2015

They're a good model in part because they're the most fundamental model of a number of equivalent ones, and the one for which there has been the largest amount of work on creating performant implementations.

inDigiNeous · on Oct 28, 2015

Any examples of usage ?

UniQP · on Oct 28, 2015

I do not have an example for Souper, but the Optgen tool uses similar techniques and the paper provides a list of missing optimizations for LLVM, GCC and ICC: http://pp.info.uni-karlsruhe.de/uploads/publikationen/buchwa... .

lumpypua · on Oct 28, 2015

And benchmarks?

pascal_cuoq · on Oct 28, 2015

Benchmark would not be relevant here. Souper is intended to detect missing optimizations that could have applied to LLVM code already generated by, say, Clang, and to point out new patterns to recognize and transform to the LLVM developers. The improvements brought by Souper are distilled over a long period and mixed with other unrelated LLVM improvements. On the other hand, all improvements suggested by Souper have looked like a good idea to an LLVM developer who let themselves be convinced at one point, and many of them simply remove instructions and at least reduce the size of the generated binary, even if nothing else.

unfamiliar · on Oct 28, 2015

From the github page:

>Alternatively, you may immediately let Souper modify the bitcode and let it apply the missed optimization opportunties by using the Souper llvm opt pass.

So it seems like a benchmark of the program before and after this process would be relevant.

ingenter · on Oct 28, 2015

I wonder if it's possible to do a similar thing for codegen. i.e. generate more efficient code based on SMT solver and CPU model.

joajoa · on Oct 28, 2015

Absolutely. See [1] for instance. Superoptimization has been around since the 80s and at the first superoptimizer by Alexia Massalin targeted the 68000 instruction set.

[1] http://superoptimization.org/w/images/f/f1/Wp4.pdf

cwzwarich · on Oct 28, 2015

It's been done:

http://theory.stanford.edu/~aiken/publications/papers/asplos...

They had to restrict themselves to instruction sequences of length 3 due to limited resources, and say that to go beyond instruction sequences of length 4 they would need a better approach.

fulafel · on Oct 28, 2015

This has been done in '87 by Massalin: https://www.cs.arizona.edu/~collberg/Teaching/553/2011/Resou...

(both are brute force though, not solver-based)

cwzwarich · on Oct 28, 2015

Massalin's superoptimizer required humans to check the correctness of optimizations. It also didn't generate parameterized peephole optimizations; it directly optimized concrete instruction sequences.

DannyBee · on Oct 28, 2015

Yes, there are tons of papers on doing optimal scheduling/instruction selection/register allocation by using integer linear programming.

Modeling is not even the hard part, it's modeling in a way that doesn't take till the heat death of the sun to solve ;-)

andrewchambers · on Oct 28, 2015

I want to see efficient backends automatically generated from a flat list of assembly instruction descriptions using similar methods.

bhouston · on Oct 28, 2015

How good is super optimization? Is this available in current compilers in some form?

fulafel · on Oct 28, 2015

It means exhaustive search of possible operation sequences to find the optimal one. So it's optimal but impractical. The applications are processing very short bits of code, and trying to apply the result , eg dropping it in some real-world code or informing humans about compiler optimizer deficiencies.