They missed a great opportunity to call this BloTorch (Bayesian Learning and Optimisation?) here, but I'm very excited to see such methods gaining more traction!
We built (internally) an engineering optimization framework on top of Botorch. Bayesian optimization is a great tool to have in the toolbox for mapping a design space and finding the limiting cases with fewer calls to the expensive physics-based solvers. The computational budget often winds up being 5-10x smaller than it would have been using traditional design-of-experiments sampling methods.
There's also recent Bayesian Optimal Experimental Design methods that allow you to directly design experiments using gradient ascent. Not sure how it compares with BayesOpt on your problem, though.
how flexible was Botorch for your task and is it difficult to map it to a Botorch specific format. And want kind of understanding did you have about the underlying cost/objective function and what kind of surrogate model did you used? Just basic GP(gaussian process) ?
It wasn’t terribly difficult once we got a feel for the underlying API. GP surrogates work out of the box, but you can plug in other kinds as well. Same goes for acquisition functions.
As far as the objective, usually we’re calling an external physics-based solver (e.g. finite elements) and post-processing the solution to get the quantity we’re trying to optimize. There’s almost never any gradient information, so Bayesian optimization winds up being the method of choice.
If you’re not from a traditional engineering background, some of my terminology may have been confusing. Here’s a less jargon-heavy version:
- We use Bayesian optimization to find the optimum (or worst-case) configuration of real manufactured objects and systems.
- Bayesian optimization lets us arrive at that design configuration faster than explicit, physics-based simulation of many samples within the space of all possible configurations.
- We built the framework to do that using Botorch.
- It’s not an uncommon practice by any means, but the availability of tools like Botorch now makes it a lot easier to implement Bayesian optimization in-house, vs relying on a vendor-based engineering tool.
Are the tolerances of the optimum important in your case? (i.e., how sensitive it is to errors in the design parameters) If so, did you use any method to incorporate this information into the optimization?
It depends on the needs of the specific application. What typically happens is that you’d use BO to globally converge to within some tolerance and use the resulting surrogate to get a map of where the interesting regions are. You can then more densely sample these candidate regions or switch to a gradient-based method (via finite difference). For uncertainty information, we usually add this as a noise parameter, either on the input samples, or as part of the GP kernel.
I've never used the library directly but it was quite usefull as a method for hyperparameter search on optuna (for choosing machine learning tuning parameters).
Same here, that's how I first came across it :) At current tests, it performs a lot better compared to other samplers and requires a lot less trials to find good hyperparameters.
It's also very useful for simulation-based optimization. As an example, we use it extensively for the design of particle accelerators, where the simulations are typically expensive and need to run on supercomputers. We have built our own library[0] for enabling this, which in the end uses BoTorch (through Ax[1]) under the hood.