Optimizing with Low Budgets: A Comparison on the Black-Box Optimization Benchmarking Suite and OpenAI Gym¶

Authors: Elena Raponi, Nathanael Carraz Rakotonirina, Jeremy Rapin, Carola Doerr, Olivier Teytaud

Published: 2023 (Journal Paper)

Source: IEEE Transactions on Evolutionary Computation

Algorithm: Low-budget black-box optimizer benchmarking

arXiv: 2310.00077

DOI: 10.1109/TEVC.2023.3346788

Summary¶

Compares Bayesian-optimization-oriented ML tools with classical derivative-free optimizers under limited evaluation budgets. The study updates earlier benchmarking work by testing both COCO/BBOB functions and direct policy search in OpenAI Gym, showing that BO methods are strong at small budgets but can be overtaken by other black-box optimizers as budgets grow.

Abstract¶

The growing ubiquity of machine learning (ML) has led it to enter various areas of computer science, including black-box optimization (BBO). Recent research is particularly concerned with Bayesian optimization (BO). BO-based algorithms are popular in the ML community, as they are used for hyperparameter optimization and more generally for algorithm configuration. However, their efficiency decreases as the dimensionality of the problem and the budget of evaluations increase. Meanwhile, derivative-free optimization methods have evolved independently in the optimization community. Therefore, we urge to understand whether cross-fertilization is possible between the two communities, ML and BBO, i.e., whether algorithms that are heavily used in ML also work well in BBO and vice versa. Comparative experiments often involve rather small benchmarks and show visible problems in the experimental setup, such as poor initialization of baselines, overfitting due to problem-specific setting of hyperparameters, and low statistical significance. With this paper, we update and extend a comparative study presented by Hutter et al. in 2013. We compare BBO tools for ML with more classical heuristics, first on the well-known BBOB benchmark suite from the COCO environment and then on Direct Policy Search for OpenAI Gym, a reinforcement learning benchmark. Our results confirm that BO-based optimizers perform well on both benchmarks when budgets are limited, albeit with a higher computational cost, while they are often outperformed by algorithms from other families when the evaluation budget becomes larger. We also show that some algorithms from the BBO community perform surprisingly well on ML tasks.

Links¶

Primary

Paper arxiv.org

Standard

arXiv Abstract 2310.00077 arXiv PDF 2310.00077 arXiv HTML 2310.00077 DOI 10.1109/TEVC.2023.3346788