Abstract
This package implements a modified Constrained Adversarially Robust Bayesian Optimization (CARBO) sampler based on the paper Constrained robust Bayesian optimization of expensive noisyblack-box functions with guaranteed regret bounds
.
This sampler robustly optimizes a function along with inequality constraints that incurs a noise in its input.
The algorithm details are described in the Others
section.
APIs
CARBOSampler(*, seed: int | None = None, independent_sampler: BaseSampler | None = None, n_startup_trials: int = 10, deterministic_objective: bool = False, constraints_func: Callable[[FrozenTrial], Sequence[float]] | None = None, rho: float = 1e3, beta: float = 4.0, local_ratio: float = 0.1, n_local_search: int = 10)
seed
: Seed for random number generator.independent_sampler
: Sampler used for initial sampling (for the firstn_startup_trials
trials) and for conditional parameters. (a random sampler with the sameseed
is used). Sampler used whensample_independent
is called.n_startup_trials
: Number of initial trials.deterministic_objective
: Whether the objective function is deterministic or not. IfTrue
, the sampler will fix the noise variance of the surrogate model to the minimum value (slightly above 0 to ensure numerical stability).constraints_func
: An optional function that computes the objective constraints. It must take aoptuna.trial.FrozenTrial
and return the constraints. The return value must be a sequence offloat
s. A value strictly larger than 0 means that a constraints is violated. A value equal to or smaller than 0 is considered feasible. Ifconstraints_func
returns more than one value for a trial, that trial is considered feasible if and only if all values are equal to 0 or smaller. Theconstraints_func
will be evaluated after each successful trial. The function won’t be called when trials fail or are pruned, but this behavior is subject to change in future releases. Currently, theconstraints_func
option is not supported for multi-objective optimization.rho
: The mix up coefficient for the acquisition function. If this value is large, the parameter suggestion puts more priority on constraints.beta
: The coefficient for LCB and UCB. If this value is large, the parameter suggestion becomes more pessimistic, meaning that the search is inclined to explore more.local_ratio
: Theepsilon
parameter in the CARBO algorithm that controls the size ofW(theta)
. This value must be in[0, 1]
.n_local_search
: How many times the local search is performed.
Note that because of the limitation of the algorithm, only non-conditional numerical parameters can be sampled by the MO-CMA-ES algorithm, and categorical and conditional parameters are handled by random search.
Installation
$ pip install torch --index-url https://download.pytorch.org/whl/cpu
$ pip install scipy
Example
import numpy as np
import optuna
import optunahub
def objective(trial: optuna.Trial) -> float:
x = trial.suggest_float("x", 0.0, 2 * np.pi)
y = trial.suggest_float("y", 0.0, 2 * np.pi)
c = float(np.sin(x) * np.sin(y) + 0.95)
trial.set_user_attr("c", c)
return float(np.sin(x) + y)
def constraints(trial: optuna.trial.FrozenTrial) -> tuple[float]:
c = trial.user_attrs["c"]
return (c, )
CARBOSampler = optunahub.load_module("samplers/carbo").CARBOSampler
sampler = CARBOSampler(seed=0, constraints_func=constraints)
study = optuna.create_study(sampler=sampler)
study.optimize(objective, n_trials=30)
Others
Notations
In this section, we use the following notations:
- $x \in [0, 1]^D$, an input vector,
- $B_\epsilon \coloneqq [-\frac{\epsilon}{2}, \frac{\epsilon}{2}]^D$, an $\epsilon$-bounding box,
- $\xi \in B_\epsilon$, an input noise,
- $f: [0, 1]^D \rightarrow \mathbb{R}$, an objective function,
- $g_c: [0, 1]^D \rightarrow \mathbb{R}$, the $c$-th constraint,
- $\text{LCB}_{h}: [0,1]^D \rightarrow \mathbb{R}$, the lower confidence bound of a function $h$,
- $\text{UCB}_{h}: [0,1]^D \rightarrow \mathbb{R}$, the upper confidence bound of a function $h$.
Suppose we would like to solve the following max-min problem: $\max_{x \in [0,1]^D} \min_{\xi \in B_\epsilon} f(x + \xi) \text{\ subject \ to \ } g_c(x + \xi) \geq 0~(\text{for~}c \in {1,2,\dots,C}).$ where the actual input noise $\xi$ is assumed to be drawn from $B_\epsilon$ uniformly.
Algorithm Details
- Train Gaussian process regressors for each function $f, g_1, \dots, g_C$ using the past observations.
- Solve the following max-min problem: $x_{\star} \in \text{arg}\max_{x \in [0,1]^D}\min_{\xi \in B_\epsilon} \text{UCB}_{f}(x+\xi) + \rho \sum_{c=1}^C [\text{UCB}_{g_c}(x+\xi)]^{-}$ where $[a]^{-} \coloneqq \min(0, a)$.
- Solve the following minimization problem: $\xi_{\star} \in \text{arg}\min_{\xi \in B_\epsilon} \text{LCB}_{f}(x_\star+\xi) + \rho\sum_{c=1}^C [\text{LCB}_{g_c}(x_\star+\xi)]^{-}$
- Evaluate each function at $x = x_{\star} + \xi_{\star}$.
- Go back to 1.
In principle, $[\text{UCB}_{g_c}(x+\xi)]^{-}$ and $[\text{LCB}_{g_c}(x+\xi)]^{-}$ quantify the upper and lower confidence bounds of the violation amount. Please note that Processes 2 and 3 are modified from the original paper because our setup assumes that the same input noise $\xi$ is used for each constraint and the objective evaluations. Also, the order of the min or max operation and the summation is flipped in our implementation.
- Package
- samplers/carbo
- Author
- Optuna Team
- License
- MIT License
- Verified Optuna version
- 4.4.0
- Last update
- 2025-08-08