Abstract
This package implements a modified Constrained Adversarially Robust Bayesian Optimization (CARBO) sampler based on the paper Constrained robust Bayesian optimization of expensive noisyblack-box functions with guaranteed regret bounds.
This sampler robustly optimizes a function along with inequality constraints that incurs a noise in its input.
The algorithm details are described in the Others section.
APIs
CARBOSampler(*, seed: int | None = None, independent_sampler: BaseSampler | None = None, n_startup_trials: int = 10, deterministic_objective: bool = False, constraints_func: Callable[[FrozenTrial], Sequence[float]] | None = None, rho: float = 1e3, beta: float = 4.0, local_ratio: float = 0.1, n_local_search: int = 10)seed: Seed for random number generator.independent_sampler: Sampler used for initial sampling (for the firstn_startup_trialstrials) and for conditional parameters. (a random sampler with the sameseedis used). Sampler used whensample_independentis called.n_startup_trials: Number of initial trials.deterministic_objective: Whether the objective function is deterministic or not. IfTrue, the sampler will fix the noise variance of the surrogate model to the minimum value (slightly above 0 to ensure numerical stability).constraints_func: An optional function that computes the objective constraints. It must take aoptuna.trial.FrozenTrialand return the constraints. The return value must be a sequence offloats. A value strictly larger than 0 means that a constraints is violated. A value equal to or smaller than 0 is considered feasible. Ifconstraints_funcreturns more than one value for a trial, that trial is considered feasible if and only if all values are equal to 0 or smaller. Theconstraints_funcwill be evaluated after each successful trial. The function won’t be called when trials fail or are pruned, but this behavior is subject to change in future releases. Currently, theconstraints_funcoption is not supported for multi-objective optimization.rho: The mix up coefficient for the acquisition function. If this value is large, the parameter suggestion puts more priority on constraints.beta: The coefficient for LCB and UCB. If this value is large, the parameter suggestion becomes more pessimistic, meaning that the search is inclined to explore more.local_ratio: Theepsilonparameter in the CARBO algorithm that controls the size ofW(theta). This value must be in[0, 1].n_local_search: How many times the local search is performed.
Note that because of the limitation of the algorithm, only non-conditional numerical parameters can be sampled by the MO-CMA-ES algorithm, and categorical and conditional parameters are handled by random search.
Installation
$ pip install torch --index-url https://download.pytorch.org/whl/cpu
$ pip install scipy
Example
import numpy as np
import optuna
import optunahub
def objective(trial: optuna.Trial) -> float:
x = trial.suggest_float("x", 0.0, 2 * np.pi)
y = trial.suggest_float("y", 0.0, 2 * np.pi)
c = float(np.sin(x) * np.sin(y) + 0.95)
trial.set_user_attr("c", c)
return float(np.sin(x) + y)
def constraints(trial: optuna.trial.FrozenTrial) -> tuple[float]:
c = trial.user_attrs["c"]
return (c, )
CARBOSampler = optunahub.load_module("samplers/carbo").CARBOSampler
sampler = CARBOSampler(seed=0, constraints_func=constraints)
study = optuna.create_study(sampler=sampler)
study.optimize(objective, n_trials=30)
Others
Please look at GitHub README.md for the compiled version.
Notations
In this section, we use the following notations:
- $x \in [0, 1]^D$, an input vector,
- $B_\epsilon \coloneqq [-\frac{\epsilon}{2}, \frac{\epsilon}{2}]^D$, an $\epsilon$-bounding box,
- $\xi \in B_\epsilon$, an input noise,
- $f: [0, 1]^D \rightarrow \mathbb{R}$, an objective function,
- $g_c: [0, 1]^D \rightarrow \mathbb{R}$, the $c$-th constraint,
- $\text{LCB}_{h}: [0, 1]^D \rightarrow \mathbb{R}$, the lower confidence bound of a function $h$,
- $\text{UCB}_{h}: [0, 1]^D \rightarrow \mathbb{R}$, the upper confidence bound of a function $h$.
Please note that both $g_c$ and $f$ are standardized internally so that their distributions follow the assumption on the prior distribution by the Gaussian process.
Suppose we would like to solve the following max-min problem: $\max_{x \in [0, 1]^D} \min_{\xi \in B_\epsilon} f(x + \xi) \text{ subject to } g_c(x + \xi) \geq 0 \; (\text{for }c \in {1, 2, \dots, C}).$ where the actual input noise $\xi$ is assumed to be drawn from $B_\epsilon$ uniformly.
Algorithm Details
- Train Gaussian process regressors for each function $f, g_1, \dots, g_C$ using the past observations.
- Solve the following max-min problem: $x_{\star} \in \text{arg}\max_{x \in [0, 1]^D}\min_{\xi \in B_\epsilon} \text{UCB}_{f}(x + \xi) + \rho \sum_{c = 1}^C [\text{UCB}_{g_c}(x + \xi)]^{-}$ where $[a]^{-} \coloneqq \min(0, a)$.
- Solve the following minimization problem: $\xi_{\star} \in \text{arg}\min_{\xi \in B_\epsilon} \text{LCB}_{f}(x_\star + \xi) + \rho\sum_{c = 1}^C [\text{LCB}_{g_c}(x_\star + \xi)]^{-}$
- Evaluate each function at $x = x_{\star} + \xi_{\star}$.
- Go back to 1.
In principle, $[\text{UCB}_{g_c}(x + \xi)]^{-}$ and $[\text{LCB}_{g_c}(x + \xi)]^{-}$ quantify the upper and lower confidence bounds of the violation amount. Please note that Processes 2 and 3 are modified from the original paper because our setup assumes that the same input noise $\xi$ is used for each constraint and the objective evaluations. Also, the order of the min or max operation and the summation is flipped in our implementation.
- Package
- samplers/carbo
- Author
- Optuna Team
- License
- MIT License
- Verified Optuna version
- 4.4.0
- Last update
- 2025-10-01