CAVE

Meta Data

Meta data, i.e. number of instances and parameters as well as configuration budget. Statistics apply to the best run, if multiple configurator runs are compared.

Run with best incumbent Cutoff 300
# Train instances 302 Walltime budget 172800
# Test instances 302 Runcount budget inf
# Parameters 26 CPU budget inf
# Features 138 Deterministic False
# Duplicate Feature vectors 14    
       
# Evaluated Configurations 2922 # Runs per Config (min) 0
# Default evaluations 7 # Runs per Config (mean) 23.8494
# Incumbent evaluations 2000 # Runs per Config (max) 2000
Budget spent evaluating configurations NaN Total number of configuration runs 69688
# Changed parameters 21    
Best configuration

Comparing parameters of default and incumbent. Parameters that differ from default to incumbent are presented first.

Default Incumbent
-------------- Changed parameters: -------------- ----- -----
sp-clause-activity-inc 1 1.45757
sp-clause-decay 1.4 1.65565
sp-first-restart 100 541
sp-learned-clause-sort-heur 0 18
sp-learned-clauses-inc 1.3 1.18221
sp-learned-size-factor 0.4 1.05424
sp-orig-clause-sort-heur 0 12
sp-rand-var-dec-freq 0.001 0.005
sp-resolution 1 0
sp-restart-inc 1.5 1.64232
sp-use-pure-literal-rule 1 0
sp-var-activity-inc 1 0.828371
sp-var-dec-heur 0 16
sp-variable-decay 1.4 1.40302
sp-max-res-lit-inc 1 inactive
sp-max-res-runs 4 inactive
sp-rand-var-dec-scaling 1 0.781189
sp-res-cutoff-cls 8 inactive
sp-res-cutoff-lits 400 inactive
sp-res-order-heur 0 inactive
sp-rand-phase-scaling 1 0.67862
-------------- Unchanged parameters: -------------- ----- -----
sp-clause-del-heur 2 2
sp-phase-dec-heur 5 5
sp-update-dec-queue 1 1
sp-rand-phase-dec-freq 0.001 0.001
Performance Analysis

Contains different ways of analyzing the final incumbent and the performance of the algorithm's default parameter configuration.

Performance Table

The most common way to compare performances of algorithms on the same set of instances. Entries in the table depend on the cost metric of the configurator run. For scenarios optimizing running time, this includes average runtime, penalized average runtime as well as number of timeouts. The p-value of the paired permutation test uses 10000 permutations and tests against the null-hypothesis that the mean of performance between default and incumbent is equal.

Train Test p-value
Default Incumbent Default Incumbent
PAR10 659.968 11.295 608.726 3.04 0.00010
PAR1 69.902 2.355 63.362 3.04 0.00010
Timeouts 62/302 1/302 55/302 0/302 0.00010
empirical Cumulative Distribution Function (eCDF)

Depicts cost distributions over the set of instances. Since these are empirical distributions, the plots show step functions. These plots provide insights into how well configurations perform up to a certain threshold. For runtime scenarios this shows the probability of solving all instances from the set in a given timeframe. On the left side the training-data is scattered, on the right side the test-data is scattered.

Plot Plot

Scatterplot

Scatter plots show the costs of the default and optimized parameter configuration on each instance. Since this looses detailed information about the individual cost on each instance by looking at aggregated cost values in tables, scatter plots provide a more detailed picture. They provide insights whether overall performance improvements can be explained only by some outliers or whether they are due to improvements on the entire instance set. On the left side the training-data is scattered, on the right side the test-data is scattered.

Plot Plot

Algorithm Footprints

The instance features are projected into a two/three dimensional space using principal component analysis (PCA) and the footprint of each algorithm is plotted, i.e., on which instances the default or the optimized configuration performs well. In contrast to the other analysis methods in this section, these plots allow insights into which of the two configurations performs well on specific types or clusters of instances. Inspired by Smith-Miles.

Default vs Incumbent 2d

Comparing footprints of default (left) and incumbent (right) in two-dimensional feature space.

Plot Plot

Default 3d

Projection of feature space into three dimensions, different viewpoints for enhanced explanation.

Plot Plot
Plot Plot

Incumbent 3d

Projection of feature space into three dimensions, different viewpoints for enhanced explanation.

Plot Plot
Plot Plot

Configurator's behavior

Analysis of the trajectory and the runhistory returned by a configurator to gain insights into how the configurator tried to find a well-performing configuration.

Configurator Footprint

Analysis of the iteratively sampled configurations during the optimization procedure. Multi-dimensional scaling (MDS) is used to reduce dimensionality of the search space and plot the distribution of evaluated configurations. The larger the dot, the more often the configuration was evaluated on instances from the set. Configurations that were incumbents at least once during optimization are marked as red squares. Configurations acquired through local search are marked with a 'x'. The downward triangle denotes the final incumbent, whereas the orange upward triangle denotes the default configuration. The heatmap and the colorbar correspond to the predicted performance in that part of the search space.

Interactive
Static
Plot Plot Plot Plot Plot Plot Plot Plot Plot Plot

Cost over time

Depicts the average cost of the best so far found configuration (using all trajectory data) over the time spent by the configurator (including target algorithm runs and the overhead generated by the configurator)If the curve flattens out early, it indicates that too much time was spent for the configurator run; whereas a curve that is still improving at the end of the budget indicates that one should increase the configuration budget. The plotted standard deviation gives the uncertainty over multiple configurator runs.

Parallel Coordinates

Previously used by Golovin et al. to study the frequency of chosen parameter settings in black-box-optimization. Each line corresponds to one configuration in the runhistory and shows the parameter settings and the corresponding (estimated) average cost. To handle large configuration spaces with hundreds of parameters, the (at most) 10 most important parameters based on a fANOVA parameter importance analysis are plotted. To emphasize better configurations, the performance is encoded in the color of each line, ranging from blue to red. These plots provide insights into whether the configurator focused on specific parameter values and how these correlate to their costs.

Plot
Parameter Importance

Parameter Importance analysis to determine which of the parameters most influence the analysed algorithms performance.

Importance Table

Parameters are sorted by the average importance-value. Note, that the values are not directly comparable, since the different techniques provide different metrics (see respective tooltips for details on the differences).

fANOVA Ablation LPI
sp-var-dec-heur 65.06 73.90 91.36
sp-orig-clause-sort-heur 1.31 21.94 -
sp-phase-dec-heur 5.94 - -
sp-restart-inc - 1.44 4.05
sp-first-restart - - 1.59
sp-learned-clause-sort-heur 1.12 2.02 -
sp-variable-decay - - 1.50
Parameters are sorted by the average importance-value. Note, that the values are not directly comparable, since the different techniques provide different metrics (see respective tooltips for details on the differences).
fANOVA

fANOVA (functional analysis of variance) computes the fraction of the variance in the cost space explained by changing a parameter by marginalizing over all other parameters, for each parameter (or for pairs of parameters). Parameters with high importance scores will have a large impact on the performance. To this end, a random forest is trained as an empirical performance model on the available empirical data from the available runhistories.

Importance
-------------------- Single importance: -------------------- --------------------
sp-var-dec-heur 65.0623
sp-phase-dec-heur 5.9379
Marginals
sp-var-dec-heur
Plot
sp-phase-dec-heur
Plot
Ablation

Ablation Analysis is a method to determine parameterimportance by comparing two parameter configurations, typically the default and the optimized configuration.It uses a greedy forward search to determine the order of flipping the parameter settings from default configuration to incumbent such that in each step the cost is maximally decreased.

Plot Plot

Local Parameter Importance (LPI)

Using an empirical performance model, performance changes of a configuration along each parameter are calculated. To quantify the importance of a parameter value, the variance of all cost values by changing that parameter are predicted and then the fraction of all variances is computed. This analysis is inspired by the human behaviour to look for improvements in the neighborhood of individual parameters of a configuration.

sp-var-dec-heur
Plot
Feature Analysis

Analysis of the instance features to gain insights into the instance set that was used during the optimization.

Feature Importance

Reduction of the out-of-the-bag root mean squared error of the random forest empirical performance model by applying forward selection on the set of instance features. Using this method, we can identify a set of instance features that suffices to obtain prediction accuracy comparable to using the full set of features.

Table
Error
None 0.989727
nclausesOrig 0.225080
Pre_featuretime 0.186257
vars_reduced_depth_64 0.181692
forward-selection-barplot
Plot
forward-selection-chng
Plot
Violin and Box Plots

Box and Violin Plots show the distribution of each feature value across the instances. Box plots show the quantiles of the distribution and violin plots show the approximated probability density of the feature values. Such plots are useful to inspect the instances and to detect characteristics of the instances. For example, if the distributions have two or more modes, it could indicate that the instance set is heterogeneous which could cause problems in combination with racing strategies configurators typically use. NaN values are removed from the data.

BINARYp
Plot
Basic_featuretime
Plot
CG_coeff_variation
Plot
CG_entropy
Plot
CG_featuretime
Plot
CG_max
Plot
CG_mean
Plot
CG_min
Plot
DIAMETER_coeff_variation
Plot
DIAMETER_entropy
Plot
DIAMETER_featuretime
Plot
DIAMETER_max
Plot
DIAMETER_mean
Plot
DIAMETER_min
Plot
HORNY_VAR_coeff_variation
Plot
HORNY_VAR_entropy
Plot
HORNY_VAR_max
Plot
HORNY_VAR_mean
Plot
HORNY_VAR_min
Plot
KLB_featuretime
Plot
LPSLack_coeff_variation
Plot
LPSLack_max
Plot
LPSLack_mean
Plot
LPSLack_min
Plot
LP_OBJ
Plot
POSNEG_RATIO_CLAUSE_coeff_variation
Plot
POSNEG_RATIO_CLAUSE_entropy
Plot
POSNEG_RATIO_CLAUSE_max
Plot
POSNEG_RATIO_CLAUSE_mean
Plot
POSNEG_RATIO_CLAUSE_min
Plot
POSNEG_RATIO_VAR_entropy
Plot
POSNEG_RATIO_VAR_max
Plot
POSNEG_RATIO_VAR_mean
Plot
POSNEG_RATIO_VAR_min
Plot
POSNEG_RATIO_VAR_stdev
Plot
Pre_featuretime
Plot
SP_bias_coeff_variation
Plot
SP_bias_max
Plot
SP_bias_mean
Plot
SP_bias_min
Plot
SP_bias_q10
Plot
SP_bias_q25
Plot
SP_bias_q50
Plot
SP_bias_q75
Plot
SP_bias_q90
Plot
SP_unconstraint_coeff_variation
Plot
SP_unconstraint_max
Plot
SP_unconstraint_mean
Plot
SP_unconstraint_min
Plot
SP_unconstraint_q10
Plot
SP_unconstraint_q25
Plot
SP_unconstraint_q50
Plot
SP_unconstraint_q75
Plot
SP_unconstraint_q90
Plot
TRINARYp
Plot
UNARY
Plot
VCG_CLAUSE_coeff_variation
Plot
VCG_CLAUSE_entropy
Plot
VCG_CLAUSE_max
Plot
VCG_CLAUSE_mean
Plot
VCG_CLAUSE_min
Plot
VCG_VAR_coeff_variation
Plot
VCG_VAR_entropy
Plot
VCG_VAR_max
Plot
VCG_VAR_mean
Plot
VCG_VAR_min
Plot
VG_coeff_variation
Plot
VG_max
Plot
VG_mean
Plot
VG_min
Plot
cl_featuretime
Plot
cl_num_coeff_variation
Plot
cl_num_max
Plot
cl_num_mean
Plot
cl_num_min
Plot
cl_num_q10
Plot
cl_num_q25
Plot
cl_num_q50
Plot
cl_num_q75
Plot
cl_num_q90
Plot
cl_size_coeff_variation
Plot
cl_size_max
Plot
cl_size_mean
Plot
cl_size_min
Plot
cl_size_q10
Plot
cl_size_q25
Plot
cl_size_q50
Plot
cl_size_q75
Plot
cl_size_q90
Plot
cluster_coeff_coeff_variation
Plot
cluster_coeff_entropy
Plot
cluster_coeff_max
Plot
cluster_coeff_mean
Plot
cluster_coeff_min
Plot
gsat_BestAvgImprovement_CoeffVariance
Plot
gsat_BestAvgImprovement_Mean
Plot
gsat_BestSolution_CoeffVariance
Plot
gsat_BestSolution_Mean
Plot
gsat_FirstLocalMinRatio_CoeffVariance
Plot
gsat_FirstLocalMinRatio_Mean
Plot
gsat_FirstLocalMinStep_CoeffVariance
Plot
gsat_FirstLocalMinStep_Mean
Plot
gsat_FirstLocalMinStep_Median
Plot
gsat_FirstLocalMinStep_Q10
Plot
gsat_FirstLocalMinStep_Q90
Plot
horn_clauses_fraction
Plot
lobjois_featuretime
Plot
lobjois_log_num_nodes_over_vars
Plot
lobjois_mean_depth_over_vars
Plot
lpIntRatio
Plot
lpTIME
Plot
ls_gsat_featuretime
Plot
ls_saps_featuretime
Plot
nclauses
Plot
nclausesOrig
Plot
nvars
Plot
nvarsOrig
Plot
reducedClauses
Plot
reducedVars
Plot
saps_BestAvgImprovement_CoeffVariance
Plot
saps_BestAvgImprovement_Mean
Plot
saps_BestSolution_CoeffVariance
Plot
saps_BestSolution_Mean
Plot
saps_FirstLocalMinRatio_CoeffVariance
Plot
saps_FirstLocalMinRatio_Mean
Plot
saps_FirstLocalMinStep_CoeffVariance
Plot
saps_FirstLocalMinStep_Mean
Plot
saps_FirstLocalMinStep_Median
Plot
saps_FirstLocalMinStep_Q10
Plot
saps_FirstLocalMinStep_Q90
Plot
sp_featuretime
Plot
unit_featuretime
Plot
vars_clauses_ratio
Plot
vars_reduced_depth_1
Plot
vars_reduced_depth_16
Plot
vars_reduced_depth_256
Plot
vars_reduced_depth_4
Plot
vars_reduced_depth_64
Plot
Correlation

Correlation of features based on the Pearson product-moment correlation. Since instance features are used to train an empirical performance model in model-based configurators, it can be important to remove correlated features in a pre-processing step depending on the machine-learning algorithm. Darker fields corresponds to a larger correlation between the features.

Plot
Clustering

Clustering instances in 2d; the color encodes the cluster assigned to each cluster. Similar to ISAC, we use a k-means to cluster the instances in the feature space. As pre-processing, we use standard scaling and a PCA to 2 dimensions. To guess the number of clusters, we use the silhouette score on the range of 2 to 12 in the number of clusters

Plot