pytabkit.bench.eval package
Submodules
pytabkit.bench.eval.analysis module
- class pytabkit.bench.eval.analysis.ResultsTables
Bases:
object- get(coll_name, n_cv=1, tag='paper')
- Parameters:
coll_name (str)
n_cv (int)
tag (str)
- Return type:
- pytabkit.bench.eval.analysis.get_benchmark_results(paths, table, coll_name, use_relative_score=True, return_percentages=True, val_metric_name=None, test_metric_name=None, rel_alg_name='BestModel', use_ranks=False, use_normalized_errors=False, use_grinnorm_errors=False, use_task_mean=True, use_geometric_mean=True, shift_eps=0.01, filter_alg_names_list=None, simplify_name_fn=None, n_splits=10, use_validation_errors=False)
- Parameters:
paths (Paths)
table (MultiResultsTable)
coll_name (str)
use_relative_score (bool)
return_percentages (bool)
val_metric_name (str | None)
test_metric_name (str | None)
rel_alg_name (str)
use_ranks (bool)
use_normalized_errors (bool)
use_grinnorm_errors (bool)
use_task_mean (bool)
use_geometric_mean (bool)
shift_eps (float)
filter_alg_names_list (List[str] | None)
simplify_name_fn (Callable[[str], str] | None)
n_splits (int)
use_validation_errors (bool)
- Return type:
Tuple[Dict[str, float | ndarray], Dict[str, Tuple[float | ndarray, float | ndarray]]]
- pytabkit.bench.eval.analysis.get_display_name(alg_name)
- Parameters:
alg_name (str)
- Return type:
str
- pytabkit.bench.eval.analysis.get_ensemble_groups(task_type_name)
Generates a groups of methods that should be evaluated.
- Parameters:
task_type_name (str) – ‘class’ or ‘reg’
- Returns:
A dict of lists {alg_group_name: [alg_name_1, alg_name_2, …]}
- Return type:
Dict[str, List[str]]
- pytabkit.bench.eval.analysis.get_opt_groups(task_type_name)
Generates a groups of methods that should be evaluated.
- Parameters:
task_type_name (str) – ‘class’ or ‘reg’
- Returns:
A dict of lists {alg_group_name: [alg_name_1, alg_name_2, …]}
- Return type:
Dict[str, List[str]]
- pytabkit.bench.eval.analysis.get_simplified_name(alg_name)
- Parameters:
alg_name (str)
- pytabkit.bench.eval.analysis.get_t_mean_confidence_interval(values)
- Parameters:
values (ndarray)
- Return type:
Tuple[ndarray, ndarray]
pytabkit.bench.eval.colors module
- pytabkit.bench.eval.colors.bilin_int(x, values)
- Parameters:
x (float)
values (List[Tuple[float, float]])
- Return type:
float
- pytabkit.bench.eval.colors.bisection_find(f, y, xmin, xmax, n=50)
- Parameters:
f (Callable[[float], float])
y (float)
xmin (float)
xmax (float)
- Return type:
float
- pytabkit.bench.eval.colors.more_percep_uniform_hue(x)
Returns a hue-value that should change perceptually somewhat uniformly with x :param x: a value between 0 and 1. :return: Hue value for HSV space.
- Parameters:
x (float)
- Return type:
float
pytabkit.bench.eval.evaluation module
- class pytabkit.bench.eval.evaluation.AlgFilter
Bases:
object
- class pytabkit.bench.eval.evaluation.AlgTaskTable
Bases:
object- __init__(alg_names, task_infos, alg_task_results)
- Parameters:
alg_names (List[str])
task_infos (List[TaskInfo])
alg_task_results (List[List[Any]])
- filter_algs(alg_names)
- Parameters:
alg_names (List[str])
- Return type:
- filter_n_splits(n_splits)
Limits the number of split results to n_splits and removes all algs where there exists a task with less than n_splits split results. :param n_splits: :return:
- Parameters:
n_splits (int)
- Return type:
- map(f)
- rename_algs(f)
- Parameters:
f (Callable[[str], str])
- Return type:
- to_array()
- Return type:
ndarray
- class pytabkit.bench.eval.evaluation.ArrayTableAnalyzer
Bases:
TableAnalyzerIntermediate class that analyzes using the same number of splits for each method
- __init__(f=None, use_weighting=False, separate_task_names=None, post_f=None)
- Parameters:
separate_task_names (List[str] | None)
- print_analysis(alg_task_table, val_table=None)
- Parameters:
alg_task_table (AlgTaskTable)
val_table (AlgTaskTable | None)
- Return type:
None
- class pytabkit.bench.eval.evaluation.DefaultEvalModeSelector
Bases:
EvalModeSelector- select_eval_modes(eval_modes)
- Parameters:
eval_modes (List[Tuple[str, str, str]])
- Return type:
List[Tuple[str, Tuple[str, str, str]]]
- class pytabkit.bench.eval.evaluation.EvalModeSelector
Bases:
object- select(alg_name, task_results)
- Parameters:
alg_name (str)
task_results (List)
- Return type:
Tuple[List[str], List[List]]
- select_eval_modes(eval_modes)
- Parameters:
eval_modes (List[Tuple[str, str, str]])
- Return type:
List[Tuple[str, Tuple[str, str, str]]]
- class pytabkit.bench.eval.evaluation.GreedyAlgSelectionTableAnalyzer
Bases:
ArrayTableAnalyzerGreedy selection of a portfolio of methods such that the addition improves the best performance in the portfolio the most
- class pytabkit.bench.eval.evaluation.MeanTableAnalyzer
Bases:
TableAnalyzer- __init__(f=None, use_weighting=False, separate_task_names=None, post_f=None)
- Parameters:
separate_task_names (List[str] | None)
- get_intervals(alg_task_table, std_factor=2.0)
- Parameters:
alg_task_table (AlgTaskTable)
std_factor (float)
- Return type:
List[Tuple[float, float]]
- get_means(alg_task_table)
- Parameters:
alg_task_table (AlgTaskTable)
- Return type:
List[float]
- print_analysis(alg_task_table)
- Parameters:
alg_task_table (AlgTaskTable)
- Return type:
None
- class pytabkit.bench.eval.evaluation.MultiResultsTable
Bases:
object- __init__(train_table, val_table, test_table, alg_tags, alg_configs)
- Parameters:
train_table (AlgTaskTable)
val_table (AlgTaskTable)
test_table (AlgTaskTable)
alg_tags (List[List[str]])
alg_configs (List[Dict[str, Any]])
- get_test_results_table(eval_mode_selector, val_metric_name=None, test_metric_name=None, alg_group_dict=None, val_test_groups=None, use_validation_errors=False, use_train_errors=False)
- Parameters:
eval_mode_selector (EvalModeSelector) – Decides how to select results from the different available ensembled/bagged results and how to name them
val_metric_name (str | None) – Name of the validation metric (used for optimizing over multiple algorithms)
test_metric_name (str | None) – Name of the test metric
alg_group_dict (Dict[str, AlgFilter] | None) – Optional dictionary of name: alg_filter. For each such pair, an additional algorithm with the given name will be added to the resulting table. Its results are computed as follows: On each split of each task, out of all the algorithms where the alg_filter returns True, the one with the best validation error is chosen, and then its test error is used.
val_test_groups (Dict[str, Dict[str, str]] | None) – Similar to alg_group_dict, but allows to use a different alg for the test score associated with the one with the best validation error. Specifically, for name: pairs in val_test_groups.items(), the best validation error among the keys of pairs will be determined, and then the test score of the value associated to this best key will be returned.
use_validation_errors (bool) – If True, use validation errors instead of test errors.
use_train_errors (bool) – If True, use train errors instead of test errors.
- Returns:
- Return type:
- static load(task_collection, n_cv, paths, alg_filter=None, split_type='random-split', max_n_splits=None, max_n_algs=None)
- Parameters:
task_collection (TaskCollection)
n_cv (int)
paths (Paths)
alg_filter (AlgFilter | None)
max_n_splits (int | None)
max_n_algs (int | None)
- class pytabkit.bench.eval.evaluation.NormalizedLossTableAnalyzer
Bases:
ArrayTableAnalyzer
- class pytabkit.bench.eval.evaluation.RankTableAnalyzer
Bases:
ArrayTableAnalyzer
- class pytabkit.bench.eval.evaluation.TableAnalyzer
Bases:
object- __init__(post_f=None)
- Parameters:
post_f (Callable[[float], float] | None)
- print_analysis(alg_task_table)
- Parameters:
alg_task_table (AlgTaskTable)
- class pytabkit.bench.eval.evaluation.TaskWeighting
Bases:
object- __init__(task_infos, separate_task_names)
Computes a weighting of tasks, downweighting tasks that have similar tasks. :param task_infos: Task infos. :param separate_task_names: Names of tasks that should not be grouped together with other tasks
- Parameters:
task_infos (List[TaskInfo])
separate_task_names (List[str] | None)
- get_n_groups()
- Return type:
int
- get_task_weights()
- Return type:
ndarray
- class pytabkit.bench.eval.evaluation.WinsTableAnalyzer
Bases:
ArrayTableAnalyzer
- pytabkit.bench.eval.evaluation.alg_comparison_str(alg_task_table, alg_names)
- Parameters:
alg_task_table (AlgTaskTable)
alg_names (List[str])
- pytabkit.bench.eval.evaluation.alg_results_str(alg_task_table, alg_name)
- Parameters:
alg_task_table (AlgTaskTable)
alg_name (str)
- pytabkit.bench.eval.evaluation.get_ranks(values)
- Parameters:
values (ndarray)
- Return type:
ndarray
pytabkit.bench.eval.plotting module
- pytabkit.bench.eval.plotting.coll_name_to_title(coll_name)
- Parameters:
coll_name (str)
- Return type:
str
- pytabkit.bench.eval.plotting.extend_runtimes(times, task_type_name, keep_gpu=True)
- Parameters:
times (Dict[str, float])
task_type_name (str)
keep_gpu (bool)
- Return type:
Dict[str, float]
- pytabkit.bench.eval.plotting.get_equidistant_blue_colors(n)
- Parameters:
n (int)
- pytabkit.bench.eval.plotting.get_equidistant_colors(n)
- Parameters:
n (int)
- pytabkit.bench.eval.plotting.get_plot_color(alg_name)
- Parameters:
alg_name (str)
- pytabkit.bench.eval.plotting.get_plot_color_idx(alg_name)
- Parameters:
alg_name (str)
- pytabkit.bench.eval.plotting.gg_color_hue(n, saturation=1.0, value=0.65)
- Parameters:
n (int)
saturation (float)
value (float)
- pytabkit.bench.eval.plotting.plot_benchmark_bars(paths, tables, filename=None, coll_names=None, val_metric_name=None, test_metric_name=None, alg_names=None, simplify_name_fn=None, use_geometric_mean=True, shift_eps=0.01)
- Parameters:
paths (Paths)
tables (ResultsTables)
filename (str | None)
coll_names (List[str] | None)
val_metric_name (str | None)
test_metric_name (str | None)
alg_names (List[str] | None)
simplify_name_fn (Callable[[str], str] | None)
use_geometric_mean (bool)
shift_eps (float)
- pytabkit.bench.eval.plotting.plot_cdd(paths, tables, coll_names, alg_names, val_metric_name=None, test_metric_name=None, filename=None, tag=None, use_validation_errors=False)
- Parameters:
paths (Paths)
tables (ResultsTables)
coll_names (List[str])
alg_names (List[str])
val_metric_name (str | None)
test_metric_name (str | None)
filename (str | None)
tag (str | None)
use_validation_errors (bool)
- pytabkit.bench.eval.plotting.plot_cdd_ax(ax, paths, tables, coll_name, alg_names, val_metric_name=None, test_metric_name=None, tag=None, use_validation_errors=False)
- Parameters:
ax (Axes)
paths (Paths)
tables (ResultsTables)
coll_name (str)
alg_names (List[str])
val_metric_name (str | None)
test_metric_name (str | None)
tag (str | None)
use_validation_errors (bool)
- pytabkit.bench.eval.plotting.plot_cumulative_ablations(paths, tables, filename=None, val_metric_name=None, test_metric_name=None, use_geometric_mean=True, shift_eps=0.01)
- Parameters:
paths (Paths)
tables (ResultsTables)
filename (str | None)
val_metric_name (str | None)
test_metric_name (str | None)
use_geometric_mean (bool)
shift_eps (float)
- pytabkit.bench.eval.plotting.plot_pareto(paths, tables, coll_names, alg_names, val_metric_name=None, test_metric_name=None, use_ranks=False, use_normalized_errors=False, filename=None, filename_suffix=None, tag=None, use_grinnorm_errors=False, use_geometric_mean=True, shift_eps=0.01, use_validation_errors=False, arrow_alg_names=None, plot_pareto_frontier=True, alg_names_to_hide=None, subfolder=None, pareto_frontier_width=2.0, use_2x3=False)
- Parameters:
paths (Paths)
tables (ResultsTables)
coll_names (List[str])
alg_names (List[str])
val_metric_name (str | None)
test_metric_name (str | None)
use_ranks (bool)
use_normalized_errors (bool)
filename (str | None)
filename_suffix (str | None)
tag (str | None)
use_grinnorm_errors (bool)
use_geometric_mean (bool)
shift_eps (float)
use_validation_errors (bool)
arrow_alg_names (List[Tuple[str, str]] | None)
plot_pareto_frontier (bool)
alg_names_to_hide (List[str] | None)
subfolder (str | None)
pareto_frontier_width (float)
use_2x3 (bool)
- pytabkit.bench.eval.plotting.plot_pareto_ax(ax, paths, tables, coll_name, alg_names, val_metric_name=None, test_metric_name=None, use_ranks=False, use_normalized_errors=False, tag=None, use_geometric_mean=True, use_grinnorm_errors=False, shift_eps=0.01, use_validation_errors=False, arrow_alg_names=None, plot_pareto_frontier=True, alg_names_to_hide=None, pareto_frontier_width=2.0)
- Parameters:
ax (Axes)
paths (Paths)
tables (ResultsTables)
coll_name (str)
alg_names (List[str])
val_metric_name (str | None)
test_metric_name (str | None)
use_ranks (bool)
use_normalized_errors (bool)
tag (str | None)
use_geometric_mean (bool)
use_grinnorm_errors (bool)
shift_eps (float)
use_validation_errors (bool)
arrow_alg_names (List[Tuple[str, str]] | None)
plot_pareto_frontier (bool)
alg_names_to_hide (List[str] | None)
pareto_frontier_width (float)
- pytabkit.bench.eval.plotting.plot_scatter(paths, filename, tables, coll_names, alg_name_1, alg_name_2, test_metric_name=None, val_metric_name=None, use_validation_errors=False)
- Parameters:
paths (Paths)
filename (str)
tables (ResultsTables)
coll_names (List[str])
alg_name_1 (str)
alg_name_2 (str)
test_metric_name (str | None)
val_metric_name (str | None)
use_validation_errors (bool)
- pytabkit.bench.eval.plotting.plot_scatter_ax(paths, tables, ax, coll_name, alg_name_1, alg_name_2, test_metric_name=None, val_metric_name=None, use_validation_errors=False)
- Parameters:
paths (Paths)
tables (ResultsTables)
ax (Axes)
coll_name (str)
alg_name_1 (str)
alg_name_2 (str)
test_metric_name (str | None)
val_metric_name (str | None)
use_validation_errors (bool)
- pytabkit.bench.eval.plotting.plot_schedule(paths, filename, sched_name)
- Parameters:
paths (Paths)
filename (str)
sched_name (str)
- Return type:
None
- pytabkit.bench.eval.plotting.plot_schedules(paths, filename, sched_names, sched_labels)
- Parameters:
paths (Paths)
filename (str)
sched_names (List[str])
sched_labels (List[str])
- Return type:
None
- pytabkit.bench.eval.plotting.plot_stopping(paths, tables, classification)
- Parameters:
paths (Paths)
tables (ResultsTables)
classification (bool)
- pytabkit.bench.eval.plotting.plot_stopping_ax(ax, paths, tables, method, classification)
- Parameters:
ax (Axes)
paths (Paths)
tables (ResultsTables)
method (str)
classification (bool)
- pytabkit.bench.eval.plotting.plot_winrates(paths, tables, coll_name, alg_names, val_metric_name=None, test_metric_name=None)
- Parameters:
paths (Paths)
tables (ResultsTables)
coll_name (str)
alg_names (List[str])
val_metric_name (str | None)
test_metric_name (str | None)
- pytabkit.bench.eval.plotting.shorten_coll_names(coll_names)
- Parameters:
coll_names (List[str])
- Return type:
List[str]
pytabkit.bench.eval.runtimes module
pytabkit.bench.eval.tables module
- pytabkit.bench.eval.tables.generate_ablations_table(paths, tables)
- Parameters:
paths (Paths)
tables (ResultsTables)
- pytabkit.bench.eval.tables.generate_architecture_table(paths, tables)
- Parameters:
paths (Paths)
tables (ResultsTables)
- pytabkit.bench.eval.tables.generate_ds_table(paths, task_collection, include_openml_ids=False)
- Parameters:
paths (Paths)
task_collection (TaskCollection)
include_openml_ids (bool)
- pytabkit.bench.eval.tables.generate_individual_results_table(paths, tables, filename, coll_name, alg_names, test_metric_name=None, val_metric_name=None)
- Parameters:
paths (Paths)
tables (ResultsTables)
filename (str)
coll_name (str)
alg_names (List[str])
test_metric_name (str | None)
val_metric_name (str | None)
- pytabkit.bench.eval.tables.generate_preprocessing_table(paths, tables)
- Parameters:
paths (Paths)
tables (ResultsTables)
- pytabkit.bench.eval.tables.generate_refit_table(paths, tables, alg_family)
- Parameters:
paths (Paths)
tables (ResultsTables)
alg_family (str)
- pytabkit.bench.eval.tables.generate_stopping_table(paths, tables)
- Parameters:
paths (Paths)
tables (ResultsTables)