pytabkit.bench.eval package

Submodules

pytabkit.bench.eval.analysis module

class pytabkit.bench.eval.analysis.ResultsTables

Bases: object

__init__(paths)
Parameters:

paths (Paths)

get(coll_name, n_cv=1, tag='paper')
Parameters:
  • coll_name (str)

  • n_cv (int)

  • tag (str)

Return type:

MultiResultsTable

pytabkit.bench.eval.analysis.get_benchmark_results(paths, table, coll_name, use_relative_score=True, return_percentages=True, val_metric_name=None, test_metric_name=None, rel_alg_name='BestModel', use_ranks=False, use_normalized_errors=False, use_grinnorm_errors=False, use_task_mean=True, use_geometric_mean=True, shift_eps=0.01, filter_alg_names_list=None, simplify_name_fn=None, n_splits=10, use_validation_errors=False)
Parameters:
  • paths (Paths)

  • table (MultiResultsTable)

  • coll_name (str)

  • use_relative_score (bool)

  • return_percentages (bool)

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • rel_alg_name (str)

  • use_ranks (bool)

  • use_normalized_errors (bool)

  • use_grinnorm_errors (bool)

  • use_task_mean (bool)

  • use_geometric_mean (bool)

  • shift_eps (float)

  • filter_alg_names_list (List[str] | None)

  • simplify_name_fn (Callable[[str], str] | None)

  • n_splits (int)

  • use_validation_errors (bool)

Return type:

Tuple[Dict[str, float | ndarray], Dict[str, Tuple[float | ndarray, float | ndarray]]]

pytabkit.bench.eval.analysis.get_display_name(alg_name)
Parameters:

alg_name (str)

Return type:

str

pytabkit.bench.eval.analysis.get_ensemble_groups(task_type_name)

Generates a groups of methods that should be evaluated.

Parameters:

task_type_name (str) – ‘class’ or ‘reg’

Returns:

A dict of lists {alg_group_name: [alg_name_1, alg_name_2, …]}

Return type:

Dict[str, List[str]]

pytabkit.bench.eval.analysis.get_opt_groups(task_type_name)

Generates a groups of methods that should be evaluated.

Parameters:

task_type_name (str) – ‘class’ or ‘reg’

Returns:

A dict of lists {alg_group_name: [alg_name_1, alg_name_2, …]}

Return type:

Dict[str, List[str]]

pytabkit.bench.eval.analysis.get_simplified_name(alg_name)
Parameters:

alg_name (str)

pytabkit.bench.eval.analysis.get_t_mean_confidence_interval(values)
Parameters:

values (ndarray)

Return type:

Tuple[ndarray, ndarray]

pytabkit.bench.eval.colors module

pytabkit.bench.eval.colors.bilin_int(x, values)
Parameters:
  • x (float)

  • values (List[Tuple[float, float]])

Return type:

float

pytabkit.bench.eval.colors.bisection_find(f, y, xmin, xmax, n=50)
Parameters:
  • f (Callable[[float], float])

  • y (float)

  • xmin (float)

  • xmax (float)

Return type:

float

pytabkit.bench.eval.colors.more_percep_uniform_hue(x)

Returns a hue-value that should change perceptually somewhat uniformly with x :param x: a value between 0 and 1. :return: Hue value for HSV space.

Parameters:

x (float)

Return type:

float

pytabkit.bench.eval.evaluation module

class pytabkit.bench.eval.evaluation.AlgFilter

Bases: object

class pytabkit.bench.eval.evaluation.AlgTaskTable

Bases: object

__init__(alg_names, task_infos, alg_task_results)
Parameters:
  • alg_names (List[str])

  • task_infos (List[TaskInfo])

  • alg_task_results (List[List[Any]])

filter_algs(alg_names)
Parameters:

alg_names (List[str])

Return type:

AlgTaskTable

filter_n_splits(n_splits)

Limits the number of split results to n_splits and removes all algs where there exists a task with less than n_splits split results. :param n_splits: :return:

Parameters:

n_splits (int)

Return type:

AlgTaskTable

map(f)
rename_algs(f)
Parameters:

f (Callable[[str], str])

Return type:

AlgTaskTable

to_array()
Return type:

ndarray

class pytabkit.bench.eval.evaluation.ArrayTableAnalyzer

Bases: TableAnalyzer

Intermediate class that analyzes using the same number of splits for each method

__init__(f=None, use_weighting=False, separate_task_names=None, post_f=None)
Parameters:

separate_task_names (List[str] | None)

print_analysis(alg_task_table, val_table=None)
Parameters:
Return type:

None

class pytabkit.bench.eval.evaluation.DefaultEvalModeSelector

Bases: EvalModeSelector

select_eval_modes(eval_modes)
Parameters:

eval_modes (List[Tuple[str, str, str]])

Return type:

List[Tuple[str, Tuple[str, str, str]]]

class pytabkit.bench.eval.evaluation.EvalModeSelector

Bases: object

select(alg_name, task_results)
Parameters:
  • alg_name (str)

  • task_results (List)

Return type:

Tuple[List[str], List[List]]

select_eval_modes(eval_modes)
Parameters:

eval_modes (List[Tuple[str, str, str]])

Return type:

List[Tuple[str, Tuple[str, str, str]]]

class pytabkit.bench.eval.evaluation.FunctionAlgFilter

Bases: AlgFilter

__init__(f)
class pytabkit.bench.eval.evaluation.GreedyAlgSelectionTableAnalyzer

Bases: ArrayTableAnalyzer

Greedy selection of a portfolio of methods such that the addition improves the best performance in the portfolio the most

class pytabkit.bench.eval.evaluation.MeanTableAnalyzer

Bases: TableAnalyzer

__init__(f=None, use_weighting=False, separate_task_names=None, post_f=None)
Parameters:

separate_task_names (List[str] | None)

get_intervals(alg_task_table, std_factor=2.0)
Parameters:
Return type:

List[Tuple[float, float]]

get_means(alg_task_table)
Parameters:

alg_task_table (AlgTaskTable)

Return type:

List[float]

print_analysis(alg_task_table)
Parameters:

alg_task_table (AlgTaskTable)

Return type:

None

class pytabkit.bench.eval.evaluation.MultiResultsTable

Bases: object

__init__(train_table, val_table, test_table, alg_tags, alg_configs)
Parameters:
get_test_results_table(eval_mode_selector, val_metric_name=None, test_metric_name=None, alg_group_dict=None, val_test_groups=None, use_validation_errors=False, use_train_errors=False)
Parameters:
  • eval_mode_selector (EvalModeSelector) – Decides how to select results from the different available ensembled/bagged results and how to name them

  • val_metric_name (str | None) – Name of the validation metric (used for optimizing over multiple algorithms)

  • test_metric_name (str | None) – Name of the test metric

  • alg_group_dict (Dict[str, AlgFilter] | None) – Optional dictionary of name: alg_filter. For each such pair, an additional algorithm with the given name will be added to the resulting table. Its results are computed as follows: On each split of each task, out of all the algorithms where the alg_filter returns True, the one with the best validation error is chosen, and then its test error is used.

  • val_test_groups (Dict[str, Dict[str, str]] | None) – Similar to alg_group_dict, but allows to use a different alg for the test score associated with the one with the best validation error. Specifically, for name: pairs in val_test_groups.items(), the best validation error among the keys of pairs will be determined, and then the test score of the value associated to this best key will be returned.

  • use_validation_errors (bool) – If True, use validation errors instead of test errors.

  • use_train_errors (bool) – If True, use train errors instead of test errors.

Returns:

Return type:

AlgTaskTable

static load(task_collection, n_cv, paths, alg_filter=None, split_type='random-split', max_n_splits=None, max_n_algs=None)
Parameters:
class pytabkit.bench.eval.evaluation.NormalizedLossTableAnalyzer

Bases: ArrayTableAnalyzer

class pytabkit.bench.eval.evaluation.RankTableAnalyzer

Bases: ArrayTableAnalyzer

class pytabkit.bench.eval.evaluation.TableAnalyzer

Bases: object

__init__(post_f=None)
Parameters:

post_f (Callable[[float], float] | None)

print_analysis(alg_task_table)
Parameters:

alg_task_table (AlgTaskTable)

class pytabkit.bench.eval.evaluation.TaskWeighting

Bases: object

__init__(task_infos, separate_task_names)

Computes a weighting of tasks, downweighting tasks that have similar tasks. :param task_infos: Task infos. :param separate_task_names: Names of tasks that should not be grouped together with other tasks

Parameters:
  • task_infos (List[TaskInfo])

  • separate_task_names (List[str] | None)

get_n_groups()
Return type:

int

get_task_weights()
Return type:

ndarray

class pytabkit.bench.eval.evaluation.WinsTableAnalyzer

Bases: ArrayTableAnalyzer

pytabkit.bench.eval.evaluation.alg_comparison_str(alg_task_table, alg_names)
Parameters:
pytabkit.bench.eval.evaluation.alg_results_str(alg_task_table, alg_name)
Parameters:
pytabkit.bench.eval.evaluation.get_ranks(values)
Parameters:

values (ndarray)

Return type:

ndarray

pytabkit.bench.eval.plotting module

pytabkit.bench.eval.plotting.coll_name_to_title(coll_name)
Parameters:

coll_name (str)

Return type:

str

pytabkit.bench.eval.plotting.extend_runtimes(times, task_type_name, keep_gpu=True)
Parameters:
  • times (Dict[str, float])

  • task_type_name (str)

  • keep_gpu (bool)

Return type:

Dict[str, float]

pytabkit.bench.eval.plotting.get_equidistant_blue_colors(n)
Parameters:

n (int)

pytabkit.bench.eval.plotting.get_equidistant_colors(n)
Parameters:

n (int)

pytabkit.bench.eval.plotting.get_plot_color(alg_name)
Parameters:

alg_name (str)

pytabkit.bench.eval.plotting.get_plot_color_idx(alg_name)
Parameters:

alg_name (str)

pytabkit.bench.eval.plotting.gg_color_hue(n, saturation=1.0, value=0.65)
Parameters:
  • n (int)

  • saturation (float)

  • value (float)

pytabkit.bench.eval.plotting.plot_benchmark_bars(paths, tables, filename=None, coll_names=None, val_metric_name=None, test_metric_name=None, alg_names=None, simplify_name_fn=None, use_geometric_mean=True, shift_eps=0.01)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • filename (str | None)

  • coll_names (List[str] | None)

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • alg_names (List[str] | None)

  • simplify_name_fn (Callable[[str], str] | None)

  • use_geometric_mean (bool)

  • shift_eps (float)

pytabkit.bench.eval.plotting.plot_cdd(paths, tables, coll_names, alg_names, val_metric_name=None, test_metric_name=None, filename=None, tag=None, use_validation_errors=False)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • coll_names (List[str])

  • alg_names (List[str])

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • filename (str | None)

  • tag (str | None)

  • use_validation_errors (bool)

pytabkit.bench.eval.plotting.plot_cdd_ax(ax, paths, tables, coll_name, alg_names, val_metric_name=None, test_metric_name=None, tag=None, use_validation_errors=False)
Parameters:
  • ax (Axes)

  • paths (Paths)

  • tables (ResultsTables)

  • coll_name (str)

  • alg_names (List[str])

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • tag (str | None)

  • use_validation_errors (bool)

pytabkit.bench.eval.plotting.plot_cumulative_ablations(paths, tables, filename=None, val_metric_name=None, test_metric_name=None, use_geometric_mean=True, shift_eps=0.01)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • filename (str | None)

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • use_geometric_mean (bool)

  • shift_eps (float)

pytabkit.bench.eval.plotting.plot_pareto(paths, tables, coll_names, alg_names, val_metric_name=None, test_metric_name=None, use_ranks=False, use_normalized_errors=False, filename=None, filename_suffix=None, tag=None, use_grinnorm_errors=False, use_geometric_mean=True, shift_eps=0.01, use_validation_errors=False, arrow_alg_names=None, plot_pareto_frontier=True, alg_names_to_hide=None, subfolder=None, pareto_frontier_width=2.0, use_2x3=False)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • coll_names (List[str])

  • alg_names (List[str])

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • use_ranks (bool)

  • use_normalized_errors (bool)

  • filename (str | None)

  • filename_suffix (str | None)

  • tag (str | None)

  • use_grinnorm_errors (bool)

  • use_geometric_mean (bool)

  • shift_eps (float)

  • use_validation_errors (bool)

  • arrow_alg_names (List[Tuple[str, str]] | None)

  • plot_pareto_frontier (bool)

  • alg_names_to_hide (List[str] | None)

  • subfolder (str | None)

  • pareto_frontier_width (float)

  • use_2x3 (bool)

pytabkit.bench.eval.plotting.plot_pareto_ax(ax, paths, tables, coll_name, alg_names, val_metric_name=None, test_metric_name=None, use_ranks=False, use_normalized_errors=False, tag=None, use_geometric_mean=True, use_grinnorm_errors=False, shift_eps=0.01, use_validation_errors=False, arrow_alg_names=None, plot_pareto_frontier=True, alg_names_to_hide=None, pareto_frontier_width=2.0)
Parameters:
  • ax (Axes)

  • paths (Paths)

  • tables (ResultsTables)

  • coll_name (str)

  • alg_names (List[str])

  • val_metric_name (str | None)

  • test_metric_name (str | None)

  • use_ranks (bool)

  • use_normalized_errors (bool)

  • tag (str | None)

  • use_geometric_mean (bool)

  • use_grinnorm_errors (bool)

  • shift_eps (float)

  • use_validation_errors (bool)

  • arrow_alg_names (List[Tuple[str, str]] | None)

  • plot_pareto_frontier (bool)

  • alg_names_to_hide (List[str] | None)

  • pareto_frontier_width (float)

pytabkit.bench.eval.plotting.plot_scatter(paths, filename, tables, coll_names, alg_name_1, alg_name_2, test_metric_name=None, val_metric_name=None, use_validation_errors=False)
Parameters:
  • paths (Paths)

  • filename (str)

  • tables (ResultsTables)

  • coll_names (List[str])

  • alg_name_1 (str)

  • alg_name_2 (str)

  • test_metric_name (str | None)

  • val_metric_name (str | None)

  • use_validation_errors (bool)

pytabkit.bench.eval.plotting.plot_scatter_ax(paths, tables, ax, coll_name, alg_name_1, alg_name_2, test_metric_name=None, val_metric_name=None, use_validation_errors=False)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • ax (Axes)

  • coll_name (str)

  • alg_name_1 (str)

  • alg_name_2 (str)

  • test_metric_name (str | None)

  • val_metric_name (str | None)

  • use_validation_errors (bool)

pytabkit.bench.eval.plotting.plot_schedule(paths, filename, sched_name)
Parameters:
  • paths (Paths)

  • filename (str)

  • sched_name (str)

Return type:

None

pytabkit.bench.eval.plotting.plot_schedules(paths, filename, sched_names, sched_labels)
Parameters:
  • paths (Paths)

  • filename (str)

  • sched_names (List[str])

  • sched_labels (List[str])

Return type:

None

pytabkit.bench.eval.plotting.plot_stopping(paths, tables, classification)
Parameters:
pytabkit.bench.eval.plotting.plot_stopping_ax(ax, paths, tables, method, classification)
Parameters:
pytabkit.bench.eval.plotting.plot_winrates(paths, tables, coll_name, alg_names, val_metric_name=None, test_metric_name=None)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • coll_name (str)

  • alg_names (List[str])

  • val_metric_name (str | None)

  • test_metric_name (str | None)

pytabkit.bench.eval.plotting.shorten_coll_names(coll_names)
Parameters:

coll_names (List[str])

Return type:

List[str]

pytabkit.bench.eval.runtimes module

pytabkit.bench.eval.runtimes.get_avg_predict_times(paths, coll_name, per_1k_samples=False)
Parameters:
  • paths (Paths)

  • coll_name (str)

  • per_1k_samples (bool)

Return type:

Dict[str, float]

pytabkit.bench.eval.runtimes.get_avg_train_times(paths, coll_name, per_1k_samples=False)
Parameters:
  • paths (Paths)

  • coll_name (str)

  • per_1k_samples (bool)

Return type:

Dict[str, float]

pytabkit.bench.eval.tables module

pytabkit.bench.eval.tables.generate_ablations_table(paths, tables)
Parameters:
pytabkit.bench.eval.tables.generate_architecture_table(paths, tables)
Parameters:
pytabkit.bench.eval.tables.generate_collections_table(paths)
Parameters:

paths (Paths)

pytabkit.bench.eval.tables.generate_ds_table(paths, task_collection, include_openml_ids=False)
Parameters:
pytabkit.bench.eval.tables.generate_individual_results_table(paths, tables, filename, coll_name, alg_names, test_metric_name=None, val_metric_name=None)
Parameters:
  • paths (Paths)

  • tables (ResultsTables)

  • filename (str)

  • coll_name (str)

  • alg_names (List[str])

  • test_metric_name (str | None)

  • val_metric_name (str | None)

pytabkit.bench.eval.tables.generate_preprocessing_table(paths, tables)
Parameters:
pytabkit.bench.eval.tables.generate_refit_table(paths, tables, alg_family)
Parameters:
pytabkit.bench.eval.tables.generate_stopping_table(paths, tables)
Parameters:

Module contents