# Reproducing results of "Rethinking Early Stopping: Refine, Then Calibrate" Here, we document how to reproduce results from our paper [Rethinking Early Stopping: Refine, Then Calibrate](https://arxiv.org/abs/2501.19195). For general instructions on how to set data paths and use slurm, we refer to the installation page. The following will be the parts specific to this paper. ## Installation ```bash pip install probmetrics[extra] # to get smECE pip install pytabkit[bench,dev] ``` ### Original environment The original conda environment for exact reproduction is stored in `original_requirements/conda_env_2025_01_15.yml`. ## Downloading datasets Download the zipped datasets (`dataset-latest.zip`) of the TALENT benchmark from [here](https://drive.google.com/drive/folders/1j1zt3zQIo8dO6vkO-K-WE6pSrl71bf0z). Extract them into a folder. Then, use ```commandline python3 scripts/download_data.py --import_talent_class_small --talent_folder= ``` where the provided data folder should be the `data` folder inside the unzipped results. ## Running experiments Experiments can be run using `python3 scripts/run_probclass_experiments.py`, then plots can be generated using `python3 scripts/create_probclass_plots.py`.