Aggrigator Experiments 🐊

Aggrigator is a lightweight and modular Python library for aggregating uncertainty in deep learning workflows, especially useful for tasks like segmentation or per-pixel analysis.

With an intuitive API and a suite of built-in strategies, Aggrigator lets you:

Reduce pixel-wise uncertainty maps into scalar scores for ranking or evaluation.
Apply patch-based, class-specific, or thresholded aggregation strategies.
Integrate spatial correlation metrics like Moran’s I or Geary’s C.
Compare strategies side-by-side with insightful summaries and plots.

Designed to be modular, explainable, and research-friendly, this repository also includes the code used to reproduce the results presented in the original publication introducing the library.

📖 For full documentation and contribution guidelines, see the Aggrigator source code and and open an issue or pull request to get involved!

Prerequisites

Setup the environment by running the following commands. Be careful to choose the right pytorch version for your installed CUDA Version.

micromamba env create -f environment.yml
micromamba activate aggr_experiments

evaluation/scripts/evaluate_aurc.py relies on the fd-shifts repository, which has a dependency on "numpy<2.0.0". However, Aggrigator requires "numpy>2.0.0" for optimal functionality. To avoid dependency conflicts, we recommend the following:

Clone the fd-shifts repository after setting up your environment.
Edit the pyproject.toml file to replace "numpy>=1.22.2,<2.0.0" with "numpy>=2.0.0".
Then, install fd-shifts from local cloning using

(aggr_experiments) pip install -e /path-to-local/fd-shifts

This modification is safe because the functions of fd-shifts used in this experiment are compatible with numpy>=2.0.0. Once everything is installed and configured correctly, run the test suite to make sure all components work as expected:

(aggr_experiments) pytest -v

Evaluation

To quantify the impact of choosing an aggregation method in one's use case, the repo offers answers to the following five questions:

How similar are the aggregated uncertainty scores produced by the different aggregators?
cf. experiments/correlation_analyses/*.ipynb
When translating a UQ method into a real-world scenario, how does the aggregator affect its reliability?
cf. evaluation/scripts/evaluate_auroc.py & evaluation/scripts/evaluate_aurc.py
To what extent does parameter choice in non–parameter-free aggregators modify method reliability?
cf.
How can an aggregator impact on the selection of an optimal UQ model in benchmarking environments?
cf.
How can spatial measures improve the aggregation performance of context-free aggregators?
cf. evaluation/analyse_spatial_methods.py

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
datasets		datasets
evaluation		evaluation
experiments		experiments
output		output
reproducibility		reproducibility
spatial		spatial
tests		tests
.gitignore		.gitignore
CorrMatrix_aggregation_value_summary_ADE20K_validation_deeplabv3_semantic_dropout_pu.png		CorrMatrix_aggregation_value_summary_ADE20K_validation_deeplabv3_semantic_dropout_pu.png
README.md		README.md
correlations.ipynb		correlations.ipynb
environment.yml		environment.yml
example_notebook.ipynb		example_notebook.ipynb
log_966557.err		log_966557.err
log_966557.out		log_966557.out
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Aggrigator Experiments 🐊

Prerequisites

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Kainmueller-Lab/aggrigator_experiments

Folders and files

Latest commit

History

Repository files navigation

Aggrigator Experiments 🐊

Prerequisites

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages