Code Monkey home page Code Monkey logo

fairtune's Introduction

FairTune: Optimizing Parameter-Efficient Fine-Tuning for Fairness in Medical Image Analysis

๐ŸŒŸ Accepted to ICLR 2024! | Paper Link

Fairtune

Abstract: Training models with robust group fairness properties is crucial in ethically sensitive application areas such as medical diagnosis. Despite the growing body of work aiming to minimise demographic bias in AI, this problem remains challenging. A key reason for this challenge is the fairness generalisation gap: High-capacity deep learning models can fit all training data nearly perfectly, and thus also exhibit perfect fairness during training. In this case, bias emerges only during testing when generalisation performance differs across subgroups. This motivates us to take a bi-level optimisation perspective on fair learning: Optimising the learning strategy based on validation fairness. Specifically, we consider the highly effective workflow of adapting pre-trained models to downstream medical imaging tasks using parameter-efficient fine-tuning (PEFT) techniques. There is a trade-off between updating more parameters, enabling a better fit to the task of interest vs. fewer parameters, potentially reducing the generalisation gap. To manage this tradeoff, we propose FairTune, a framework to optimise the choice of PEFT parameters with respect to fairness. We demonstrate empirically that FairTune leads to improved fairness on a range of medical imaging datasets.

Installation

Python >= 3.8+ and Pytorch >=1.10 are required for running the code.
Main packages: PyTorch, Optuna, FairLearn

Dataset Preparation

We follow the steps in MEDFAIR for preparing the datasets. Please see this page. Detailed instructions for preparing the datasets are given in the Appendix.

After preprocessing, specify the paths of the metadata and pickle files in config.yaml.

Dataset

Due to the data use agreements, we cannot directly share the download link. Please register and download datasets using the links from the table below:

Dataset Access
CheXpert https://stanfordmlgroup.github.io/competitions/chexpert/
OL3I https://stanfordaimi.azurewebsites.net/datasets/3263e34a-252e-460f-8f63-d585a9bfecfc
PAPILA https://www.nature.com/articles/s41597-022-01388-1#Sec6
HAM10000 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DBW86T
Oasis-1 https://www.oasis-brains.org/#data
Fitzpatrick17k https://github.com/mattgroh/fitzpatrick17k
Harvard-GF3300 https://ophai.hms.harvard.edu/datasets/harvard-glaucoma-fairness-3300-samples/

Run HPO search for finding the best mask (Stage 1)

python search_mask.py --model [model] --epochs [epochs] --batch-size [batch-size] \
     --opt [opt] --lr [lr] --lr-scheduler [lr-scheduler] --lr-warmup-method [lr-warmup-method] --lr-warmup-epochs [lr-warmup-epochs] 
     --tuning_method [tuning_method] --dataset [dataset] --sens_attribute [sens_attribute] \
     --objective_metric [objective_metric] --num_trials [num_trials] --disable_storage --disable_checkpointing

The mask would be saved in the directory FairTune/<model>/<dataset>/<Optuna_Masks>/

You can use different types of metrics as objectives for the HPO search. Please check parse_args.py for more options.

Fine-Tune on the downstream task using the searched mask (Stage 2)

python finetune_with_mask.py --model [model] --epochs [epochs] --batch-size [batch-size] \
     --opt [opt] --lr [lr] --lr-scheduler [lr-scheduler] --lr-warmup-method [lr-warmup-method] --lr-warmup-epochs [lr-warmup-epochs] 
     --tuning_method [tuning_method] --dataset [dataset] --sens_attribute [sens_attribute] \
    --cal_equiodds --mask_path [mask_path] --cal_equiodds --use_metric auc

The results would be saved in a CSV file located at FairTune/<model>/<dataset>/

Note: It is advisable to use a weighted loss when working with Papila and OL3I datasets because of high imbalance, hence, use the --compute_cw argument.

Citing FairTune

@inproceedings{dutt2023fairtune,
  title={Fairtune: Optimizing parameter efficient fine tuning for fairness in medical image analysis},
  author={Dutt, Raman and Bohdal, Ondrej and Tsaftaris, Sotirios A and Hospedales, Timothy},
  booktitle={International Conference on Learning Representations},
  year={2024}
}

fairtune's People

Contributors

ondrejbohdal avatar raman1121 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

aamer98

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.