Code Monkey home page Code Monkey logo

nam's Introduction

NAM: Neural Additive Models - Interpretable Machine Learning with Neural Nets

Overview | Installation | Paper

PyPI Python Version PyPI version arXiv GitHub license

NAM is a library for generalized additive models research. Neural Additive Models (NAMs) combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models. NAMs learn a linear combination of neural networks that each attend to a single input feature. These networks are trained jointly and can learn arbitrarily complex relationships between their input feature and the output.

Overview

from nam.config import defaults
from nam.data import FoldedDataset, NAMDataset
from nam.models import NAM, get_num_units
from nam.trainer import LitNAM
from nam.utils import *
Define the experiments configurations
config = defaults()
print(config)
>> Config(activation='exu', batch_size=1024, cross_val=False, data_path='data/GALLUP.csv', decay_rate=0.995, device='cpu', dropout=0.5, early_stopping_patience=50, experiment_name='NAM', feature_dropout=0.5, fold_num=1, hidden_sizes=[64, 32], l2_regularization=0.5, logdir='output', lr=0.0003, num_basis_functions=1000, num_epochs=1, num_folds=5, num_models=1, num_splits=3, num_workers=16, optimizer='adam', output_regularization=0.5, regression=False, save_model_frequency=2, save_top_k=3, seed=2021, shuffle=True, units_multiplier=2, use_dnn=False, wandb=True)
Creating and preprocessing the dataset
import sklearn
housing = sklearn.datasets.fetch_california_housing()

dataset = pd.DataFrame(data=housing.data, columns=housing.feature_names)
dataset['target'] = housing.target

config.regression = True

dataset = NAMDataset(config,
                      data_path=dataset,
                      features_columns=dataset.columns[:-1],
                      targets_column=dataset.columns[-1])


## Getting the training dataloaders
dataloaders = dataset.train_dataloaders()
Define NAM Model
model = NAM(
  config=config,
  name="NAM_GALLUP",
  num_inputs=len(dataset[0][0]),
  num_units=get_num_units(config, dataset.features),
)
model
>>> NAM(
  (feature_nns): ModuleList(
    (0): FeatureNN(
      (model): ModuleList(
        (0): ExU(in_features=1, out_features=1000)
        (1): LinReLU(in_features=1000, out_features=64)
        (2): LinReLU(in_features=64, out_features=32)
        (3): Linear(in_features=32, out_features=1, bias=True)
      )
    )
    (1): FeatureNN(
      (model): ModuleList(
        (0): ExU(in_features=1, out_features=4)
        (1): LinReLU(in_features=4, out_features=64)
        (2): LinReLU(in_features=64, out_features=32)
        (3): Linear(in_features=32, out_features=1, bias=True)
      )
    )
    (2): FeatureNN(
      (model): ModuleList(
        (0): ExU(in_features=1, out_features=184)
        (1): LinReLU(in_features=184, out_features=64)
        (2): LinReLU(in_features=64, out_features=32)
        (3): Linear(in_features=32, out_features=1, bias=True)
      )
    )
  )
)
Training loop
for fold, (trainloader, valloader) in enumerate(dataloaders):

    tb_logger = TensorBoardLogger(save_dir=config.logdir,
                                name=f'{model.name}',
                                version=f'fold_{fold + 1}')

    checkpoint_callback = ModelCheckpoint(filename=tb_logger.log_dir +
                                        "/{epoch:02d}-{val_loss:.4f}",
                                        monitor='val_loss',
                                        save_top_k=config.save_top_k,
                                        mode='min')

    litmodel = LitNAM(config, model)
    trainer = pl.Trainer(logger=tb_logger,
                       max_epochs=config.num_epochs,
                       checkpoint_callback=checkpoint_callback)
    trainer.fit(litmodel,
              train_dataloader=trainloader,
              val_dataloaders=valloader)


## Testing the trained model
trainer.test(test_dataloaders=dataset.test_dataloaders())

## Output
>>> [{'test_loss': 236.77108764648438,
  'test_loss_epoch': 237.080322265625,
  'Accuracy_metric_epoch': 0.6506563425064087,
  'Accuracy_metric': 0.6458333134651184}]
Nam Visualization
fig1 = plot_mean_feature_importance(litmodel.model, dataset)

fig2 = plot_nams(litmodel.model, dataset, num_cols= 3)

Vis

Usage

$ python main.py -h
usage: Neural Additive Models [-h] [--num_epochs NUM_EPOCHS]
                              [--learning_rate LEARNING_RATE]
                              [--batch_size BATCH_SIZE] --data_path DATA_PATH
                              --features_columns FEATURES_COLUMNS
                              [FEATURES_COLUMNS ...] --targets_column
                              TARGETS_COLUMN [TARGETS_COLUMN ...]
                              [--weights_column WEIGHTS_COLUMN]
                              [--experiment_name EXPERIMENT_NAME]
                              [--regression REGRESSION] [--logdir LOGDIR]
                              [--wandb WANDB]
                              [--hidden_sizes HIDDEN_SIZES [HIDDEN_SIZES ...]]
                              [--activation {exu,relu}] [--dropout DROPOUT]
                              [--feature_dropout FEATURE_DROPOUT]
                              [--decay_rate DECAY_RATE]
                              [--l2_regularization L2_REGULARIZATION]
                              [--output_regularization OUTPUT_REGULARIZATION]
                              [--dataset_name DATASET_NAME] [--seed SEED]
                              [--num_basis_functions NUM_BASIS_FUNCTIONS]
                              [--units_multiplier UNITS_MULTIPLIER]
                              [--shuffle SHUFFLE] [--cross_val CROSS_VAL]
                              [--num_folds NUM_FOLDS]
                              [--num_splits NUM_SPLITS] [--fold_num FOLD_NUM]
                              [--num_models NUM_MODELS]
                              [--early_stopping_patience EARLY_STOPPING_PATIENCE]
                              [--use_dnn USE_DNN] [--save_top_k SAVE_TOP_K]

optional arguments:
  -h, --help            show this help message and exit
  --num_epochs NUM_EPOCHS
                        The number of epochs to run training for.
  --learning_rate LEARNING_RATE
                        Hyperparameter: learning rate.
  --batch_size BATCH_SIZE
                        Hyperparameter: batch size.
  --data_path DATA_PATH
                        The path for the training data
  --features_columns FEATURES_COLUMNS [FEATURES_COLUMNS ...]
                        Name of the feature columns in the dataset
  --targets_column TARGETS_COLUMN [TARGETS_COLUMN ...]
                        Name of the target column in the dataset
  --weights_column WEIGHTS_COLUMN
                        Name of the weights column in the dataset
  --experiment_name EXPERIMENT_NAME
                        The name for the experiment
  --regression REGRESSION
                        Boolean flag indicating whether we are solving a
                        regression task or a classification task.
  --logdir LOGDIR       Path to dir where to store summaries.
  --wandb WANDB         Using wandb for experiments tracking and logging
  --hidden_sizes HIDDEN_SIZES [HIDDEN_SIZES ...]
                        Feature Neural Net hidden sizes
  --activation {exu,relu}
                        Activation function to used in the hidden layer.
                        Possible options: (1) relu, (2) exu
  --dropout DROPOUT     Hyperparameter: Dropout rate
  --feature_dropout FEATURE_DROPOUT
                        Hyperparameter: Prob. with which features are dropped
  --decay_rate DECAY_RATE
                        Hyperparameter: Optimizer decay rate
  --l2_regularization L2_REGULARIZATION
                        Hyperparameter: l2 weight decay
  --output_regularization OUTPUT_REGULARIZATION
                        Hyperparameter: feature reg
  --dataset_name DATASET_NAME
                        Name of the dataset to load for training.
  --seed SEED           seed for torch
  --num_basis_functions NUM_BASIS_FUNCTIONS
                        Number of basis functions to use in a FeatureNN for a
                        real-valued feature.
  --units_multiplier UNITS_MULTIPLIER
                        Number of basis functions for a categorical feature
  --shuffle SHUFFLE     Shuffle the training data
  --cross_val CROSS_VAL
                        Boolean flag indicating whether to perform cross
                        validation or not.
  --num_folds NUM_FOLDS
                        Number of N folds
  --num_splits NUM_SPLITS
                        Number of data splits to use
  --fold_num FOLD_NUM   Index of the fold to be used
  --num_models NUM_MODELS
                        the number of models to train.
  --early_stopping_patience EARLY_STOPPING_PATIENCE
                        Early stopping epochs
  --use_dnn USE_DNN     Deep NN baseline.
  --save_top_k SAVE_TOP_K
                        Indicates the maximum number of recent checkpoint
                        files to keep.

Citing NAM

@misc{kayid2020nams,
  title={Neural additive models Library},
  author={Kayid, Amr and Frosst, Nicholas and Hinton, Geoffrey E},
  year={2020}
}
@article{agarwal2020neural,
  title={Neural additive models: Interpretable machine learning with neural nets},
  author={Agarwal, Rishabh and Frosst, Nicholas and Zhang, Xuezhou and Caruana, Rich and Hinton, Geoffrey E},
  journal={arXiv preprint arXiv:2004.13912},
  year={2020}
}

nam's People

Contributors

amrmkayid avatar nickfrosst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nam's Issues

Not installing on Colab

Hi to everybody. I am trying to use the code on Colab.
I have used the command
!pip install git+https://github.com/AmrMKayid/nam
But I got the following error message:
Collecting sklearn (from nam==0.0.2)
Downloading sklearn-0.0.post7.tar.gz (3.6 kB)
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I would appreciate any help.
Thanks in advance
Regards

Error in example code / NAMdataset

The example code is not complete and missing the pandas import call. (import pandas as pd)
Additionally, I suggest to actually import sklearn.datasets explicitly.
If done so, I still cannot source your example code due to an error thrown in NAMdataset.

precisely:

dataset = NAMDataset(config,
                      data_path=dataset,
                      features_columns=dataset.columns[:-1],
                      targets_column=dataset.columns[-1])

results in

Traceback (most recent call last):

  File ~/.../site-packages/nam/data/base.py:76 in __init__
    self.features, self.features_names = transform_data(self.raw_X)

  File ~/.../site-packages/nam/data/utils.py:86 in transform_data
    cat_ohe_step = ('ohe', OneHotEncoder(sparse=False, handle_unknown='ignore'))

TypeError: OneHotEncoder.__init__() got an unexpected keyword argument 'sparse'

About graphing.py

Hi, I have a question about graphing.py. In def plot_nams, I would like to know why mean_pred is subtracted from feat_contrib. Isn't it possible to draw a graph without subtracting mean_pred, where 0 is a value that does not contribute to the prediction? Isn't it possible to draw the graph without subtracting mean_pred, with 0 being a value that does not contribute to the prediction? Is there any reason why the zero part of the graph should be the mean of the prediction?

How run this code

Hi,

I use google colab to test this model to apply some real data:

!pip install git+https://github.com/AmrMKayid/nam

but when I run the notebook the line about pytorch_lightning

import pytorch_lightning as pl

I obtain:

ImportError Traceback (most recent call last)
in ()
----> 1 import pytorch_lightning as pl
2 from pytorch_lightning.callbacks.model_checkpoint import ModelCheckpoint
3 from pytorch_lightning.loggers import TensorBoardLogger
4
5 from nam.config import defaults

10 frames
/usr/local/lib/python3.7/dist-packages/torchtext/vocab.py in ()
11 from typing import Dict, List, Optional, Iterable
12 from collections import Counter, OrderedDict
---> 13 from torchtext._torchtext import (
14 Vocab as VocabPybind,
15 )

ImportError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZN2at6detail10noopDeleteEPv

Can you help me?. Or I should use a local version of your code.. in any way, what are the libraries that I need to run succesfully your code. Thanks!!

SAG

Multi-task learning

Hello!

Thanks for sharing the code. I am wondering how to do multi-task learning with the code.

Thanks!

Getting the trained model's predictions

Hello. Thank you for this helpful code repository.

I have a question regarding using a trained model and obtaining its predictions.

In the documentation, you mention that:

"""

Testing the trained model

trainer.test(test_dataloaders=dataset.test_dataloaders())
"""

However, executing this code provides us only with some error metrics (e.g., test loss, accuracy), and not the actual trained model's predictions.

Can you help me with getting the model predictions and then convert the predictions to a NumPy array?
I need the model's predictions to calculate some statistics on them (for a classification task), that's why I need them in a NumPy array.

Thank you.
Best regards

Basis for FeatureNN and LinReLU

Hello! According to Section 3 of the NAM paper,

Feature nets in NAMs are selected amongst (1) DNNs containing 3 hidden layers with 64, 64 and 32 units and ReLU activation, and (2) single hidden layer NNs with1024 ExU units and ReLU-1 activation

However, the feature nets described in FeatureNN use either a ExU layer or a LinReLU layer followed by more LinReLU layers topped off with a standard Linear layer. May I ask:

  1. What was the basis of the this feature net architecture?

  2. What was the basis for the LinReLU layer? I understand that this LinReLU layer is similar to the ExU layer described in the paper, but without the exponential, but where did this come about?

I do apologize if the answers are already in the paper and I just overlooked them while reading it!

Is it working?

Hi,

Thank you for implementing NAM in PyTorch. I am looking into ways to run NAM in PyTorch on my dataset. I am wondering if the current state of code works? I saw commit msgs and it seems the code has not been woring in gallup datasets?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.