Code Monkey home page Code Monkey logo

rfml's Introduction

Radio Frequency Machine Learning (RFML) in PyTorch

The concept of deep learning has revitalized machine learning research in recent years. In particular, researchers have demonstrated the use of deep learning for a multitude of tasks in wireless communications, such as signal classification and cognitive radio. These technologies have been colloquially coined Radio Frequency Machine Learning (RFML) by the Defense Advanced Research Projects Agency (DARPA). This repository hosts two key components to enable you to further your RFML research: a library with PyTorch implementations of common RFML networks, wrappers for downloading and utilizing an open source signal classification dataset, and adversarial evasion and training methods along with multiple tutorial notebooks for signal classification, adversarial evasion, and adversarial training.


License Uses Python 3 Deep Learning by PyTorch BLACK_BADGE


Table of Contents

Highlights

rfml.attack

Implementation of the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) that are aware of signal-to-perturbation ratios

rfml.data

Classes for creating datasets from raw-IQ samples, splitting amongst training/validation/test datasets while keeping classes and signal-to-noise ratios (SNR) balanced, and converting into a PyTorch TensorDataset

rfml.data.converters

Wrappers to load open source datasets (including downloading them from the internet if necessary) from DeepSig, Inc

rfml.nn.eval

Compute Top-K accuracy (overall and vs SNR) and confusion matrices from the models and datasets contained in this library

rfml.nn.model

Implementations of state of the art signal classification deep neural networks (DNNs) in PyTorch

rfml.nn.train

Implementation of standard training and adversarial training algorithms for classification problems in PyTorch

rfml.ptradio

PyTorch implementations of linearly modulated modems (such as PSK, QAM, etc) and simple channel models

Quick Start

Installation

The rfml library can be installed directly from pip (for Python >= 3.5).

pip install git+https://github.com/brysef/[email protected]

If you plan to directly edit the underlying library then you can install the library as editable after cloning this repository.

git clone [email protected]:brysef/rfml.git # OR https://github.com/brysef/rfml.git
pip install --user -e rfml/

Signal Classification (AMC)

Click to Expand

The following code (located at examples/signal_classification.py) will:

  • Download the RML2016.10a Dataset from deepsig.io/datasets
  • Load the dataset into a PyTorch format with categorical labels
  • Create a Convolutional Neural Network model with PyTorch
  • Train the model to perform modulation classification
  • Evaluate the model on the test set in terms of overall accuracy, accuracy vs SNR, and a confusion matrix amongst classes
  • Save the model weights for later use

Running the above code will produce an output similar to the following. Additionally, the weights file will be saved off (cnn.py) along with a local copy of the RML2016.10a dataset (RML2016.10a.*).

> python3 signal_classification.py
.../rfml/data/converters/rml_2016.py:42: UserWarning:
About to attempt downloading the RML2016.10A dataset from deepsig.io/datasets.
Depending on your network connection, this process can be slow and error prone.  Any
errors raised during network operations are not silenced and will therefore cause your
code to crash.  If you require robustness in your experimentation, you should manually
download the file locally and pass the file path to the load_RML201610a_dataset
function.

Further, this dataset is provided by DeepSig Inc. under Creative Commons Attribution
- NonCommercial - ShareAlike 4.0 License (CC BY-NC-SA 4.0).  By calling this function,
you agree to that license -- If an alternative license is needed, please contact DeepSig
Inc. at [email protected]

warn(self.WARNING_MSG)

Epoch 0 completed!
                -Mean Training Loss: 1.367
                -Mean Validation Loss: 1.226
Epoch 1 completed!
                -Mean Training Loss: 1.185
                -Mean Validation Loss: 1.180
Epoch 2 completed!
                -Mean Training Loss: 1.128
                -Mean Validation Loss: 1.158
Training has Completed:

=======================
        Best Validation Loss: 1.158
        Best Epoch: 2
        Total Epochs: 2
=======================
===============================
Overall Testing Accuracy: 0.6024
SNR (dB)        Accuracy (%)
===============================
-4      72.3
16      82.8
-12     25.2
10      84.0
-8      49.8
-10     34.8
-14     19.0
18      83.0
-6      63.5
6       83.4
-20     12.0
12      82.2
14      82.5
2       81.3
-2      77.6
-16     13.4
-18     12.3
4       81.6
0       80.9
8       83.3
===============================
Confusion Matrix:
...

Evading Signal Classification (FGSM)

Click to Expand

The following code (located at examples/adversarial_evasion.py) will:

  • Download the RML2016.10a Dataset from deepsig.io/datasets
  • Load the dataset into a PyTorch format with categorical labels and only keep high SNR samples
  • Create a Convolutional Neural Network model with PyTorch
  • Load pre-trained weights (see Signal Classification (AMC))
  • Evaluate the model on the dataset with no adversarial evasion for a baseline
  • Perform an FGSM attack with a signal-to-perturbation ratio of 10 dB

Note that its likely that this script would evaluate the network on data it also used for training and that is certainly not desired. This script is merely meant to serve as an easy example and shouldn't be directly used for evaluation.

Running the above code will produce an output similar to the following.

> python3 examples/adversarial_evasion.py
    Normal (no attack) Accuracy on Dataset: 0.831
    Adversarial Accuracy with SPR of 10 dB attack: 0.092
    FGSM Degraded Model Accuracy by 0.740

PyTorch Implementation of Linear Modulations

Click to Expand

The following code (located at examples/pt_modem.py) will do the following:

  • Generate a random bit stream
  • Modulate that bit stream using a PyTorch implementation of a linear modem (with a symbol mapping, upsampling, and pulse shaping)
  • Corrupt the signal using AWGN generated by a PyTorch module
  • Demodulate the bit stream back using a PyTorch implementation (with match filtering, downsampling, and a hard decision on symbol unmapping)
  • Compute the bit error rate

While it is a simplistic example, the individual pieces (transmit, receive, and channel) can all be reused for your specific application.

Running the above code will produce an output similar to the following.

> python3 examples/pt_modem.py
    BER=7.763e-02, theory=7.865e-02, |diff|=1.020e-03, SNR=0, modulation=BPSK
    BER=5.502e-02, theory=5.628e-02, |diff|=1.262e-03, SNR=1, modulation=BPSK
    BER=3.740e-02, theory=3.751e-02, |diff|=1.060e-04, SNR=2, modulation=BPSK
    BER=2.340e-02, theory=2.288e-02, |diff|=5.220e-04, SNR=3, modulation=BPSK
    BER=1.269e-02, theory=1.250e-02, |diff|=1.890e-04, SNR=4, modulation=BPSK
    BER=6.500e-03, theory=5.954e-03, |diff|=5.461e-04, SNR=5, modulation=BPSK
    BER=2.250e-03, theory=2.388e-03, |diff|=1.383e-04, SNR=6, modulation=BPSK
    BER=8.000e-04, theory=7.727e-04, |diff|=2.733e-05, SNR=7, modulation=BPSK

Using EVM as a Loss Function

Click to Expand

The Error Vector Magnitude (EVM) of the symbols can be used as a loss function as well. The following code snippet (located at examples/evm_loss.py) presents a, silly, minimalist example of its use. In this code, a transmit/receive chain is constructed (see PyTorch Implementation of Linear Modulations) and the transmitted symbols are learned from some target received symbols.

The code may be better understood through a diagram.

Overview of simplistic example for utilizing symbol (EVM) loss

If the above code is executed, an output similar to the following should be observed.

> python3 examples/evm_loss.py
    Loss @ epoch 0: 1.700565
    Loss @ epoch 15: 1.455332
    Loss @ epoch 30: 1.062061
    Loss @ epoch 45: 0.700792
    Loss @ epoch 60: 0.422401
    Loss @ epoch 75: 0.220447
    Loss @ epoch 90: 0.102916
    Loss @ epoch 105: 0.044921
    Loss @ epoch 120: 0.021536
    Loss @ epoch 135: 0.006125
    Loss @ epoch 150: 0.004482

Which may also be better understood through an animation.

Animation of utilizing symbol (EVM) loss

Spectral Mask as a Loss Function

Click to Expand

Nearly all communications systems are frequency limited, therefore, it can be helpful to have a component of the loss function which penalizes the use of spectrum. The following simple example (located at examples/spectral_loss.py) demonstrates a filtering of a signal to adhere to a spectral mask. By itself, it isn't useful as the performance is extremely subpar to a standard digital filter; however, it can be incorportated into a larger machine learning workflow.

It may be easier to understand the above code with a diagram.

Overview of simplistic example for utilizing spectral loss

If the example is ran, an output similar to the following will be displayed.

> python3 examples/spectral_loss.py
    Loss @ epoch 0: 20.610109
    Loss @ epoch 15: 1.159350
    Loss @ epoch 30: 0.206273
    Loss @ epoch 45: 0.039206
    Loss @ epoch 60: 0.007379
    Loss @ epoch 75: 0.001740
    Loss @ epoch 90: 0.000586
    Loss @ epoch 105: 0.000301
    Loss @ epoch 120: 0.000195
    Loss @ epoch 135: 0.000145
    Loss @ epoch 150: 0.000117

Which, again, may be more easily understood through an animation.

Animation of utilizing spectral loss

Clearly, the loss function does a great job at initially killing the out of band energy to comply with the provided spectral mask, however, it only achieves ~20dB of attenuation whereas a digital filter could achieve much greater out of band attenuation.

Executing Unit Tests

From the root folder of the repository.

python3 -m pytest

Documentation

The documentation is a relatively simplistic Sphinx API rendering hosted within the repository by GitHub pages. It can be accessed at brysef.github.io/rfml.

Tutorial

This code was released in support of a tutorial offered at MILCOM 2019 (Adversarial Radio Frequency Machine Learning (RFML) with PyTorch). While the code contained in the library can be applied more broadly, the tutorial was focused on adversarial evasion attacks and defenses on deep learning enabled signal classification systems. The learning objectives and course outline of that tutorial are provided below. Of particular interest, three Jupyter Notebooks are included that demonstrate how to: train an Automatic Modulation Classification Neural Network, evade signal classification with the Fast Gradient Sign Method, and perform adversarial training.

Learning Objectives

Through this tutorial, the attendee will be introduced to the following concepts:

  1. Applications of RFML
  2. The PyTorch toolkit for developing RFML solutions
    • (Hands-On Exercise) Train, validate, and test a simple neural network for spectrum sensing
    • Advanced PyTorch concepts (such as custom loss functions and modules to support advanced digital signal processing functions)
  3. Adversarial machine learning applied to RFML
    • Overview of current state-of-the-art in adversarial RFML
    • (Hands-On Exercise) Develop an adversarial evasion attack against a spectrum sensing network (created by the attendee) using the well-known Fast Gradient Sign Method (FGSM) algorithm
    • Overview of hardening techniques against adversarial RFML
    • (Hands-On Exercise) Utilize adversarial training to harden a RFML model

Format

The primary objective of the tutorial is for the attendee to be hands-on with the code. Therefore, while a lot of information is presented in slide format, the core of the tutorial is code execution through prepared Jupyter Notebooks executed in Google Colaboratory. In the modules listed below, you can click on the solutions notebook to view a pre-ran Jupyter Notebook that is rendered by GitHub, or, click on Open in Colab to open an executable version in Google Colaboratory. Note that when opening Google Colaboratory you should either enable the GPU Hardware Accelerator (click here for how) or disable the GPU flag in the notebooks (this will make execution very slow).

Modules

# Time Description Notes/Solutions/Exercises
0 10m Introduction: Provide an overview of RFML with a focus on signal classification.
1 10m Tutorial Objectives and Software Tools: Describe the skills that will be learned in this tutorial and introduce the format and software tools utilized for the hands-on exercises.
2 20m Train/Evaluate a DNN for AMC: Train and validate a DNN using a static dataset of raw IQ data to perform an automatic modulation classification (AMC) task. After training, the performance of the network will be evaluated as a function of SNR and an averaged confusion matrix of all possible classes. Open Solutions Notebook: Train/Evaluate a DNN for AMC Open Notebook in Colab: Train/Evaluate a DNN for AMC
3 15m Adversarial RF Machine Learning: Provide an overview of adversarial machine learning techniques and how they uniquely apply to RFML. In particular, focus on adversarial evasion attacks and the well-known FGSM algorithm.
4 20m Evade Signal Classification with FGSM: Develop a white-box, digital, adversarial evasion attack against a trained AMC DNN using the FGSM algorithm. Open Solutions Notebook: Evade Signal Classification with FGSM Open Notebook in Colab: Evade Signal Classification with FGSM
5 15m Physical Adversarial RF Machine Learning: Many adversarial ML techniques in the literature focus on attacks that have digital access to the classifier input; however, the primary vulnerability of RFML is to physical attacks, which are transmitted over-the-air and thus perturbations are subject to natural noise and impact their intended receiver.
6 15m Hardening RFML Against Adversarial Evasion: Provide an overview of techniques by which to harden deep learning solutions against adversarial evasion attacks. In particular, study the unique defense techniques that have been proposed in RFML for both detecting adversarial examples and being robust to those adversarial examples (by still correctly classifiying them).
7 20m Adversarial Training: Train a DNN, with portions of the training inputs being adversarial examples generated from FGSM on the fly, in order to gain more robustness against an FGSM attack. Open Solutions Notebook: Adversarial Training Open Notebook in Colab: Adversarial Training
9 10m Conclusion: Summary of current state of adversarial RFML, the proposed next steps for research, and immediate actions to ensure robust RFML devices.
10 20m Advanced Topics in PyTorch: "Expert" filters, channel models, and custom loss functions for RF.

Contributing

If you find any errors, feel free to open an issue; though I can't guarantee how quickly it will be looked at. Pull requests are accepted though ๐Ÿ˜ƒ! There isn't an extensive contribution guideline, but, please follow the GitHub Flow.

In particular, ensure that you've:

  • written a passing unit test (that would have failed before)
  • formatted the code with black
  • re-built the documentation (if applicable)
  • adequately described why the change was needed (if a bug) or what the change does (if a new feature)

If you've open sourced your own work in machine learning for wireless communications, feel free to drop me a note to be added to the related projects!

  • MeysamSadeghi/Security of DL in Wireless: Attacks on Physical Layer Auto-Encoders in TensorFlow
  • RadioML/Examples: Automatic Modulation Classification using Keras
  • RadioML/Dataset: Recreate the RML Synthetic Datasets using GNU Radio
  • immortal3/AutoEncoder Communication: TensorFlow implementation of "An Introduction to Deep Learning for the Physical Layer"
  • Tensorflow/Cleverhans: Library for adversarial machine learning attacks and defenses with support for Tensorflow (support for other frameworks coming soon) -- This repository also contains tutorials for adversarial machine learning
  • BethgeLab/Foolbox: Library for adversarial machine learning attacks with support for PyTorch, Keras, and TensorFlow
  • MadryLab/robustness: Adversarial training library built with PyTorch.
  • FastAI: An extensive deep learning library along with tutorials built on top of PyTorch
  • PyTorch: The PyTorch library itself comes with excellent documentation and tutorials

License

This project is licensed under the BSD 3-Clause License -- See LICENSE.rst for more details.

Citing this Repository

This repository contains implementations of other folk's algorithms (e.g. adversarial attacks, neural network architectures, dataset wrappers, etc.) and therefore, whenever those algorithms are used, their respective works must be cited. The relevant citations for their works have been provided in the docstrings when needed. Since this repository isn't the official code for any publication, you take responsibility for the correctness of the implementations (although we've made every effort to ensure that the code is well tested).

If you find this code useful for your research, please consider referencing it in your work so that others are aware. This repository isn't citable (since that requires archiving and creating a DOI), so a simple footnote would be the best way to reference this repository.

\footnote{Code is available at \textit{github.com/brysef/rfml}}

If your work specifically revolves around adversarial machine learning for wireless communications, consider citing my journal publication (on FGSM physical adversarial attacks for wireless communications) or MILCOM conference paper (on adding communications loss to adversarial attacks).

@article{Flowers2019a,
        author = {B. {Flowers} and R. M. {Buehrer} and W. C. {Headley}},
        doi = {10.1109/TIFS.2019.2934069},
        issn = {1556-6013},
        journal = {IEEE Transactions on Information Forensics and Security},
        month = {},
        number = {},
        pages = {1-1},
        title = {Evaluating Adversarial Evasion Attacks in the Context of Wireless Communications},
        volume = {},
        year = {2019}
}
@inproceedings{Flowers2019b,
        author = {B. {Flowers} and R. M. {Buehrer} and W. C. {Headley}},
        booktitle = {MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM)},
        doi = {10.1109/MILCOM47813.2019.9020716},
        issn = {2155-7578},
        keywords = {Perturbation methods;Transmitters;Receivers;Machine learning;Bit error rate;Modulation;Neural networks},
        month = {Nov},
        number = {},
        pages = {133-140},
        title = {Communications Aware Adversarial Residual Networks for Over the Air Evasion Attacks},
        volume = {},
        year = {2019}
}

Authors

Bryse Flowers PhD student at UCSD [email protected]
William C. Headley Associate Director of Electronic Systems Laboratory, Hume Center / Research Assistant Professor ECE Virginia Tech [email protected]

Numerous others have generously contributed to this work -- see CONTRIBUTORS.rst for more details.

rfml's People

Contributors

brysef avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rfml's Issues

[Question] Any slides available?

Hi, this seems like it was a great tutorial section. The notebooks are very helpful, but do you happen to have any slides that include the intro portion of the talk?

[Question] ValueError: columns cannot be a set

I have downloaded the dataset, but encountered the following error when importing the dataset. How can I fix it?


ValueError Traceback (most recent call last)
Cell In[22], line 1
----> 1 dataset = load_RML201610A_dataset(path=data_path)
2 print(len(dataset))
3 pprint(dataset.get_examples_per_class())

File G:\xxxcode\rfml-master\notebooks..\rfml\data\converters\rml_2016.py:129, in load_RML201610A_dataset(path)
121 UNPICKLED_PATH = "RML2016.10a_dict.pkl"
123 loader = RML2016DataLoader(
124 cache_path=CACHE_PATH,
125 remote_url=REMOTE_URL,
126 unpickled_path=UNPICKLED_PATH,
127 warning_msg=WARNING_MSG,
128 )
--> 129 return loader.load(path=path)

File G:\xxxcode\rfml-master\notebooks..\rfml\data\converters\rml_2016.py:35, in RML2016DataLoader.load(self, path)
30 if not os.path.exists(path):
31 raise ValueError(
32 "If path is provided, it must actually exist. Provided path: "
33 "{}".format(path)
34 )
---> 35 return self._load_local(path=path)
37 # If this function has previously been called before to fetch the dataset from the
38 # remote, then it will have already been cached locally and unpickled.
39 if os.path.exists(self.UNPICKLED_PATH):

File G:\xxxcode\rfml-master\notebooks..\rfml\data\converters\rml_2016.py:53, in RML2016DataLoader._load_local(self, path)
51 for iq in data[(mod, snr)]:
52 builder.add(iq=iq, Modulation=mod, SNR=snr)
---> 53 return builder.build()

File G:\xxxcode\rfml-master\notebooks..\rfml\data\dataset_builder.py:156, in DatasetBuilder.build(self)
150 def build(self) -> Dataset:
151 """Build the Dataset based on the examples that have been added.
152
153 Returns:
154 Dataset: A compiled dataset consisting of the added examples.
155 """
--> 156 df = pd.DataFrame(self._rows, columns=self._keys)
157 return Dataset(df)

File G:\Users\xxx\anaconda3\envs\LYY2\Lib\site-packages\pandas\core\frame.py:675, in DataFrame.init(self, data, index, columns, dtype, copy)
673 raise ValueError("index cannot be a set")
674 if columns is not None and isinstance(columns, set):
--> 675 raise ValueError("columns cannot be a set")
677 if copy is None:
678 if isinstance(data, dict):
679 # retain pre-GH#38939 default behavior

ValueError: columns cannot be a set

pre-channel FGSM

Hi Bryse,
I've added some for loops to your adversarial_evasion.py example to produce the plot below, and it seems like you're digitally inserting the FGSM attack directly into the data the eavesdropper receives, not adding it to your transmission pre-channel as in the paper this tutorial is associated with. Is there a simple way to do this with your API?
Best,
Kyle

acc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.