Code Monkey home page Code Monkey logo

t-cell-subsets-neural-network's Introduction

T Cell subsets neural-network

This neural-network can predict the effector/memory T cell (T lymphocyte) subset of single cells based on flow cytometry data. The model was trained to predict if each flow cytometry event is one of the following: CD4+ Tcm, CD4+ Tem, CD4+ Temra, CD4+ Th0, CD8+ Tcm, CD8+ Tem, CD8+ Temra, CD8+ Th0, a non CD4+/CD8+ living cell or cell debris. The model was found to predict the test dataset with 99,22% accuracy. This means that out of 1,361,486 events in the test dataset, 10,606 events were wrongly predicted.

(Tcm: central memory T cell, Tem: effector memory T cell, Th0: naive T cell and Temra: effector memory T cell expressing CD45RA)

Brief explanation of flow cytometry:

Flow cytometry is a technology used to measure physical characteristics of single cells by a single or multiple lasers. The visible-light scattering and fluorescence parameters of each individual cell can be detected. Forward scatter (FSC) and side scatter (SSC) correlate with the size and granularity of a cell, respectively. Furthermore, cells can be stained with antibodies conjugated to a fluorochrome (specific for a type of cell). The staining of cell populations with different marker antibodies can be used together with flow cytometry to distinguish cell types/subsets. By analysing this flow cytometry data, the cell types/subsets can be quantified, for example in a blood sample with a large number of different cell types. Manual analysis of large flow cytometry data, by manual gating cell groups, can be a time-consuming task. This neural-network was created to try to automate this analysis of flow cytometry data, which could significantly safe time with the analysis of large and complex data.

Dataset and model:

The dataset this model was trained and tested with was made by manual gating flow cytometry data from the public FlowRepository database. From the manual analysed data, CSV files were extracted and the cells were labeled for the corresponding cell subsets (see the flowcyto_data_preperation jupyter notebook). The datasets with the different cell types were concatenated and subsequently, train and test datasets were generated (traintest jupyter notebook).

The neural-network was trained on 12 parameters: FSC_A, FSC_H, FSC_W, SSC_A, SSC_H, SSC_W, CD4, CD5, CD8, CD197, CD45RA and Live/Dead. Each row in the dataset corresponds to one event (one cell or debris/doublet). The model was trained on 5.445.944 of such events. Various parameters in the model were tweaked or tested, such as the learning rate, batch size, dropout layers, dense layer units, before resulting in the current model.

While this neural-network model is not perfect in predicting the cell type of each single cell, it gives a good illustration on how neural-networks can be used to speed up flow cytometry analysis. As manual analysis of flow cytometry data is also not perfect, I think that one of the difficulties to further improve this model is in training data set (generated by manual gating). Therefore, the next step would be to generate a training dataset based on automated cell clustering, and train this model on such dataset.

t-cell-subsets-neural-network's People

Contributors

menno-meijer avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.