Implementation of Protein Subcellular Localization
A Multi-scale Multi-model Deep Neural Network via Ensemble Strategy on High-throughput Microscopy Image for Protein Subcellular Localization
In this study, we propose a multi-scale multi-model deep neural network via ensemble strategy for protein subcellular localization on single-cell high-throughput images. First of all, we employ a deep convolutional neural network as multi-scale feature extractor and use global average pooling to map extracted features at different stages into feature vectors, then concatenate these multi-scale features to form a multi-model structure for image classification. In addition, we add Squeeze-and-Excitation Blocks to the network to emphasize more informative features. What’s more, we use an ensemble method to fuse the classification results from the multi-model structure to obtain the final sub-cellular location of each single-cell image. Experiments show the validity and effectiveness of our method on yeast cell images, it can significantly improve the accuracy of high-throughput microscopy image-based protein subcellular localization, and we achieve the classification accuracy of 0.9098 on the high-throughput microscopy images of yeast cells. In the work of protein subcellular localization, our method provides a framework for processing and classifying microscope images, and further lays the foundation for the study of protein and gene functions.
T. Pärnamaa, L. Parts, Accurate classification of protein subcellular localization from high-throughput microscopy images using deep learning, G3: Genes, Genomes, Genetics 7 (5) (2017) 1385–1392. arXiv:https: //www.g3journal.org/content/7/5/1385.full.pdf, doi:10.1534/g3. 810 116.033654.
dataset.py
python training.py
model = ResNet34()
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])
metric = 'val_accuracy'
filepath="/home/dingjiaqi/Program/deepyeast-master/deepyeast-master/deepyeast/weights/A-{accuracy:.3f}.hdf5"
checkpoint = keras.callbacks.ModelCheckpoint(filepath, monitor=metric, verbose=1, save_best_only=True, mode='max')
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='val_accuracy', factor=0.1, patience=5, cooldown=0, min_lr=1e-5)
callbacks_list = [checkpoint, reduce_lr]
batch_size = 32
epochs = 50
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
shuffle=True,
validation_split=0.1,
# validation_data=(x_val, y_val)
callbacks=callbacks_list)
# predict and output the performance on each class
python predict.py resnet weights/weight_name.hdf5
# extract feature and visualize it by t-SNE
python feature_extration