Code Monkey home page Code Monkey logo

moleculenet's Introduction

Hi there ๐Ÿ‘‹

I am a PhD Student in Health Data Science at Oxford supervised by Professor Jens Rittscher and funded by Professor Fergus Gleeson. I am focusing on applications of Computer Vision ๐Ÿ‘€๐Ÿ’ป to improving diagnostics and treatment of patients with lung cancer as part of the DART lung health project ( see my role in the project).

February 2024: my first main conference paper (pre-print ๐Ÿ“, code ๐Ÿ’ป) got accepted to ISBI-2024 conference!๐Ÿš€ In our work "Accurate Subtyping of Lung Cancers by Modelling Class Dependencies", we (1) construct a weakly-supervised multi-label lung cancer histology dataset from three public (TCGA, TCIA-CPTAC, DHMC), and one in-house dataset DART, (2) propose a class-dependency injection method allowing the learning of robust bag representations suitable for multi-label problems under weakly-supervised settings. Dataset creation, model building, and training code is available in the dependency-mil repository.

September 2022: my first workshop paper ๐Ÿ“ (pre-print ๐Ÿ“, code ๐Ÿ’ป) got published at MICCAI 2022 CaPTion workshop! ๐Ÿš€ In our work "Active Data Enrichment by Learning What to Annotate in Digital Pathology", we (1) proposed a new comprehensive annotation protocol for lung cancer pathology, (2) proposed a new metric for comparing how well a retrieval methods can prioritize examples from underrepresented classes, and (3) demonstrated that annotating and adding top-runked examples into the training set results in greater improvements to the algorithm performance than annotating and adding random examples. Links: published paper, open-access paper, code.

December 2020: my first mini-conference working notes paper ๐Ÿ“ (code ๐Ÿ’ป) got published at MediaEval 2020 Multimedia Benchmark workshop ๐Ÿš€. In our work "Real-Time Polyp Segmentation Using U-Net with IoU Loss" we explored how using a combination of differentiable IoU and BCE losses affects the segmentation performance measured by meanIoU and DiceScore when training a simple U-Net. Links: published open-access paper, code.


Public histology data sources. If you also want to start working with histopathology images, but do not have or are waiting for your own data, consider starting with "Dartmouth Lung Cancer Histology Dataset" DHMC, the "The Cancer Genome Atlas" (TCGA), and "The Cancer Imaging Archive" TCIA-CPTAC. Downloading large volumes of data is not a trivial task, so I documented my process for TCGA-lung-histology-download, TCIA-CPTAC-lung-histology-download.

Public natural images sources. Another thing you can do if you are lacking medical data is to simulate parts of your future workflow on natural images, e.g. classifying medical images for presence or absence of particular patterns can be similar to classifying natural images for presence or absence of particular objects. I used images from the COCO dataset. You can see my work here: GeorgeBatch/cocoapi.


Education


Here are some of the best free online resources to boost your ML/DL knowledge ๐Ÿš€ I am currently doing it, while skipping the repetitive parts โฐ


moleculenet's People

Contributors

georgebatch avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.