Code Monkey home page Code Monkey logo

Dat Quoc Nguyen is a Senior Research Scientist and the Head of the Natural Language Processing department at VinAI Research, Vietnam. He was an Honorary Fellow in the School of Computing and Information Systems at the University of Melbourne, Australia, where previously he was a Research Fellow. Before that, he received his Ph.D. in Computer Science from Macquarie University, Australia.

Dat Quoc Nguyen is the author of 70 peer-reviewed publications covering core NLP problems, ML methods for NLP and their applications for low-resource languages and specific domains, with over 5000 citations and an h-index of 33 (Google Scholar). He released many ML/NLP toolkits and datasets, which are widely used in both academia and industry. He also created large language models and other foundation models, including PhoGPT, PhoBERT, BARTpho, XPhoneBERT and BERTweet, with millions of downloads.

Dat Quoc Nguyen's Projects

bioposdep icon bioposdep

Tokenization, sentence segmentation, POS tagging and dependency parsing for biomedical texts (BMC Bioinformatics 2019)

fairseq icon fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

jldadmm icon jldadmm

A Java package for the LDA and DMM topic models

jointre icon jointre

End-to-end neural relation extraction using deep biaffine attention (ECIR 2019)

jptdp icon jptdp

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

lftm icon lftm

Improving topic models LDA and DMM (one-topic-per-document model for short texts) with word embeddings (TACL 2015)

map4lda icon map4lda

Improving Topic Coherence with Latent Feature Word Representations in MAP Estimation for Topic Modeling (ALTA 2015)

phow2v icon phow2v

Pre-trained Word2Vec syllable- and word-level embeddings for Vietnamese

rdrpostagger icon rdrpostagger

A fast and accurate POS and morphological tagging toolkit (EACL 2014)

rdrsegmenter icon rdrsegmenter

A Fast and Accurate Vietnamese Word Segmenter (LREC 2018)

stranse icon stranse

STransE: a novel embedding model of entities and relationships in knowledge bases (NAACL 2016)

transe-nmm icon transe-nmm

Neighborhood Mixture Model for Knowledge Base Completion (CoNLL 2016)

transformers icon transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

vndt icon vndt

VnDT: A Vietnamese Dependency Treebank

vnmarmot icon vnmarmot

A state-of-the-art pre-trained model for Vietnamese POS tagging (ALTA 2017)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.