Code Monkey home page Code Monkey logo

ai-phd-s24's Introduction

Artificial Intelligence for Business Research (Spring 2024)

Teaching Team

  • Instructor: Renyu (Philip) Zhang, Associate Professor, Department of Decisions, Operations and Technology, CUHK Business School, [email protected], @911 Cheng Yu Tung Building.
  • Teaching Assistant: Leo Cao, Full-time TA, Department of Decisions, Operations and Technology, CUHK Business School, [email protected]. Please be noted that Leo will help with any issues related to the logistics, but not the content, of this course.
  • Tutorial Instructor: Qiansiqi Hu, MSBA Student, Department of Decisions, Operations and Technology, CUHK Business School, [email protected]. BS in ECE, Shanghai Jiaotong University Michigan Institute.

Basic Information

  • Website: https://github.com/rphilipzhang/AI-PhD-S24
  • Time: Tuesday, 12:30pm-3:15pm, from Jan 9, 2024 to Apr 16, 2024, except for Feb 13 (Chinese New Year) and Mar 5 (Final Project Discussion)
  • Location: Cheng Yu Tung Building (CYT) LT5

About

Welcome to the mono-repo of the PhD course AI for Business Research (DSME 6635) at CUHK Business School in Spring 2024. You may download the Syllabus of this course first. The purpose of this course is to learn the following:

  • Have a basic understanding of the fundamental concepts/methods in machine learning (ML) and artificial intelligence (AI) that are used (or potentially useful) in business research.
  • Understand how business researchers have utilized ML/AI and what managerial questions have been addressed by ML/AI in the recent decade.
  • Nurture a taste of what the state-of-the-art AI/ML technologies can do in the ML/AI community and, potentially, in your own research field.

We will meet each Tuesday at 12:30pm in Cheng Yu Tung Building (CYT) LT5 (please pay attention to this room change). Please ask for my approval if you need to join us via the following Zoom links:

  • Zoom link, Meeting ID 996 4239 3764, Passcode 386119.

Most of the code in this course will be distributed through the Google CoLab cloud computing environment to avoid the incompatibility and version control issues on your local individual computer. On the other hand, you can always download the Jupyter Notebook from CoLab and run it your own computer.

  • The CoLab files of this course can be found at this folder.
  • The Google Sheet to sign up for groups and group tasks can be found here.
  • The overleaf template for scribing the lecture notes of this course can be found here.

If you have any feedback on this course, please directly contact Philip at [email protected] and we will try our best to address it.

Brief Schedule

Subject to modifications. All classes start at 12:30pm and end at 3:15pm.

Session Date Topic Key Words
1 1.09 AI/ML in a Nutshell Course Intro, ML Models, Model Evaluations
2 1.16 Intro to DL DL Intro, Neural Nets, Computational Issues in DL
3 1.23 Prediction and Traditional NLP Prediction in Biz Research, Pre-processing, Word Representations
4 1.30 NLP (II): Word2Vec $N$-gram, NLP Performance Evaluation, Word2Vec
5 2.06 NLP (III): Seq2Seq and Attention RNN, Seq2Seq, (Self-)Attention
6 2.20 NLP (IV): Transformer Tranformer, BERT, GPT
7 2.27 NLP (V): LLM and Generative AI Prompting, Emergence, Generative AI in Biz Research
7.5 3.05 Final Project Proposal No Class, Group Meeting with Philip
8 3.12 Image Processing and CV (I) CNN, AlexNet, ResNet
9 3.19 CV (II) Data Augmentation, ViT, Video Understanding
10 3.26 Unsupervised Learning (I) EM, LDA, Topic Modeling
11 4.02 Unsupervised Learning (II) VAE, Stable Diffusion, Multimodality
12 4.09 Algorithm and Fairness Algorithmic Fairness/Bias in CS and Econ, Testing Discrimination
13 4.16 Final Project Presentation Show, not tell!

Important Dates

All problem sets are due at 12:30pm right before class.

Date Time Event Note
1.10 11:59pm Group Sign-Ups Each group has at most two students.
1.12 7:00pm-9:00pm Python Tutorial Given by Qiansiqi Hu, Python Tutorial CoLab
1.19 7:00pm-9:00pm PyTorch Tutorial Given by Qiansiqi Hu, PyTorch Tutorial CoLab
3.05 9:00am-6:00pm Final Project Discussion Please schedule a meeting with Philip.
3.12 12:30pm Final Project Proposal 1-page maximum
4.16 12:30pm-3:15pm Final Project Presentation Show, not tell!

Useful Resources

Find more on the Syllabus.

Detailed Schedule

The following schedule is tentative and subject to changes.

Session 1. Artificial Intelligence and Machine Learning in a Nutshell (Jan/09/2024)

  • Keywords: Course Introduction, Machine Learning Basics, Bias-Variance Trade-off, Cross Validation, $k$-Nearest Neighbors, Decision Tree, Ensemble Methods
  • Slides: Course Introduction, Machine Learning Basics
  • CoLab Notebook Demos: k-Nearest Neighbors, Decision Tree
  • Homework: Problem Set 1: Bias-Variance Trade-Off
  • Scribed Lecture Notes: To be updated.
  • Online Python Tutorial: Python Tutorial CoLab, 7:00pm-9:00pm, Jan/12/2024 (Friday), given by Qiansiqi Hu, [email protected]. Zoom Link, Meeting ID: 923 4642 4433, Pass code: 178146
  • References:
    • The Elements of Statistical Learning (2nd Edition), 2009, by Trevor Hastie, Robert Tibshirani, Jerome Friedman, https://hastie.su.domains/ElemStatLearn/.
    • Probabilistic Machine Learning: An Introduction, 2022, by Kevin Murphy, https://probml.github.io/pml-book/book1.html.
    • Mullainathan, Sendhil, and Jann Spiess. 2017. Machine learning: an applied econometric approach. Journal of Economic Perspectives 31(2): 87-106.
    • Athey, Susan, and Guido W. Imbens. 2019. Machine learning methods that economists should know about. Annual Review of Economics 11: 685-725.
    • Hofman, Jake M., et al. 2021. Integrating explanation and prediction in computational social science. Nature 595.7866: 181-188.
    • Bastani, Hamsa, Dennis Zhang, and Heng Zhang. 2022. Applied machine learning in operations management. Innovative Technology at the Interface of Finance and Operations. Springer: 189-222.
    • Kelly, Brian, and Dacheng Xiu. 2023. Financial machine learning, SSRN, https://ssrn.com/abstract=4501707.

Session 2. Introduction to Deep Learning (Jan/16/2024)

Session 3. DL Basics, Predictions in Business Research, and Traditonal NLP (Jan/23/2024)

  • Keywords: Optimization and Computational Issues of Deep Learning, Prediction Problems in Business Research, Pre-processing and Word Representations in Traditional Natural Language Processing
  • Slides: Deep Learning Basics, Prediction Problems in Business Research, NLP(I): Pre-processing and Word Representations
  • CoLab Notebook Demos: He Initialization, Dropout, Micrograd, NLP Pre-processing
  • Presentation: By Letian Kong and Liheng Tan.
    • Mullainathan, Sendhil, and Jann Spiess. 2017. Machine learning: an applied econometric approach. Journal of Economic Perspectives 31(2): 87-106. Link to the paper.
  • Homework: Problem Set 2: Implementing Neural Nets, due at 12:30pm, Jan/30/2024 (Tuesday).
  • Scribed Lecture Notes: To be updated.
  • References:
    • Kleinberg, Jon, Jens Ludwig, Sendhil Mullainathan, and Ziad Obermeyer. 2015. Prediction policy problems. American Economic Review 105(5): 491-495.
    • Mullainathan, Sendhil, and Jann Spiess. 2017. Machine learning: an applied econometric approach. Journal of Economic Perspectives 31(2): 87-106.
    • Kleinberg, Jon, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018. Human decisions and machine predictions. Quarterly Journal of Economics 133(1): 237-293.
    • Bajari, Patrick, Denis Nekipelov, Stephen P. Ryan, and Miaoyu Yang. 2015. Machine learning methods for demand estimation. American Economic Review, 105(5): 481-485.
    • Farias, Vivek F., and Andrew A. Li. 2019. Learning preferences with side information. Management Science 65(7): 3131-3149.
    • Cui, Ruomeng, Santiago Gallino, Antonio Moreno, and Dennis J. Zhang. 2018. The operational value of social media information. Production and Operations Management, 27(10): 1749-1769.
    • Gentzkow, Matthew, Bryan Kelly, and Matt Taddy. 2019. Text as data. Journal of Economic Literature, 57(3): 535-574.
    • Chapter 2, Introduction to Information Retrieval, 2008, Cambridge University Press, by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze, https://nlp.stanford.edu/IR-book/information-retrieval-book.html.
    • Chapter 2, Speech and Language Processing (3rd ed. draft), 2023, by Dan Jurafsky and James H. Martin, https://web.stanford.edu/~jurafsky/slp3/.
    • Parameter Initialization and Batch Normalization (in Chinese)
    • GPU Comparisons
    • GitHub Repo for Micrograd, by Andrej Karpathy.
    • Hand Written Notes

Session 4. Traditonal NLP (Jan/30/2024)

Session 5. Deep-Learning-Based NLP: Word2Vec (Feb/06/2024)

  • Keywords: Traditional NLP Applied to Business/Econ Research, Word2Vec: Continuous Bag of Words, Skip-Gram, and GloVe, Language Model Evaluation, Word2Vec Applied to Business/Econ Research
  • Slides: NLP(II): N-Gram, Naïve Bayes, and Language Model Evaluation, NLP(III): Word2Vec
  • CoLab Notebook Demos: Word2Vec: CBOW, Word2Vec: N-Gram
  • Presentation: By Xinyu Xu and Shu Zhang.
    • Timoshenko, Artem, and John R. Hauser. 2019. Identifying customer needs from user-generated content. Marketing Science, 38(1): 1-20. Link to the paper.
  • Homework: No homework this week. Probably you should think about your final project when enjoying your Lunar New Year Holiday.
  • Scribed Lecture Notes: To be updated.
  • References:
    • Gentzkow, Matthew, Bryan Kelly, and Matt Taddy. 2019. Text as data. Journal of Economic Literature, 57(3): 535-574.
    • Ash, Elliot, and Stephen Hansen. 2023. Text algorithms in economics. Annual Review of Economics, 15: 659-688.
    • Tetlock, Paul. 2007. Giving content to investor sentiment: The role of media in the stock market. Journal of Finance, 62(3): 1139-1168.
    • Baker, Scott, Nicholas Bloom, and Steven Davis, 2016. Measuring economic policy uncertainty. Quarterly Journal of Economics, 131(4): 1593-1636.
    • Gentzkow, Matthew, and Jesse Shapiro. 2010. What drives media slant? Evidence from US daily newspapers. Econometrica, 78(1): 35-71.
    • Timoshenko, Artem, and John R. Hauser. 2019. Identifying customer needs from user-generated content. Marketing Science, 38(1): 1-20.
    • Li, Kai, Feng Mai, Rui Shen, Xinyan Yan. 2021. Measuring corporate culture using machine learning. Review of Financial Studies, 34(7): 3265-3315.
    • Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeff Dean. 2013. Efficient estimation of word representations in vector space. ArXiv Preprint, arXiv:1301.3781.
    • Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems (NeurIPS) 26.
    • Pennington, Jeffrey, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
    • Parts I - II, Lecture Notes and Slides for CS224n: Natural Language Processing with Deep Learning, by Christopher D. Manning, Diyi Yang, and Tatsunori Hashimoto, https://web.stanford.edu/class/cs224n/.
    • Word Embeddings Trained on Google News Corpus

ai-phd-s24's People

Contributors

rphilipzhang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.