apln552_identifying-ai-generated-text_final-project's Introduction

APLN552_LLM-Project

AI-Generated Text Detection

This project aims to find machine learning models to detect AI-generated text from human-generated text.

Introduction

With the rise of AI-generated content, there is a growing need to distinguish between text generated by artificial intelligence systems and text authored by humans. This project addresses this challenge by leveraging machine learning techniques and feature engineering to classify text samples as either AI-generated or human-generated.

Methodology:

Three machine learning classifiers were trained and evaluated for essay classification:
• Naive Bayes
• Logistic Regression
• Random Forest
• DistilBERT

Feature engineering techniques were employed to augment the dataset and enhance classification performance. Three key features were engineered:
• Text Length
• Lexical Diversity
• Flesch Reading Ease

Evaluation metrics such as precision, recall, F1-score, and accuracy were used to assess the performance of each classifier.

DATASET

https://www.kaggle.com/datasets/thedrcat/daigt-v2-train-dataset (Report #2)

https://www.kaggle.com/competitions/llm-detect-ai-generated-text (Report #1)

Code Files

Sample size of 10000 with k-fold validation is implemented in ML models with 10000 sample

Sample size of 3000 with k-fold validation(Naive Bayes, Logistic Regression, Random Forest) is implemented in ML models with 3000 sample

Sample size of 3000 without k-fold validation(Naive Bayes, Logistic Regression, Random Forest and DistilBERT) is implemented in ML models with 3000 sample(without K-fold)

Note: Codes take several hours to run and give output. Therefore, outputs are added to the notebooks for initial review.

Recommend Projects

ro468 / apln552_identifying-ai-generated-text_final-project Goto Github PK

apln552_identifying-ai-generated-text_final-project's Introduction

APLN552_LLM-Project

AI-Generated Text Detection

Introduction

Methodology:

DATASET

Code Files

apln552_identifying-ai-generated-text_final-project's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent