Code Monkey home page Code Monkey logo

crop-disease-diagnosis-service's Introduction

crop-disease-diagnosis-service

crop disease diagnosis service application with image-captioning and object-detection(deep learning)

  • paper
    • Lee, D.I.; Lee, J.H.; Jang, S.H.; Oh, S.J.; Doo, I.C. Crop Disease Diagnosis with Deep Learning-Based Image Captioning and Object Detection. Appl. Sci. 2023, 13, 3148. https://doi.org/10.3390/app13053148

Contents

  1. Team
  2. Requirement
  3. Keywords
  4. Motivation & Purpose
  5. Goals
  6. System Structure
  7. Service Flow
  8. Disease Diagnostic Results
  9. Project Flow
  10. Deep Learning
  11. App
  12. Benefits
  13. References

๐ŸŒฑ๋‚ด ์†์•ˆ์˜ ์‹๋ฌผ์˜์‚ฌ: ๋‹ฅํ„ฐ ์‘ฅ์‘ฅ

๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ์บก์…”๋‹๊ณผ ๊ฐ์ฒด ์ธ์‹์„ ์ด์šฉํ•œ ์ž‘๋ฌผ ์งˆ๋ณ‘ ์ง„๋‹จ ์„œ๋น„์Šค

ํ•œ๊ตญ๋ฐ์ดํ„ฐ์‚ฐ์—…์ง„ํฅ์›์™ธ๋Œ€๋กœ๊ณ ๋ฆฌ์‚ฌ์ด์ฆˆ

Getting Start

  1. apk download : https://github.com/DI-LEE/crop-disease-diagnosis-service/releases
  2. ์„ค์ • -> ์ƒ์ฒด ์ธ์‹ ๋ฐ ๋ณด์•ˆ -> ์ถœ์ฒ˜๋ฅผ ์•Œ ์ˆ˜ ์—†๋Š” ์•ฑ ์„ค์น˜ -> ๋‚ด ํŒŒ์ผ ์„ ํƒ ํ›„ ํ—ˆ์šฉ -> ๋‚ด ํŒŒ์ผ -> apk ์„ ํƒ ํ›„ Dr.์‘ฅ์‘ฅ ์•ฑ ์„ค์น˜

For detail instructions : Service Flow

Team

Team Logo

๋‹์„๋ณ•

image

Team Organization

Name role Contact
์ด๋™์ธ ํŒ€์žฅ, ์ด๋ฏธ์ง€์บก์…”๋‹ ๋ชจ๋ธ ๊ตฌ์ถ• ๋ฐ ์ด์‹, ๋ฐฑ์—”๋“œ ์„œ๋ฒ„ ๊ตฌ์ถ• ๋ณด์กฐ, ์‹œ์Šคํ…œ ๊ตฌ์กฐ ์„ค๊ณ„ [email protected]
์žฅ์Šนํ˜ธ ์˜ค๋ธŒ์ ํŠธ ๋””ํ…์…˜ ๋ชจ๋ธ ๊ตฌ์ถ• ๋ฐ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•, ์•ฑ ์„œ๋น„์Šค ํ”Œ๋กœ์šฐ ์ œ์ž‘ [email protected]
์ด์ง€ํ™˜ ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ ๊ตฌ์ถ• ๋ฐ ๋ชจ๋ธ ์„ฑ๋Šฅ ๋น„๊ต์—ฐ๊ตฌ, ์•ฑ ์„œ๋น„์Šค ํ”Œ๋กœ์šฐ ์ œ์ž‘ [email protected]
๋ฅ˜์Šน๊ธฐ ๋ฐฑ์—”๋“œ ๋กœ์ง ์„ค๊ณ„ ๋ฐ ๊ตฌ์ถ•, ํ”„๋ก ํŠธ์—”๋“œ ๋ฐฑ์—”๋“œ ๊ฐ„ ํ†ต์‹  ๋ฐ ์—ฐ๊ฒฐ, ์„œ๋ฒ„ ๊ตฌ์ถ• ๋ฐ ๋ชจ๋ธ ์ด์‹ [email protected]
์ •ํ›ˆ์„œ ์˜ค๋ธŒ์ ํŠธ ๋””ํ…์…˜ ๋ชจ๋ธ ๊ตฌ์ถ• ๋ฐ ์ด์‹, ๋ฐฑ์—”๋“œ ์„œ๋ฒ„ ๊ตฌ์ถ• ๋ณด์กฐ, ์‹œ์Šคํ…œ ๊ตฌ์กฐ ์„ค๊ณ„ [email protected]
์˜ค์ง€ํ™˜ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ๋ถ„์„, ๊ธฐํš ๋ฐ ๋ฐœํ‘œ [email protected]
์–‘๊ฑด์•ˆ ํ”„๋ก ํŠธ์—”๋“œ ๋กœ์ง ์„ค๊ณ„ ๋ฐ ๊ตฌ์ถ•, ํ”„๋ก ํŠธ์—”๋“œ ๋ฐฑ์—”๋“œ ๊ฐ„ ํ†ต์‹  ๋ฐ ์—ฐ๊ฒฐ, UX/UI ๋””์ž์ธ [email protected]
๊น€์žฌ์› ๊ธฐํš ๋ฐ ๋ฐœํ‘œ [email protected]

Requirements

Image-Captioning Requirements for train

cd requirements
pip install -r img_cpt_requirements.txt  # install

key requirements

  • python==3.9
  • tensorflow-gpu==2.8.0

Object-detection Requirements for train

cd requirements
pip install -r ob_requirements.txt  # install

key requirements

  • torch==1.12.1

APP

frontend Requirements

cd app_front
flutter pub get  # install

backend Requirements

cd requirements
pip install -r backend_requirements.txt  # install

key requirements

  • flutter==3.0.5
  • flask==2.2.2

APP frontend environment for build

Train environment

  • RTX 2070
  • CUDA Version==11.2
  • cudnn==7.6.5

Keywords

  • Image-captioning
  • Object-detection
  • Natural Language Generation
  • Diagnosis of crop disease
  • Home farming

Motivation & Purpose

๋„์‹œ๋†์—…์— ๋Œ€ํ•œ ๊ด€์‹ฌ๋„๊ฐ€ ๋งค๋…„ ๊พธ์ค€ํžˆ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. ์„œ์šธ์‹œ ์ œ๊ณต ์ž๋ฃŒ์— ๋”ฐ๋ฅด๋ฉด ๋„์‹œ๋†๋ถ€ ์ˆ˜๋Š” 2010๋…„ 15๋งŒ๋ช…์—์„œ 2020๋…„ 185๋งŒ๋ช…์œผ๋กœ 10๋…„ ๋งŒ์— 10๋ฐฐ๋‚˜ ์ฆ๊ฐ€ํ–ˆ๋‹ค. ํŠนํžˆ ์ตœ๊ทผ ๋ฌผ๊ฐ€์ƒ์Šน์˜ ์—ฌํŒŒ๋กœ ์‹์ž์žฌ๊ฐ’์ด ๊ธ‰๋“ฑํ•˜์—ฌ ๋†์ž‘๋ฌผ์„ ๊ตฌ๋งคํ•˜์ง€ ์•Š๊ณ  ์ง์ ‘ ์ง‘์—์„œ ํ‚ค์›Œ ์„ญ์ทจํ•˜๋Š” โ€˜ํ™ˆํŒŒ๋ฐโ€™ ๋ฌธํ™”๊ฐ€ ํ™•์‚ฐ๋˜๊ณ  ์žˆ๋‹ค. ์ด์ฒ˜๋Ÿผ ๋„์‹œ์˜ ๊ฐ€์ •์—์„œ ์ง์ ‘ ์ž‘๋ฌผ์„ ๊ธธ๋Ÿฌ ์†Œ๋น„ํ•˜๋Š” ๋„์‹œ๋†์—… ์‹œ์žฅ์ด ๊พธ์ค€ํžˆ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์ง€๋งŒ, ๋„์‹œ๋†๋ถ€๋“ค์€ ๋Œ€์ฒด๋กœ ์ „๋ฌธ ๋†์—…์ธ์ด ์•„๋‹Œ ์ƒˆ๋‚ด๊ธฐ ๋†๋ถ€๊ฐ€ ๋งŽ์•„ ์˜๋†๊ธฐ์ˆ  ๋ฐ ๋†์—… ๊ฒฝํ—˜ ๋ถ€์กฑ์œผ๋กœ ๋†์ž‘๋ฌผ์˜ ์งˆ๋ณ‘์„ ์ œ๋•Œ ์ง„๋‹จํ•˜์ง€ ๋ชปํ•ด ์ ์ ˆํ•œ ์น˜๋ฃŒ๋ฒ•์œผ๋กœ ์ž‘๋ฌผ์„ ๊ด€๋ฆฌํ•˜์ง€ ๋ชปํ•˜๊ณ  ๊ฒฐ๊ตญ ์ˆ˜ํ™•์— ์‹คํŒจํ•œ ์‚ฌ๋ก€๋ฅผ ์–ด๋ ต์ง€ ์•Š๊ฒŒ ์ฐพ์•„๋ณผ ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋„์‹œ๋†๋ถ€์˜ ์ž‘๋ฌผ ๊ด€๋ฆฌ๋ฅผ ๋•๊ธฐ ์œ„ํ•ด ์ž‘๋ฌผ์˜ ์ƒํƒœ๋ฅผ ์ž์„ธํ•˜๊ฒŒ ๋ฌ˜์‚ฌํ•˜๊ณ  ํ•ด๋‹น ์งˆ๋ณ‘์„ ์ง„๋‹จํ•˜๋Š” ์•ฑ ์„œ๋น„์Šค๋ฅผ ๊ฐœ๋ฐœํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค.

Goals

๋„์‹œ๋†๋ถ€ ๋“ฑ ๋†์—…์— ์ต์ˆ™ํ•˜์ง€ ์•Š์€ ์ดˆ๋ณด ๋†๋ถ€๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•œ ์ž‘๋ฌผ ๊ด€๋ฆฌ ์•ฑ์„ ๊ฐœ๋ฐœํ•˜๊ณ ์ž ํ–ˆ๋‹ค. ์งˆ๋ณ‘์— ๊ฐ์—ผ๋œ ๊ฒƒ์œผ๋กœ ์˜์‹ฌ๋˜๋Š” ์ž‘๋ฌผ์˜ ์‚ฌ์ง„์„ ์ฐ์œผ๋ฉด ์ธ๊ณต์ง€๋Šฅ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ  ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ์„ ๊ฑฐ์ณ ๊ฐ์—ผ ํ™˜๋ถ€๋ฅผ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋กœ ํ‘œ์‹œํ•˜๊ณ  ํ™˜๋ถ€์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋ฌ˜์‚ฌ์™€ ํ•จ๊ป˜ ํ•ด๋‹น ์งˆ๋ณ‘์„ ์ง„๋‹จํ•˜๋Š” ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•˜๋„๋ก ํ–ˆ๋‹ค. ๋‹จ์ˆœํžˆ ์–ด๋–ค ์งˆ๋ณ‘์— ๊ฑธ๋ ธ๋Š”์ง€๋งŒ ์•Œ๋ ค์ฃผ๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์งˆ๋ณ‘ ๊ฐ์—ผ ๋ถ€์œ„๋ฅผ ํ‘œ์‹œํ•˜๊ณ  ๊ฐ์—ผ ๋ถ€์œ„๋ฅผ ์ž์„ธํ•˜๊ฒŒ ๋ฌ˜์‚ฌํ•˜์—ฌ ์งˆ๋ณ‘ ์ง„๋‹จ์— ๋Œ€ํ•œ ๋ช…ํ™•ํ•œ ๊ทผ๊ฑฐ๋ฅผ ์ œ์‹œํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์งˆ๋ณ‘ ์ž์ฒด๋ฅผ ์ง„๋‹จํ•จ์œผ๋กœ์จ ํ•ด๋‹น ์งˆ๋ณ‘์— ๊ฐ์—ผ๋  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋“  ์ž‘๋ฌผ์— ๋Œ€ํ•œ ์งˆ๋ณ‘ ์ง„๋‹จ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜์—ฌ ์ž‘๋ฌผ๋ณ„๋กœ ๋ณ‘์„ ์ง„๋‹จํ•˜๋Š” ๋ฒˆ๊ฑฐ๋กœ์›€์„ ํ•ด์†Œํ•  ์ˆ˜ ์žˆ๋„๋ก ํ–ˆ๋‹ค. ๋˜ํ•œ ์ง„๋‹จ์˜ ์ •ํ™•๋„๋ฅผ ๋†’์ด๊ธฐ ์œ„ํ•ด ์ •์ƒ์ ์ธ ์ƒํƒœ์˜ ์ž‘๋ฌผ ์‚ฌ์ง„์ด๋‚˜ ์ž‘๋ฌผ์ด ์•„๋‹Œ ๋ฌผ์ฒด์˜ ์‚ฌ์ง„์„ ์ž…๋ ฅํ–ˆ์„ ๋•Œ ์งˆ๋ณ‘์ด ์ง„๋‹จ๋˜์ง€ ์•Š๋„๋ก ์ถ”๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์ถ•ํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œ์ผฐ๋‹ค. ๊ฐ์—ผ๋œ ์ž‘๋ฌผ์˜ ์งˆ๋ณ‘์ด ์ง„๋‹จ๋˜๋ฉด ํ•ด๋‹น ์งˆ๋ณ‘์˜ ๋ฐœ์ƒ ํ™˜๊ฒฝ๊ณผ ๊ด€๋ฆฌ๋ฒ•์„ ์†Œ๊ฐœํ•˜์—ฌ ์งˆ๋ณ‘์„ ์น˜๋ฃŒํ•˜๊ณ  ์˜ˆ๋ฐฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‰ฝ๊ฒŒ ์•Œ๋ ค์ฃผ๋„๋ก ํ–ˆ๋‹ค.

System Structure

image01

Service Flow

image02

Disease diagnostic results

result screen

image03

result video

default.mp4

Project Flow

image04

Deep Learning

Data Collection and Labeling

์ž‘๋ฌผ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋กœ๋Š” AI-hub์— ์˜คํ”ˆ์†Œ์Šค๋กœ ๊ณต๊ฐœ๋˜์–ด ์žˆ๋Š” โ€˜๋…ธ์ง€์ž‘๋ฌผ ์งˆ๋ณ‘ ์ง„๋‹จ ์ด๋ฏธ์ง€โ€™์™€ '์‹œ์„ค ์ž‘๋ฌผ ์งˆ๋ณ‘ ์ง„๋‹จ ์ด๋ฏธ์ง€' ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ํ•ด๋‹น ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹์—๋Š” ์ •์ƒ์ ์ธ ์ž‘๋ฌผ ์ด๋ฏธ์ง€์™€ ์งˆ๋ณ‘์— ๊ฑธ๋ฆฐ ์ž‘๋ฌผ ์ด๋ฏธ์ง€๊ฐ€ ๊ณจ๊ณ ๋ฃจ ๋“ค์–ด์žˆ์–ด ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋กœ ์ ํ•ฉํ•˜๋‹ค๋Š” ํŒ๋‹จ์„ ๋‚ด๋ ธ๋‹ค. ์ „์ฒด ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹์—์„œ โ€˜๊ณ ์ถ”โ€™, โ€˜์• ํ˜ธ๋ฐ•โ€™, โ€˜ํ† ๋งˆํ† โ€™, โ€˜์ฝฉโ€™, โ€˜ํŒŒโ€™ ๋“ฑ ์ด 5๊ฐœ์˜ ์ž‘๋ฌผ์„ ์„ ๋ณ„ํ•œ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”์ถœํ•˜์—ฌ ์ด 9๊ฐœ์˜ ์งˆ๋ณ‘์„ ํƒ์ง€ํ•˜๊ณ ์ž ํ–ˆ๋‹ค. ๋˜ํ•œ ๊ฐ ์งˆ๋ณ‘์˜ ํŠน์ง•๊ณผ ์ค‘์ฆ๋„๋ฅผ ๊ตฌ๋ถ„ํ•˜์—ฌ ์งˆ๋ณ‘์˜ ํŠน์ง•์„ ๋ฌ˜์‚ฌํ•˜๋Š” ์บก์…˜ ๋ฌธ์žฅ ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์ถ•ํ–ˆ๋‹ค. ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ ํ•™์Šต์—๋Š” ์ด 123,913๊ฐœ์˜ ์ด๋ฏธ์ง€์™€ 619,565๊ฐœ์˜ ์บก์…˜์„ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์˜ค๋ธŒ์ ํŠธ ๋””ํ…์…˜ ๋ชจ๋ธ ํ•™์Šต์—๋Š” ์ด 31,394๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค.

Dataset

์ด๋ฏธ์ง€ ์บก์…”๋‹(Image Captioning) ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ๋ชจ๋ธ weight

Dataset and Weight Download : https://drive.google.com/drive/folders/1nT2tOmWdmjItQA_5MNHqByVcMas0bzKp?usp=sharing

Train Validation
Images 123,913 303
Label 619,565 1,515

์˜ค๋ธŒ์ ํŠธ ๋””ํ…์…˜(Object Detection) ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ๋ชจ๋ธ weight

Dataset and Weight Download :https://drive.google.com/drive/folders/1NmlqqYI_ePEpUEhjWU2qUO5R1MMMHy-L?usp=sharing

์›์ฒœ Train Validation
Images 4,871 1,099
์ฆ๊ฐ• Train Validation
Images 20,587 4,837
Total Train Validation
Images 25,458 5,936
Label 144,172 33,717

Data Labeling

์ด๋ฏธ์ง€ ์บก์…”๋‹(Image Captioning) ๋ผ๋ฒจ๋ง ๊ธฐ์ค€

1. AI-hub์— ์ˆ˜๋ก๋˜์–ด ์žˆ๋Š” ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉ

  • ์ด๋ฏธ์ง€์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋Š” json ํ˜•ํƒœ๋กœ ์ €์žฅ๋˜์–ด์žˆ์Œ
  • ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์ค‘ โ€˜ํ•™์Šต์šฉ ์ •๋ณด(annotations)โ€™ ๋ฅผ ์ฐธ๊ณ ํ•จ
    • **์งˆ๋ณ‘/ํ•ด์ถฉ ์ฝ”๋“œ(disease)**๋กœ ์งˆ๋ณ‘์˜ ์ข…๋ฅ˜๋ฅผ ํŒŒ์•…
    • **์ž‘๋ฌผ์ฝ”๋“œ(crop)**๋กœ ์ž‘๋ฌผ์˜ ์ข…๋ฅ˜๋ฅผ ํŒŒ์•…
    • **์งˆ๋ณ‘ ํ”ผํ•ด ์ •๋„(risk)**๋กœ ์งˆ๋ณ‘์˜ ํ”ผํ•ด ์ •๋„๋ฅผ ํŒŒ์•…
  • json ํŒŒ์ผ์˜ annotations ๋ถ€๋ถ„์ด ์ด๋ฏธ์ง€์˜ ์ œ๋ชฉ์— ๋ฐ˜์˜๋˜์–ด ์žˆ์Œ์„ ํ™•์ธ

<์ด๋ฏธ์ง€>

๊ณ ์ถ”์ด๋ฏธ์ง€์™€์ด๋ฆ„

<json ํ˜•์‹์˜ ์ด๋ฏธ์ง€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ>

๊ณ ์ถ”์–ด๋…ธํ…Œ์ด์…˜

<AI-hub์— ์ˆ˜๋ก๋œ ์ด๋ฏธ์ง€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ์ •๋ณด>

์—์ด์•„์ดํ—ˆ๋ธŒ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ

2. โ€˜๊ตญ๊ฐ€๋†์ž‘๋ฌผ๋ณ‘ํ•ด์ถฉ๊ด€๋ฆฌ์‹œ์Šคํ…œโ€™์—์„œ ์ œ๊ณตํ•˜๋Š” ์งˆ๋ณ‘ ํ‚ค์›Œ๋“œ๋ฅผ ํ™œ์šฉ

  • โ€˜๊ตญ๊ฐ€๋†์ž‘๋ฌผ๋ณ‘ํ•ด์ถฉ๊ด€๋ฆฌ์‹œ์Šคํ…œโ€™ ํ™ˆํŽ˜์ด์ง€์˜ โ€˜๋ณ‘ํ•ด์ถฉ์ •๋ณดโ€™โ†’โ€™๋ณ‘ํ•ด์ถฉ๋ณ„ ๋„๊ฐ์ •๋ณดโ€™ ์ด๋™
  • ์งˆ๋ณ‘์„ ๊ฒ€์ƒ‰ํ•˜๋ฉด ํ•ด๋‹น ์งˆ๋ณ‘์— ๊ฑธ๋ฆด ์ˆ˜ ์žˆ๋Š” ์ž‘๋ฌผ์˜ ์ข…๋ฅ˜์™€ ํ•ด๋‹น ์งˆ๋ณ‘์˜ ํŠน์ง•์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Œ
  • '์ฆ์ƒ ์„ค๋ช…' ๋ถ€๋ถ„์—์„œ ์งˆ๋ณ‘์˜ ํŠน์ง•์„ ๋‚˜ํƒ€๋‚ด๋Š” ํ‚ค์›Œ๋“œ๋ฅผ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Œ
    • ๊ฐ€๋ น, ๊ณ ์ถ” ํƒ„์ €๋ณ‘์—์„œ๋Š” ์›ํ˜•๋ฐ˜์ , ๋‹ดํ™ฉ์ƒ‰ ๋‚ด์ง€ ํ™ฉ๊ฐˆ์ƒ‰์˜ ํฌ์ž๋ฉ์–ด๋ฆฌ, ๋ง๋ผ ๋น„ํ‹€์–ด์ง์˜ ํ‚ค์›Œ๋“œ๋ฅผ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Œ

3. ์œ„์˜ ๋‘ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ž‘๋ฌผ์˜ ์งˆ๋ณ‘์„ ์ง„๋‹จํ•˜๋Š” ์บก์…˜ ์ƒ์„ฑ

1) ์งˆ๋ณ‘ ํ”ผํ•ด ์ •๋„๋ฅผ ์ธก์ •

  • ์ดˆ๊ธฐ, ์ค‘๊ธฐ, ๋ง๊ธฐ๋กœ ๋‚˜๋ˆ  ๋ผ๋ฒจ๋ง ์ง„ํ–‰

2) ์งˆ๋ณ‘ ํ”ผํ•ด ์ •๋„์— ๋”ฐ๋ฅธ ํ‚ค์›Œ๋“œ ํ• ๋‹น

  • ์งˆ๋ณ‘ ํ”ผํ•ด ์ •๋„์— ๋”ฐ๋ผ ์ ์šฉ๋˜๋Š” ํ‚ค์›Œ๋“œ์˜ ์ข…๋ฅ˜ ๋ฐ ๊ฐœ์ˆ˜๊ฐ€ ์„œ๋กœ ๋‹ค๋ฆ„
    • Ex) ๊ณ ์ถ”ํƒ„์ €๋ณ‘
      • ์ดˆ๊ธฐ: ์›ํ˜• ๋ฐ˜์ 
      • ์ค‘๊ธฐ: ์›ํ˜• ๋ฐ˜์  + ํ™ฉ๊ฐˆ์ƒ‰(๋‹ดํ™ฉ์ƒ‰)์˜ ํฌ์ž
      • ๋ง๊ธฐ: ์›ํ˜• ๋ฐ˜์  + ํ™ฉ๊ฐˆ์ƒ‰(๋‹ดํ™ฉ์ƒ‰)์˜ ํฌ์ž + ๋ง๋ผ๋น„ํ‹€์–ด์ง

3) ํ‚ค์›Œ๋“œ๋ฅผ ํ™œ์šฉํ•œ ์บก์…˜ ๋ฌธ์žฅ ์ƒ์„ฑ

  • ์ž‘๋ฌผ ์ข…๋ฅ˜, ์งˆ๋ณ‘ ์ข…๋ฅ˜, ์งˆ๋ณ‘ ํ”ผํ•ด์ •๋„, ๊ทธ๋ฆฌ๊ณ  ํ‚ค์›Œ๋“œ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์บก์…˜ ๋ฌธ์žฅ ์ƒ์„ฑ
    • Ex) ๊ณ ์ถ”ํƒ„์ €๋ณ‘
      • ์ดˆ๊ธฐ: ๊ณ ์ถ”์— ์›ํ˜• ๋ฐ˜์ ์ด ๋‚˜ํƒ€๋‚˜ ๊ณ ์ถ”ํƒ„์ €๋ณ‘์œผ๋กœ ์˜์‹ฌ๋ฉ๋‹ˆ๋‹ค
      • ์ค‘๊ธฐ: ๊ณ ์ถ”์— ํ™ฉ๊ฐˆ์ƒ‰์˜ ํฌ์ž์™€ ์›ํ˜• ๋ฐ˜์ ์ด ์ƒ๊ธด ๊ฒƒ์œผ๋กœ ๋ณด์•„ ๊ณ ์ถ”ํƒ„์ €๋ณ‘์œผ๋กœ ์˜์‹ฌ๋ฉ๋‹ˆ๋‹ค
      • ๋ง๊ธฐ: ๊ณ ์ถ”์— ์›ํ˜• ๋ฐ˜์ ์ด ๋‚˜ํƒ€๋‚˜๊ณ  ํ™ฉ๊ฐˆ์ƒ‰์ด ํฌ์ž๊ฐ€ ๋ณด์ด๋ฉฐ ๋ง๋ผ ๋น„ํ‹€์–ด์ง„ ๊ฒƒ์„ ๋ณด์•„ ๊ณ ์ถ”ํƒ„์ €๋ณ‘์œผ๋กœ ์˜์‹ฌ๋ฉ๋‹ˆ๋‹ค
  • ๋ชจ๋ธ์ด ๋‹ค์–‘ํ•œ ๋ฌธ๋งฅ์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ฐ™์€ ์˜๋ฏธ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•จ
    • ์–ด์ˆœ ๋ฐ”๊พธ๊ธฐ
      • ์ƒ๋Œ€์ ์œผ๋กœ ์–ด์ˆœ์ด ์ž์œ ๋กœ์šด ํ•œ๊ตญ์–ด์˜ ํŠน์„ฑ์„ ํ™œ์šฉํ•˜์—ฌ ๋ฌธ์žฅ ์„ฑ๋ถ„์˜ ์ˆœ์„œ๋ฅผ ๋ฐ”๊พธ๋ฉฐ ์—ฌ๋Ÿฌ ํ˜•ํƒœ์˜ ๋ฌธ์žฅ์„ ์ƒ์„ฑ
    • ํ…์ŠคํŠธ ์—ญ๋ฒˆ์—ญ(Back Translation)
      • ๋„ค์ด๋ฒ„ 'ํŒŒํŒŒ๊ณ ' ๊ธฐ๊ณ„๋ฒˆ์—ญ๊ธฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ฌธ์žฅ์„ ์™ธ๊ตญ์–ด๋กœ ๋ฒˆ์—ญํ•œ ํ›„ ๋‹ค์‹œ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญ
      • 'ํ•œ๊ตญ์–ดโ†’์˜์–ดโ†’์ผ๋ณธ์–ดโ†’ํ•œ๊ตญ์–ด' ์ˆœ์„œ๋กœ ๋ฒˆ์—ญํ•˜์—ฌ ๊ธฐ์กด์˜ ๋ฌธ์žฅ๊ณผ๋Š” ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ ์ƒˆ๋กœ์šด ๋ฌธ์žฅ์„ ์ƒ์„ฑ

์˜ค๋ธŒ์ ํŠธ ๋””ํ…์…˜(Object Detection) ๋ผ๋ฒจ๋ง ๊ธฐ์ค€

์ด๋ฏธ์ง€ ์บก์…”๋‹๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ AI-hub์— ์ˆ˜๋ก๋˜์–ด ์žˆ๋Š” ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์˜ 'ํ•™์Šต์šฉ ์ •๋ณด(annotations)'๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ž‘๋ฌผ์˜ ์ด๋ฆ„๊ณผ ์งˆ๋ณ‘์˜ ์ข…๋ฅ˜๋ฅผ ํŒŒ์•…ํ•œ ํ›„ bounding box ์ฒ˜๋ฆฌ๋ฅผ ํ•ด์ฃผ์—ˆ๋‹ค. ์ด ๋•Œ bounding box๋Š” ์ „์ฒด ์ž‘๋ฌผ ์ค‘ ๋ณ‘๋ณ€์ด ๋ฐœ์ƒํ•œ ํŠน์ • ๋ถ€๋ถ„์— ํ‘œ์‹œ๋ฅผ ํ•ด์ฃผ์—ˆ์œผ๋ฉฐ, ์งˆ๋ณ‘์ด ์ž‘๋ฌผ ์ „์ฒด์ ์œผ๋กœ ํผ์ ธ์žˆ๋Š” ๊ฒฝ์šฐ ๋ถ€๋ถ„์ด ์•„๋‹Œ ์ž‘๋ฌผ ์ „์ฒด์— bounding box์ฒ˜๋ฆฌ๋ฅผ ํ•ด์ฃผ์—ˆ๋‹ค.

187653219-083cef9d-f4db-4e69-9a1b-755228aa0d75

187653281-6f922bc7-37f8-46f6-9f29-ef55051d61d4

Modeling

Used Model

  • InceptionV3 + Transformer

    • ํ›ˆ๋ จ ํŒŒ์ผ ๊ฒฝ๋กœ: image-captioning/image_captioning_InceptionV3_Transformer.ipynb
  • yoloV5m

Image Captioning

์ด๋ฏธ์ง€ ์บก์…”๋‹(Image Captioning)์€ ์ด๋ฏธ์ง€๋ฅผ ์„ค๋ช…ํ•˜๋Š” ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•˜๋Š” ๊ธฐ์ˆ ๋กœ, ์ด๋ฏธ์ง€์˜ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ํŠน์ง•์„ ์ž์„ธํžˆ ๋ฌ˜์‚ฌํ•œ ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•œ๋‹ค. ํ”„๋กœ์ ํŠธ์— ์‚ฌ์šฉ๋œ ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ์€ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์˜ ๊ธฐ๊ณ„๋ฒˆ์—ญ ๋งค์ปค๋‹ˆ์ฆ˜ ์ค‘ ํ•˜๋‚˜์ธ ์ธ์ฝ”๋”-๋””์ฝ”๋” ํ˜•์‹์„ ์‚ฌ์šฉํ•œ๋‹ค. ์ธ์ฝ”๋”์—์„œ ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ์ถ”์ถœํ•˜๊ณ  ๋””์ฝ”๋”์—์„œ๋Š” ์ธ์ฝ”๋”์—์„œ ์ถ”์ถœ๋œ ํŠน์ง•์„ ๋ฐ”ํƒ•์œผ๋กœ ์บก์…˜ ๋ฌธ์žฅ์„ ์ƒ์„ฑํ•œ๋‹ค. ์šฐ๋ฆฌ์˜ ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ์˜ ์ธ์ฝ”๋”์—๋Š” ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ์— ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š” CNN ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ ๊ทธ์ค‘ โ€˜ImageNetโ€™์ด๋ผ๋Š” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์ „ํ•™์Šต์„ ๊ฑฐ์นœ InceptionV3 ๋ชจ๋ธ์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋””์ฝ”๋”์—๋Š” ์ž์—ฐ์–ด๋ฅผ ์ƒ์„ฑํ•ด๋‚ด๋Š” **์–ธ์–ด ๋ชจ๋ธ(Language Model)**์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ ๋Œ€ํ‘œ์ ์ธ ์–ธ์–ด ๋ชจ๋ธ๋กœ Attention๋ชจ๋ธ๊ณผ Transformer ๋ชจ๋ธ์ด ์žˆ๋‹ค. Attention ๋ชจ๋ธ์€ ๋ฌธ์žฅ ๊ตฌ์„ฑ ์š”์†Œ ์ค‘ ํŠน์ • ๋‹จ์–ด์— ์ง‘์ค‘ํ•˜๋„๋ก ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์ถ”๊ฐ€๋œ RNN ๊ณ„์—ด์˜ ๋ฌธ์žฅ ์ƒ์„ฑ ๋ชจ๋ธ์ด๋ฉฐ Transformer ๋ชจ๋ธ์€ Attention ๋ชจ๋ธ์„ ๋ณด์™„ํ•œ ๊ฒƒ์œผ๋กœ, RNN ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ ๋„ ์—ฌ๋Ÿฌ ๋ฒˆ์˜ โ€˜Self-Attentionโ€™ ๋ฐฉ์‹์œผ๋กœ ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ๊ณผ ์†๋„๋ฅผ ํš๊ธฐ์ ์œผ๋กœ ํ–ฅ์ƒ์‹œํ‚จ ๋ชจ๋ธ์ด๋‹ค. ๋ณธ ํ”„๋กœ์ ํŠธ์˜ ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ ๋””์ฝ”๋”์— ๋‘ ๋ชจ๋ธ ์ค‘ ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ์ด ๋” ๋†’์€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด BLEU ์Šค์ฝ”์–ด๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‘ ๋ชจ๋ธ์˜ ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ์„ ๋น„๊ตํ–ˆ๋‹ค.

BLEU ์Šค์ฝ”์–ด๋Š” ์ธ๊ฐ„์ด ์ƒ์„ฑํ•œ ๋ฌธ์žฅ๊ณผ ๋ชจ๋ธ์ด ์ƒ์„ฑํ•œ ๋ฌธ์žฅ์˜ ์œ ์‚ฌ์„ฑ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜์—ฌ ์ ์ˆ˜๋กœ ๋‚˜ํƒ€๋‚ด๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ ์ง€ํ‘œ๋กœ ๊ธฐ๊ณ„๋ฒˆ์—ญ ๋“ฑ์—์„œ ์ž์ฃผ ์ด์šฉ๋œ๋‹ค. ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ์ด ์ข‹์„์ˆ˜๋ก ๋†’์€ ์ ์ˆ˜๊ฐ€ ์‚ฐ์ถœ๋œ๋‹ค. ๊ฐ ๋ฌธ์žฅ์˜ ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ํ† ํฐ์œผ๋กœ ๋‚˜๋ˆ„๊ณ  ํ† ํฐ์„ ๋น„๊ตํ•˜์—ฌ ๋‘ ๋ฌธ์žฅ์ด ์„œ๋กœ ๊ณต์œ ํ•˜๋Š” ํ† ํฐ์˜ ๊ฐœ์ˆ˜ ๋“ฑ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜์—ฌ ์ ์ˆ˜๋ฅผ ํ™˜์‚ฐํ•˜๋Š”๋ฐ, ํ† ํฐ์„ ๋น„๊ตํ•  ๋•Œ n-gram ๊ธฐ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ํ† ํฐ ์Œ์„ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๊ณ  ๊ฐ n-gram์„ ์ ์šฉํ•œ BLEU ์Šค์ฝ”์–ด๋Š” โ€˜BLEU_Nโ€™์œผ๋กœ ํ‘œํ˜„๋œ๋‹ค. ๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” 1-gram, 2-gram, 3-gram, 4-gram์ด ๊ฐ๊ฐ ์ ์šฉ๋œ BLEU_1, BLEU_2, BLEU_3, BLEU_4, ๊ทธ๋ฆฌ๊ณ  ์ด ๋„ค๊ฐ€์ง€ BLEU ์Šค์ฝ”์–ด์˜ ํ‰๊ท ๊ฐ’์ธ BLEU_AVG๋ฅผ ์ด์šฉํ•˜์—ฌ ๋‘ ๋ชจ๋ธ์˜ ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ์„ ๋น„๊ตํ–ˆ๋‹ค. Validation ๋ฐ์ดํ„ฐ ์…‹์„ ์ด์šฉํ•˜์—ฌ BLEU ์Šค์ฝ”์–ด๋ฅผ ์‚ฐ์ถœํ•œ ๊ฒฐ๊ณผ, BLEU_3์„ ์ œ์™ธํ•œ ๋‚˜๋จธ์ง€ BLEU ์Šค์ฝ”์–ด์— ๋Œ€ํ•ด Transformer ๋ชจ๋ธ์˜ BLEU ์Šค์ฝ”์–ด๊ฐ€ ๋” ๋†’์•˜๋‹ค. ์ฆ‰, Attention ๋ชจ๋ธ๋ณด๋‹ค Transformer ๋ชจ๋ธ์˜ ๋ฌธ์žฅ ์ƒ์„ฑ ์„ฑ๋Šฅ์ด ๋” ์ข‹์•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ณธ ํ”„๋กœ์ ํŠธ์˜ ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ ๋””์ฝ”๋”์—๋Š” Transformer ๋ชจ๋ธ์„ ์‚ฌ์šฉํ–ˆ๋‹ค.

blue-score

๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ ์‚ฌ์šฉํ•œ ์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ๋จผ์ € ์ด๋ฏธ์ง€๊ฐ€ CNN๊ณ„์—ด์˜ InceptionV3 ์— ์ž…๋ ฅ๋˜๋ฉด ๋ชจ๋ธ์€ ์ด๋ฏธ์ง€์—์„œ ์—ฌ๋Ÿฌ ํŠน์ง•์„ ๋ถ„์„ํ•œ๋‹ค. ๊ฐ€๋ น ๋…ธ๊ท ๋ณ‘์— ๊ฐ์—ผ๋œ ์• ํ˜ธ๋ฐ• ์žŽ์‚ฌ๊ท€ ์ด๋ฏธ์ง€๊ฐ€ ์ž…๋ ฅ๋˜๋ฉด ๋ชจ๋ธ์€ ์žŽ์˜ ๊ฐ€์žฅ์ž๋ฆฌ์— ๋…ธ๋ž€ ์ ๋ฐ•์ด๊ฐ€ ์ƒ๊ธด ๋ชจ์Šต, ์žŽ์˜ ์ƒ‰์ƒ ๋“ฑ์„ ๋ถ„์„ํ•˜๊ฒŒ ๋œ๋‹ค. ๋ถ„์„๋œ ํŠน์ง•์€ Transformer ๋ชจ๋ธ์— ์ž…๋ ฅ๋œ๋‹ค. Transformer ๋ชจ๋ธ ์ž์ฒด๋„ ์ธ์ฝ”๋”-๋””์ฝ”๋” ๊ตฌ์กฐ๋ฅผ ์ง€๋‹ˆ๋Š”๋ฐ, ์šฐ์„  ์ด๋ฏธ์ง€์˜ ํŠน์ง•์€ ๊ฐ ํŠน์ง•์˜ ์œ„์น˜ ์ •๋ณด์™€ ํ•จ๊ป˜ Transformer์˜ ์ธ์ฝ”๋”์— ์ž…๋ ฅ๋˜์–ด Self-Attention ๊ณผ์ •์„ ๊ฑฐ์ณ ๋ถ„์„๋œ ๋’ค Transformer์˜ ๋””์ฝ”๋”์— ์ž…๋ ฅ๋œ๋‹ค. ๋˜ํ•œ Transformer์˜ ๋””์ฝ”๋”์—๋Š” ํ•ด๋‹น ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์ •๋‹ต ๋ผ๋ฒจ์ธ ์‹ค์ œ ์บก์…˜ ๋ฌธ์žฅ๋„ ํ•จ๊ป˜ ์ž…๋ ฅ๋˜๋Š”๋ฐ ์ด๋•Œ Transformer์˜ ์ธ์ฝ”๋”์—์„œ ๋ถ„์„๋œ ์ด๋ฏธ์ง€ ํŠน์ง•๊ณผ ๋””์ฝ”๋”์— ์ž…๋ ฅ๋œ ์‹ค์ œ ์บก์…˜ ๋ฌธ์žฅ์ด Self-Attention ๊ณผ์ •์œผ๋กœ ์ข…ํ•ฉ์ ์œผ๋กœ ๋ถ„์„๋˜๊ณ  ์ตœ์ข…์ ์œผ๋กœ Transformer ๋””์ฝ”๋”์—์„œ ํ•ด๋‹น ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ์บก์…˜ ๋ฌธ์žฅ์ด ์ƒ์„ฑ๋œ๋‹ค.

Object Detection

๊ฐ์ฒดํƒ์ง€(object-detection)์€ ํ•œ ์ด๋ฏธ์ง€์—์„œ ๊ฐ์ฒด์™€ ๊ทธ ๊ฒฝ๊ณ„ ์ƒ์ž(bounding box)๋ฅผ ํƒ์ง€ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ๊ฐ์ฒด ํƒ์ง€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›๊ณ , ๊ฒฝ๊ณ„ ์ƒ์ž์™€ ๊ฐ์ฒด ํด๋ž˜์Šค ๋ฆฌ์ŠคํŠธ๋ฅผ ์ถœ๋ ฅํ•˜๋ฉฐ ์ด๋•Œ ๊ฒฝ๊ณ„ ์ƒ์ž์— ๋Œ€์‘ํ•˜๋Š” ์˜ˆ์ธก ํด๋ž˜์Šค์™€ ํด๋ž˜์Šค์˜ ์‹ ๋ขฐ๋„(confidence)๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค. 2012๋…„ ์ด์ „๊นŒ์ง€๋Š” non-neural network-based๋ฐฉ์‹์ด ์“ฐ์ด๋‹ค๊ฐ€ 2012๋…„์„ ๊ธฐ์ ์œผ๋กœ neural network-based ๊ธฐ๋ฒ•์ด ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜์—ˆ๋‹ค. ์ด ์ค‘ ์šฐ๋ฆฌ๊ฐ€ ์‚ฌ์šฉํ•œ ๋ชจ๋ธ์€ 2018๋…„์— ์ถœ์‹œ๋œ Yolo ๋ชจ๋ธ์ด๋‹ค. 2022๋…„ ํ˜„์žฌ๊นŒ์ง€ V7๊นŒ์ง€ ์ถœ์‹œ๋˜์—ˆ์œผ๋ฉฐ V5๋ถ€ํ„ฐ๋Š” PyTorch๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌํ˜„๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ํŒŒ์ด์ฌ ํ™˜๊ฒฝ์—์„œ๋„ ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ๋‹ค. ์šฐ๋ฆฌ ์กฐ๋Š” ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ๊ณผ ๋งŽ์€ ์ฐธ๊ณ ๋ฌธํ—Œ์„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋Š” YoloV5๋ฅผ ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜์˜€๋‹ค.

๋ชจ๋ธ์„ ๋ณธ๊ฒฉ์ ์œผ๋กœ ๊ตฌ์ถ•ํ•˜๊ธฐ ์ „์— ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์ด๋ฏธ์ง€ ์บก์…”๋‹๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ AI-hub์˜ ๋…ธ์ง€ ์ž‘๋ฌผ ์งˆ๋ณ‘ ์ง„๋‹จ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹๊ณผ ์‹œ์„ค ์ž‘๋ฌผ ์งˆ๋ณ‘ ์ง„๋‹จ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜์˜€์œผ๋ฉฐ ๊ณ ์ถ”, ์• ํ˜ธ๋ฐ•, ํ† ๋งˆํ† , ์ฝฉ, ํŒŒ ๋“ฑ ์ด 5๊ฐœ์˜ ์ž‘๋ฌผ์—์„œ 9๊ฐœ์˜ ์งˆ๋ณ‘์„ ํƒ์ง€ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜์˜€๋‹ค. ๋‹ค๋งŒ ํฐ๊ฐ€๋ฃจ๋ณ‘๊ณผ ์žŽ๋งˆ๋ฆ„๋ณ‘์€ 2๊ฐœ์˜ ์ž‘๋ฌผ์—์„œ ๋™์‹œ์— ๋“ฑ์žฅํ•˜๋Š” ์งˆ๋ณ‘์ด์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์‹ค์ œ ์˜ˆ์ธก ํด๋ž˜์Šค๋Š” 7๊ฐœ๋กœ ์ง€์ •ํ•˜์˜€๋‹ค.

๋ฐ์ดํ„ฐ ์ˆ˜์ง‘๊ณผ ์˜ˆ์ธก ํด๋ž˜์Šค๋ฅผ ์ง€์ •ํ•œ ํ›„ ๋ฐ์ดํ„ฐ ๋ผ๋ฒจ๋ง์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. LabelImg๋ผ๋Š” ํˆด์„ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์ž‘๋ฌผ ํ™˜๋ถ€์— ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค(Bounding Box)๋ฅผ ๋งŒ๋“ค์—ˆ๋‹ค.

ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์›์ฒœ ๋ฐ์ดํ„ฐ ์‚ฌ์ง„์€ 6,000์—ฌ ์žฅ์ด์—ˆ๋Š”๋ฐ, ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๊ธฐ์—๋Š” ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์€ ์–‘์ด์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์„ ์‹ค์‹œํ•˜์˜€๋‹ค. ํŒŒ์ด์ฌ ๊ฐœ๋ฐœํ™˜๊ฒฝ์—์„œ imgaugํŒจํ‚ค์ง€๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์ฆ๊ฐ• ์ฝ”๋“œ๋ฅผ ๋งŒ๋“ค์—ˆ๊ณ , ๊ทธ ๊ฒฐ๊ณผ train 25,458์žฅ, valid 5,936์žฅ, ์ด 31,394์žฅ์˜ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์ถ•ํ•˜๋Š”๋ฐ ์„ฑ๊ณตํ–ˆ๋‹ค. ์•„์šธ๋Ÿฌ ๊ฐœ๋ณ„ ์ด๋ฏธ์ง€์˜ ํฌ๊ธฐ๋ฅผ 640 * 640์œผ๋กœ ํฌ๊ธฐ๋ฅผ ์ค„์ด๋Š” ์ž‘์—…๋„ ์‹ค์‹œํ•˜์˜€๋‹ค.

์ดํ›„ Yolov5 ์‚ฌ์ „ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•˜์—ฌ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. Yolov5์—์„œ ์ œ๊ณตํ•˜๋Š” ์‚ฌ์ „ ํ›ˆ๋ จ ๋ชจ๋ธ์€ Yolov5n, Yolov5s, Yolov5m, Yolov5l, Yolov5x ์ด 5๊ฐœ์ด๋‹ค. n์—์„œ x๋กœ ๊ฐˆ์ˆ˜๋ก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์€ ์ข‹์•„์ง€์ง€๋งŒ, ํ•™์Šต์‹œํ‚ค๋Š”๋ฐ ๋” ๋งŽ์€ ์‹œ๊ฐ„์ด ์†Œ์š”๋œ๋‹ค. ์šฐ๋ฆฌ ์กฐ๋Š” ํ•™์Šต์˜ ์„ฑ๋Šฅ๊ณผ ํ•™์Šต ์‹œ๊ฐ„์„ ์ ์ ˆํžˆ ๊ณ ๋ คํ•˜์—ฌ Yolov5m๋ชจ๋ธ์„ ์ฑ„ํƒํ•˜์—ฌ ํ•™์Šต์„ ์‹œ์ผฐ๊ณ  15์‹œ๊ฐ„์— ๊ฑธ์นœ ๋์— ๋ชจ๋ธ ๊ตฌ์ถ•์— ์„ฑ๊ณตํ–ˆ๋‹ค.

์•„๋ž˜๋Š” Yolov5 ํ•™์Šต ๊ฒฐ๊ณผ์˜ Confusion Matrix ์ด๋‹ค. confusion_matrix

APP

์•ฑ ํ”„๋ก ํŠธ์—”๋“œ ๊ฐœ๋ฐœ์—๋Š” ๊ตฌ๊ธ€์—์„œ ์ œ๊ณตํ•˜๋Š” Dart ์–ธ์–ด๊ธฐ๋ฐ˜์˜ ๋ฌด๋ฃŒ ํ”„๋ ˆ์ž„์›Œํฌ์ธ flutter์„ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ๋ฐฑ์—”๋“œ ๊ฐœ๋ฐœ์—๋Š” python ์–ธ์–ด ๊ธฐ๋ฐ˜ ์›น ํ”„๋ ˆ์ž„์›Œํฌ์ธ flask๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ์„œ๋ฒ„๋Š” ์•ˆ์ •์ ์ด๊ณ  ํƒ„๋ ฅ์„ฑ์žˆ๋Š” ์„ฑ๋Šฅ์œผ๋กœ ์œ ๋ช…ํ•œ AWS EC2๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์•ฑ UX/UI ๋””์ž์ธ๊ณผ ์•ฑ ์„œ๋น„์Šค ํ”Œ๋กœ์šฐ๋Š” ํ˜‘์—… ๋””์ž์ธ ํˆด์ธ Figma๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค.

Frontend

์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ด๋ฆ„์ธ Dr.์‘ฅ์‘ฅ์€ ์‹๋ฌผ์˜ ์งˆ๋ณ‘์„ ์ง„๋‹จํ•˜๋Š” โ€˜์‹๋ฌผ ์˜์‚ฌโ€™์˜ ๋œป์„ ๋‚ดํฌํ•˜๊ณ  ์žˆ๋‹ค. ์‹๋ฌผ ์˜์‚ฌ๋ผ๋Š” ์ปจ์…‰์— ๋งž๊ฒŒ ์ „์ฒด์ ์œผ๋กœ ๊ทธ๋ฆฐ ํŒŒ์Šคํ…” ํ†ค์„ ์‚ฌ์šฉํ•˜์—ฌ ๋””์ž์ธํ–ˆ๊ณ  ๋ˆ„๊ตฌ๋‚˜ ์‚ฌ์šฉํ•˜๊ธฐ ์‰ฝ๊ฒŒ ์‹ฌํ”Œํ•œ UX/UI๋กœ ํ™”๋ฉด์„ ๊ตฌ์„ฑํ–ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ”„๋ก ํŠธ์—”๋“œ ๋ถ€๋ถ„์—์„œ ๊ธฐ์กด์— ๊ธฐํšํ–ˆ๋˜ ์นด๋ฉ”๋ผ, ๋‚ ์”จ, ์งˆ๋ณ‘ ์ง„๋‹จ, ๊ทธ๋ฆฌ๊ณ  ๋ฐฉ์ œ ๋ฐฉ๋ฒ• ์ œ๊ณต ๊ธฐ๋Šฅ์„ ๊ตฌํ˜„ํ–ˆ๋‹ค. ์นด๋ฉ”๋ผ ๊ธฐ๋Šฅ์€ flutter์—์„œ ์ œ๊ณตํ•˜๋Š” image_picker ํŒจํ‚ค์ง€๋ฅผ ์ด์šฉํ•ด ๊ตฌํ˜„ํ–ˆ๊ณ  ๋‚ ์”จ์™€ ์งˆ๋ณ‘ ์ง„๋‹จ ๊ธฐ๋Šฅ์€ ์„œ๋ฒ„์™€ ํด๋ผ์ด์–ธํŠธ์˜ API ํ†ต์‹ ์„ ํ†ตํ•ด ์„œ๋น„์Šค์— ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ›์•„ ์•ฑ ํ™”๋ฉด์— ์ถœ๋ ฅํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์งˆ๋ณ‘์— ๋”ฐ๋ผ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฉ์ œ ๋ฐฉ๋ฒ•์„ ์†Œ๊ฐœํ•˜๋Š” ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•  ์ˆ˜ ์žˆ๋„๋ก ์•ฑ์„ ๊ตฌํ˜„ํ–ˆ๋‹ค.

Backend & Server

๋จผ์ € ๊ฐœ๋ฐœ ํ™˜๊ฒฝ ๊ตฌ์ถ•์ด ์ค‘์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ๊ธฐ์— ์ด์— ํ•„์š”ํ•œ ๊ฐ ํŒจํ‚ค์ง€์™€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋“ฑ์˜ ๋ฒ„์ „์„ ํ†ต์ผํ•˜๊ณ  ๊ณต์œ ํ•˜๋Š” ๊ณผ์ •์— ์ถฉ๋ถ„ํ•œ ์‹œ๊ฐ„์„ ํˆฌ์žํ–ˆ๋‹ค.

์ดํ›„ ํ”„๋ก ํŠธ์—”๋“œ์™€ ์—ฐ๋™ํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ API๋ฅผ ์„ค๊ณ„ํ•˜๊ธฐ ์œ„ํ•ด ์ง์ ‘ ๊ตฌ์กฐ๋„๋ฅผ ๊ทธ๋ ค๋ณด๊ณ  ๊ตฌ์ฒดํ™”ํ•˜๋ฉด์„œ ํ•ต์‹ฌ ๋ฐฑ์—”๋“œ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜์˜€๋‹ค.

์ด๋ฏธ์ง€ ์บก์…”๋‹ ๋ชจ๋ธ๊ณผ ์˜ค๋ธŒ์ ํŠธ ๋””ํ…์…˜ ๋ชจ๋ธ์ด ์™„์„ฑ๋œ ์ดํ›„์—๋Š” ์‚ฌ์ „์— ๊ตฌ์ถ•๋œ ๋ฐฑ์—”๋“œ ์„œ๋ฒ„์— ๋‘ ๋ชจ๋ธ์„ ์ด์‹ํ•˜๋Š”๋ฐ ๋งŽ์€ ๊ณต์„ ๋“ค์˜€๋‹ค. ๋‘ ๋ชจ๋ธ์„ ํ•œ ํ™”๋ฉด์— ๋™์‹œ์— ๋ณด์—ฌ์ค˜์•ผ ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋กœ์ง ๊ตฌ์„ฑ ๋‹จ๊ณ„์—์„œ ๊ณ ๋ คํ•ด์•ผ ํ•  ๊ฒƒ๋“ค์ด ๋งŽ์•˜๊ณ , ๊ฒฐ๊ตญ ํ”„๋ก ํŠธ์—”๋“œ์—์„œ ์ „๋‹ฌ๋œ ์‚ฌ์ง„์„ ์ €์žฅํ•˜์—ฌ ๋‘ ๋ชจ๋ธ์˜ ๊ฒฐ๊ณผ๋ฌผ์ธ ์บก์…˜๊ณผ ๋””ํ…์…˜๋œ ์‚ฌ์ง„์„ ํ•œ ๋ฒˆ์— ํ”„๋ก ํŠธ์—”๋“œ๋กœ ์ „๋‹ฌํ•˜๋Š” ๋กœ์ง์œผ๋กœ ๋ชจ๋ธ ์ด์‹์„ ์™„๋ฃŒํ•˜์˜€๋‹ค.

์ด ๊ณผ์ •์—์„œ AWS EC2์—์„œ ๋ฌด๋ฃŒ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ธ์Šคํ„ด์Šค ์œ ํ˜•์˜ CPU(1GB)/๋ฉ”๋ชจ๋ฆฌ(RAM 1GB) ํ•œ๊ณ„์™€ ์šฐ๋ฆฌ๊ฐ€ ์ œ์ž‘ํ•œ ๋ชจ๋ธ์˜ ์šฉ๋Ÿ‰ ๋ฌธ์ œ๋กœ ์ธํ•ด ์„œ๋ฒ„์— ๊ณผ๋ถ€ํ•˜๊ฐ€ ์ผ์–ด๋‚˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์˜€๊ณ , ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ โ€˜AWS EC2โ€™์˜ ํ•˜๋“œ๋””์Šคํฌ์˜ ์ผ๋ถ€ ์šฉ๋Ÿ‰์„ swap memory๋กœ ํ• ๋‹นํ•˜์˜€๋‹ค. ์ด ๊ณผ์ •์„ ํ†ตํ•ด ๋ฉ”๋ชจ๋ฆฌ์— ์—ฌ์œ ๋ฅผ ์ฃผ๊ณ , ์„œ๋ฒ„์˜ ๋ถ€๋‹ด์ด ์ค„์–ด๋“ค๋ฉด์„œ ์„œ๋ฒ„๊ฐ€ ์•ˆ์ •ํ™”๋˜๋Š” ํšจ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์—ˆ๋‹ค.

Benefits

๋จผ์ € ๋†์—…์— ๋Œ€ํ•œ ์ด๋ฏธ์ง€๋ฅผ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ „๋ฌธ์ ์ธ ๊ด€๋ฆฌ ์ง€์‹์ด ์š”๊ตฌ๋œ๋‹ค๋Š” ์ธ์‹์„ ๊ฐœ์„ ํ•˜์—ฌ ๋†์—…์— ๋Œ€ํ•œ ์ ‘๊ทผ์„ฑ์„ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.

๋‘ ๋ฒˆ์งธ๋กœ ์—ฌ๋Ÿฌ ๋ถ€๊ฐ€ ์„œ๋น„์Šค๋ฅผ ํ†ตํ•ด ์ƒˆ๋‚ด๊ธฐ ๋†๋ถ€์™€ ๋ฒ ํ…Œ๋ž‘ ๋†๋ถ€์˜ ๊ฐ€๊ต์—ญํ• ์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์งˆ๋ณ‘์ด ๋ฐœ์ƒํ•˜๋Š” ํ™˜๊ฒฝ๊ณผ ์งˆ๋ณ‘ ์˜ˆ๋ฐฉ๋ฒ•์„ ์งง์€ ๋ฌธ์žฅ์œผ๋กœ ์†Œ๊ฐœํ•˜์—ฌ ์•ž์œผ๋กœ ํ•ด๋‹น ์งˆ๋ณ‘์„ ์˜ˆ๋ฐฉํ•˜๊ธฐ ์œ„ํ•œ ํ™˜๊ฒฝ๊ณผ ์˜ˆ๋ฐฉ ๋ฐฉ๋ฒ•์„ ์‰ฝ๊ฒŒ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

๋งˆ์ง€๋ง‰์œผ๋กœ ๋†์—… ๋ฐ ์›์˜ˆ ์‹œ์žฅ์˜ ๋ฐœ์ „์— ๊ธ์ •์  ์˜ํ–ฅ์„ ์ค„ ์ˆ˜ ์žˆ๋‹ค. ์งˆ๋ณ‘ ์ง„๋‹จ์˜ ๋Œ€์ƒ์„ ๋” ๋งŽ์€ ๋†์ž‘๋ฌผ๋กœ ํ™•๋Œ€ํ•  ์ˆ˜ ์žˆ๊ณ  ๋” ๋‚˜์•„๊ฐ€ ํ•ด๋‹น ์งˆ๋ณ‘์— ๊ฑธ๋ฆด ์ˆ˜ ์žˆ๋Š” ๋ฐ˜๋ ค ์‹๋ฌผ๋กœ๋„ ํ™•๋Œ€ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด์ฒ˜๋Ÿผ ๋” ๋งŽ์€ ์ข…๋ฅ˜์˜ ์‹๋ฌผ์— ๋Œ€ํ•œ ์งˆ๋ณ‘์„ ์ง„๋‹จํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์„œ๋น„์Šค๋ฅผ ๋ฐœ์ „์‹œ์ผœ ๋งŽ์€ ์‚ฌ๋žŒ์ด ์‹๋ฌผ ๊ด€๋ฆฌ ์„œ๋น„์Šค๋ฅผ ์ด์šฉํ•˜๋„๋ก ํ•˜๊ณ  ๊ถ๊ทน์ ์œผ๋กœ ๋†์—… ๋ฐ ์›์˜ˆ ์‹œ์žฅ์˜ ๊ทœ๋ชจ๋ฅผ ํ™•๋Œ€ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋œ๋‹ค.

References

Xu et al. "Show, attend and tell: Neural image caption generation with visual attention.", International conference on machine learning. PMLR, 2015.

Li et al. โ€œEntangled Transformer for Image Captioningโ€, Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.

Taraneh et al. โ€œDeep Learning Approaches on Image Captioning: A Reviewโ€, arXiv preprint arXiv:2201.12944, 2022.

crop-disease-diagnosis-service's People

Contributors

di-lee avatar jason9865 avatar eukkki210 avatar jihwanlee17 avatar hoonseojung avatar moho191113 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.