Code Monkey home page Code Monkey logo

langchain's Introduction

header

๐Ÿ‘ฅ Members


๋ฐ•์žฌ์œค

๊น€๋ฏผ์ˆ˜

๋ฐ•๋‚˜ํ˜„

์ตœ์Šน์•„

์ „์˜๋‚จ
main, crawl, gemini vector_store, retreiver chat, gemini logger sql

๐Ÿ‘€ How to run

  1. git clone https://github.com/jenner9212/langchain.git
  2. cd langchain
  3. python main.py
  4. Enter search term for arXiv papers: ๊ฐ€ ๋‚˜์˜ค๋ฉด ๊ฒ€์ƒ‰ํ•  ๋…ผ๋ฌธ ์ฃผ์ œ ๋ฅผ ์ž…๋ ฅํ•˜์„ธ์š”.
  5. ํ•ด๋‹น ์ฃผ์ œ์™€ ๊ด€๋ จ๋œ ๋…ผ๋ฌธ ๋‚ด์šฉ์„ ์ˆ˜์ง‘์ค‘์ž…๋‹ˆ๋‹ค.

Typing SVG

  1. User question: ๊ฐ€ ๋‚˜์˜ค๋ฉด ์งˆ๋ฌธ ๋‚ด์šฉ์„ ์ž…๋ ฅํ•˜์„ธ์š”....๋ฐ˜๋ณต ( ! ํ† ํฐ ์ œํ•œ์œผ๋กœ ์ธํ•ด ํ† ํฐ ์ดˆ๊ณผ์‹œ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค)
  2. exit๋ฅผ ์ž…๋ ฅํ•˜์‹œ๋ฉด ์‹คํ–‰์ด ์ข…๋ฃŒ๋ฉ๋‹ˆ๋‹ค.

๐Ÿ’ป Result

439CCC4A-2068-41A3-A167-B0C7274034BE

์ž‘๋™ ์ˆœ์„œ

  1. ๊ฒ€์ƒ‰ํ•  ๋…ผ๋ฌธ ์ฃผ์ œ๋ฅผ ์ž…๋ ฅ๋ฐ›์Šต๋‹ˆ๋‹ค.
  2. ํ•ด๋‹น ์ฃผ์ œ์™€ ๊ด€๋ จ๋œ ๋…ผ๋ฌธ์˜ ๋‚ด์šฉ์„ Crawler.crawl์—์„œ ์ˆ˜์ง‘ํ•ฉ๋‹ˆ๋‹ค.
  3. ์ˆ˜์ง‘๋œ ๋‚ด์šฉ์„ VectorStore.store์—์„œ ์ฒญํฌ ์‚ฌ์ด์ฆˆ 500์œผ๋กœ ์ž๋ฅด๊ณ  ์ž„๋ฒ ๋”ฉ(๋ฒกํ„ฐํ™”)ํ•˜์—ฌ vector_db์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  4. ์ฐพ๊ณ  ์‹ถ์€ ๋‚ด์šฉ์„ ์ž…๋ ฅ๋ฐ›์Šต๋‹ˆ๋‹ค.
  5. Chat.generate_response์—์„œ Retriever.retrieve๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค.
  6. Retriever.retrieve์—์„œ vector_db๋ฅผ ์ฝ์–ด์™€ ์ฐพ๋Š” ๋‚ด์šฉ๊ณผ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
  7. ๋ฐ˜ํ™˜๋œ ๊ฒฐ๊ณผ์™€ ์ฐพ๊ณ  ์‹ถ์€ ๋‚ด์šฉ์„ GeminiAPI.generate_response๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
  8. ConversationChain์„ ์ด์šฉํ•ด prompt, ์‚ฌ์šฉํ•  model(gemini), ๊ธฐ์–ต์‹œํ‚ฌ memory(๋ฐ˜ํ™˜๋œ ๊ฒฐ๊ณผ)๋ฅผ ๋ฌถ์–ด์„œ conversation์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  9. conversation์— ์ฐพ๊ณ  ์‹ถ์€ ๋‚ด์šฉ์„ ๋„ฃ์–ด์„œ gemini์˜ ๊ฒฐ๊ณผ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

Module

main: ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋ฉ”์ธ ์‹คํ–‰ ํ๋ฆ„์„ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค.
gemini : gemini API๋ฅผ ํ˜ธ์ถœํ•˜๊ณ , response๋ฅผ ๋ฐ›๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
vector-store : RAG๋ฅผ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•ด์„œ, ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ Splitํ•˜๊ณ  Vector DB์— storeํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
retriever: Vector DB์—์„œ ์‚ฌ์šฉ์ž ์ž…๋ ฅ๊ณผ ์—ฐ๊ด€๋œ Knowledge๋ฅผ ์ฐพ๋Š” ๊ฒ€์ƒ‰ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
crawler : KB๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ๋ฌธ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
chat : user input๊ณผ LLM answer๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹ต๋ณ€์„ ์–ด๋–ป๊ฒŒ ์ƒ์„ฑํ• ์ง€ ์ •ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
sql : sqlite๋‚˜ MySQL์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ RDBMS๋ฅผ ์—ฐ๋™ํ•˜๊ณ , SQL์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. -> ์‚ฌ์šฉX
logger : chat์ด ์‹คํ–‰๋  ๋•Œ๋งˆ๋‹ค User question ์ž…๋ ฅ๊ฐ’, gemini response๊ฐ’, ์‹œ๊ฐ„์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

Project explain

image

Source

์ด๋ฒˆ ํ”„๋กœ์ ํŠธ๋Š” LangChain์„ ์ด์šฉํ•ด์„œ Knowledge Base๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ , ์ด๋ฅผ ์‘์šฉํ•œ QA Engine์„ ๊ฐœ๋ฐœํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

ํ”„๋กœ์ ํŠธ์˜ ๋ชฉํ‘œ๋Š” Google Gemini API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ˆ˜์ง‘ํ•œ ๋ฌธ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋‹ต๋ณ€์„ ๋งŒ๋“œ๋Š” QA Bot์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋ฒˆ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” QA Bot ์ž์ฒด์˜ ๋‹ต๋ณ€ ํ€„๋ฆฌํ‹ฐ๋ฅผ ๋†’์ด๋Š” ๊ฒƒ์— ์ง‘์ค‘ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ์ด๋Ÿฌํ•œ ๊ธฐ๋Šฅ์„ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ตฌ์กฐ๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ์— ์ง‘์ค‘ํ•ฉ๋‹ˆ๋‹ค.

์ˆ˜์ง‘ํ•˜๋Š” ๋ฌธ์„œ๋Š” ์ž์œ ๋กญ๊ฒŒ ์ •ํ•˜์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค(๊ธฐํš ๋‹จ๊ณ„์— ํฌํ•จ). ๊ผญ ํฌ๋กค๋ง ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜์—ฌ ์ˆ˜์ง‘ํ•  ํ•„์š”๋Š” ์—†์œผ๋ฉฐ, ์˜ˆ๋ฅผ ๋“ค์–ด AI๋…ผ๋ฌธ์ด๋ผ๊ณ  ํ•œ๋‹ค๋ฉด arXiv์—์„œ ์ง์ ‘ PDF๋ฅผ ๋‹ค์šด๋กœ๋“œ ๋ฐ›์•„์„œ ์‚ฌ์šฉํ•ด๋„ ๋ฌด๋ฐฉํ•ฉ๋‹ˆ๋‹ค. ๋˜๋Š” ๋„ค์ด๋ฒ„ ๋‰ด์Šค๊ธฐ์‚ฌ๋ฅผ ํฌ๋กค๋งํ•˜๊ฑฐ๋‚˜, ํŠน์ • ๋ถ„์•ผ์˜ tech report, Harvard Business Review, Numpy documentation ๋“ฑ๋“ฑ ์ž์œ ๋กญ๊ฒŒ ์ •ํ•˜์„ธ์š”.

Requirements

  1. ๊ฐœ๋ฐœ ์–ธ์–ด: ์ „์ฒด ํ”„๋กœ์ ํŠธ๋Š” Python์œผ๋กœ ๊ฐœ๋ฐœ๋ฉ๋‹ˆ๋‹ค.
  2. API ์‚ฌ์šฉ: Gemini API๋ฅผ ํ†ตํ•ด LLM์˜ ๋‹ต๋ณ€์„ ๋ฐ›์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์€ gemini-pro ๋˜๋Š” gemini-1.5-pro-latest๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  3. ๋ชจ๋“ˆ ๊ตฌ์กฐ: ํ”„๋กœ์ ํŠธ๋Š” ์ตœ์†Œ ๋‹ค์„ฏ๊ฐœ ์ด์ƒ์˜ ๋ชจ๋“ˆ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. (๋‹ค์Œ์€ ์˜ˆ์‹œ์ด๋ฉฐ, ๊ธฐํš ๋‹จ๊ณ„์—์„œ ์–ด๋–ป๊ฒŒ ๊ตฌํ˜„ํ• ์ง€ ํŒ€ ๋‹จ์œ„๋กœ ๊ณ ๋ฏผํ•ด๋ณด์„ธ์š”. ๊ธฐํš ๋‹จ๊ณ„๋„ ํ‰๊ฐ€ ์š”์†Œ์— ๋“ค์–ด๊ฐ‘๋‹ˆ๋‹ค.)
    • main: ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋ฉ”์ธ ์‹คํ–‰ ํ๋ฆ„์„ ์ œ์–ดํ•ฉ๋‹ˆ๋‹ค. (main์€ ํ•„์ˆ˜๋กœ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค)
    • gemini : gemini API๋ฅผ ํ˜ธ์ถœํ•˜๊ณ , response๋ฅผ ๋ฐ›๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
    • vector-store : RAG๋ฅผ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•ด์„œ, ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋ฅผ Splitํ•˜๊ณ  Vector DB์— storeํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
    • retriever: Vector DB์—์„œ ์‚ฌ์šฉ์ž ์ž…๋ ฅ๊ณผ ์—ฐ๊ด€๋œ Knowledge๋ฅผ ์ฐพ๋Š” ๊ฒ€์ƒ‰ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
    • crawler : KB๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ๋ฌธ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
    • chat : user input๊ณผ LLM answer๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋‹ต๋ณ€์„ ์–ด๋–ป๊ฒŒ ์ƒ์„ฑํ• ์ง€ ์ •ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
    • sql : sqlite๋‚˜ MySQL์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ RDBMS๋ฅผ ์—ฐ๋™ํ•˜๊ณ , SQL์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
    • logger : ์‹คํ–‰๋˜๋ฉด์„œ ๊ธฐ๋ก๋˜๋Š” ๋‚ด์šฉ๋“ค์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
  4. ์˜คํ”ˆ์†Œ์Šค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ : ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์˜คํ”ˆ์†Œ์Šค์— ์ œํ•œ์€ ์—†์Šต๋‹ˆ๋‹ค.
  5. LLM ์‚ฌ์šฉ ์ œํ•œ: openai GPT-4๋‚˜, Claude3 Opus ๊ฐ™์€ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ํ‰๊ฐ€์—์„œ ์ œ์™ธ๋ฉ๋‹ˆ๋‹ค. ๊ฐœ์ธ์ ์œผ๋กœ ํ…Œ์ŠคํŠธํ•˜๋Š” ๊ฒƒ์€ ์ƒ๊ด€์—†์œผ๋‚˜ ์ด๋ฒˆ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” Gemini๋งŒ ์‚ฌ์šฉํ•ด์ฃผ์„ธ์š”.
  6. ์ฝ”๋“œ ํ˜‘์—…: GitHub๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์†Œ์Šค ์ฝ”๋“œ ๊ด€๋ฆฌ ๋ฐ ํ˜‘์—…์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ๋ชจ๋“ˆ๋ณ„๋กœ ๋ธŒ๋žœ์น˜๋ฅผ ๋‚˜๋ˆ„์–ด ๊ฐœ๋ฐœํ•˜๊ณ , Pull Request๋ฅผ ํ†ตํ•ด ์ฝ”๋“œ ๋ฆฌ๋ทฐ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ํŒ€์›์ด ์ตœ์†Œํ•œ 1๋ฒˆ ์ด์ƒ PR์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ํ”„๋กœ์ ํŠธ์˜ Contributor๋กœ ์ฐธ์—ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  7. ํด๋ž˜์Šค ์‚ฌ์šฉ: ๊ฐ ๋ชจ๋“ˆ์€ ํด๋ž˜์Šค ๊ธฐ๋ฐ˜์œผ๋กœ ์„ค๊ณ„ํ•ฉ๋‹ˆ๋‹ค. **main.py**์—์„  ๋ชจ๋“  ๋ชจ๋“ˆ์„ ๋ถˆ๋Ÿฌ์™€์„œ ์ „์ฒด์ ์ธ ๊ธฐ๋Šฅ์„ ํ•œ๋ฒˆ์— ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

ํ•„์ˆ˜ ๊ธฐ๋Šฅ

  1. Gemini API๋ฅผ ํ†ตํ•ด ๋‹ต๋ณ€์ด ์ œ๋Œ€๋กœ ์ƒ์„ฑ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๊ธฐ๋Šฅ
  2. LLM์˜ ๋‹ต๋ณ€์„ ์˜๋„๋Œ€๋กœ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด System Prompt๋ฅผ ์„ธํŒ…ํ•˜๋Š” ๊ธฐ๋Šฅ
  3. RAG๋ฅผ ํ•˜๊ธฐ ์œ„ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ VectorDB์— ์ €์žฅํ•˜๋Š” ๊ธฐ๋Šฅ (์ตœ์†Œ ๋ฌธ์„œ์˜ ๊ฐœ์ˆ˜๋Š” 30๊ฐœ) (++ ์—ฌ๊ธฐ์„œ 30๊ฐœ์˜ ๋ฌธ์„œ๋Š” ๋‰ด์Šค ๊ธฐ์‚ฌ 30๊ฐœ ์ •๋„์˜ ํ…์ŠคํŠธ ๊ธธ์ด๋ผ๋Š” ์˜๋ฏธ์ด๋ฉฐ, ๊ธ€์ž ์ˆ˜๋กœ๋Š” ๋Œ€๋žต 20000์ž ์ •๋„ ๋ฉ๋‹ˆ๋‹ค.)
  4. Retrieve๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉ์ž ์ž…๋ ฅ๊ณผ ์ ์ ˆํ•œ Knowledge๋ฅผ ์ถ”์ถœํ•˜๋Š” ๊ธฐ๋Šฅ
  5. main.py๋ฅผ ํ†ตํ•ด์„œ ๋‚˜๋จธ์ง€ ๋ชจ๋“ˆ๋“ค์„ ๋ถˆ๋Ÿฌ์™€์„œ ํ•œ๋ฒˆ์— ์‹คํ–‰ํ•˜๋Š” ๊ธฐ๋Šฅ
  6. Chat History๋ฅผ ๊ธฐ์–ตํ•˜์—ฌ ์ง€์ •ํ•œ ํŽ˜๋ฅด์†Œ๋‚˜, ์ฃผ์–ด์ง„ ์—ญํ• , ๋‹ต๋ณ€์˜ ์ถœ๋ ฅ ์–‘์‹๋“ฑ์„ ์œ ์ง€ํ•˜๋Š” ๊ธฐ๋Šฅ โ†’ ํ‰๊ฐ€์‹œ ์‚ฌ์šฉ์ž ์ž…๋ ฅ์€ 5๋ฒˆ ์ˆ˜ํ–‰์ด ๋ ๊ฒƒ์ด๋ฏ€๋กœ, 5๋ฒˆ ์ด์ƒ ๊ธฐ์–ตํ•  ์ˆ˜ ์žˆ์œผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

๐Ÿงฑ Tech Stack

Language

Python

Framework

๏ฟฝlangchain

gemini

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.