Code Monkey home page Code Monkey logo

ai-fitting-room's Introduction

AI Fitting Room

๐Ÿ‘œ ํด๋ฆญ ํ•œ ๋ฒˆ์œผ๋กœ ์˜ท์„ ์ž…์–ด๋ณผ ์ˆ˜ ์žˆ๋‹ค๋ฉด? Virtual Try-On ๋ชจ๋ธ์„ ์ด์šฉํ•œ AI ํ”ผํŒ…๋ฃธ

๐Ÿ‘— Introduction

Motivation

๊ฐ€๋”, ๋‚ด๊ฐ€ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์˜ท๋“ค ์ค‘ ์–ด๋–ค ๊ฑธ ์ฝ”๋””ํ•ด์•ผ ํ• ์ง€ ์ „ํ˜€ ๊ฐ์ด ์˜ค์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. โ€œ์ด๋ ‡๊ฒŒ ์ž…์œผ๋ฉด ๊ดœ์ฐฎ๊ฒ ์ง€โ€ํ•˜๊ณ  ์ž…์–ด๋ดค๋”๋‹ˆ ๋‚ด๊ฐ€ ์ƒ์ƒํ–ˆ๋˜ ๋ชจ์Šต์ด ์•„๋‹ˆ๋ผ ๋‹นํ™ฉ์Šค๋Ÿฌ์› ๋˜ ์ ๋„ ์žˆ๊ณ ์š”. ์ด๋Ÿฐ ๊ณ ๋ฏผ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ์ง์ ‘ ์ž…์–ด๋ณด์ง€ ์•Š๊ณ ๋„ ์˜ท์„ ์ž…์€ ๋‚˜์˜ ๋ชจ์Šต์„ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ๋„๋ก ํ”ผํŒ…ํ•ด์ฃผ๋Š” ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ณ  ์‹ถ๋‹ค๋Š” ์ƒ๊ฐ์—์„œ ์‹œ์ž‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

TryonDiffusion Image from TryOnDiffusion: A Tale of Two UNets

Goal

๊ธฐ์กด์˜ ๊ฐ€์ƒํ”ผํŒ… ๋ชจ๋ธ๋“ค์€ ๋Œ€๊ฐœ ์ƒ์˜๋งŒ ์ ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ํ•œ๊ณ„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Œ

โ‡’ ์ด๋ฅผ ํ™•์žฅํ•ด ์ƒ์˜, ํ•˜์˜ ๋ฐ ๋“œ๋ ˆ์Šค๊นŒ์ง€ ํ”ผํŒ… ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ์„ ๋งŒ๋“ค์–ด๋ณด์ž!

๐Ÿ“š Dataset

Dress-Code [repo]

Proposed in โ€œDress Code: High-Resolution Multi-Category Virtual Try-Onโ€

  • 1024 x 768 ๊ณ ํ™”์งˆ ์ด๋ฏธ์ง€
  • 5๋งŒ์—ฌ ์žฅ์˜ ์˜ท, 10๋งŒ์—ฌ ์žฅ์˜ ์ „์‹  ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ
  • keypoint, skeleton, label map, dense pose ๋“ฑ ํ’๋ถ€ํ•œ annotation ์ œ๊ณต

Untitled

๐Ÿ“ Modeling

DAFlow [repo]

Proposed in "Single Stage Virtual Try-on via Deformable Attention Flows" from ECCV2022

Deformable attention์„ ์ด์šฉํ•œ single stage, end-to-end ๊ตฌ์กฐ๋กœ, ๊ธฐ์กด multi-stage ๊ตฌ์กฐ์˜ ๋ณต์žก์„ฑ์„ ํ•ด๊ฒฐํ•œ ๋‹จ์ˆœํ•œ ๊ตฌ์กฐ์˜ ๋ชจ๋ธ

  • ํŠน์ง•
    1. Pyramid feature extraction: coarse-to-fine
    2. Cascade flow estimation: DAFN and DAWarp
    3. Shallow encoder-decoder generation

Brief description of DAFlow Brief description of DAFlow

Input of DAFlow Input of DAFlow

โ™ป๏ธ Data Processing

What is โ€˜Agnosticโ€™?

Target Garment๋ฅผ ํ•ฉ์„ฑํ•˜๊ณ ์ž ํ•˜๋Š” ์ž๋ฆฌ๋ฅผ ๊ฒ€์ •์ƒ‰์œผ๋กœ ๋งˆ์Šคํ‚นํ•œ ์ด๋ฏธ์ง€. ๊ธฐ์กด ์ฝ”๋“œ์—์„œ๋Š” ์ƒ์˜์— ๋Œ€ํ•ด์„œ๋งŒ keypoint์™€ densepose ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•ด Agnostic ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์ ํŠธ๋Š” ํ•˜์˜์™€ ๋“œ๋ ˆ์Šค๊นŒ์ง€ ํ•ฉ์„ฑํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ด๋ฏ€๋กœ, ํ•˜๋ฐ˜์‹ ๊ณผ ์ „์‹  Agnostic ์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

Agnostic for full-body

์ƒ์ฒด๋Š” ํ‚คํฌ์ธํŠธ์™€ densepose๋งŒ์„ ์‚ฌ์šฉํ•ด ์™„๋ฒฝํ•˜๊ฒŒ ๋งˆ์Šคํ‚น์ด ๊ฐ€๋Šฅํ•˜์ง€๋งŒ, ๊ณจ๋ฐ˜ ๋ถ€๋ถ„์ด ์ƒ์ฒด๋กœ ๋ถ„๋ฅ˜๋˜์–ด ์žˆ๋Š” Densepose์˜ ํŠน์„ฑ์ƒ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋งŒ์„ ์‚ฌ์šฉํ•ด ํ•˜์ฒด๋ฅผ ์™„๋ฒฝํ•˜๊ฒŒ ๋งˆ์Šคํ‚นํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

โ‡’ ์ƒ์˜, ๋ฐ”์ง€, ์‹ ๋ฐœ, ๋ชจ์ž ๋“ฑ ๋‹ค์–‘ํ•œ ํŒจ์…˜ ์•„์ดํ…œ์œผ๋กœ ๋ ˆ์ด๋ธ”๋ง๋œ Dress-Code์˜ Label map ๋ฐ์ดํ„ฐ๋ฅผ ์ถ”๊ฐ€๋กœ ์‚ฌ์šฉํ•ด ํ•˜๋ฐ˜์‹ ์„ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ฒŒ ๋งˆ์Šคํ‚นํ–ˆ์Šต๋‹ˆ๋‹ค.

  • Upper body
    • keypoint + densepose
  • Lower body
    • keypoint + densepose + label map
  • Dresses
    • Upper + Lower body ๋™์‹œ์— ์ ์šฉ

Untitled

๐ŸŽ“ Training

Fine-Tuning

์Šคํฌ๋ž˜์น˜๋ถ€ํ„ฐ ํ•™์Šตํ•œ ๊ฒฝ์šฐ ์ดˆ๋ฐ˜ Loss๊ฐ€ ํฌ๊ณ  ์ˆ˜๋ ด ์†๋„๊ฐ€ ๋Š๋ ธ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ƒ๋ฐ˜์‹  ์ค‘์‹ฌ์œผ๋กœ ํ•™์Šต๋œ DAFlow์˜ ์ฒดํฌํฌ์ธํŠธ์—์„œ ์ „์ฒ˜๋ฆฌํ•œ Dress-Code ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•ด ํŒŒ์ธํŠœ๋‹ํ–ˆ์Šต๋‹ˆ๋‹ค.

Implementation Detail

  • Epoch: 10
  • Batch size: 1
  • Device: RTX3080 * 1
  • Data Usage
    • Image Resolution: 512 x 384
    • 1800 paired Upper/Lower/Dresses sets each

๐Ÿงช Results

Sample results during training

ํ•™์Šต ๊ณผ์ •์—์„œ ์–ป์€ ๊ฒฐ๊ณผ๋ฅผ ์™ผ์ชฝ์—์„œ ์˜ค๋ฅธ์ชฝ์œผ๋กœ ์‹œ๊ฐ„์ˆœ ๋ฐฐ์—ดํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šตํ• ์ˆ˜๋ก ๋” ์ •ํ™•ํ•˜๊ณ , ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ํ•ฉ์„ฑํ•˜๋Š” ๋ชจ์Šต์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. 4๋ฒˆ์งธ epoch ์ดํ›„๋ถ€ํ„ฐ๋Š” ์˜ค๋ฒ„ํ”ผํŒ…์ด ๋ฐœ์ƒํ•˜์˜€์Šต๋‹ˆ๋‹ค.

์ƒ์˜ ํ•ฉ์„ฑ ๊ฒฐ๊ณผ ์ƒ์˜ ํ•ฉ์„ฑ ๊ฒฐ๊ณผ

ํ•˜์˜ ํ•ฉ์„ฑ ๊ฒฐ๊ณผ ํ•˜์˜ ํ•ฉ์„ฑ ๊ฒฐ๊ณผ

๋“œ๋ ˆ์Šค ํ•ฉ์„ฑ ๊ฒฐ๊ณผ ๋“œ๋ ˆ์Šค ํ•ฉ์„ฑ ๊ฒฐ๊ณผ

Inference with new images

Untitled

โ›” Limitation

Failure cases Failure cases: 1. ๋ณต์žกํ•œ ํŒ” ํ˜•ํƒœ์— ๋งž๊ฒŒ ํ•ฉ์„ฑ์— ์‹คํŒจํ•œ ๊ฒฝ์šฐ, 2~3. ๋“œ๋ ˆ์Šค์˜ ๋„ฅ๋ผ์ธ ๋””ํ…Œ์ผ์ด ์‚ฌ๋ผ์ง€๋Š” ๊ฒฝ์šฐ

  • Agnostic mask์˜ ํ˜•ํƒœ์— ๋ฏผ๊ฐ
  • ๋ณต์žกํ•œ ํฌ์ฆˆ์— ๋Œ€ํ•œ ์ ์‘๋ ฅ ๋–จ์–ด์ง
  • ์˜ท์˜ ๋””ํ…Œ์ผ์ด ๋ณ€ํ˜•๋˜๋Š” ๊ฒฝ์šฐ ์กด์žฌ

๐Ÿค” Future Works

Performance

  • ์ผ๋ฐ˜์ ์ด๊ณ  ํšจ๊ณผ์ ์ธ Agnostic mask ํ˜•ํƒœ ์—ฐ๊ตฌ
  • ์˜ท์˜ ๋””ํ…Œ์ผ์„ ๋ณด์กดํ•˜๋ฉฐ ํ•ฉ์„ฑํ•˜๋„๋ก ๊ฐœ์„ 
  • ๋‹ค์–‘ํ•œ ๋ฐฐ๊ฒฝ๊ณผ ๊ฐ๋„์˜ ์ด๋ฏธ์ง€์— ๊ฐ•๊ฑดํ•˜๋„๋ก ํ•™์Šต

Application

  • ๋‹ค๋ฅธ VITON ์‘์šฉ ๋ถ„์•ผ์™€์˜ ๊ฒฐํ•ฉ
  • ๋ฐ๋ชจ ํŽ˜์ด์ง€ ๋ฐ ์„œ๋น„์Šค ๋งŒ๋“ค๊ธฐ

ai-fitting-room's People

Contributors

hamin-shim avatar kchyun avatar sjn0910 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.