Code Monkey home page Code Monkey logo

Comments (8)

KichangKim avatar KichangKim commented on June 10, 2024 2

For the historical reason, web demo and training program use different image pre-processing steps so it makes slightly different results. I do not have any plan to release full web demo code yet, but here are parts of its image pre-processing and cropping code. You can use this for generating the same result to web's.

# image_utility.py
import math

import skimage.transform
import numpy as np
import tensorflow as tf


def calculate_image_scale(source_width, source_height, target_width, target_height):
    """
    Calculate scale for image resizing while preserving aspect ratio.
    """
    if source_width == target_width and source_height == target_height:
        return 1.0

    source_ratio = source_width / source_height
    target_ratio = target_width / target_height

    if target_ratio < source_ratio:
        scale = target_width / source_width
    else:
        scale = target_height / source_height

    return scale


def transform_and_pad_image(image, target_width, target_height, scale=None, rotation=None, shift=None, order=1, mode='edge'):
    """
    Transform image and pad by edge pixles.
    """
    image_width = image.shape[1]
    image_height = image.shape[0]
    image_array = image

    # centerize
    t = skimage.transform.AffineTransform(
        translation=(-image_width * 0.5, -image_height * 0.5))

    if scale:
        t += skimage.transform.AffineTransform(scale=(scale, scale))

    if rotation:
        radian = (rotation / 180.0) * math.pi
        t += skimage.transform.AffineTransform(rotation=radian)

    t += skimage.transform.AffineTransform(
        translation=(target_width * 0.5, target_height * 0.5))

    if shift:
        t += skimage.transform.AffineTransform(
            translation=(target_width * shift[0], target_height * shift[1]))

    warp_shape = (target_height, target_width)

    image_array = skimage.transform.warp(
        image_array, (t).inverse, output_shape=warp_shape, order=order, mode=mode)

    return image_array


def crop_image(image, crop_box_ratio):
    width = image.shape[1]
    height = image.shape[0]

    (left_ratio, upper_ratio, right_ratio, lower_ratio) = crop_box_ratio

    width_start = int(width * left_ratio)
    width_end = int(width * right_ratio)
    height_start = int(height * upper_ratio)
    height_end = int(height * lower_ratio)

    return image[height_start:height_end, width_start:width_end, :]


def create_crop_box_ratio_list(ratio):
    return [
        (0, 0, ratio, ratio),
        (1 - ratio, 0, 1, ratio),
        (0, 1 - ratio, ratio, 1),
        (1 - ratio, 1 - ratio, 1, 1),
        ((1 - ratio) * 0.5,
         (1 - ratio) * 0.5, (1 + ratio) * 0.5, (1 + ratio) * 0.5)
    ]


def transform_image(image, width, height):
    source_height = image.shape[0]
    source_width = image.shape[1]

    scale = calculate_image_scale(source_width, source_height, width, height)
    image = transform_and_pad_image(image, width, height, scale=scale)

    return image / 255.0


def load_image(path):
    image_raw = tf.io.read_file(path)
    image = tf.io.decode_png(image_raw, channels=3)

    return image.numpy().astype(np.float32)


def resize_image(image, size):
    return tf.image.resize(image, size=size, method=tf.image.ResizeMethod.AREA, preserve_aspect_ratio=True).numpy()

The core method is transform_image(). Also here are cropping code:

y = model.predict(image_transformed)[0]

if crop == 'true':
    crop_box_ratio_list = image_utility.create_crop_box_ratio_list(0.6)

    for crop_box_ratio in crop_box_ratio_list:
        image_crop = image_utility.crop_image(image, crop_box_ratio)
        image_crop = image_utility.transform_image(
            image_crop, image_width, image_height)
        image_crop = image_crop.reshape(
            (1, image_crop.shape[0], image_crop.shape[1], image_crop.shape[2]))
        y_crop = model.predict(image_crop)[0]
        y_crop = np.multiply(
            y_crop, project_data['crop_exclude_tags_vector'])

        y = np.maximum(y, y_crop)

from deepdanbooru.

KichangKim avatar KichangKim commented on June 10, 2024 2

@rachmadaniHaryono

  1. 4 corners (top-left, top-right, bottom-left, bottom-right) with 50% edge size + small overlap 10% = 60%
  2. Exclude incorrect estimation due to cropping. In ex, if the original image has 2 girls and cropped image has only one girl, it may return 1girl tag, so it should not be contained. Number-related, size-related, angle-related tags are candidates.

from deepdanbooru.

KichangKim avatar KichangKim commented on June 10, 2024

Web demo uses latest release ( https://github.com/KichangKim/DeepDanbooru/releases/tag/v3-20200915-sgd-e30 ).

"Use Cropping" is simple, it splits image into multiple small parts with overlap and independently estimates for each parts. Then it combine all estimated tags with filtering (remove mis-estimated tags due to splitting)

from deepdanbooru.

Superfloh avatar Superfloh commented on June 10, 2024

Sadly I'm getting different results with the v3 model and the web version, usually very similar tags but different scores. (not using the cropping option)
The "Use Cropping" idea is pretty interesting, would you mind releasing the code for that ? ^^

Also on an unrelated sidenote, the requirements.txt still has tensorflow>=2.1.0.

Model v3 result:
v3 result

Web result:
web result

Original:
original

from deepdanbooru.

Superfloh avatar Superfloh commented on June 10, 2024

Thank you very much, loading and pre-processing the image with that code indeed gives the same result as on the webpage.
For the cropping I'm missing the project_data['crop_exclude_tags_vector'], it doesn't exist in the project.json.

from deepdanbooru.

KichangKim avatar KichangKim commented on June 10, 2024

It is simple mask-vector (0 or 1). If the tag exists in exclude_tags, its value is 0, or not, 1.

Here is my exclude_tags:

1boy
2boys
3boys
4boys
5boys
6+boys
1girl
2girls
3girls
4girls
5girls
6+girls
1koma
2koma
3koma
4koma
5koma
solo
solo_focus
text_focus
ass_focus
male_focus
out-of-frame_censoring
out_of_frame
feet_out_of_frame
head_out_of_frame
lower_body
upper_body
portrait
close-up
rating:safe
rating:questionable
rating:explicit
score:very_bad
score:bad
score:average
score:good
score:very_good

from deepdanbooru.

Superfloh avatar Superfloh commented on June 10, 2024

I made a vector out of the tags mentioned above and I'm getting the same result as the web version now.

In case someone else is interested in the cropping feature, here is my code:

            project_context, model, tags = dd.project.load_project(project_path)
            width = model.input_shape[2]
            height = model.input_shape[1]
            try:
                image = load_image(image_path)
                image_transformed = transform_image(image, width=width, height=height)
            except:
                print("error loading the image")
                continue

            image_shape = image_transformed.shape
            image_transformed = image_transformed.reshape((1, image_shape[0], image_shape[1], image_shape[2]))
            y = model.predict(image_transformed)[0]

            if crop == 'true':
                crop_box_ratio_list = create_crop_box_ratio_list(0.6)
                for crop_box_ratio in crop_box_ratio_list:
                    image_crop = crop_image(image, crop_box_ratio)

                    image_crop = transform_image(image_crop, width=width, height=height)
                    image_crop = image_crop.reshape(
                        (1, image_crop.shape[0], image_crop.shape[1], image_crop.shape[2]))
                    y_crop = model.predict(image_crop)[0]

                    exclude_tags = np.fromfile(project + "/exclude_tags.txt", dtype=int, sep='\n')
                    y_crop = np.multiply(y_crop, exclude_tags)
                    y = np.maximum(y, y_crop)

And here the Vector exclude_tags.txt:

exclude_tags.txt

Thank you for your help.

from deepdanbooru.

rachmadaniHaryono avatar rachmadaniHaryono commented on June 10, 2024
  1. why 0.6 ratio?
  2. how to choose tag to exclude?

from deepdanbooru.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.