assafshocher / blindsr_dataset_generator Goto Github PK

View Code? Open in Web Editor NEW

74.0 74.0 14.0 691 KB

Downscale a set of images by randomly created kernels and save them

License: Other

Jupyter Notebook 100.00%

blindsr_dataset_generator's People

Contributors

Stargazers

Watchers

Forkers

liuguoyou gammalee scape1989 tony109060581 ksnzh mingyanzhu burt0927 wh-forker iceclear shiva811 akashverma4088-iitkgp xpo0a 2683321382 standardgalactic

blindsr_dataset_generator's Issues

ValueError: Floating point image RGB values must be in the 0..1 range.

Due to raw_kernel_centered = kernel_shift(raw_kernel, scale_factor) in kernel generation, sometimes, some entrys of the kernel are negative. Thus, after downscaling, some pixel values may be smaller than 0. When saving the image, it outputs ValueError: Floating point image RGB values must be in the 0..1 range.

This happens randomly, but for the generation of DIV2KRK, there are nearly always errors.

How do you deal with this problem?

the settings of 2x downsampling

Hi. Thank you for sharing the code.
Could you tell me how to set the following parametres when generating 2x downsampled lr images, so make the gererated images are consistent with the dataset you used? Thanks

scale_factor = np.array([4, 4]) # choose scale-factor
avg_sf = np.mean(scale_factor) # this is calculated so that min_var and max_var will be more intutitive
min_var = 0.175 * avg_sf # variance of the gaussian kernel will be sampled between min_var and max_var
max_var = 2.5 * avg_sf
k_size = np.array([21, 21]) # size of the kernel, should have room for the gaussian
noise_level = 0.4 # this option allows deviation from just a gaussian, by adding multiplicative noise noise

Code not work at some situation! Confusing! Help!!

hello Mr. Assafshocher,
Thanks for your contributions. But when I use your code, I met some problems.

When I use another image(except the image in folder "example" in this repository) to downscale, I found the program can't show the LR image which is downscale and conv. with the HR image.

I found that the only difference is that they have different resolutions.

import numpy as np
from matplotlib import pyplot as plt
import matplotlib.image as img
from scipy.ndimage import filters, measurements, interpolation
import glob

from scipy.io import savemat

import ntpath

# this option allows deviation from just a gaussian, by adding multiplicative noise noise
# 此选项允许通过添加乘性噪声来偏离高斯
noise_level = 0.4
output_path = './kernel_with_img'


# Function for centering a kernel
# 确定内核中心
def kernel_shift(kernel, sf):
    # There are two reasons for shifting the kernel:
    # 有2个原因要改变内核
    # 1. Center of mass is not in the center of the kernel which creates ambiguity. There is no possible way to know
    #    the degradation process included shifting so we always assume center of mass is center of the kernel.
    # 质心不在内核的中心会产生歧义, 没有办法知道包括位移(shifting)在内的退化过程, 因此, 我们始终假设质心是内核的中心。
    #
    # 2. We further shift kernel center so that top left result pixel corresponds to the middle of the sfXsf first
    #    pixels. Default is for odd size to be in the middle of the first pixel and for even sized kernel to be at the
    #    top left corner of the first pixel. that is why different shift size needed between od and even size.
    # 我们进一步移动内核中心，以使左上结果像素对应于 sfXsf 第一个像素的中间;
    # 默认值是奇数大小在第一个像素的中间，而偶数大小的内核在第一个像素的左上角;
    # 这就是为什么在od和偶数大小之间需要不同的移位大小;
    #
    # Given that these two conditions are fulfilled, we are happy and aligned, the way to sun it is as follows:
    # The input image, when interpolated (regular bicubic) is exactly aligned with ground truth.
    # 若给定的条件都已被满足, 我们很高兴并且一致，测试它的方法如下:
    # 插值后的输入图像（规则双三次）与地面实况完全对齐。

    # First calculate the current center of mass for the kernel
    # 首先计算当前内核的质心
    current_center_of_mass = measurements.center_of_mass(kernel)

    # The second ("+ 0.5 * ....") is for applying condition 2 from the comments above
    # 应用条件2
    wanted_center_of_mass = np.array(kernel.shape) / 2 + 0.5 * (sf - (kernel.shape[0] % 2))

    # Define the shift vector for the kernel shifting (x, y)
    # 为内核移位(x，y)定义移位向量
    shift_vec = wanted_center_of_mass - current_center_of_mass

    # Finally shift the kernel and return
    # 最后, 转变内核并返回
    return interpolation.shift(kernel, shift_vec)


# Function for generating one kernel
# 生成内核
def generate_kernel(k_size, scale_factor, min_var, max_var):

    # Set random eigen-vals (lambdas) and angle (theta) for COV matrix
    # 为 COV 矩阵 设置 随机特征值(λ) 和 角度(θ)
    lambda_1 = min_var + np.random.rand() * (max_var - min_var)
    lambda_2 = min_var + np.random.rand() * (max_var - min_var)
    theta = np.random.rand() * np.pi
    noise = -noise_level + np.random.rand(*k_size) * noise_level * 2

    # Set COV matrix using Lambdas and Theta
    # 使用 λ 和 θ 设置 COV 矩阵
    LAMBDA = np.diag([lambda_1, lambda_2])
    Q = np.array([[np.cos(theta), -np.sin(theta)], [np.sin(theta), np.cos(theta)]])
    SIGMA = Q @ LAMBDA @ Q.T
    INV_SIGMA = np.linalg.inv(SIGMA)[None, None, :, :]

    # Set expectation position (shifting kernel for aligned image)
    # 设置期望位置(为对齐的图像移动内核)
    MU = k_size // 2 + 0.5 * (scale_factor - k_size % 2)
    MU = MU[None, None, :, None]

    # Create meshgrid for Gaussian
    # 创建高斯网格
    [X, Y] = np.meshgrid(range(k_size[0]), range(k_size[1]))
    Z = np.stack([X, Y], 2)[:, :, :, None]

    # Calcualte Gaussian for every pixel of the kernel
    # 计算内核每个像素的高斯分布
    ZZ = Z - MU
    ZZ_t = ZZ.transpose(0, 1, 3, 2)
    raw_kernel = np.exp(-0.5 * np.squeeze(ZZ_t @ INV_SIGMA @ ZZ)) * (1 + noise)

    # shift the kernel so it will be centered
    # 移动内核，使其居中
    raw_kernel_centered = kernel_shift(raw_kernel, scale_factor)

    # Normalize the kernel and return
    # 正则化 内核 并 返回
    kernel = raw_kernel_centered / np.sum(raw_kernel_centered)

    return kernel


def downscale(img, kernel, scale_factor, output_shape=None):
    """
    Function for downscaling an image using a kernel(使用内核下采样图像)

    :param img: 输入图像
    :param kernel: 内核
    :param scale_factor: 下采样缩放比例
    :param output_shape: 输出尺寸
    """
    # output shape can either be specified or, for simple cases, can be calculated.
    # see more details regarding this at: https://github.com/assafshocher/Resizer
    # 可以指定输出形状，对于简单的情况，也可以计算输出形状。
    if output_shape is None:
        # [: -1]: 截取除去最后一个元素以外的其他元素
        output_shape = np.array(img.shape[: -1]) / np.array(scale_factor)
        print(output_shape)

    # print("-------------------------------")
    # print(img.shape)
    # print(np.array(img.shape[: -1]))
    # print(np.array(img.shape[: -1]).shape)
    # print(scale_factor.shape)
    # print(np.array(scale_factor).shape)
    print("-------------------------------")

    # First run a correlation (convolution with flipped kernel)
    # np.zeros_like(img): 返回与 给定数组 具有相同 形状(shape) 和 类型(type) 的元素均为 0 的数组
    out_img = np.zeros_like(img)
    # np.ndim(img)：返回给定数组的维数, 若为3, 则为彩色图像; 若为2, 则为灰度图
    for channel in range(np.ndim(img)):
        # img[:, :, channel]: 针对 图像img 的每层信道(channel)进行操作
        # correlate(img, kernel): 将图像与内核进行卷积
        out_img[:, :, channel] = filters.correlate(img[:, :, channel], kernel)

    # Then subsample and return
    # 子采样 并 返回
    # np.round(array): 将 array 中的元素四舍五入(< 0.5 --> 0; > 0.5 --> 1)
    # np.linspace(start, stop, num): [start, end, 需切个数]
    # astype(target): 造型到指定的 target 类型
    # [:, None]: None表示该维不进行切片，而是将该维整体作为数组元素处理;
    # 所以, [:, None] 的效果就是将二维数组按每行分割, 最后形成一个三维数组;
    return out_img[np.round(np.linspace(0, img.shape[0] - scale_factor[0], output_shape[0])).astype(int)[:, None],
           np.round(np.linspace(0, img.shape[1] - scale_factor[1], output_shape[1])).astype(int), :]


# Load images, downscale using kernels, save and display
# 加载图片, 使用内核下采样, 保存并展示
def main():

    images_path = '.\datasets\DIV2KRK\gt\img_3_gt.png'

    # choose scale-factor
    # 选择缩放比例
    # scale_factor = np.array([4, 4])
    scale_factor = np.array([8, 8])

    # this is calculated so that min_var and max_var will be more intutitive
    # 计算这个值使 min_var 和 max_var 更直观
    avg_sf = np.mean(scale_factor)

    # variance of the gaussian kernel will be sampled between min_var and max_var
    # 高斯核的方差将在 min_var 与 max_var 之间采样
    min_var = 0.175 * avg_sf
    max_var = 2.5 * avg_sf

    # size of the kernel, should have room for the gaussian
    # 内核尺寸,
    k_size = np.array([21, 21])

    for i, path in enumerate(glob.glob(images_path)):

        # 读取图像
        im = img.imread(path)

        # kernel = get_k()

        # # 生成模糊核
        kernel = generate_kernel(k_size, scale_factor, min_var, max_var)

        # 使用特定的 内核 和 缩放因子 下采样图像, 从而得到低清图像
        lr = downscale(im, kernel, scale_factor)

        print(i)
        print(type(kernel))
        print(kernel.shape)

        # 解决中文乱码问题
        plt.rcParams["font.sans-serif"] = "SimHei"

        # 平铺画布
        fig = plt.figure()

        # subplots_adjust：调整子图间距
        fig.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.4, hspace=None)

        # 原始图像(origin)
        f1 = fig.add_subplot(1, 3, 1)
        f1.title.set_text('原始图像')
        f1.imshow(im)

        # LR图像(lr)
        f2 = fig.add_subplot(1, 3, 3)
        f2.title.set_text('低清(LR)图像')
        f2.imshow(lr)

        # 内核(kernel)
        # 高斯模糊核(Gaussian kernel)
        f3 = fig.add_subplot(1, 3, 2)
        f3.title.set_text('散焦模糊核(Disk Kernel)')
        f3.minorticks_on()
        f3.imshow(kernel, cmap='gray')

        fig.show()

       savemat('%s/im_%d_sf_%d_%d.mat' % (output_path, i, scale_factor[0], scale_factor[1]), {'ker': kernel})
      
       plt.imsave('%s/im_%d_sf_%d_%d.png' % (output_path, i, scale_factor[0], scale_factor[1]), lr, vmin=0, vmax=1)

main()

and the image comes from the DIV2KRK, but when I review the LR image in the image software; I can just see the dark or white.

I am very confused about this situation.

[Err] An error occured when call function downscale

return out_im[np.round(np.linspace(0, im.shape[0] - scale_factor[0], output_shape[0])).astype(np.int64)[:, None],

File "<array_function internals>", line 5, in linspace
File "D:\Anaconda3\lib\site-packages\numpy\core\function_base.py", line 119, in linspace
raise TypeError(
TypeError: object of type <class 'numpy.float64'> cannot be safely interpreted as an integer.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.