可以使用数据集cityscapes来训练吗？

<div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto

<div class="highlight highlight-source-python notranslate po

数据集为cityscapes,about bubbliiiing/deeplabv3-plus-pytorch

Comments (30)

bubbliiiing commented on May 23, 2024

格式一样的话可以……我最近还没研究出来

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

from deeplabv3-plus-pytorch.

Dejavusd commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

有代码吗？ovo

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

import os
import random
'''
用来生成imageset下的内容，主要是保存原图和标注图像重合的那部分文件名
'''

# -------------------------------------------------------#
#   指向VOC数据集所在的文件夹
#   默认指向根目录下的VOC数据集
# -------------------------------------------------------#
cityscapes_path = 'data/cityscapes'

if __name__ == "__main__":
    random.seed(0)
    print("Generate txt in ImageSets.")
    # 彩色分割图的路径
    segfilepath = os.path.join(cityscapes_path, 'SegmentationClass')
    # 保存训练集和验证集图像名称的txt文件的位置
    saveBasePath = os.path.join(cityscapes_path, 'ImageSets/Segmentation')

    ##################################################################################################
    citys = os.listdir(os.path.join(segfilepath, 'train'))
    temp_train_seg = []
    for city in citys:
        temp_train_seg += os.listdir(os.path.join(segfilepath, 'train/'+ city))
    total_train_seg = []  # 所有的训练分割图的文件名
    for seg in temp_train_seg:
        if seg.endswith("color.png"):
            total_train_seg.append(seg)

    tr = len(total_train_seg)
    # 打印大小
    print("train size", tr)
    # 四个要写入内容的txt文件
    ftrain = open(os.path.join(saveBasePath, 'train.txt'), 'w')
    for i in total_train_seg:
        # 把文件名后缀.png去掉
        name = i[:-16] + '\n'
        # 分别写入
        ftrain.write(name)
    ftrain.close()

    ##############################################################################################
    citys = os.listdir(os.path.join(segfilepath, 'val'))
    temp_val_seg = []
    for city in citys:
        temp_val_seg += os.listdir(os.path.join(segfilepath, 'val/' + city))
    total_val_seg = []  # 所有的训练分割图的文件名
    for seg in temp_val_seg:
        if seg.endswith("color.png"):
            total_val_seg.append(seg)

    tv = len(total_val_seg)
    # 打印大小
    print("val size", tv)
    # 四个要写入内容的txt文件
    fval = open(os.path.join(saveBasePath, 'val.txt'), 'w')
    for i in total_val_seg:
        # 把文件名后缀.png去掉
        name = i[:-16] + '\n'
        # 分别写入
        fval.write(name)
    fval.close()
###################################################################################################
    citys = os.listdir(os.path.join(segfilepath, 'test'))
    temp_test_seg = []
    for city in citys:
        temp_test_seg += os.listdir(os.path.join(segfilepath, 'test/' + city))
    total_test_seg = []  # 所有的训练分割图的文件名
    for seg in temp_test_seg:
        if seg.endswith("color.png"):
            total_test_seg.append(seg)

    ts = len(total_test_seg)
    # 打印大小
    print("test size", ts)
    # 四个要写入内容的txt文件
    ftest = open(os.path.join(saveBasePath, 'test.txt'), 'w')
    for i in total_test_seg:
        # 把文件名后缀.png去掉
        name = i[:-16] + '\n'
        # 分别写入
        ftest.write(name)
    ftest.close()

    print("Generate txt in ImageSets done.")

import os

import cv2
import numpy as np
from PIL import Image
from torch.utils.data.dataset import Dataset
from utils.utils import preprocess_input, cvtColor
# 在测试的时候要改一下这个，并改一下根目录。还没找到解决方法
# from utils import preprocess_input, cvtColor


class DeeplabDataset(Dataset):
    def __init__(self, annotation_lines, input_shape, num_classes, train, dataset_path):
        super(DeeplabDataset, self).__init__()
        self.annotation_lines = annotation_lines
        self.length = len(annotation_lines)
        self.input_shape = input_shape
        self.num_classes = num_classes
        self.train = train
        self.dataset_path = dataset_path

    def __len__(self):
        return self.length

    def __getitem__(self, index):
        annotation_line = self.annotation_lines[index]
        name = annotation_line.split()[0]

        # -------------------------------#
        #   从文件中读取图像
        # -------------------------------#
        city = name.split('_')[0]
        if self.train:
            jpg = Image.open(os.path.join(os.path.join(self.dataset_path, "JPEGImages/train/"+city), name + "leftImg8bit.png"))
            png = Image.open(os.path.join(os.path.join(self.dataset_path, "SegmentationClass/train/"+city), name + "gtFine_labelTrainIds.png"))
        else:
            jpg = Image.open(os.path.join(os.path.join(self.dataset_path, "JPEGImages/val/"+city), name + "leftImg8bit.png"))
            png = Image.open(os.path.join(os.path.join(self.dataset_path, "SegmentationClass/val/"+city), name + "gtFine_labelTrainIds.png"))
        # -------------------------------#
        #   数据增强
        # -------------------------------#
        jpg, png = self.get_random_data(jpg, png, self.input_shape, random=self.train)
        # 把通道数放到前面来
        jpg = np.transpose(preprocess_input(np.array(jpg, np.float64)), [2, 0, 1])
        png = np.array(png)
        # 把标签图像中大于类别数的像素数都置为最大像素数
        # cityscepes的数字是0 1 2 ... 18 255    255是白色，是要忽略的类，这里给变成19了
        # 所以编号为19的就是要忽略的类
        png[png >= self.num_classes] = self.num_classes
        # -------------------------------------------------------#
        #   转化成one_hot的形式
        #   在这里需要+1是因为voc数据集有些标签具有白边部分
        #   我们需要将白边部分进行忽略，+1的目的是方便忽略。
        # -------------------------------------------------------#
        # 给每个像素的ID都搞成one-hot编码，行数就是图像长*宽，列数就是num_calsses+1
        # 本来应该是19类，就是19列，但是这里多了一个要忽略的类，所以就是20列
        seg_labels = np.eye(self.num_classes + 1)[png.reshape([-1])]
        # 编码是厚度
        seg_labels = seg_labels.reshape((int(self.input_shape[1]), int(self.input_shape[0]), self.num_classes + 1))
        return jpg, png, seg_labels

    def rand(self, a=0, b=1):
        return np.random.rand() * (b - a) + a

    def get_random_data(self, image, label, input_shape, jitter=.3, hue=.1, sat=1.5, val=1.5, random=True):
        image = cvtColor(image)
        label = Image.fromarray(np.array(label))
        h, w = input_shape

        if not random:
            iw, ih = image.size
            scale = min(w / iw, h / ih)
            nw = int(iw * scale)
            nh = int(ih * scale)

            image = image.resize((nw, nh), Image.BICUBIC)
            new_image = Image.new('RGB', [w, h], (128, 128, 128))
            new_image.paste(image, ((w - nw) // 2, (h - nh) // 2))

            label = label.resize((nw, nh), Image.NEAREST)
            new_label = Image.new('L', [w, h], (0))
            new_label.paste(label, ((w - nw) // 2, (h - nh) // 2))
            return new_image, new_label

        # resize image
        rand_jit1 = self.rand(1 - jitter, 1 + jitter)
        rand_jit2 = self.rand(1 - jitter, 1 + jitter)
        new_ar = w / h * rand_jit1 / rand_jit2

        scale = self.rand(0.25, 2)
        if new_ar < 1:
            nh = int(scale * h)
            nw = int(nh * new_ar)
        else:
            nw = int(scale * w)
            nh = int(nw / new_ar)

        image = image.resize((nw, nh), Image.BICUBIC)
        label = label.resize((nw, nh), Image.NEAREST)

        flip = self.rand() < .5
        if flip:
            image = image.transpose(Image.FLIP_LEFT_RIGHT)
            label = label.transpose(Image.FLIP_LEFT_RIGHT)

        # place image
        dx = int(self.rand(0, w - nw))
        dy = int(self.rand(0, h - nh))
        new_image = Image.new('RGB', (w, h), (128, 128, 128))
        new_label = Image.new('L', (w, h), (0))
        new_image.paste(image, (dx, dy))
        new_label.paste(label, (dx, dy))
        image = new_image
        label = new_label

        # distort image
        hue = self.rand(-hue, hue)
        sat = self.rand(1, sat) if self.rand() < .5 else 1 / self.rand(1, sat)
        val = self.rand(1, val) if self.rand() < .5 else 1 / self.rand(1, val)
        x = cv2.cvtColor(np.array(image, np.float32) / 255, cv2.COLOR_RGB2HSV)
        x[..., 0] += hue * 360
        x[..., 0][x[..., 0] > 1] -= 1
        x[..., 0][x[..., 0] < 0] += 1
        x[..., 1] *= sat
        x[..., 2] *= val
        x[x[:, :, 0] > 360, 0] = 360
        x[:, :, 1:][x[:, :, 1:] > 1] = 1
        x[x < 0] = 0
        image_data = cv2.cvtColor(x, cv2.COLOR_HSV2RGB) * 255
        return image_data, label


# DataLoader中collate_fn使用
def deeplab_dataset_collate(batch):
    images = []
    pngs = []
    seg_labels = []
    for img, png, labels in batch:
        images.append(img)
        pngs.append(png)
        seg_labels.append(labels)
    images = np.array(images)
    pngs = np.array(pngs)
    seg_labels = np.array(seg_labels)
    return images, pngs, seg_labels


if __name__ == "__main__":
    cityscapes_path = '../data/cityscapes'
    input_shape = [384, 384]
    num_classes = 19
    with open(os.path.join(cityscapes_path, "ImageSets/Segmentation/train.txt"),"r") as f:
        train_lines = f.readlines()
    train_dataset = DeeplabDataset(train_lines, input_shape, num_classes, True, cityscapes_path)
    train_dataset[1]

from deeplabv3-plus-pytorch.

Dejavusd commented on May 23, 2024

import os
import random
'''
用来生成imageset下的内容，主要是保存原图和标注图像重合的那部分文件名
'''

# -------------------------------------------------------#
#   指向VOC数据集所在的文件夹
#   默认指向根目录下的VOC数据集
# -------------------------------------------------------#
cityscapes_path = 'data/cityscapes'

if __name__ == "__main__":
    random.seed(0)
    print("Generate txt in ImageSets.")
    # 彩色分割图的路径
    segfilepath = os.path.join(cityscapes_path, 'SegmentationClass')
    # 保存训练集和验证集图像名称的txt文件的位置
    saveBasePath = os.path.join(cityscapes_path, 'ImageSets/Segmentation')

    ##################################################################################################
    citys = os.listdir(os.path.join(segfilepath, 'train'))
    temp_train_seg = []
    for city in citys:
        temp_train_seg += os.listdir(os.path.join(segfilepath, 'train/'+ city))
    total_train_seg = []  # 所有的训练分割图的文件名
    for seg in temp_train_seg:
        if seg.endswith("color.png"):
            total_train_seg.append(seg)

    tr = len(total_train_seg)
    # 打印大小
    print("train size", tr)
    # 四个要写入内容的txt文件
    ftrain = open(os.path.join(saveBasePath, 'train.txt'), 'w')
    for i in total_train_seg:
        # 把文件名后缀.png去掉
        name = i[:-16] + '\n'
        # 分别写入
        ftrain.write(name)
    ftrain.close()

    ##############################################################################################
    citys = os.listdir(os.path.join(segfilepath, 'val'))
    temp_val_seg = []
    for city in citys:
        temp_val_seg += os.listdir(os.path.join(segfilepath, 'val/' + city))
    total_val_seg = []  # 所有的训练分割图的文件名
    for seg in temp_val_seg:
        if seg.endswith("color.png"):
            total_val_seg.append(seg)

    tv = len(total_val_seg)
    # 打印大小
    print("val size", tv)
    # 四个要写入内容的txt文件
    fval = open(os.path.join(saveBasePath, 'val.txt'), 'w')
    for i in total_val_seg:
        # 把文件名后缀.png去掉
        name = i[:-16] + '\n'
        # 分别写入
        fval.write(name)
    fval.close()
###################################################################################################
    citys = os.listdir(os.path.join(segfilepath, 'test'))
    temp_test_seg = []
    for city in citys:
        temp_test_seg += os.listdir(os.path.join(segfilepath, 'test/' + city))
    total_test_seg = []  # 所有的训练分割图的文件名
    for seg in temp_test_seg:
        if seg.endswith("color.png"):
            total_test_seg.append(seg)

    ts = len(total_test_seg)
    # 打印大小
    print("test size", ts)
    # 四个要写入内容的txt文件
    ftest = open(os.path.join(saveBasePath, 'test.txt'), 'w')
    for i in total_test_seg:
        # 把文件名后缀.png去掉
        name = i[:-16] + '\n'
        # 分别写入
        ftest.write(name)
    ftest.close()

    print("Generate txt in ImageSets done.")

import os

import cv2
import numpy as np
from PIL import Image
from torch.utils.data.dataset import Dataset
from utils.utils import preprocess_input, cvtColor
# 在测试的时候要改一下这个，并改一下根目录。还没找到解决方法
# from utils import preprocess_input, cvtColor


class DeeplabDataset(Dataset):
    def __init__(self, annotation_lines, input_shape, num_classes, train, dataset_path):
        super(DeeplabDataset, self).__init__()
        self.annotation_lines = annotation_lines
        self.length = len(annotation_lines)
        self.input_shape = input_shape
        self.num_classes = num_classes
        self.train = train
        self.dataset_path = dataset_path

    def __len__(self):
        return self.length

    def __getitem__(self, index):
        annotation_line = self.annotation_lines[index]
        name = annotation_line.split()[0]

        # -------------------------------#
        #   从文件中读取图像
        # -------------------------------#
        city = name.split('_')[0]
        if self.train:
            jpg = Image.open(os.path.join(os.path.join(self.dataset_path, "JPEGImages/train/"+city), name + "leftImg8bit.png"))
            png = Image.open(os.path.join(os.path.join(self.dataset_path, "SegmentationClass/train/"+city), name + "gtFine_labelTrainIds.png"))
        else:
            jpg = Image.open(os.path.join(os.path.join(self.dataset_path, "JPEGImages/val/"+city), name + "leftImg8bit.png"))
            png = Image.open(os.path.join(os.path.join(self.dataset_path, "SegmentationClass/val/"+city), name + "gtFine_labelTrainIds.png"))
        # -------------------------------#
        #   数据增强
        # -------------------------------#
        jpg, png = self.get_random_data(jpg, png, self.input_shape, random=self.train)
        # 把通道数放到前面来
        jpg = np.transpose(preprocess_input(np.array(jpg, np.float64)), [2, 0, 1])
        png = np.array(png)
        # 把标签图像中大于类别数的像素数都置为最大像素数
        # cityscepes的数字是0 1 2 ... 18 255    255是白色，是要忽略的类，这里给变成19了
        # 所以编号为19的就是要忽略的类
        png[png >= self.num_classes] = self.num_classes
        # -------------------------------------------------------#
        #   转化成one_hot的形式
        #   在这里需要+1是因为voc数据集有些标签具有白边部分
        #   我们需要将白边部分进行忽略，+1的目的是方便忽略。
        # -------------------------------------------------------#
        # 给每个像素的ID都搞成one-hot编码，行数就是图像长*宽，列数就是num_calsses+1
        # 本来应该是19类，就是19列，但是这里多了一个要忽略的类，所以就是20列
        seg_labels = np.eye(self.num_classes + 1)[png.reshape([-1])]
        # 编码是厚度
        seg_labels = seg_labels.reshape((int(self.input_shape[1]), int(self.input_shape[0]), self.num_classes + 1))
        return jpg, png, seg_labels

    def rand(self, a=0, b=1):
        return np.random.rand() * (b - a) + a

    def get_random_data(self, image, label, input_shape, jitter=.3, hue=.1, sat=1.5, val=1.5, random=True):
        image = cvtColor(image)
        label = Image.fromarray(np.array(label))
        h, w = input_shape

        if not random:
            iw, ih = image.size
            scale = min(w / iw, h / ih)
            nw = int(iw * scale)
            nh = int(ih * scale)

            image = image.resize((nw, nh), Image.BICUBIC)
            new_image = Image.new('RGB', [w, h], (128, 128, 128))
            new_image.paste(image, ((w - nw) // 2, (h - nh) // 2))

            label = label.resize((nw, nh), Image.NEAREST)
            new_label = Image.new('L', [w, h], (0))
            new_label.paste(label, ((w - nw) // 2, (h - nh) // 2))
            return new_image, new_label

        # resize image
        rand_jit1 = self.rand(1 - jitter, 1 + jitter)
        rand_jit2 = self.rand(1 - jitter, 1 + jitter)
        new_ar = w / h * rand_jit1 / rand_jit2

        scale = self.rand(0.25, 2)
        if new_ar < 1:
            nh = int(scale * h)
            nw = int(nh * new_ar)
        else:
            nw = int(scale * w)
            nh = int(nw / new_ar)

        image = image.resize((nw, nh), Image.BICUBIC)
        label = label.resize((nw, nh), Image.NEAREST)

        flip = self.rand() < .5
        if flip:
            image = image.transpose(Image.FLIP_LEFT_RIGHT)
            label = label.transpose(Image.FLIP_LEFT_RIGHT)

        # place image
        dx = int(self.rand(0, w - nw))
        dy = int(self.rand(0, h - nh))
        new_image = Image.new('RGB', (w, h), (128, 128, 128))
        new_label = Image.new('L', (w, h), (0))
        new_image.paste(image, (dx, dy))
        new_label.paste(label, (dx, dy))
        image = new_image
        label = new_label

        # distort image
        hue = self.rand(-hue, hue)
        sat = self.rand(1, sat) if self.rand() < .5 else 1 / self.rand(1, sat)
        val = self.rand(1, val) if self.rand() < .5 else 1 / self.rand(1, val)
        x = cv2.cvtColor(np.array(image, np.float32) / 255, cv2.COLOR_RGB2HSV)
        x[..., 0] += hue * 360
        x[..., 0][x[..., 0] > 1] -= 1
        x[..., 0][x[..., 0] < 0] += 1
        x[..., 1] *= sat
        x[..., 2] *= val
        x[x[:, :, 0] > 360, 0] = 360
        x[:, :, 1:][x[:, :, 1:] > 1] = 1
        x[x < 0] = 0
        image_data = cv2.cvtColor(x, cv2.COLOR_HSV2RGB) * 255
        return image_data, label


# DataLoader中collate_fn使用
def deeplab_dataset_collate(batch):
    images = []
    pngs = []
    seg_labels = []
    for img, png, labels in batch:
        images.append(img)
        pngs.append(png)
        seg_labels.append(labels)
    images = np.array(images)
    pngs = np.array(pngs)
    seg_labels = np.array(seg_labels)
    return images, pngs, seg_labels


if __name__ == "__main__":
    cityscapes_path = '../data/cityscapes'
    input_shape = [384, 384]
    num_classes = 19
    with open(os.path.join(cityscapes_path, "ImageSets/Segmentation/train.txt"),"r") as f:
        train_lines = f.readlines()
    train_dataset = DeeplabDataset(train_lines, input_shape, num_classes, True, cityscapes_path)
    train_dataset[1]

我下载的cityscapes目录下面只有gtFine和leftImg8bit，我下载数据集的时候好像没有train.txt，是要自己做一个吗？

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

train.txt文件是要自己生成的，那个annotation.py文件就是用来生成这个txt文件的。我写的和博主的代码是一致的，博主的readme将流程介绍地很清楚。

from deeplabv3-plus-pytorch.

bubbliiiing commented on May 23, 2024

可以，我来研究一下0 0

from deeplabv3-plus-pytorch.

Dejavusd commented on May 23, 2024

可以，我来研究一下0 0

请问怎么得到分割目标的最小外接矩形或者坐标点

from deeplabv3-plus-pytorch.

bubbliiiing commented on May 23, 2024

我觉得其实难度还蛮低的……直接求xy的最大值和最小值不就好了吗

from deeplabv3-plus-pytorch.

Dejavusd commented on May 23, 2024

我觉得其实难度还蛮低的……直接求xy的最大值和最小值不就好了吗

在训练完50轮冻结训练时，报错oom：0 bytes free

from deeplabv3-plus-pytorch.

Dejavusd commented on May 23, 2024

我觉得其实难度还蛮低的……直接求xy的最大值和最小值不就好了吗

还有请问怎么用转的onnx进行推理

`import onnxruntime as nxrun
import numpy as np
from skimage.transform import resize
from skimage import io
from PIL import Image
import matplotlib.pyplot as plt
import cv2

image = io.imread("img/street.jpg")
image = np.rollaxis(image, 2, 0)

img = resize(image / 255, (3, 512, 512), anti_aliasing=True)
f_img = img[np.newaxis, :, :, :]
f_img = f_img.astype(np.float32)

sess = nxrun.InferenceSession("torch_model2.onnx")

print("The model expects input shape: ", sess.get_inputs()[0].shape)#[1, 3, 512, 512]

input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
result = sess.run(None, {input_name: f_img})`

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

我觉得其实难度还蛮低的……直接求xy的最大值和最小值不就好了吗

在训练完50轮冻结训练时，报错oom：0 bytes free

这个应该是解冻训练的batch_size设置过大造成的吧，因为解冻训练中的要训练的参数比冻结训练多，所以batch_size也要比冻结训练的小一些。

from deeplabv3-plus-pytorch.

bubbliiiing commented on May 23, 2024

我觉得其实难度还蛮低的……直接求xy的最大值和最小值不就好了吗

在训练完50轮冻结训练时，报错oom：0 bytes free

这个应该是解冻训练的batch_size设置过大造成的吧，因为解冻训练中的要训练的参数比冻结训练多，所以batch_size也要比冻结训练的小一些。

是的

from deeplabv3-plus-pytorch.

bubbliiiing commented on May 23, 2024

我觉得其实难度还蛮低的……直接求xy的最大值和最小值不就好了吗

还有请问怎么用转的onnx进行推理

`import onnxruntime as nxrun import numpy as np from skimage.transform import resize from skimage import io from PIL import Image import matplotlib.pyplot as plt import cv2

image = io.imread("img/street.jpg") image = np.rollaxis(image, 2, 0)

img = resize(image / 255, (3, 512, 512), anti_aliasing=True) f_img = img[np.newaxis, :, :, :] f_img = f_img.astype(np.float32)

sess = nxrun.InferenceSession("torch_model2.onnx")

print("The model expects input shape: ", sess.get_inputs()[0].shape)#[1, 3, 512, 512]

input_name = sess.get_inputs()[0].name label_name = sess.get_outputs()[0].name result = sess.run(None, {input_name: f_img})`

这个我没有办法

from deeplabv3-plus-pytorch.

commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

你好，我想请问一下你用这个模型训练cityscapes数据集效果怎么样呢？谢谢

from deeplabv3-plus-pytorch.

ChouCHou-y commented on May 23, 2024

train.txt文件是要自己生成的，那个annotation.py文件就是用来生成这个txt文件的。我写的和博主的代码是一致的，博主的readme将流程介绍地很清楚。
这就是不用官方划分好的数据，直接将他打乱重新分吗？

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

train.txt文件是要自己生成的，那个annotation.py文件就是用来生成这个txt文件的。我写的和博主的代码是一致的，博主的readme将流程介绍地很清楚。
这就是不用官方划分好的数据，直接将他打乱重新分吗？

不用自己划分，就是原来的train和val，只是把文件名写到一个txt里，方便训练的时候找到对应的图像。

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

你好，我想请问一下你用这个模型训练cityscapes数据集效果怎么样呢？谢谢
deeplabv3+的性能还是很不错的，只不过cityscapes数据集中的图比较大，要想训练得好需要把原始图像切分后再训练。

from deeplabv3-plus-pytorch.

ChouCHou-y commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

你好，我想请问一下你用这个模型训练cityscapes数据集效果怎么样呢？谢谢
deeplabv3+的性能还是很不错的，只不过cityscapes数据集中的图比较大，要想训练得好需要把原始图像切分后再训练。

train.txt文件是要自己生成的，那个annotation.py文件就是用来生成这个txt文件的。我写的和博主的代码是一致的，博主的readme将流程介绍地很清楚。
这就是不用官方划分好的数据，直接将他打乱重新分吗？

不用自己划分，就是原来的train和val，只是把文件名写到一个txt里，方便训练的时候找到对应的图像。

请问一下这个精细标签是不是分为19类我一直没有找对应的类别和rgb值可以分享一下吗

from deeplabv3-plus-pytorch.

xiamibudayang commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

你好，我想请问一下你用这个模型训练cityscapes数据集效果怎么样呢？谢谢
deeplabv3+的性能还是很不错的，只不过cityscapes数据集中的图比较大，要想训练得好需要把原始图像切分后再训练。

train.txt文件是要自己生成的，那个annotation.py文件就是用来生成这个txt文件的。我写的和博主的代码是一致的，博主的readme将流程介绍地很清楚。
这就是不用官方划分好的数据，直接将他打乱重新分吗？

不用自己划分，就是原来的train和val，只是把文件名写到一个txt里，方便训练的时候找到对应的图像。

请问一下这个精细标签是不是分为19类我一直没有找对应的类别和rgb值可以分享一下吗

labels = [
# name id trainId category catId hasInstances ignoreInEval color
Label( 'unlabeled' , 0 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'ego vehicle' , 1 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'rectification border' , 2 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'out of roi' , 3 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'static' , 4 , 255 , 'void' , 0 , False , True , ( 0, 0, 0) ),
Label( 'dynamic' , 5 , 255 , 'void' , 0 , False , True , (111, 74, 0) ),
Label( 'ground' , 6 , 255 , 'void' , 0 , False , True , ( 81, 0, 81) ),
Label( 'road' , 7 , 1 , 'flat' , 1 , False , False , (128, 64,128) ),
Label( 'sidewalk' , 8 , 2 , 'flat' , 1 , False , False , (244, 35,232) ),
Label( 'parking' , 9 , 255 , 'flat' , 1 , False , True , (250,170,160) ),
Label( 'rail track' , 10 , 255 , 'flat' , 1 , False , True , (230,150,140) ),
Label( 'building' , 11 , 3 , 'construction' , 2 , False , False , ( 70, 70, 70) ),
Label( 'wall' , 12 , 4 , 'construction' , 2 , False , False , (102,102,156) ),
Label( 'fence' , 13 , 5 , 'construction' , 2 , False , False , (190,153,153) ),
Label( 'guard rail' , 14 , 255 , 'construction' , 2 , False , True , (180,165,180) ),
Label( 'bridge' , 15 , 255 , 'construction' , 2 , False , True , (150,100,100) ),
Label( 'tunnel' , 16 , 255 , 'construction' , 2 , False , True , (150,120, 90) ),
Label( 'pole' , 17 , 6 , 'object' , 3 , False , False , (153,153,153) ),
Label( 'polegroup' , 18 , 255 , 'object' , 3 , False , True , (153,153,153) ),
Label( 'traffic light' , 19 , 7 , 'object' , 3 , False , False , (250,170, 30) ),
Label( 'traffic sign' , 20 , 8 , 'object' , 3 , False , False , (220,220, 0) ),
Label( 'vegetation' , 21 , 9 , 'nature' , 4 , False , False , (107,142, 35) ),
Label( 'terrain' , 22 , 10 , 'nature' , 4 , False , False , (152,251,152) ),
Label( 'sky' , 23 , 11 , 'sky' , 5 , False , False , ( 70,130,180) ),
Label( 'person' , 24 , 12 , 'human' , 6 , True , False , (220, 20, 60) ),
Label( 'rider' , 25 , 13 , 'human' , 6 , True , False , (255, 0, 0) ),
Label( 'car' , 26 , 14 , 'vehicle' , 7 , True , False , ( 0, 0,142) ),
Label( 'truck' , 27 , 15 , 'vehicle' , 7 , True , False , ( 0, 0, 70) ),
Label( 'bus' , 28 , 16 , 'vehicle' , 7 , True , False , ( 0, 60,100) ),
Label( 'caravan' , 29 , 255 , 'vehicle' , 7 , True , True , ( 0, 0, 90) ),
Label( 'trailer' , 30 , 255 , 'vehicle' , 7 , True , True , ( 0, 0,110) ),
Label( 'train' , 31 , 17 , 'vehicle' , 7 , True , False , ( 0, 80,100) ),
Label( 'motorcycle' , 32 , 18 , 'vehicle' , 7 , True , False , ( 0, 0,230) ),
Label( 'bicycle' , 33 , 19 , 'vehicle' , 7 , True , False , (119, 11, 32) ),
Label( 'license plate' , -1 , -1 , 'vehicle' , 7 , False , True , ( 0, 0,142) ),
]

from deeplabv3-plus-pytorch.

miscedence12 commented on May 23, 2024

@xiamibudayang 我可以知道你的cityscapes文件夹放的是哪些文件吗？因为我在运行annotation.py的时候一直提醒我没有train文件夹。

from deeplabv3-plus-pytorch.

ChouCHou-y commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

请问一下你复现cityscapes后的miou值有多少呀我只有50多想请教一下如果方便的话可以加你联系方式吗可有偿

from deeplabv3-plus-pytorch.

duduzai2019 commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

x[..., 0] += hue * 360
x[..., 0][x[..., 0] > 1] -= 1
x[..., 0][x[..., 0] < 0] += 1
x[..., 1] *= sat
x[..., 2] *= val
x[x[:, :, 0] > 360, 0] = 360
x[:, :, 1:][x[:, :, 1:] > 1] = 1
x[x < 0] = 0
请问一下，你这块代码进行的是什么操作

from deeplabv3-plus-pytorch.

bubbliiiing commented on May 23, 2024

色域变换

from deeplabv3-plus-pytorch.

duduzai2019 commented on May 23, 2024

好的😘

from deeplabv3-plus-pytorch.

ChouCHou-y commented on May 23, 2024

可以的，只需要重写annotation.py、dataloader.py，就可以了。我之前在这里卡了很久，现在训练好了，有什么问题可以互相交流啊

你好我尝试了您说的这种方法但是miou只能达到59 不知道事哪里出了问题很想请教一下您

from deeplabv3-plus-pytorch.

Chihanxx commented on May 23, 2024

@xiamibudayang 您好哥，请问修改后您的训练效果如何，miou达到了多少，想请教请教您

from deeplabv3-plus-pytorch.

Chihanxx commented on May 23, 2024

@Dejavusd 您好，请问修改了这两个文件之后有效果嘛，想请教请教您

from deeplabv3-plus-pytorch.

Voyagerlemon commented on May 23, 2024

@xiamibudayang 想请问一下，在train.py里除了一些up主设置的超参数外还需要做哪些修改吗，可以看一下您的具体的超参数吗

from deeplabv3-plus-pytorch.

Voyagerlemon commented on May 23, 2024

@xiamibudayang 您好，想问一下，callbacks.py里面的具体代码是如何更改的呢，根据重写的annotation.py和dataloader.py

from deeplabv3-plus-pytorch.

数据集为cityscapes about deeplabv3-plus-pytorch HOT 30 OPEN

Comments (30)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent