Code Monkey home page Code Monkey logo

basicocr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

basicocr's Issues

修改demo代码,使之适用于cpu和gpu

#coding: utf-8
import torch
from torch.autograd import Variable
import utils
import dataset
import os
from PIL import Image

import models.crnn as crnn

#os.environ["CUDA_VISIBLE_DEVICES"] ="1"
model_path = './data/netCRNN_ch_nc_21_nh_128.pth'
img_path = './data/image33.jpg'
alphabet = u''ACIMRey万下依口哺摄次状璐癌草血运重'
#print(alphabet)
nclass = len(alphabet) + 1

判断是否含有GPU

if torch.cuda.is_available():
model = crnn.CRNN(32, 1, nclass, 128).cuda()
pre_model = torch.load(model_path)
else:
model = crnn.CRNN(32, 1, nclass, 128)
pre_model = torch.load(model_path,map_location=lambda storage, loc: storage)

print('loading pretrained model from %s' % model_path)
for k,v in pre_model.items():
print(k,len(v))
model.load_state_dict(pre_model)

converter = utils.strLabelConverter(alphabet)

transformer = dataset.resizeNormalize((100, 32))
image = Image.open(img_path).convert('L')

#是否含有GPU
if torch.cuda.is_available():
image = transformer(image).cuda()
else:
image = transformer(image)

image = image.view(1, *image.size())
image = Variable(image)

model.eval()
preds = model(image)

_, preds = preds.max(2)
preds = preds.squeeze(2)
preds = preds.transpose(1, 0).contiguous().view(-1)

preds_size = Variable(torch.IntTensor([preds.size(0)]))
raw_pred = converter.decode(preds.data, preds_size.data, raw=True)
sim_pred = converter.decode(preds.data, preds_size.data, raw=False)
print('%-20s => %-20s' % (raw_pred.encode('utf8'), sim_pred.encode('utf8')))

多GPU训练loss不对,单GPU训练没有问题

麻烦问一下,crnn, 我用多个gpu训练的loss感觉不对,与单gpu训练相同Loss值的模型,预测结果非常差,单gpu训练的模型预测是正确的。@ wulivicte,请问您遇到过这种问题吗,怎么解决的,非常感谢。

How can l get the probability of the sequence outputted by CRNN ?

Hello,

l'm wondering whether the CRNN is able to output also the probability of each sequence

from example :

--h-e--ll-oo- => 'hello' with a probability= 0.89
for instance
how can l get that ?

in the code CTCLoss can't find these probabilites .
However l don't find where to print the output probabilities in CTCloss(). In __init__.py the CTC class is defined as follow :

class _CTC(Function):
    def forward(self, acts, labels, act_lens, label_lens):
        is_cuda = True if acts.is_cuda else False
        acts = acts.contiguous()
        loss_func = warp_ctc.gpu_ctc if is_cuda else warp_ctc.cpu_ctc
        grads = torch.zeros(acts.size()).type_as(acts)
        minibatch_size = acts.size(1)
        costs = torch.zeros(minibatch_size)
        loss_func(acts,
                  grads,
                  labels,
                  label_lens,
                  act_lens,
                  minibatch_size,
                  costs)
        self.grads = grads
        self.costs = torch.FloatTensor([costs.sum()])
        return self.costs

    def backward(self, grad_output):
        return self.grads, None, None, None


class CTCLoss(Module):
    def __init__(self):
        super(CTCLoss, self).__init__()

    def forward(self, acts, labels, act_lens, label_lens):
        """
        acts: Tensor of (seqLength x batch x outputDim) containing output from network
        labels: 1 dimensional Tensor containing all the targets of the batch in one sequence
        act_lens: Tensor of size (batch) containing size of each output sequence from the network
        act_lens: Tensor of (batch) containing label length of each example
        """
        _assert_no_grad(labels)
        _assert_no_grad(act_lens)
        _assert_no_grad(label_lens)
        return _CTC()(acts, labels, act_lens, label_lens)

Is RARE?

Hello,This model is RARE?Why it is CRNN in here.

关于环境配置的疑问

您好,我看textbox实验的内容,环境配置的问题,请教如何配置呢?Ubuntu14.04系统下配安装docker?“服务器上用nvidia-docker从镜像gds/keras-th-tf-opencv中新建了caffe_ys容器。按照caffe的依赖文件,并编译GPU版本。”这句话不理解╮(╯▽╰)╭

请问下添加与训练模型怎么微调修改

我在运行 crnn_main.py的时候,因为我的类和预训练模型不一样,所以需要修改,我修改了key.py里面的汉字,再crnn_main修改requires_grad但是没找到修改类的地方,想请假一下在哪里修改啊

How to train a new model

Hi,
can anyone please tell me how to train this model for transfer learning.
My model is working fine. But when I train it I am getting following error.

Traceback (most recent call last): File "crnn_main.py", line 191, in <module> train_iter = iter(train_loader) File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 303, in __iter__ return DataLoaderIter(self) File "/usr/local/lib/python2.7/dist-packages/torch/utils/data/dataloader.py", line 143, in __init__ self.sample_iter = iter(self.sampler) File "/home/pranay/crnn.pytorch/dataset.py", line 99, in __iter__ random_start = random.randint(0, len(self) - self.batch_size) File "/usr/lib/python2.7/random.py", line 242, in randint return self.randrange(a, b+1) File "/usr/lib/python2.7/random.py", line 218, in randrange raise ValueError, "empty range for randrange() (%d,%d, %d)" % (istart, istop, width) ValueError: empty range for randrange() (0,-33, -33) Exception AttributeError: "'DataLoaderIter' object has no attribute 'shutdown'" in <bound method DataLoaderIter.__del__ of <torch.utils.data.dataloader.DataLoaderIter object at 0x7ff08e6da050>> ignored

I am using the following line on terminal to run the code
python crnn_main.py --trainroot="Pranay/train_set/" --valroot="Pranay/validation_set" --alphabet='0123456789abcdefghijklmnopqrstuvwxyz !-%.'"'"',#&$\/[]:()?;'

I just want to test that training is possible before I train the complete model. Hence I have selected validation set with 10 images and train set with 30 images. I know it is not possible to train by using just 30 images. But still, I just want to see my training on CPU working. I think this error has something to do with the small size of training data.

When I try to run for more images (150-200), my computer gets hanged. Hence, I am trying for small subset.

I am training on ICDAR 2015 data.
I generated .mdb data by using input images path list as (This is just a sample. I hae more data in both lists)

  • ['../test_images/word_11.png',
    '../test_images/word_12.png',
    '../test_images/word_13.png',
    '../test_images/word_14.png',]

and label path list as
['Genaxis Theatre',
'[06]',
'62-03',
'Carpark',
]

I have generated train.mdb and lock.mdb files both in folder train_set and validation_set. I have changed the alphabet to take into account new special characters as
--alphabet='0123456789abcdefghijklmnopqrstuvwxyz !-%.'"'"',#&$/[]:()?;'

Please, can you help me train my model? Any help is really appreciated.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.