Code Monkey home page Code Monkey logo

aggregation-cross-entropy's People

Contributors

summerlvsong avatar whang94 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aggregation-cross-entropy's Issues

Why use ResLSTM for Offline Handwritten Chinese Text Recognition ?

Nice work!

I found this network architecture
(126,576)Input − 8C3 − MP2 − 32C3 − MP2 − 128C3 − MP2−5∗256C3−MP2−512C3−512C3−MP2−512C2− 3 ∗ 512ResLSTM − 7357F C − Output
in the paper for Handwritten Chinese Text Recognition.

Is it necessary for Chinese Text Recognition ?

关于论文公式(2)的理解问题

对general loss function为何可以通过第3节的公式(2)来估计?估计公式中的每一项概率不是远大于general loss function中的对应项吗?

KL Divergence

In the paper in https://arxiv.org/pdf/1904.08364.pdf sec 3.2 it is mentioned:
"We borrow the concept of cross-entropy from information theory, which is designed to measure the “distance” between two probability
distributions."

Wont' kl-divergence be a better way to measure the distance between both probability distributions ?

HWDB results

Hi, have you experimented on HWDB 2.0-2.2, could you share your results for ACE? Thanks

It doesn't work on my data, why don't you provide a pretrained model?

I pretrained a model using ctcloss and it works well. Then I loaded the weights and continued to train with the aceloss. The losses seemed to be coming down, but the test results were terrible, almost all wrong.

Here is my implementation of ACELoss.

device = torch.device("cuda:" + cfg.TRAIN.GPU_ID if torch.cuda.is_available() else "cpu")
class ACELoss(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, input_, target, target_lens):
        w, bs, num_class = input_.size()
        aggragetions = torch.zeros(bs, cfg.ARCH.NUM_CLASS)
        for i in range(bs):
            idx = 0
            for j in range(target_lens[i]):
                aggragetions[i][target[idx]] += 1
                idx += 1
            aggragetions[i][0] = w - target_lens[i]
        target = aggragetions.to(device)

        input_ = input_ + 1e-10
        input_ = torch.sum(input_, 0)
        input_ = input_ / w
        target = target / w

        loss = (-torch.sum(torch.log(input_) * target)) / bs
        return loss

loss decline but accuracy near to 0

I train a model(CRNN) base on dataset synth90k, through the loss decline step by step, the accuracy is near to 0 all the time. What casue this problem?

Can’t reproduce your results in your cvpr paper

I reproduced crnn+ctc and test it on IIIT5K+SVT+IC03+IC13 test database, got WER 0.153, which is same as the reported results in paper.
I also reproduced crnn+ace loss, but only got WER 0.205 on the same test database, any advise?
My environment:
pytorch 1.2.0
batchsize 60
trained only on 8-million synthetic data released by Jaderberg
iterations 1000k
adadelta rho 0.9

I would like to ask you how to accurately predict the character order of a word.

I recreated your project and found that the input GT was converted into a word list, which had lost its order, and your prediction only provided the number of characters. Only through the two-dimensional matrix position of the network output can barely judge the order, I would like to ask you how to accurately predict the character order of a word.

combined with CTC?

Very nice work,
can this method combine with CTC in 1D case to improve performace further? does it conflict with CTC when training?

Need Help! Loss nan

for this line: torch.log(input)

The 'input' is the softmax score (0-1).
If k-th class does not show in an input, the accumulative softmax score of all time steps for k-th class is very likely to be 0. Then this will result into torch.log(input) = nan.

How do you make sure that 'input' does not equal to 0 for 'torch.log(input)'

When will the code be publicly available?

Hello, it is an excellent work for your "Aggregation Cross-Entropy for Sequence Recognition" paper. Just want to check whether you will release the code or not. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.