Can anyone guide me, how to train on the Custom dataset for example CORD (<a href="htt

How to train on custom dataset? about docformer HOT 2 CLOSED

shabie commented on July 20, 2024

How to train on custom dataset?

from docformer.

Comments (2)

uakarsh commented on July 20, 2024 1

Here is how you can do it:

Write a dataset class, which reads an image address and the labels associated with it (like for sequence labeling, in which we need to predict what is the class for each text, we would have a label of (seq_len, 1), since each would have a single class)
Now, extract the features using the create_features function
And from the dataset object, return the 5 parameters (1. resized_scaled_img, 2.x_features, 3.y_featuers, 4. input_ids, 5l. labels)
Use a collate function in PyTorch for handling multiple inputs from a single dataset and then pass it as follows:

Use ResNetFeatureExtractor, to get the image feature of the resized_scaled)img
Use DocFormerEmbedding, to get the extracted features of x_features, y_features
Use LanguageFeatureExtractor, to get the features of the tokenized words

Having obtained these things, pass it through the DocFormerEncoder
And then, attach a linear layer according to your task requirement
Take the loss and backward propagate it

Hope this helps

We would shortly include these things in a smaller boilerplate, so that code becomes more clear and more concise, but I hope this helps.

from docformer.

gnanaravindhan commented on July 20, 2024

Here is how you can do it:

Write a dataset class, which reads an image address and the labels associated with it (like for sequence labeling, in which we need to predict what is the class for each text, we would have a label of (seq_len, 1), since each would have a single class)

Now, extract the features using the create_features function

And from the dataset object, return the 5 parameters (1. resized_scaled_img, 2.x_features, 3.y_featuers, 4. input_ids, 5l. labels)

Use a collate function in PyTorch for handling multiple inputs from a single dataset and then pass it as follows:

Use ResNetFeatureExtractor, to get the image feature of the resized_scaled)img

Use DocFormerEmbedding, to get the extracted features of x_features, y_features

Use LanguageFeatureExtractor, to get the features of the tokenized words

Having obtained these things, pass it through the DocFormerEncoder

And then, attach a linear layer according to your task requirement

Take the loss and backward propagate it

Hope this helps

We would shortly include these things in a smaller boilerplate, so that code becomes more clear and more concise, but I hope this helps.

@uakarsh Thanks for detailed step by step explanation, i'll have a go at it.

from docformer.

How to train on custom dataset? about docformer HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent