Comments (7)
Have a look here: https://github.com/uakarsh/docformer/blob/master/examples/DocFormer_for_MLM.ipynb
The error is because, the entity is not batched (i.e having a shape of (....), rather than (batch_size,....)
from docformer.
@uakarsh Thank you for your help! Does this mean that the Usage section of the README can't actually be used? I was trying to do a demo of it to my study group. I tried encoding['resized_scaled_img'] = encoding['resized_scaled_img'].unsqueeze(0)
to add a batch size of 1, but that didn't work either.
from docformer.
It can be used, we just need to pass an argument, add_batch_dim=True
, in dataset.create_features
function.
from docformer.
The thing, which you did also won't work, because there are more than just image features, i.e you need to unsqueeze the other features as well. I have updated the readme, hope it helps
from docformer.
Thank you so much, it runs now! Unsqueezing each feature also works for me, but add_batch_dim
is more straightforward. Are there any examples of followup steps (i.e., what the resulting tensor means in terms of the input image)? I can't find that in the README and examples.
from docformer.
Maybe, you can have a look at the notebook, which I shared previously. In that notebook, you can go through the DocFormerForMLM
class, and look at the forward method there. I would briefly describe it here:
All the shapes are mentioned as per the default configuration
- The
self.embeddings
, are responsible for encoding the spatial features of the bounding boxes (size -> (512,768) - The
self.resent
, is responsible for extracting the image feature (size -> (512, 768) - The
self.lang_emb
, is responsible for the language feature extraction from the words of the bounding boxes (size -> (512,768) - The
self.encoder
, calculates the attention and forward propagates it (size -> (512,768)
And then, for downstream task, the linear layers are attached. Hope it helps.
from docformer.
Ok, understood, thank you very much for your help. I'll close the issue since the example runs!
from docformer.
Related Issues (20)
- Reference code of pre-training tasks HOT 4
- [Errno 2] No such file or directory: 'rvl_cdip_dataset.csv'
- NER task HOT 4
- Predictions are wrong. HOT 1
- can you provide visual question answering task code HOT 1
- Unable to convert model to onnx HOT 3
- DocFormer for Token Classification. HOT 3
- Shape mismatch during sanity check HOT 1
- finetune the Docformer HOT 8
- Pre-trained models HOT 5
- Error in Example: Please provide the bounding box and words or pass the argument "use_ocr" = True HOT 1
- DocFormer for key-value pairs extraction HOT 1
- Using pre-trained models HOT 17
- Inference for token classification. HOT 1
- We can not find DocFormer_For_IR in modeling.py and modeling_l.py HOT 1
- Permission Denied Pretraining Weights HOT 2
- NotImplementedError: Support for `validation_epoch_end` has been removed in v2.0.0. `DocFormer` implements this method HOT 2
- pre-training code tutorial HOT 3
- The pre-trained weight link is invalid HOT 1
- DocFormerv2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docformer.