Comments (6)
Interesting, I would expect positional encoding (possibly a different encoding than a simple linear mesh) would have helped.
So, this suggests a few possible outcomes (1x1 filter/conv here will always refer to the conv in the frequency domain inside the Spectral Transform block):
- The 1x1 filter doesn't take frequency into account/is frequency agnostic (this also applies to the original FFC paper).
- Some sort of spatial information is latently encoded in the featuremaps. In this case, the 1x1 convolution takes frequency into account, but the positional encoding is redundant.
- The frequency/phase domain isn't really important; each pixel in the spectral image is to just be interpreted as a different hash of the entire input featuremap.
- The 1x1 filter doesn't actually do anything. Perhaps the real power just comes from applying BN and relu on the spectral image before applying the iFFT. Or, perhaps its inappropriate to perform the BN/relu since it limits what the post-iFFT transform image looks like.
from lama.
Wow! Great that you've noticed that :)
We experimented with positional encoding in spectral domain just a little bit. It did not help for the inpainting on our benchmarks - but might work in other cases. But we did not explore that feature thoroughly enough to say something for sure.
I'll be happy to hear back if this feature helps :)
from lama.
Hi @BrianPugh, Have you done any further research on that?
from lama.
i have not had a chance/the resources to perform experiments with these changes.
from lama.
Interesting, I would expect positional encoding (possibly a different encoding than a simple linear mesh) would have helped.
So, this suggests a few possible outcomes (1x1 filter/conv here will always refer to the conv in the frequency domain inside the Spectral Transform block):
- The 1x1 filter doesn't take frequency into account/is frequency agnostic (this also applies to the original FFC paper).
- Some sort of spatial information is latently encoded in the featuremaps. In this case, the 1x1 convolution takes frequency into account, but the positional encoding is redundant.
- The frequency/phase domain isn't really important; each pixel in the spectral image is to just be interpreted as a different hash of the entire input featuremap.
- The 1x1 filter doesn't actually do anything. Perhaps the real power just comes from applying BN and relu on the spectral image before applying the iFFT. Or, perhaps its inappropriate to perform the BN/relu since it limits what the post-iFFT transform image looks like.
Great idea!I agree with you. Maybe I can do some experiments.
from lama.
Interesting, I would expect positional encoding (possibly a different encoding than a simple linear mesh) would have helped.
So, this suggests a few possible outcomes (1x1 filter/conv here will always refer to the conv in the frequency domain inside the Spectral Transform block):
- The 1x1 filter doesn't take frequency into account/is frequency agnostic (this also applies to the original FFC paper).
- Some sort of spatial information is latently encoded in the featuremaps. In this case, the 1x1 convolution takes frequency into account, but the positional encoding is redundant.
- The frequency/phase domain isn't really important; each pixel in the spectral image is to just be interpreted as a different hash of the entire input featuremap.
- The 1x1 filter doesn't actually do anything. Perhaps the real power just comes from applying BN and relu on the spectral image before applying the iFFT. Or, perhaps its inappropriate to perform the BN/relu since it limits what the post-iFFT transform image looks like.
Great idea!I agree with you. Maybe I can do some experiments.
Hi, have you done an experiment? What was the result?
from lama.
Related Issues (20)
- About the training command 2 HOT 1
- Created single-file version of LaMa
- Question about generating validation and eval data
- Can I separate the Feature Refinement to Improve the High-Resolution Image Inpainting technique
- A simple ckpt to pt model convertor
- Repeated Refinement?
- Error finetuning the big-lama-with-discr model HOT 7
- Data set training problem HOT 1
- After executing the training command, it has been stuck at this point without any progress in the training. HOT 1
- Inpaint a NEW thing? HOT 3
- Refinement with Multiple Images
- How to draw a loss function curve
- Dataset is empty if configuring img_suffix: .jpg in default.yaml
- ONNX Model done HOT 4
- Output Error: No inpainted in the output_dir HOT 1
- Can't install at image in Docker
- The completion effect is not good? HOT 1
- Does Llama support inpainting a given image onto the original image as opposed to just removing from the original image
- Can't find dataloader , no outputs HOT 1
- Is there a way to find out the result is quality?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lama.