mahmoudnafifi / c5 Goto Github PK
View Code? Open in Web Editor NEWReference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)
License: Apache License 2.0
Reference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)
License: Apache License 2.0
What dataset is the pre-trained model trained on? If I want to run test, is the datasets like Cheng and Gehler be good test samples? I am just concerned whether these two sets are already in the training set for the pre-trained models.
Hi Dr. Afifi, Thanks for your great work!
I have read many of your inspiring papers about color constancy. From a private point of view, most of your papers are elegant enough and easy to understand, except the 'C5', which really confused me a lot.
I do not know if I have some misunderstanding of related learning such as Transfer learning or Transductive learning. I just could not understand the core training and testing processing of your method.
I have three questions below:
C5 approaches this problem through the lens of transductive inference: additional unlabeled images are provided as input to the model at test time
However, you also highlight that even in the training process, the model is trained on labeled and unlabeled images……
Our system is trained using labeled (and unlabeled) images from multiple cameras, but at test time our model is able to look at a set of the (unlabeled) test set images from a new camera.
For instance, if you use sensor1's image as training and testing on sensor2
The query image represents one of an image with a label from sensor1 and the additional image is selected from sensor1 without labels, am I right?
The query image represents one of an image with a label from sensor2 and the additional images are selected from sensor2 without labels, actually, just the same as the Training process except for the dataset, am I right?
If so, what the paper said that did not use any labels which are biased for the truth
In contrast, our technique requires no ground-truth labels for the unseen camera, and is essentially calibration-free for this new sensor.
If not so, I am confused that how the model could be updated with its parameters without labels in the testing process.
For camera-specific, it is clear that leave-one-out means you use n-1 images as training and the left 1 image as testing, looping it n times. However, I do not understand what the paper which focuses on cross-sensor, said here:
we adopt a leave-one-out cross-validation evaluation approach: for each dataset, we exclude all scenes and cameras used by the test set from our training images. For a fair comparison with FFCC [13], we trained FFCC using the same leave-one-out cross-validation evaluation approach.
Can you describe the detail of the leave-one-out method here, for instance, how did you re-train the FFCC method using such method?
The three above questions may be connected with each other.
I will be very grateful for your reply~
hello,I am very interested in your C5 work. But I have some doubts on the format of the input image.
When I use my own dataset, should its data be 16-bit?,Is it ok if I directly enter a .hdr format?
When the output is generated, what operations should be performed to get the feeling that accords with the human eye, such as CCM, AE and other operations?
Greatful thanks
Hi Mahmoud,
For my work, one of my tasks is to reproduce the results mentioned in your paper. For this, I am testing the provided pre-trained model on the INETL-TAU dataset (7022 images) with m=7. Since the images are already black-level subtracted as mentioned on the dataset website, I am directly passing resized PNG images (384×256) to the model along with corresponding .json files (illuminant information). I am also using cross-validation but without the G multiplier for testing. My obtained results are as: Mean: 2.61, Median: 1.77, Best25: 0.57, Worst25: 1.44, Worst05: 2.16, Tri: 1.95, and Max: 28.39.
There is slight variation in results except the Worst25 which has a lot. As per my understanding, one reason could be the random sample selection nature of cross-validation. Is it so? or is there any other important step, I am missing?
Another thing to be mentioned, during the test I didn't mask out the color checker present in the scenes that you mentioned in the paper. Could you please provide details on it, how you did that? Because I think for the masking the coordinates for the color checker in each scene should be known.
Thanks for your great work! it has indeed sparked a lot of inspiration for me. However, there are several aspects that I would like to discuss further:
The paper mentioned: "To allow the network to reason about the set of additional input images in a way that is insensitive to their ordering, we adopt the permutation invariant pooling approach of Aittala et al."
Regarding the number of additional unlabeld images (m), it appears that were used in both the training and testing stages. From the ablation study, it seems that various values of m were only tested on the test camera, as illustrated in Table 4. I have a question about this:
When m equals 1, I understand that this means only the query image is used during testing. If so, my question is:
Thank you for taking the time to answer these questions. Your responses will be greatly beneficial to my understanding.
Hi Mahmoud,
Thanks for sharing the great work!
May I know whether there is some preprocessing to the images input to the network. Here is what I observed, when the input is from the ffcc's dataset: https://github.com/google/ffcc/tree/master/data
the output looks good. But if input other data like NUS from here: http://cvil.eecs.yorku.ca/projects/public_html/illuminant/illuminant.html
the output is bad. Is this because I missed some preprocessing steps?
Thanks
Simon
Hi, @mahmoudnafifi , I have a question about your data augmentation method.
I think that I have a little confusion about the color space transform process in general ISP, which converts WB applied raw image into CIE XYZ color space.
(I referenced the 2019 ICCV tutorial by your supervisor, Professor Michael Brown)
As far as I know, CST only changes the axis(or basis) representing the color, it doesn't change the unique color itself.
So from what I understand, the CIE XYZ images (with WB applied) for the same scene on two different devices are different because they represent colors (unique colors, which are different from each other) observed by different sensors in the canonical color space (axis, CIE XYZ space).
However, according to the data augmentation method presented in the paper, the above sentence I said is wrong.
According to the method used in your paper, since images in CIE XYZ space are device-independent, data augmentation in RAW corresponding to each device is possible using conversion/inverse transformation to CIE XYZ space.
I'd appreciate it if you could let me know which of the two is correct in the part where I'm mistaken.
Is there a sample json file so I can generate json files for new data?
Thanks,
Simon
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.