Code Monkey home page Code Monkey logo

c5's People

Contributors

mahmoudnafifi avatar manipopopo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

c5's Issues

what dataset is the pre-trained model trained on?

What dataset is the pre-trained model trained on? If I want to run test, is the datasets like Cheng and Gehler be good test samples? I am just concerned whether these two sets are already in the training set for the pre-trained models.

Some questions about the Training and Testing details

Hi Dr. Afifi, Thanks for your great work!
I have read many of your inspiring papers about color constancy. From a private point of view, most of your papers are elegant enough and easy to understand, except the 'C5', which really confused me a lot.

I do not know if I have some misunderstanding of related learning such as Transfer learning or Transductive learning. I just could not understand the core training and testing processing of your method.

I have three questions below:

  1. From your abstract, it seems that you only use the additional labels image at test time.

C5 approaches this problem through the lens of transductive inference: additional unlabeled images are provided as input to the model at test time

However, you also highlight that even in the training process, the model is trained on labeled and unlabeled images……

Our system is trained using labeled (and unlabeled) images from multiple cameras, but at test time our model is able to look at a set of the (unlabeled) test set images from a new camera.

  1. I am not sure if I understand the meaning of query image and additional image in your training and testing procedure.

For instance, if you use sensor1's image as training and testing on sensor2

  • Training

The query image represents one of an image with a label from sensor1 and the additional image is selected from sensor1 without labels, am I right?

  • Testing

The query image represents one of an image with a label from sensor2 and the additional images are selected from sensor2 without labels, actually, just the same as the Training process except for the dataset, am I right?

If so, what the paper said that did not use any labels which are biased for the truth

In contrast, our technique requires no ground-truth labels for the unseen camera, and is essentially calibration-free for this new sensor.

If not so, I am confused that how the model could be updated with its parameters without labels in the testing process.

  1. The leave-one-out evaluation approach

For camera-specific, it is clear that leave-one-out means you use n-1 images as training and the left 1 image as testing, looping it n times. However, I do not understand what the paper which focuses on cross-sensor, said here:

we adopt a leave-one-out cross-validation evaluation approach: for each dataset, we exclude all scenes and cameras used by the test set from our training images. For a fair comparison with FFCC [13], we trained FFCC using the same leave-one-out cross-validation evaluation approach.

Can you describe the detail of the leave-one-out method here, for instance, how did you re-train the FFCC method using such method?

The three above questions may be connected with each other.
I will be very grateful for your reply~

some question about the input image type

hello,I am very interested in your C5 work. But I have some doubts on the format of the input image.
When I use my own dataset, should its data be 16-bit?,Is it ok if I directly enter a .hdr format?
When the output is generated, what operations should be performed to get the feeling that accords with the human eye, such as CCM, AE and other operations?
Greatful thanks

Testing results

Hi Mahmoud,
For my work, one of my tasks is to reproduce the results mentioned in your paper. For this, I am testing the provided pre-trained model on the INETL-TAU dataset (7022 images) with m=7. Since the images are already black-level subtracted as mentioned on the dataset website, I am directly passing resized PNG images (384×256) to the model along with corresponding .json files (illuminant information). I am also using cross-validation but without the G multiplier for testing. My obtained results are as: Mean: 2.61, Median: 1.77, Best25: 0.57, Worst25: 1.44, Worst05: 2.16, Tri: 1.95, and Max: 28.39.

There is slight variation in results except the Worst25 which has a lot. As per my understanding, one reason could be the random sample selection nature of cross-validation. Is it so? or is there any other important step, I am missing?

Another thing to be mentioned, during the test I didn't mask out the color checker present in the scenes that you mentioned in the paper. Could you please provide details on it, how you did that? Because I think for the masking the coordinates for the color checker in each scene should be known.

Many detaild questions

Thanks for your great work! it has indeed sparked a lot of inspiration for me. However, there are several aspects that I would like to discuss further:

The paper mentioned: "To allow the network to reason about the set of additional input images in a way that is insensitive to their ordering, we adopt the permutation invariant pooling approach of Aittala et al."

1. Could you elaborate on why insensitivity to ordering is crucial? Specifically, I'm curious whether a sufficiently large training dataset would inherently cover all potential orderings.

Regarding the number of additional unlabeld images (m), it appears that were used in both the training and testing stages. From the ablation study, it seems that various values of m were only tested on the test camera, as illustrated in Table 4. I have a question about this:

2. During the training process, did you experiment with varying quantities for 'm', or was there a consistent fixed number applied throughout, for example, 8?

When m equals 1, I understand that this means only the query image is used during testing. If so, my question is:

4. Could you clarify whether m=1 only signifies the zero-shot condition, i.e., just inferring, or does it mean that the single query image is used for self-calibration, followed by parameter fixation, and then inference?

5. From the results shown in Table 4, it doesn't seem that the results improve as m increases(i.e., error(m=13)>error(m=7)). Could you provide some insights into this?

6. Have you considered using additional labeled images for fine-tuning? If so, would this lead to better results than the current method?

Thank you for taking the time to answer these questions. Your responses will be greatly beneficial to my understanding.

about preprocessing of the input image

Hi Mahmoud,

Thanks for sharing the great work!
May I know whether there is some preprocessing to the images input to the network. Here is what I observed, when the input is from the ffcc's dataset: https://github.com/google/ffcc/tree/master/data
the output looks good. But if input other data like NUS from here: http://cvil.eecs.yorku.ca/projects/public_html/illuminant/illuminant.html
the output is bad. Is this because I missed some preprocessing steps?

Thanks
Simon

Question about your data augmentation method and CIE XYZ color space

Hi, @mahmoudnafifi , I have a question about your data augmentation method.

I think that I have a little confusion about the color space transform process in general ISP, which converts WB applied raw image into CIE XYZ color space.
(I referenced the 2019 ICCV tutorial by your supervisor, Professor Michael Brown)

As far as I know, CST only changes the axis(or basis) representing the color, it doesn't change the unique color itself.
So from what I understand, the CIE XYZ images (with WB applied) for the same scene on two different devices are different because they represent colors (unique colors, which are different from each other) observed by different sensors in the canonical color space (axis, CIE XYZ space).

However, according to the data augmentation method presented in the paper, the above sentence I said is wrong.
According to the method used in your paper, since images in CIE XYZ space are device-independent, data augmentation in RAW corresponding to each device is possible using conversion/inverse transformation to CIE XYZ space.

I'd appreciate it if you could let me know which of the two is correct in the part where I'm mistaken.

json files

Is there a sample json file so I can generate json files for new data?

Thanks,
Simon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.