I have found https://ar

Neural Abstract Reasoner: claims that they has achieved 80% accuracy on this dataset about arc HOT 5 CLOSED

fchollet commented on September 17, 2024

Neural Abstract Reasoner: claims that they has achieved 80% accuracy on this dataset

from arc.

Comments (5)

jmmcd commented on September 17, 2024 1

"As this work is still in progress, these are preliminary results evaluated on grids up to 10×10."

I guess this 80% is only on the grids of small sizes, which are disproportionately the "easy" tasks involving mere rotation/mirroring.

from arc.

enceladus2000 commented on September 17, 2024

Has anyone tried implementing this paper? I can't find a working demonstration anywhere.

from arc.

hassanshallal commented on September 17, 2024

Hi, I worked on ti for a few months 2020. I haven’t read that article yet thought. Hassan

…

On Feb 27, 2022, at 7:25 AM, Tanmay Bhonsale ***@***.***> wrote: Has anyone tried implementing this paper? I can't find a working demonstration anywhere. — Reply to this email directly, view it on GitHub <#82 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AESS5ZAEYMANMLPRTTMLONTU5IX5TANCNFSM4Z75RC6Q>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you are subscribed to this thread.

from arc.

Sebastian-0 commented on September 17, 2024

I think there are several strange things about their paper.

In a later presentation (2021) they say "NAR achieves 61.13% accuracy on the Abstraction and Reasoning Corpus" with no further explanation. Is that measure on the entire dataset (i.e. all grid sizes)? Otherwise, why is number different? They use the exact same graphs as motivation in their poster as in the old article, yet with different numbers as the result, see: https://eucys2021.usal.es/computing-03-2021/
As far as I understand it they evaluate on the public test set, yet they compare it to the Kaggle competition which ran on completely different, hidden, tasks.
They claim to solve 78,8% of 100 hidden tasks but don't explain how they get .8 when the test are binary.
There is no discussion on what the impact is when excluding all larger grids, this is especially relevant when they are comparing agains the Kaggle competition.
There is no source code, no one (AFAIK) has reproduced the results, and there is no official benchmark against the hidden test set.

It's possible they have devised an approach that is better than the previous state-of-the-art, but at this point I find it hard to take their numbers at face value.

from arc.

hassanshallal commented on September 17, 2024

Thank you for sharing your thoughts. I share perspective on some of these points and also noticed the papers used data augmentation. I am not convinced that using deep learning in any shape or form can tackle escalating levels of generalization from local (robustness), to broad (flexibility), and finally to extreme generalization.

…

On Mar 3, 2022, at 2:05 AM, Sebastian Hjelm ***@***.***> wrote: I think there are several strange things about their paper. In a later presentation (2021) they say "NAR achieves 61.13% accuracy on the Abstraction and Reasoning Corpus" with no further explanation. Is that measure on the entire dataset (i.e. all grid sizes)? Otherwise, why is number different? They use the exact same graphs as motivation in their poster as in the old article, yet with different numbers as the result, see: https://eucys2021.usal.es/computing-03-2021/ <https://eucys2021.usal.es/computing-03-2021/> As far as I understand it they evaluate on the public test set, yet they compare it to the Kaggle competition which ran on completely different, hidden, tasks. They claim to solve 78,8% of 100 hidden tasks but don't explain how they get .8 when the test are binary. There is no discussion on what the impact is when excluding all larger grids, this is especially relevant when they are comparing agains the Kaggle competition. There is no source code, no one (AFAIK) has reproduced the results, and there is no official benchmark against the hidden test set. It's possible they have devised an approach that is better than the previous state-of-the-art, but at this point I find it hard to take their numbers at face value. — Reply to this email directly, view it on GitHub <#82 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AESS5ZDPV257IPBCBZDNDALU6B6GLANCNFSM4Z75RC6Q>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you commented.

from arc.

Neural Abstract Reasoner: claims that they has achieved 80% accuracy on this dataset about arc HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent