Code Monkey home page Code Monkey logo

Comments (15)

behrica avatar behrica commented on June 24, 2024

Root cause is this:

(def newsgroups (sklearn.datasets/fetch_20newsgroups :subset "all" :remove (builtins/tuple [ "headers" "footers" "quotes"])))
(.hashCode newsgroups)

failing with:

 Unhandled java.lang.Exception
   TypeError: unhashable type: 'Bunch'

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

There seems to be the opinion in the java community, that .hashCode implementations should never throw exceptions.

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

But this seem to be a very special case in Python, where a python type is not hashable.
Bunch extends dict, and dict is not hashable in Python.

from libpython-clj.

jjtolton avatar jjtolton commented on June 24, 2024

Interesting. I can tell you that the user interface philosophy so far as been:

  1. Default to Clojure idioms, unless adopting Clojure idioms would prevent certain Python behavior -- i.e., automatically casting a Python list to a vector would not allow using .append() style methods to the Python list.
  2. Allow opt-in Python idioms where appropriate, i.e., :bind-ns allows a user to have Python module be bound the a Clojure namespace symbol.

This is the first time I'm aware of that there has been a conflict with a Java idiom. For instance, hash([]) throwing an error is expected Python behavior. I suppose the acceptable solution would be, "allow tools that are expecting Java objects to behave like Java objects have objects that behave like Java objects, but Python code expecting Python objects should have Python objects that behave like Python objects." I'm sure there's a more elegant way to phrase that, and I can already see the conceptual difficulty with figuring out how to approach the problem.

The simple approach would be to patch the hashing behavior for Java, so maybe that's best. I don't think many libpython-clj users would be overly upset that calling hash on an unhashable object would return nil rather than throw an error.

from libpython-clj.

cnuernber avatar cnuernber commented on June 24, 2024

Hmm. Or embrace and extend the python dict type somehow to support clojure's algorithm for hashing.

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

I think return 0 is as well acceptable for a java hashcode impl (better the nil)

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

There are some reports that Clerk cannot render the "newsgroup" objects, no sure if same reason.
https://clojurians.slack.com/archives/CLR5FD4ET/p1701986985003569

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

Maybe the code here:

(throw (Exception. ^String error-str))))

should catch TypeError: unhashable type:
and not throw, but return 0 instead.

Just to address the point that "non hashable" in python is "expected", while in java its not.

from libpython-clj.

jjtolton avatar jjtolton commented on June 24, 2024

Hmm. Or embrace and extend the python dict type somehow to support clojure's algorithm for hashing.

As a hacker, I would love this approach and I think it would fulfill the original intent of introspectable datastructures. Less hacker-inclined devs may fume a bit at the potential implications of sets of mutable dicts and dicts as keys, and other potential footguns. Not sure what the effort would be to make this behavior opt-in or toggle-able.

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

As a far simpler example, we can take this for discussion:

(hash
 (py/->py-dict {:a 1}))

This should NOT fail in my view, but it does:


1. Unhandled java.lang.Exception
  TypeError: unhashable type: 'dict'


                  ffi.clj:  707  libpython-clj2.python.ffi/check-error-throw
                  ffi.clj:  705  libpython-clj2.python.ffi/check-error-throw
                 base.clj:  180  libpython-clj2.python.base/hash-code
                 base.clj:   -1  libpython-clj2.python.base/hash-code
        bridge_as_jvm.clj:  231  libpython-clj2.python.bridge-as-jvm/generic-python-as-map/reify
                Util.java:  173  clojure.lang.Util/hasheq
                 core.clj: 5197  clojure.core/hash
                 core.clj: 5190  clojure.core/hash
                     REPL:   45  testLibPy.testLibPy/ev
                     

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

In my view, in the same way we have a default behavoiur for toString:

(str
 (py/->py-dict {:a 1}))

which should return "a string" for any libpython object, we should return "a number " for any libpython object, when .hashCode is called on it.
(similar for equals())

Which algorithm to use to calculate the hashcode is then a less important consideration.
(return 0 would be already better then exception)

from libpython-clj.

jjtolton avatar jjtolton commented on June 24, 2024

Well this discussion has also inspired me to open a new issue, because analytically I think a very important tool that is currently missing is the equivalent of py->clj and clj->py, analogous to js->clj and clj->js. Then it would be rather straightforward to do the (.hashCode (py->clj dict)). The obvious issue of course is that there is not a 1:1 correspondence, and it may be only marginally more useful than casting to json.

the issue with str, @behrica , is that Python objects are free to implement (or not) their own str implementation, and there would be a lot of unhelpful and borderline random behavior using str as a hashing key.

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

I am not suggesting to use str as hashing key.
In my view, we should "catch" "TypeError: unhashable type",
here:

(throw (Exception. ^String error-str))))

and either:
return 0, as hashcode or
return id() of the object as hashcode

Both will comply with the hashcode/equals rules of Java, I believe:
https://www.baeldung.com/java-equals-hashcode-contracts

from libpython-clj.

behrica avatar behrica commented on June 24, 2024

Or check here:

if the python object is hashable:
The presents of attribute "hash" can be checked for being "nil"

(->
 (py/->python "")
 (py/get-attr "__hash__"))
;; => #object[tech.v3.datatype.ffi.Pointer 0x5406e4ba "{:address 0x00007EFD4368E8B0 }"]
;;
(->
 (py/->py-dict {:a 1})
 (py/get-attr "__hash__"))
;; => nil

from libpython-clj.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.