Code Monkey home page Code Monkey logo

Comments (15)

tfrederiksen avatar tfrederiksen commented on August 25, 2024 2

Sorry, I didn't find time to study your proposal today. But from what I've seen in your exchange it looks good to me too.

from sisl.

pfebrer avatar pfebrer commented on August 25, 2024 1

Something like this (of course here the behavior is oversimplified):

class Executor:
     def __init__(self, func, obj, indices):
         self.func = func
         self.obj = obj
         self.indices = indices

     def __call__(self, *args, **kwargs):
         for i in self.indices:
             self.func(self.obj, *args, **kwargs)

class BoundReader:
     def __init__(self, func, obj):
         self.func = func
         self.obj = obj
   
    def __call__(self, *args, **kwargs):
         return self.func(self.obj, *args, **kwargs)

    def __getitem__(self, key):
        return Executor(self.func, self.obj, key)

class ReaderFunction:
     def __init__(self, func):
         self.func = func

     def __get__(self, owner, instance):
         return BoundReader(instance)

class Sile:
     
    @ReaderFunction
    def read_geometry(...):
        ...
     
     
 

from sisl.

zerothi avatar zerothi commented on August 25, 2024 1

Agreed. But the interface looks a bit complicated to me. Lets first start with the other things, and we can continue down the rabbit hole later ;)

from sisl.

zerothi avatar zerothi commented on August 25, 2024
          I have a somewhat working solution, see in the `sile-slicer` branch, a test is here:
open("/tmp/test.file", 'w').write("""1

C   0.00000000  0.00000000  0.00000000
2

C   0.00000000  0.00000000  0.00000000
C   1.000000  0.00000000  0.00000000
3

C   0.00000000  0.00000000  0.00000000
C   1.000000  0.00000000  0.00000000
C   2.00000  0.00000000  0.00000000
""")

from sisl.io import xyzSile

sile = xyzSile("/tmp/test.file")
print(sile.read_geometry(), help(sile.read_geometry))
print(sile.read_geometry[0](), help(sile.read_geometry[0]))
print(sile.read_geometry[1]())
print(sile.read_geometry[1:]())

It isn't complete, but just to get an idea?
I quickly ran into problems when decoration the function, since that happens at class-creation, the function won't bind self and thus they won't be passed, hence the __get__ work-around... A bit ugly IMHO.
Other suggestions on the implementation side?

Originally posted by @zerothi in #578 (comment)

from sisl.

zerothi avatar zerothi commented on August 25, 2024

@tfrederiksen @pfebrer lets continue the discussion here.

Please see the updated branch 584-sile-slicer, it contains more functionality and patches the documentation.

@pfebrer did the changes correspond to your remarks?

from sisl.

pfebrer avatar pfebrer commented on August 25, 2024

It is a bit strange that SileSlicer.__getitem__ stores the key and returns itself, no? This allows things like:

sile = sisl.get_sile("file.xyz")

sile.read_geometry[2][:3][2:4:5][-1] # Will only care about the last selector.

Also, coupled with the fact that once you create the bound slicer you set it as an attribute, this could result in very weird behavior:

sile = sisl.get_sile("file.xyz")

first = sile.read_geometry[0]
second = sile.read_geometry[1]

first() # will read the second geometry

Even more weirdly:

sile = sisl.get_sile("file.xyz")

reader = sile.read_geometry[:10]

sile.read_geometry() # Will read the first ten geometries

Probably sile.read_geometry[3:8] should return a new object :)

from sisl.

zerothi avatar zerothi commented on August 25, 2024

Yes, but this was what I thought you suggested?

I also thought it was a bit strange, but... :)
I previously always returned a new object, but then it can't be bound? What did you mean?

from sisl.

zerothi avatar zerothi commented on August 25, 2024

Ok, it is there now! :) I think this should be fully functional! :)

Perhaps we should discuss how it should work. :)

  • slices will take care of everything,
  • should we allow double slicing? Currently this is not allowed
  • do we need performance stuff that omits certain aspects? I.e. quick-skips for non-returned indices?
  • does this cover all aspects?
  • should we have a .m method? And what should it do?
  • should we allow a dict in the slicing, e.g. read_geometry[<slice>, {...}] for additional functionality?

from sisl.

pfebrer avatar pfebrer commented on August 25, 2024

I was thinking that in the future you could have something like sanitize_selection and everything that is passed into __getitem__ should go through this sanitzer. Just like it is done for the atoms. For now I think it's ok if we just support slices, single integers or arrays of integers, mimicking the behavior of @tfrederiksen's branch. Then we can always add new functionality if it makes sense.

should we allow double slicing? Currently this is not allowed

I believe for now it's better to keep it simple and don't allow double slicing.

do we need performance stuff that omits certain aspects? I.e. quick-skips for non-returned indices?

If there are no negative indices (or you know the length of the data), I think it would be nice to stop reading when you reach the last requested item.

should we have a .m method? And what should it do?

I don't know what's the best name, but I think there should be a method, for if you need to pass more arguments than just the selection of indices. I can't tell right now what would I use this method for, but it's probably a good idea to have it. Maybe we can include it later when it is more clear what would it be useful for.

should we allow a dict in the slicing, e.g. read_geometry[, {...}] for additional functionality?

If the dict contains arguments for the function, I think it's better not to do it because it feels very hacky. Instead, one should go through the method discussed above, since it feels much more natural to pass arguments to a function.

from sisl.

pfebrer avatar pfebrer commented on August 25, 2024

Oh by the way, I think it would be cool if the SileSlicer would have an __iter__ method that it's just like the call but yields the items read one by one. Although this sounds simple only if the indices are sorted from small to big.

from sisl.

zerothi avatar zerothi commented on August 25, 2024

I was thinking that in the future you could have something like sanitize_selection and everything that is passed into __getitem__ should go through this sanitzer. Just like it is done for the atoms. For now I think it's ok if we just support slices, single integers or arrays of integers, mimicking the behavior of @tfrederiksen's branch. Then we can always add new functionality if it makes sense.

Agreed!

should we allow double slicing? Currently this is not allowed

I believe for now it's better to keep it simple and don't allow double slicing.

Agreed, I really also think this is a corner case, but it could be useful for passing around objects that reads geometries

do we need performance stuff that omits certain aspects? I.e. quick-skips for non-returned indices?

If there are no negative indices (or you know the length of the data), I think it would be nice to stop reading when you reach the last requested item.

Agreed, lets try this as well.

should we have a .m method? And what should it do?

I don't know what's the best name, but I think there should be a method, for if you need to pass more arguments than just the selection of indices. I can't tell right now what would I use this method for, but it's probably a good idea to have it. Maybe we can include it later when it is more clear what would it be useful for.

Ok, we'll wait

should we allow a dict in the slicing, e.g. read_geometry[, {...}] for additional functionality?

If the dict contains arguments for the function, I think it's better not to do it because it feels very hacky. Instead, one should go through the method discussed above, since it feels much more natural to pass arguments to a function.

Agreed.

Oh by the way, I think it would be cool if the SileSlicer would have an __iter__ method that it's just like the call but yields the items read one by one. Although this sounds simple only if the indices are sorted from small to big.

Hmm. That might be a bit complicated. I think the interface would result in something like this:

for read in read_geometry[:10]:
    geom = read()

or similar? would this be expected?

from sisl.

pfebrer avatar pfebrer commented on August 25, 2024

For the iterator, I was thinking that it could already call the reader and yield the output. Although I noticed the problem that then you can't pass extra arguments. Is that why you suggested the interface? Perhaps the method .m could be useful for that.

Although now that you wrote it like that, I kind of like the functionality too. Hmm now I don' know what do I prefer 😅

from sisl.

zerothi avatar zerothi commented on August 25, 2024

Yeah, my problem was the arguments, hence that solution. Well, to my taste I don't particularly see the big need, doing:

for geom in read_geometry[:10]():
...

looks fine to me, albeit there may be some initial overhead.

from sisl.

pfebrer avatar pfebrer commented on August 25, 2024

Yes, but that is not the same, because it will read all the geometries. For example you can't break to stop the reading early. And you need to have enough memory to store all the geometries.

Another application of taking geometries on the fly could be to generate animations of trajectories.

from sisl.

zerothi avatar zerothi commented on August 25, 2024

@tfrederiksen would you be ok with abandoning #578 and continue down this path outlined here?

I do believe that this might be more versatile in the long run, and a bit simpler for end-users, no weird arguments, just slices.
If that's agreed, I will work on moving all sile_read_multiple to this approach.

from sisl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.