Code Monkey home page Code Monkey logo

pydelta's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pydelta's Issues

Use pandas' _constructor for metadata bookkeeping

Since (I guess?) 0.16, pandas supports a _constructor property on DataFrame or Series subclasses that will be used when pandas needs to create a new version of one of its datatypes. This could be used to keep track of our metadata in an easier and more robust way than the current manual implementation, where, e.g. corpus / scalar will always create a plain DataFrame and we have to copy the metadata manually, by implementing something like:

class Corpus:

     _metadata = ['metadata']

     @property
     def _constructor(self):
         def constructor(*args, **kwargs):
	     result = Corpus(*args, **kwargs)
	     result.metadata.update_from(self)
	     return result
	 return constructor

     ...

It would require to get rid of much of the conversion and copying code, though.

Installation under Python 3.10 is not working

I tried the pip command with the download

pip install git+https://github.com/cophi-wue/pydelta@next

and have this error:

ERROR: Package 'delta' requires a different Python: 3.10.6 not in '<3.10,>=3.7.1'

Switch K-Medoids to scikit-learn-extras

The KMedoids implementation (that is planned to be integrated for 0.20, see sklearn issue 7694) doesn't currently compile against the pydelta/next requirements list. Since this is integrated using requirements.txt, we'll probably either need to remove it from there or update the dependency.

.evaluate() fails with current pandas

    distances.evaluate()
  File "/tmp/pydelta/delta/deltas.py", line 787, in evaluate
    result["Simple Score"] = self.simple_score()
  File "/tmp/pydelta/delta/deltas.py", line 772, in simple_score
    in_group_df, out_group_df = self.z_scores().partition()
  File "/tmp/pydelta/delta/deltas.py", line 731, in z_scores
    return DistanceMatrix((self - deltas.mean()) / deltas.std(),
  File "/tmp/pydelta/.venv/lib/python3.8/site-packages/pandas/core/ops/__init__.py", line 660, in f
    new_data = dispatch_to_series(self, other, op)
  File "/tmp/pydelta/.venv/lib/python3.8/site-packages/pandas/core/ops/__init__.py", line 266, in dispatch_to_series
    return type(left)(bm)
  File "/tmp/pydelta/delta/deltas.py", line 630, in __init__
    self.metadata = Metadata(metadata, **kwargs)
  File "/tmp/pydelta/delta/util.py", line 31, in __init__
    self.update(*args, **kwargs)
  File "/tmp/pydelta/delta/util.py", line 84, in update
    self._update_from(arg)
  File "/tmp/pydelta/delta/util.py", line 53, in _update_from
    self.__dict__.update(d)
TypeError: 'NoneType' object is not iterable

works for pandas < 0.25

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.