Code Monkey home page Code Monkey logo

cbfv's Issues

CompositionError: ( is an invalid formula!

Processing Input Data: 92%|█████████▏| 1263/1377 [00:00<00:00, 20629.89it/s]

CompositionError Traceback (most recent call last)
in <cell line: 1>()
1 for f in['jarvis','magpie','mat2vec','oliynyk','onehot','random_200']:
----> 2 X_train_unscaled,y_train,formulae_train,skipped_train = generate_features(df, elem_prop=f, drop_duplicates=False, extend_features=False, sum_feat=True)
3 #it has to be tested again with bg data which I have created by deleting the duplicates
4
5 SEED=42

4 frames
/usr/local/lib/python3.10/dist-packages/CBFV/composition.py in generate_features(df, elem_prop, drop_duplicates, extend_features, sum_feat, mini)
281 if 'x' in formula:
282 continue
--> 283 l1, l2 = _element_composition_L(formula)
284 formula_mat.append(l1)
285 count_mat.append(l2)

/usr/local/lib/python3.10/dist-packages/CBFV/composition.py in _element_composition_L(formula)
97
98 def _element_composition_L(formula):
---> 99 comp_frac = _element_composition(formula)
100 atoms = list(comp_frac.keys())
101 counts = list(comp_frac.values())

/usr/local/lib/python3.10/dist-packages/CBFV/composition.py in _element_composition(formula)
86
87 def _element_composition(formula):
---> 88 elmap = parse_formula(formula)
89 elamt = {}
90 natoms = 0

/usr/local/lib/python3.10/dist-packages/CBFV/composition.py in parse_formula(formula)
62 expanded_formula = formula.replace(m.group(), expanded_sym)
63 return parse_formula(expanded_formula)
---> 64 sym_dict = get_sym_dict(formula, 1)
65 return sym_dict
66

/usr/local/lib/python3.10/dist-packages/CBFV/composition.py in get_sym_dict(f, factor)
26 f = f.replace(m.group(), "", 1)
27 if f.strip():
---> 28 raise CompositionError(f'{f} is an invalid formula!')
29 return sym_dict
30

CompositionError: ( is an invalid formula!

Usage instructions for pip-installed cbfv (possible bug)

PyPi doesn't have a description, and it's not obvious from the README. Taylor and I are both having trouble with it. In its own conda environment:

(cbfv) C:\Users\sterg>pip install cbfv
Collecting cbfv
  Downloading cbfv-1.0.0-py3-none-any.whl (5.0 kB)
Collecting numpy
  Using cached numpy-1.21.2-cp38-cp38-win_amd64.whl (14.0 MB)
Collecting pytest
  Downloading pytest-6.2.5-py3-none-any.whl (280 kB)
     |████████████████████████████████| 280 kB 819 kB/s
Collecting pandas
  Using cached pandas-1.3.3-cp38-cp38-win_amd64.whl (10.2 MB)
Collecting tqdm
  Downloading tqdm-4.62.3-py2.py3-none-any.whl (76 kB)
     |████████████████████████████████| 76 kB 5.5 MB/s
Collecting python-dateutil>=2.7.3
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pytz>=2017.3
  Downloading pytz-2021.3-py2.py3-none-any.whl (503 kB)
     |████████████████████████████████| 503 kB 2.2 MB/s
Collecting iniconfig
  Downloading iniconfig-1.1.1-py2.py3-none-any.whl (5.0 kB)
Collecting atomicwrites>=1.0
  Downloading atomicwrites-1.4.0-py2.py3-none-any.whl (6.8 kB)
Collecting py>=1.8.2
  Downloading py-1.10.0-py2.py3-none-any.whl (97 kB)
     |████████████████████████████████| 97 kB 2.2 MB/s
Collecting pluggy<2.0,>=0.12
  Downloading pluggy-1.0.0-py2.py3-none-any.whl (13 kB)
Collecting colorama
  Using cached colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Collecting packaging
  Downloading packaging-21.0-py3-none-any.whl (40 kB)
     |████████████████████████████████| 40 kB ...
Collecting attrs>=19.2.0
  Downloading attrs-21.2.0-py2.py3-none-any.whl (53 kB)
     |████████████████████████████████| 53 kB 1.0 MB/s
Collecting toml
  Using cached toml-0.10.2-py2.py3-none-any.whl (16 kB)
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting pyparsing>=2.0.2
  Using cached pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
Installing collected packages: six, pyparsing, toml, pytz, python-dateutil, py, pluggy, packaging, numpy, iniconfig, colorama, attrs, atomicwrites, tqdm, pytest, pandas, cbfv
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
skrebate 0.6 requires scikit-learn, which is not installed.
skrebate 0.6 requires scipy, which is not installed.
automatminer 1.0.3.20200727 requires matminer==0.6.2, which is not installed.
automatminer 1.0.3.20200727 requires pymatgen==2020.01.28, which is not installed.
automatminer 1.0.3.20200727 requires scikit-learn==0.22.2, which is not installed.
automatminer 1.0.3.20200727 requires tpot==0.11.0, which is not installed.
auto-xrd 0.0.1 requires pymatgen, which is not installed.
auto-xrd 0.0.1 requires scipy, which is not installed.
tensorflow 2.5.0rc1 requires absl-py~=0.10, which is not installed.
tensorflow 2.5.0rc1 requires astunparse~=1.6.3, which is not installed.
tensorflow 2.5.0rc1 requires flatbuffers~=1.12.0, which is not installed.
tensorflow 2.5.0rc1 requires gast==0.4.0, which is not installed.
tensorflow 2.5.0rc1 requires google-pasta~=0.2, which is not installed.
tensorflow 2.5.0rc1 requires grpcio~=1.34.0, which is not installed.
tensorflow 2.5.0rc1 requires h5py~=3.1.0, which is not installed.
tensorflow 2.5.0rc1 requires keras-nightly~=2.5.0.dev, which is not installed.
tensorflow 2.5.0rc1 requires keras-preprocessing~=1.1.2, which is not installed.
tensorflow 2.5.0rc1 requires typing-extensions~=3.7.4, which is not installed.
dtw-python 1.1.10 requires scipy>=1.1, which is not installed.
tensorboard 2.4.1 requires absl-py>=0.4, which is not installed.
tensorboard 2.4.1 requires google-auth<2,>=1.6.3, which is not installed.
tensorboard 2.4.1 requires google-auth-oauthlib<0.5,>=0.4.1, which is not installed.
tensorboard 2.4.1 requires grpcio>=1.24.3, which is not installed.
tensorboard 2.4.1 requires markdown>=2.6.8, which is not installed.
tensorboard 2.4.1 requires requests<3,>=2.21.0, which is not installed.
tensorboard 2.4.1 requires tensorboard-plugin-wit>=1.6.0, which is not installed.
tensorboard 2.4.1 requires werkzeug>=0.11.15, which is not installed.
tensorflow 2.5.0rc1 requires numpy~=1.19.2, but you have numpy 1.21.2 which is incompatible.
tensorflow 2.5.0rc1 requires six~=1.15.0, but you have six 1.16.0 which is incompatible.
Successfully installed atomicwrites-1.4.0 attrs-21.2.0 cbfv-1.0.0 colorama-0.4.4 iniconfig-1.1.1 numpy-1.21.2 packaging-21.0 pandas-1.3.3 pluggy-1.0.0 py-1.10.0 pyparsing-2.4.7 pytest-6.2.5 python-dateutil-2.8.2 pytz-2021.3 six-1.16.0 toml-0.10.2 tqdm-4.62.3

(cbfv) C:\Users\sterg>python3
Python 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cbfv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'cbfv'

Invalid formulas during generate_features()

Some formulas in my datasets are occasionally not recognized and I get the error

raise CompositionError(f'{f} is an invalid formula!')
CompositionError: ,65 is an invalid formula!

This is happening into get_sym_dict() function. Is there a way to automatically drop non-recognized symbols?

`Me` element missing, not accounted for in "exotic" elements checking

Exception has occurred: ValueError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
'Me' is not in list
  File "[C:\Users\sterg\miniconda3\envs\vickers\Lib\site-packages\composition_based_feature_vector\composition.py]()", line 133, in _assign_features
    row = elem_index[elem_symbols.index(elem)]
  File "[C:\Users\sterg\miniconda3\envs\vickers\Lib\site-packages\composition_based_feature_vector\composition.py]()", line 295, in generate_features
    feats, targets, formulae, skipped = _assign_features(matrices,
  File "[C:\Users\sterg\Documents\GitHub\sparks-baird\VickersHardnessPrediction\vickers_hardness\utils\mpds.py]()", line 12, in <module>
    X, y, formulae, skipped = generate_features(df)

`generate_features(..., extend_features=...)` InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Processing Input Data: 100%|██████████| 1794/1794 [00:00<00:00, 7378.49it/s]
	Featurizing Compositions...
Assigning Features...: 100%|██████████| 1778/1778 [00:00<00:00, 3426.03it/s]
NOTE: Your data contains formula with exotic elements. These were skipped.
	Creating Pandas Objects...

---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
[<ipython-input-45-22826a03d387>](https://localhost:8080/#) in <module>()
      1 from CBFV import composition
----> 2 X, y, formulae, skipped = composition.generate_features(df, extend_features="R")

4 frames
[/usr/local/lib/python3.7/dist-packages/CBFV/composition.py](https://localhost:8080/#) in generate_features(df, elem_prop, drop_duplicates, extend_features, sum_feat, mini)
    307         extended = pd.DataFrame(extra_features, columns=features)
    308         extended = extended.set_index('formula', drop=True)
--> 309         X = pd.concat([X, extended], axis=1)
    310 
    311     # reset dataframe indices

[/usr/local/lib/python3.7/dist-packages/pandas/util/_decorators.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

[/usr/local/lib/python3.7/dist-packages/pandas/core/reshape/concat.py](https://localhost:8080/#) in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    305     )
    306 
--> 307     return op.get_result()
    308 
    309 

[/usr/local/lib/python3.7/dist-packages/pandas/core/reshape/concat.py](https://localhost:8080/#) in get_result(self)
    526                     obj_labels = obj.axes[1 - ax]
    527                     if not new_labels.equals(obj_labels):
--> 528                         indexers[ax] = obj_labels.get_indexer(new_labels)
    529 
    530                 mgrs_indexers.append((obj._mgr, indexers))

[/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py](https://localhost:8080/#) in get_indexer(self, target, method, limit, tolerance)
   3440 
   3441         if not self._index_as_unique:
-> 3442             raise InvalidIndexError(self._requires_unique_msg)
   3443 
   3444         if not self._should_compare(target) and not is_interval_dtype(self.dtype):

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Accompanying paper

What is the paper that accompanies this? Maybe include in the README?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.