decile-team / submodlib Goto Github PK
View Code? Open in Web Editor NEWSummarize Massive Datasets using Submodular Optimization
License: MIT License
Summarize Massive Datasets using Submodular Optimization
License: MIT License
Installation environment:
Error:
` /tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1195:51: error: ‘is_pod_struct’ is not a member of ‘pybind11::detail’
/tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1195:74: error: template argument 1 is invalid
struct compare_buffer_info<T, detail::enable_if_t<detail::is_pod_struct::value>> {
^
/tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1195:77: error: template argument 2 is invalid
struct compare_buffer_info<T, detail::enable_if_t<detail::is_pod_struct::value>> {
^
/tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1195:82: error: expected unqualified-id before ‘>’ token
struct compare_buffer_info<T, detail::enable_if_t<detail::is_pod_struct::value>> {
^
/tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1392:19: error: ‘is_pod_struct’ was not declared in this scope
static_assert(is_pod_struct::value,
^
/tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1392:34: error: expected primary-expression before ‘>’ token
static_assert(is_pod_struct::value,
^
/tmp/pip-build-env-m7fawz9p/overlay/lib/python3.8/site-packages/pybind11/include/pybind11/numpy.h:1392:35: error: ‘::value’ has not been declared
static_assert(is_pod_struct::value,
^
cpp/submod/FacilityLocation2.cpp: In member function ‘void FacilityLocation2::pybind_init(ll, const pybind11::array_t&, bool, const std::unordered_set&, bool)’:
cpp/submod/FacilityLocation2.cpp:77:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (size_t idx = 0; idx < num_rows; idx++) {
^
cpp/submod/FacilityLocation2.cpp:78:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (size_t idy = 0; idy < num_columns; idy++) {
^
In file included from cpp/submod/FacilityLocation2.cpp:10:0:
cpp/submod/FacilityLocation2.h: In constructor ‘FacilityLocation2::FacilityLocation2(ll, const std::vector<std::vector >&, bool, const std::unordered_set&, bool)’:
cpp/submod/FacilityLocation2.h:39:33: warning: ‘FacilityLocation2::denseKernel’ will be initialized after [-Wreorder]
std::vector<std::vector>denseKernel; //size n_master X n
^
cpp/submod/FacilityLocation2.h:32:7: warning: ‘bool FacilityLocation2::partial’ [-Wreorder]
bool partial; //if masked implementation is desired, relevant to be used in ClusteredFunction
^
cpp/submod/FacilityLocation2.cpp:125:1: warning: when initialized here [-Wreorder]
FacilityLocation2::FacilityLocation2(ll n_, std::vector<std::vector> const &denseKernel_, bool partial_, std::unordered_set const &ground_, bool separateMaster_): n(n_), mode(dense), denseKernel(denseKernel_), partial(partial_), separateMaster(separateMaster_) {
^
In file included from cpp/submod/FacilityLocation2.cpp:10:0:
cpp/submod/FacilityLocation2.h: In constructor ‘FacilityLocation2::FacilityLocation2(ll, const std::vector<std::unordered_set >&, const std::vector<std::vector<std::vector > >&, const std::vector&)’:
cpp/submod/FacilityLocation2.h:47:17: warning: ‘FacilityLocation2::clusterIndexMap’ will be initialized after [-Wreorder]
std::vectorclusterIndexMap; //mapping from datapont index to index in cluster kernel, size = n
^
In file included from cpp/submod/FacilityLocation2.cpp:10:0:
cpp/submod/FacilityLocation2.h:32:7: warning: ‘bool FacilityLocation2::partial’ [-Wreorder]
bool partial; //if masked implementation is desired, relevant to be used in ClusteredFunction
^
cpp/submod/FacilityLocation2.cpp:228:1: warning: when initialized here [-Wreorder]
FacilityLocation2::FacilityLocation2(ll n_, std::vector<std::unordered_set> const &clusters_,std::vector<std::vector<std::vector>> const &clusterKernels_, std::vector const &clusterIndexMap_): n(n_), mode(clustered), num_clusters(clusters_.size()), clusters(clusters_), clusterKernels(clusterKernels_), clusterIndexMap(clusterIndexMap_), partial(false), separateMaster(false) {
^
In file included from cpp/submod/FacilityLocation2.cpp:10:0:
cpp/submod/FacilityLocation2.h: In copy constructor ‘FacilityLocation2::FacilityLocation2(const FacilityLocation2&)’:
cpp/submod/FacilityLocation2.h:39:33: warning: ‘FacilityLocation2::denseKernel’ will be initialized after [-Wreorder]
std::vector<std::vector>denseKernel; //size n_master X n
^
cpp/submod/FacilityLocation2.h:32:7: warning: ‘bool FacilityLocation2::partial’ [-Wreorder]
bool partial; //if masked implementation is desired, relevant to be used in ClusteredFunction
^
cpp/submod/FacilityLocation2.cpp:268:1: warning: when initialized here [-Wreorder]
FacilityLocation2::FacilityLocation2(const FacilityLocation2& f)
^
error: command '/usr/bin/gcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for submodlib
Failed to build submodlib
ERROR: Could not build wheels for submodlib, which is required to install pyproject.toml-based projects
`
So how to tackle it?
Your requirements.txt and setup.py files don't match and cause pip install .
to fail. If you update setup.py install_requirements
list to install_requires=["numpy==1.22.0", "scipy", "scikit-learn", "numba"]
it will install successfully.
While Installing submodlib with the provided alternative 1 is causing the following error
ERROR: Command errored out with exit status 1:
command: /home/venkat/BADRI/venv/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-vdi9e14s/sklearn/setup.py'"'"'; __file__='"'"'/tmp/pip-install-vdi9e14s/sklearn/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-vdi9e14s/sklearn/pip-egg-info
cwd: /tmp/pip-install-vdi9e14s/sklearn/
Complete output (18 lines):
The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
rather than 'sklearn' for pip commands.
Here is how to fix this error in the main use cases:
- use 'pip install scikit-learn' rather than 'pip install sklearn'
- replace 'sklearn' by 'scikit-learn' in your pip requirements files
(requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
- if the 'sklearn' package is used by one of your dependencies,
it would be great if you take some time to track which package uses
'sklearn' instead of 'scikit-learn' and report it to their issue tracker
- as a last resort, set the environment variable
SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
More information is available at
https://github.com/scikit-learn/sklearn-pypi-package
If the previous advice does not cover your use case, feel free to report it at
https://github.com/scikit-learn/sklearn-pypi-package/issues/new
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Please update to the usage of scikit-learn
instead of sklearn
to bypass the error
When i was installing the package in colab, i have followed error info
Downloading https://test-files.pythonhosted.org/packages/91/e7/2bb3aeea86faadfe5f23adddf7f8d7375e70e390571da4da916c4e261c48/sklearn-0.0.post3.tar.gz (3.7 kB)
error: subprocess-exited-with-error
I would like to implement my own submodular function classes, inheriting the SetFunction
class. However, I couldn't figure out how to do it. It's not easy to see how to do this, or whether it is possible to do this, because much of the implementation is abstracted away by the calls to cpp_obj
. Is there a way to achieve what I want?
Your library and documentation are great, thanks for sharing!
Can you please add a license file? Without a license this library cannot be used.
Error:
ERROR: Failed building wheel for submodlib
Failed to build submodlib
ERROR: Could not build wheels for submodlib which use PEP 517 and cannot be installed directly
================
my pip list:
Package Version
alabaster 0.7.12
attrs 19.3.0
autograd 1.3
autograd-gamma 0.5.0
Babel 2.10.1
certifi 2022.5.18.1
charset-normalizer 2.0.12
colorama 0.4.4
dm-tree 0.1.6
docutils 0.18.1
formulaic 0.2.4
gast 0.4.0
grpcio 1.34.1
idna 3.3
imagesize 1.3.0
importlib-metadata 4.11.4
interface-meta 1.2.4
Jinja2 3.1.2
joblib 1.1.0
latexcodec 2.0.1
lifelines 0.26.4
llvmlite 0.38.1
MarkupSafe 2.1.1
mkl-fft 1.3.1
mkl-random 1.2.2
mkl-service 2.4.0
multipledispatch 0.6.0
numba 0.55.2
numpy 1.20.1
packaging 21.3
pip 21.2.2
portpicker 1.3.9
pybtex 0.24.0
pybtex-docutils 1.0.2
Pygments 2.12.0
pyparsing 3.0.9
pytz 2022.1
PyYAML 6.0
requests 2.28.0
retrying 1.3.3
scikit-learn 0.23.0
scipy 1.4.1
semantic-version 2.8.5
setuptools 61.2.0
six 1.16.0
snowballstemmer 2.2.0
Sphinx 5.0.1
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-bibtex 2.4.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.0
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
stg 0.1.2
tensorflow-addons 0.11.2
tensorflow-estimator 2.5.0
tensorflow-federated 0.17.0
tensorflow-model-optimization 0.4.1
tensorflow-privacy 0.5.2
threadpoolctl 3.1.0
tqdm 4.64.0
typeguard 2.13.3
urllib3 1.26.9
wheel 0.37.1
wincertstore 0.2
zipp 3.8.0
Thank you
Hi,
Following tests fail when installing the software.
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================================================================== short test summary info =======================================================================================================
FAILED tests/test_all.py::TestAll::test_dense_cpp_eval_groundset[GraphCut] - AssertionError: Eval on groundset is not >= 0 or is NAN or is INF
FAILED tests/test_all.py::TestAll::test_dense_py_eval_groundset[GraphCut] - AssertionError: Eval on groundset is not >= 0 or is NAN or is INF
FAILED tests/test_all.py::TestAll::test_mi_dense_cpp_eval_groundset[GraphCutConditionalGain] - AssertionError: Eval on groundset is not >= 0 or is NAN or is INF
FAILED tests/test_all.py::TestAll::test_mi_dense_py_eval_groundset[GraphCutConditionalGain] - AssertionError: Eval on groundset is not >= 0 or is NAN or is INF
============================================================================================ 4 failed, 687 passed, 335 warnings in 17.34s ============================================================================================```
pip install . does not install all dependencies
Add fastdist to setup.py
stopIfZeroGain in maximize() checks 0 as an Int instead of Float
While the pip
installation works for Google Colab, I am still running into issues installing on a Windows system with C++20 VS build tools. There are quite a few warnings; more importantly, there are a couple errors that I've identified from the full log output (PIP Log Output.txt):
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\SetFunction.cpp(12) : error C4716: 'SetFunction::getEffectiveGroundSet': must return a value
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\SetFunction.cpp(10) : error C4716: 'SetFunction::marginalGainWithMemoization': must return a value
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\SetFunction.cpp(9) : error C4716: 'SetFunction::marginalGain': must return a value
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\SetFunction.cpp(8) : error C4716: 'SetFunction::evaluateWithMemoization': must return a value
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\SetFunction.cpp(7) : error C4716: 'SetFunction::evaluate': must return a value
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\optimizers\LazierThanLazyGreedyOptimizer.cpp(49) : error C4716: 'printSortedSet': must return a value
C:\Users\nbeck\AppData\Local\Temp\pip-install-d83pcjkv\submodlib_df4056be6b38418fbc6a76e7ace5d2d4\cpp\SetFunction.cpp(13) : error C4716: 'SetFunction::maximize': must return a value
All the errors revolve around the lack of a return statement in functions that have return types. As a suggestion, perhaps the SetFunction class could instead declare its functions as pure virtual functions so that they can be abstract definitions/declarations (as they do not do anything by their current definition). Furthermore, perhaps the SetFunction class could declare a blank virtual destructor as not all compilers will bind delete calls from base-class pointers to the most-derived class's destructor without this definition.
Try a few test cases where we compare the greedy ordering obtained by submodlib and datk for some functions.
See https://pypi.org/project/sklearn/ -> https://pypi.org/project/scikit-learn/
Thanks,
Andreas
I 'm installing it on windows machine. I'm facing this issue.
**Building wheels for collected packages: submodlib
Building wheel for submodlib (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for submodlib (pyproject.toml) did not run successfully.
│ exit code: 1**
Sharing the complete logs
error.odt
The C++ implementation of the SetCover function uses the word 'concept' regularly as a variable name; however, 'concept' is a reserved keyword for C++20 and causes compilation issues during pip installation.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.