Comments (4)
I'm not sure about docker, but on LC systems, a significant source of slowdown appears to be pandas:
herbein1@quartz386 ~/Repositories/flux-framework/flux-accounting (core-autotools)
❯ time python3.7 -c 'import pandas' 15:55:44 ()
python3.7 -c 'import pandas' 0.58s user 0.49s system 6% cpu 15.915 total
flux python -m cProfile -s cumtime ./libexec/flux/cmd/flux-account.py 12:51:30 ()
usage: flux-account.py [-h] [-p PATH] [-o OUTPUT_FILE]
{view-user,add-user,delete-user,edit-user,view-job-records,create-db,add-bank,view-bank,delete-bank,edit-bank,print-hierarchy}
...
flux-account.py: error: the following arguments are required: subcommand
315329 function calls (307279 primitive calls) in 19.592 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
462/1 0.006 0.000 19.592 19.592 {built-in method builtins.exec}
1 0.000 0.000 19.592 19.592 flux-account.py:12(<module>)
823/8 0.005 0.000 19.582 2.448 <frozen importlib._bootstrap>:978(_find_and_load)
823/8 0.004 0.000 19.582 2.448 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
485/9 0.004 0.000 19.538 2.171 <frozen importlib._bootstrap>:663(_load_unlocked)
400/9 0.002 0.000 19.537 2.171 <frozen importlib._bootstrap_external>:722(exec_module)
842/9 0.001 0.000 19.516 2.168 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
4 0.000 0.000 19.373 4.843 __init__.py:5(<module>)
2 0.000 0.000 19.245 9.623 __init__.py:4(<module>)
79/64 0.000 0.000 9.602 0.150 <frozen importlib._bootstrap_external>:1048(exec_module)
79/64 0.015 0.000 9.602 0.150 {built-in method _imp.exec_dynamic}
545 0.001 0.000 9.286 0.017 {method 'extend' of 'list' objects}
1 0.000 0.000 9.285 9.285 lazy.py:93(_lazy)
594 0.002 0.000 9.284 0.016 __init__.py:1088(<genexpr>)
593 0.004 0.000 9.283 0.016 __init__.py:100(resource_exists)
593 0.011 0.000 9.270 0.016 __init__.py:74(open_resource)
3353 8.672 0.003 8.672 0.003 {built-in method posix.stat}
558/78 0.001 0.000 7.595 0.097 {built-in method builtins.__import__}
33 0.001 0.000 4.977 0.151 __init__.py:1(<module>)
400 0.009 0.000 4.889 0.012 <frozen importlib._bootstrap_external>:793(get_code)
400 4.760 0.012 4.778 0.012 <frozen importlib._bootstrap_external>:914(get_data)
756 0.002 0.000 4.729 0.006 genericpath.py:16(exists)
595 4.531 0.008 4.531 0.008 {built-in method io.open}
2468/1454 0.002 0.000 4.445 0.003 <frozen importlib._bootstrap>:1009(_handle_fromlist)
649 0.007 0.000 4.246 0.007 <frozen importlib._bootstrap>:882(_find_spec)
644 0.001 0.000 4.235 0.007 <frozen importlib._bootstrap_external>:1272(find_spec)
644 0.004 0.000 4.234 0.007 <frozen importlib._bootstrap_external>:1240(_get_spec)
1442 0.015 0.000 4.220 0.003 <frozen importlib._bootstrap_external>:1356(find_spec)
2570 0.002 0.000 3.944 0.002 <frozen importlib._bootstrap_external>:74(_path_stat)
728 0.003 0.000 3.907 0.005 <frozen importlib._bootstrap_external>:84(_path_is_mode_type)
668 0.001 0.000 3.907 0.006 <frozen importlib._bootstrap_external>:93(_path_isfile)
1 0.000 0.000 3.710 3.710 __init__.py:106(<module>)
1 0.000 0.000 2.167 2.167 api.py:5(<module>)
6 0.000 0.000 1.365 0.228 api.py:3(<module>)
1 0.000 0.000 1.211 1.211 __init__.py:25(<module>)
1 0.000 0.000 0.944 0.944 groupby.py:8(<module>)
1 0.000 0.000 0.928 0.928 frame.py:12(<module>)
One potential workaround, which goes against the traditional python style, but might be worthwhile here, would be to import flux-accounting subpackages into flux-account.py
only after the argument parsing has been performed. This wouldn't make flux-account.py
any faster in general, but it'll be significantly faster for the -h
and the improperly run cases.
from flux-accounting.
I notice this delay too. I wonder if it is something to do with the virtual environment.
FWIW, when I build flux-accounting alongside flux-core, I notice that there isn't quite the large delay:
[fluxuser@045fd022ca63 ~]$ time flux account -h
usage: flux-account.py [-h] [-p PATH] [-o OUTPUT_FILE]
{view-user,add-user,delete-user,edit-user,view-job-records,create-db,add-bank,view-bank,delete-bank,edit-bank,print-hierarchy}
...
Description: Translate command line arguments into SQLite instructions for the
Flux Accounting Database.
positional arguments:
{view-user,add-user,delete-user,edit-user,view-job-records,create-db,add-bank,view-bank,delete-bank,edit-bank,print-hierarchy}
sub-command help
view-user view a user's information in the accounting database
add-user add a user to the accounting database
delete-user remove a user from the accounting database
edit-user edit a user's value
view-job-records view job records
create-db create the flux-accounting database
add-bank add a new bank
view-bank view bank information
delete-bank remove a bank
edit-bank edit a bank's allocation
print-hierarchy print accounting database
optional arguments:
-h, --help show this help message and exit
-p PATH, --path PATH specify location of database file
-o OUTPUT_FILE, --output-file OUTPUT_FILE
specify location of output file
real 0m0.777s
user 0m0.789s
sys 0m0.246s
from flux-accounting.
We talked briefly about this last week, but I wonder if it would be feasible to drop the dependency on pandas
and instead utilize the Cursor object interface to fetch records and interact with the flux-accounting database.
I haven't looked over this repo's interaction with pandas
too closely just yet, but I'm pretty sure I have been using pandas
mostly for easy formatting and indexing of SQL queries. If it in fact takes this long just to load pandas
, it might be worth to convert those calls so we don't have to use it anymore.
from flux-accounting.
I haven't noticed any significant startup costs recently when installing flux-accounting RPM's in my Docker container, and since this issue was opened, the pandas
dependency has been removed from the project, so I think I can safely close this issue. I can always re-open if a similar issue creeps up again.
from flux-accounting.
Related Issues (20)
- testsuite: fix tests that look at job state HOT 1
- support bank and project updates HOT 1
- `view-bank`: `-t` option does show hierarchy for a sub bank with users in it
- per-queue user limits HOT 2
- plugin: create external `bank_info` class HOT 1
- all pending jobs killed after Flux update HOT 5
- plugin: create new `Association` class
- plugin: improve callback for `job.validate` HOT 1
- error in flux account view-job-records HOT 2
- `plugin.query`: abstract helper functions that create JSON objects of flux-accounting data HOT 1
- `job.new`: use new external functions for user/bank lookups
- plugin: support bypassing limits
- `job.update`/`job.update...queue`: use new external methods for association lookup
- `job.state.priority`: use new external function for association lookup, general function improvement
- plugin: move accounting-specific helper functions to `accounting.cpp`
- plugin: send max nodes information per-association
- plugin: create estimation of node count helper function
- docs: move flux-accounting guide to this repo HOT 1
- create script for crontab tasks HOT 3
- flux account commands hang while fairshare is being updated HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flux-accounting.