cernceph / ceph-scripts Goto Github PK
View Code? Open in Web Editor NEWSmall helper scripts for monitoring/managing a Ceph cluster
License: GNU General Public License v2.0
Small helper scripts for monitoring/managing a Ceph cluster
License: GNU General Public License v2.0
Do you know, or have a tool, to take a pg ID that can then map to a specific RBD?
Hi,
I cannot run script "ceph-pool-pg-distribution.py".
There's this error:
root@ld3955:~
# ./ceph-pool-pg-distribution.py
Traceback (most recent call last):
File "./ceph-pool-pg-distribution.py", line 7, in <module>
from cephinfo import cephinfo
ImportError: No module named cephinfo
root@ld3955:~
Please advise how to fix this.
When osd_max_backfills
is not high enough, the script will keep increasing the weight. Solution would be to take backfill_wait
into account and add it to status backfilling
.
Would you merge a PR for this?
Hello,
First of all, thank you for this script !
I had a few issues running the script with python 3 on Centos 7.
$ python3 -V
Python 3.6.8
$ python3 scripts/upmap-remapped.py
Error loading remapped pgs
The output of subprocess.check_output
is of type byte
, so it needs to be decoded before being able to use split
:
for line in subprocess.check_output(['ceph', 'osd', 'pool', 'ls', 'detail']).split('\n'):
OSDS
is not a listbyte
and not list
:OSDS = subprocess.check_output(['ceph', 'osd', 'ls', '-f', 'json'])
valid_osds
osds
to str
is useless and breaks the lookup in :def valid_osds(osds):
valid = []
for osd in osds:
if str (osd) in OSDS:
valid.append(osd)
return valid
I'll submit a PR with the various fixes.
There's a bug in the "in_timeframe" routine in the ceph-gentle-reweight script.
If the current_day is not in allowed_days but the current_time does fall between start_time and end_time it still allows the script to run. It shouldn't because the current_day is not allowed.
jq
supports -e
option since version 1.4, run previous versions with -e
will result in:
$ ceph daemon mon.$(hostname -s) mon_status 2>/dev/null | jq -e '.state == "leader"'
jq: Unknown option -e
Use jq --help for help with command-line options,
or see the jq documentation at http://stedolan.github.com/jq
$ jq --version
jq version 1.3
The script redirects that error message to /dev/null
:
$ ceph daemon mon.$(hostname -s) mon_status 2>/dev/null | jq -e '.state == "leader"' &>/dev/null
which leads to incorrect result.
Hi,
thank you for creating such a great collection of scripts. After upgrade my Ceph cluster to Nautilus the crush-reweight-by-utilization.py script stopped working.
# python crush-reweight-by-utilization.py --verbose --max-change=0.05
Traceback (most recent call last):
File "crush-reweight-by-utilization.py", line 211, in <module>
if VERBOSE: raise(e)
KeyError: 'osd_stats'
It would be nice to have a utility that can run on any node and return what node it is (mon/mgr/mds/osd/rgw)
For a script, it's difficult to be sure what node it is running on. For example, if someone is writing a script that needs to do different things depending on whether it's run on a mon or osd node, for eg., there's no easy way to be sure unless you do a combination of things like check running services, examine /var/lib/ceph/XXX for non empty folders, etc.
So for example the expected behavior would be, when run on a OSD node,
$ceph-whatami
OSD
When run on a mon+mgr node
$ceph-whatami
MON
MGR
When run on a RGW node,
$ceph-whatami
RGW
This would come in handy for any scripts that need to behave differently depending on whether they're running on mon/osd/mgr/rgw/mds ceph nodes.
Hi there,
I'm newbie to ceph and python. Got some error when I try the script.
Kindly advice how to fix that. Thanks in advance.
Traceback (most recent call last):
File "./ceph-gentle-reweight.py", line 190, in
main(sys.argv[1:])
File "./ceph-gentle-reweight.py", line 185, in main
reweight_osds(drain_osds, max_pgs_backfilling, max_latency, delta_weight, target_weight, test_pool, start_time, end_time, allowed_days, interval, really)
File "./ceph-gentle-reweight.py", line 90, in reweight_osds
latency = measure_latency(test_pool)
File "./ceph-gentle-reweight.py", line 46, in measure_latency
latency_ms = 1000*float(latency)
ValueError: could not convert string to float:
Thanks,
Allen
I tried this invocation of ceph-gentle-reweight
:
ceph-gentle-reweight -o 151,153,155,157,159,161,163,165,167,169,171,173,175,177,179 -l 15 -d -0.01 -t 0
and saw this traceback:
reweight_osds: changing all osds by weight -0.01 (target 0.0)
check current time: 02:00:11
check current day: 2
get_num_backfilling: PGs currently backfilling: 0
measure_latency: measuring 4kB write latency
Traceback (most recent call last):
File "./ceph-gentle-reweight", line 191, in <module>
main(sys.argv[1:])
File "./ceph-gentle-reweight", line 186, in main
reweight_osds(drain_osds, max_pgs_backfilling, max_latency, delta_weight, target_weight, test_pool, start_time, end_time, allowed_days, interval, really)
File "./ceph-gentle-reweight", line 90, in reweight_osds
latency = measure_latency(test_pool)
File "./ceph-gentle-reweight", line 46, in measure_latency
latency_ms = 1000*float(latency)
ValueError: could not convert string to float:
This is with the system Python 2.7.5
on CentOS 7.9
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.