ntt-sic / masakari Goto Github PK
View Code? Open in Web Editor NEW[UNMAINTAINED]
Home Page: https://wiki.openstack.org/wiki/Masakari
License: Apache License 2.0
[UNMAINTAINED]
Home Page: https://wiki.openstack.org/wiki/Masakari
License: Apache License 2.0
Hi masakari team.
Thank you for this project and your support.
we have issue on operation for this project.
some information:
openstack version: newton
environment:
1 controller node
2 compute nodes
we clone this project, compile and successfully run masakari-controller on controller node and run masakari-instancemonitor on compute nodes.
some failed test scenarios list below:
additionally, we install masakari-hostmonitor on controller, when we destroy compute1, masakari-hostmonitor log which compute1 is offline but masakari-controller do not infrom from this event and finally none of instances recover on other compute node.
thank you for your notice.
Vahid
All 4 processes use different namespace for a directory and these process name. All process should be under masakari/* namespace to easy to maintenance and to operate.
hi !
today,I try to installed in the centos7+openstack kilo,there are some ImportError .
Is there a requirement for the Python and openstack versions of this program?
While creating database like:
openstack@openstack-VirtualBox:~/masakari/masakari_controller/db$ sudo ./create_database.sh /home/openstack/masakari/masakari_controller/db /home/openstack/masakari Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/openstack/masakari/masakari_controller/db/create_tables.py", line 22, in <module> from sqlalchemy_utils.functions import database_exists, create_database ImportError: No module named sqlalchemy_utils.functions
we get import error for sqlalchemy_utils module. It should be mentioned in the masakari-controller/requirements.txt and should be installed as a dependency
This is quite obvious and may not be considered a bug as such. But I would like to point out here that requiring a specific 'user name' to be present for package installation to be successful doesn't seem like a good idea.
IMO following two approaches look better:
In our case 2nd approach seems more appropriate. For e.g. I am running OpenStack as a user named 'ubuntu'. So naturally I should be able to run masakari as the user 'ubuntu'.
EDIT: I think what we want here is to have the same user running openstack and masakari. Hence while installing masakari we should be able to 'configure' the user. As a result of this someone running openstack as 'abc' user will be able to configure/install/use masakari via the user 'abc'.
While setting up and running Masakari for the first time. Instance monitor failed to start without generating any log files. Running python files for the particular service directly from the code throws out errors on console. There should be a way where such errors are reported directly in instancemonitor.log file even if the service is failing to start.
service masakari-instancemonitor start
Starting masakari-instancemonitor: *
service masakari-instancemonitor status
* masakari-instancemonitor is not running
python /opt/masakari/instancemonitor/masakari_instancemonitor.py
Traceback (most recent call last):
File "/opt/masakari/instancemonitor/masakari_instancemonitor.py", line 29, in <module>
import libvirt_eventfilter as evf
File "/opt/masakari/instancemonitor/libvirt_eventfilter.py", line 29, in <module>
from libvirt_callback import *
File "/opt/masakari/instancemonitor/libvirt_callback.py", line 223, in <module>
from httplib2 import Http
ImportError: No module named httplib2
pip install httplib2
service masakari-instancemonitor start
Result: SUCCESS
* masakari-instancemonitor is running
The character "-" in the name of a pyhon package is invalid. we get import error while creating database:
openstack@openstack_3:~/masakari/masakari-controller/db$ ./create_database.sh
/home/openstack/masakari/masakari-controller/db
/home/openstack/masakari
/usr/bin/python: No module named masakari-controller.db
Also masakari-controller does not contain init.py so as to be recognized as a package.
IMO directory structure should be like:
masakari-controller
`---controller
`---__init__.py and other py files
masakari-instancemonitor
`---instance_monitor
`---__init__.py and other py files
and so on.
hostmonitor hits corosync's scaling limits since it relays on hosts listed in "Online" of crm_mon command's output to check whether each compute host works well or now. Pacemaker-remote which doesn't relay on corosync's connection has been released since pacemaker-1.1.11. If hostmonitor also checks each host status with Pacemaker-remote, it'll be released from the corosync's limits.
(1) Prevent re-process VMHA after recovery controller restarted.
(2) Implement the DB lock to avoid the race conditions
(3) Fix the Nova API response handling, since API response is different in Juno and Kilo
Hi, I have read your document, I would like to use Masakari and I'm having trouble finding a step by step or other documentation to get started with. Which part should be installed on controller, which is should be on compute, and what is the prerequisite to install masakari, I have installed corosync and pacemaker, what else do I need to do ?
A notification from monitoring process to controller is specified by a pattern of each value on 'rscGroup', 'EVENT_ID' and 'DETAIL'. It's not developer friendly and not good for maintenance. We should re-define what event is sent to controller, and re-write each event with readable STRING or so on.
The controller should use python client since the clients can follow Nova API changes and configure every Keystone domain and project model. Currently, the controller uses curl commands to call OpenStack API. That is not generalized for Keystone project (tenant) model and Nova API version users of Masakari would deploy.
Hello Masakari team! Can you help me? I installed Masakari on centos9, the openstack zen version, when I run the "openstack segment list" command, I get a 'segment' when I should get an empty deadline, I previously deployed Masakari on Ubuntu 22, the openstack yoga version and it worked, tell me what the problem is, how to fix it?
In CentOS and ubuntu, process path of the instance_monitor is different.
rpm package puts the script as /usr/bin/masakari-instancemonitor and 'ps' output give something similar.
we have to make things consistent between rpm packaging and deb packaging.
Add python installers, setup.py, to easy to deploy on any OS. deb package is only supported in some Linux distribution, Debian, Ubuntu and so on. Python works in multi platform that enables users to deploy this easily.
I've just recently installed Masakari into our OpenStack environment and have started to notice false notifications from masakari-hostmonitor. When masakari-controller receives the false notification it disables nova-compute on the node and attempts a migration. The migration eventually fails with ..
Compute service of node-13.local is still in use
This leaves me with two problems:
I know that corosync/pacemaker are not reporting the node as down because fencing never takes place. There is no sign of any attempt to fence the node and no failed resources are shown when running crm status.
Here's the masakari-controller log after receiving a false notification:
2016-11-04 07:49:54.436 30958 INFO controller.masakari_util [Thread:notification_list(a0752992-61f9-447e-b9bf-5d4099d09be9)] Do update_notification_list_dict.
2016-11-04 07:49:54.438 30958 INFO controller.masakari_util [Thread:notification_list(a0752992-61f9-447e-b9bf-5d4099d09be9)] Succeeded in update_notification_list_dict.
2016-11-04 07:49:54.729 30958 INFO controller.masakari_worker [Thread:vm_list(52)] Do get_vm_list_by_id.
2016-11-04 07:49:54.732 30958 INFO controller.masakari_worker [Thread:vm_list(52)] Succeeded in get_vm_list_by_id. Return_value = (0L, 'node-12.local')
2016-11-04 07:49:54.732 30958 INFO controller.masakari_util [Thread:vm_list(52)] Call Evacuate API with c332baca-ca06-43d1-afb2-543b2d483feb to node-12.local
2016-11-04 07:49:54.739 30958 INFO controller.masakari_worker [Thread:vm_list(55)] Do get_vm_list_by_id.
2016-11-04 07:49:54.743 30958 INFO controller.masakari_worker [Thread:vm_list(55)] Succeeded in get_vm_list_by_id. Return_value = (0L, 'node-12.local')
2016-11-04 07:49:54.743 30958 INFO controller.masakari_util [Thread:vm_list(55)] Call Evacuate API with 18d95395-3a83-41ab-a5f2-2c49ef59a4d5 to node-12.local
2016-11-04 07:49:54.759 30958 INFO controller.masakari_worker [Thread:vm_list(46)] Do get_vm_list_by_id.
2016-11-04 07:49:54.761 30958 INFO controller.masakari_worker [Thread:vm_list(46)] Succeeded in get_vm_list_by_id. Return_value = (0L, 'node-12.local')
2016-11-04 07:49:54.761 30958 INFO controller.masakari_util [Thread:vm_list(46)] Call Evacuate API with 89419892-1ead-4150-8dc4-2906e86cc1d9 to node-12.local
2016-11-04 07:49:54.782 30958 INFO controller.masakari_worker [Thread:vm_list(58)] Do get_vm_list_by_id.
2016-11-04 07:49:54.783 30958 INFO controller.masakari_worker [Thread:vm_list(58)] Succeeded in get_vm_list_by_id. Return_value = (0L, 'node-12.local')
2016-11-04 07:49:54.783 30958 INFO controller.masakari_util [Thread:vm_list(58)] Call Evacuate API with 143345a1-7d68-4f1b-8ebb-c21b1b6df046 to node-12.local
2016-11-04 07:49:54.804 30958 ERROR controller.masakari_util [Thread:vm_list(52)] Fails to call Instance Evacuate API onto node-12.local: Compute service of no
de-13.local is still in use. (HTTP 400) (Request-ID: req-1e41c332-c3f1-4798-b3db-881dab855f23)
2016-11-04 07:49:54.804 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] <class 'novaclient.exceptions.BadRequest'>
2016-11-04 07:49:54.804 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] Compute service of node-13.local is still in use. (HTTP 400) (Request-ID: r
eq-1e41c332-c3f1-4798-b3db-881dab855f23)
2016-11-04 07:49:54.804 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/controller/masakari_worker.py", line 209, in _do_node_accident_vm_recovery
self.rc_util_api.do_instance_evacuate(uuid, evacuate_node)
2016-11-04 07:49:54.805 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/controller/masakari_util.py", line 744, in do_instance_evacuate
on_shared_storage=True)
2016-11-04 07:49:54.805 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/masakari_ve/local/lib/python2.7/site-packages/novaclient/api_versions.py", line 402, in substitution
return methods[-1].func(obj, *args, **kwargs)
2016-11-04 07:49:54.805 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/masakari_ve/local/lib/python2.7/site-packages/novaclient/v2/servers.py", line 1744, in evacuate
body)
2016-11-04 07:49:54.805 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/masakari_ve/local/lib/python2.7/site-packages/novaclient/v2/servers.py", line 1856, in _action_return_resp_and_body
return self.api.client.post(url, body=body)
2016-11-04 07:49:54.805 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/masakari_ve/local/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 222, in post
return self.request(url, 'POST', **kwargs)
2016-11-04 07:49:54.806 30958 ERROR controller.masakari_worker [Thread:vm_list(52)] File "/opt/masakari/masakari_ve/local/lib/python2.7/site-packages/novaclient/client.py", line 117, in request
raise exceptions.from_response(resp, body, url, method)
2016-11-04 07:49:54.806 30958 INFO controller.masakari_util [Thread:vm_list(52)] Do update_vm_list_by_id_dict.
2016-11-04 07:49:54.810 30958 INFO controller.masakari_util [Thread:vm_list(52)] Succeeded in update_vm_list_by_id_dict.
2016-11-04 07:49:54.810 30958 INFO controller.masakari_worker [Thread:vm_list(52)] Recovery process has been terminated abnormally. <
Current Masakari uses MySQLdb to access its DB. In some environment or distribution, they use not only MySQL but PostgresQL or other db. To enable masakari to work with other DB series, we need to replace MySQLdb with SQLalchemy.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.