Code Monkey home page Code Monkey logo

hass-data-detective's Introduction

PyPI Version Binder Code style: black

Introduction

The HASS-data-detective package helps you explore and analyse the data in your Home Assistant) database. If you are running HASS-data-detective on a machine with Home Assistant installed, it will automatically discover your database and by default collect information about the entities in your database. See the notebooks directory for examples of using the detective package. If you are on a Raspberry pi you should use the official JupyterLab add-on which includes HASS-data-detective

Installation on your machine

You can either: pip install HASS-data-detective for the latest released version from pypi, or pip install git+https://github.com/robmarkcole/HASS-data-detective.git --upgrade for the bleeding edge version from github. Note that due to the matplotlib dependency, libfreetype6-dev is a requirement on aarch64 platforms (i.e. RPi).

Which version to install?

The 3.0 version from pypi requires the existence of a states_meta table which is not present in older Home Assistant databases. If you get the error (sqlite3.OperationalError) no such table: states_meta then you should install the earlier release with pip install HASS-data-detective==2.6

Run with Docker locally

You can use the detective package within a Docker container so there is no need to install anything on your machine (assuming you already have docker installed). Note this will pull the jupyter/scipy-notebook docker image the first time you run it, but subsequent runs will be much faster. Note there is no image available for Raspberry pi.

From the root directory of this repo run:

docker run --rm -p 8888:8888 -e JUPYTER_ENABLE_LAB=yes -v "$PWD":/home/jovyan/work jupyter/scipy-notebook

Follow the URL printed to the terminal, which opens a Jupyter lab instance. Open a new terminal in Jupyter lab and navigate to the work directory containing setup.py, then run:

~/work$ pip install .

You can now navigate to the notebooks directory and start using the detective package. Note that you can install any package you want from pypi, but they will not persist on restarting the container.

Try out detective online

You can try out the latest version of detective from pypi without installing anything. If you click on the 'launch binder' button above, detective will be started in a Docker container online using the Binderhub service. Run the example notebook to explore detective, and use the Upload button to upload your own home-assistant_v2.db database file for analysis. Note that all data is deleted when the container closes down, so this service is just for trying out detective.

Development (VScode)

  • Create a venv: python3 -m venv venv
  • Activate venv: source venv/bin/activate
  • Install requirements: pip3 install -r requirements.txt
  • Install detective in development mode: pip3 install -e .
  • Install Jupyterlab to run the notebooks: pip3 install jupyterlab
  • Open the notebook at notebooks/Getting started with detective.ipynb

Running tests

  • Install dependencies: pip3 install -r requirements_test.txt
  • Run: pytest tests

Contributors

Big thanks to @balloob and @frenck, checkout their profiles!

hass-data-detective's People

Contributors

9r avatar balloob avatar basnijholt avatar bramkragten avatar charlesbaynham avatar fabaff avatar gerritjandebruin avatar jlrgraham avatar robmarkcole avatar willkoehrsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hass-data-detective's Issues

Warn about `mysql://`

We want users to use mysql+pymysql:// because SQLAlchemy defaults with mysql to a package that does not support Python 3.

We should print a warning if we detect this that people should manually configure it. Maybe if we catch an ImportError ?

Make graphs accessible and saveable

Would like to return the graph objects so people can add their own formatting such as changing the title etc. Plot functions should return the graph objects

Add db type as attribute

As pointed out by @balloob 'Most DBAPIs have built in support for the datetime module, with the noted exception of SQLite. In the case of SQLite, date and time types are stored as strings which are then converted back to datetime objects when rows are returned.' Detective should have the db type as an attribute, allowing for db type specific behaviour if required.

Add export interesting data to csv

csv file is the common format for sharing tabular data. Add convenience for exporting interesting data, passing in a list of sensors or binary to export

Tests

To do: create some tests.

Initialisation

Change initialisation to return true/false that the connection to the database is OK

Install in HASS virtual env?

Not knowing the dependencies and interactions, should this be installed in the same virtual env that HASS has been installed in, or a new/seperate one, or just global?

Given it gets installed as a HASS.IO addon, I'm tending the former.... #

Setup CI

Appears there is a migration from travis-ci.org to something else, confusing how to set this up

DataError: No numeric types to aggregate

db.fetch_all_data()
binary_sensors = BinarySensors(db.master_df)

Returns:

---------------------------------------------------------------------------
DataError                                 Traceback (most recent call last)
<ipython-input-8-14f6c61226db> in <module>
      1 from detective.core import BinarySensors
----> 2 binary_sensors = BinarySensors(db.master_df)

/usr/local/lib/python3.6/dist-packages/detective/core.py in __init__(self, master_df)
    281         # Pivot
    282         binary_df = binary_df.pivot_table(
--> 283             index='last_changed', columns='entity', values='state')
    284 
    285         # Index to datetime

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in pivot_table(self, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name)
   5301                            aggfunc=aggfunc, fill_value=fill_value,
   5302                            margins=margins, dropna=dropna,
-> 5303                            margins_name=margins_name)
   5304 
   5305     def stack(self, level=-1, dropna=True):

/usr/local/lib/python3.6/dist-packages/pandas/core/reshape/pivot.py in pivot_table(data, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name)
     85     # if we have a categorical
     86     grouped = data.groupby(keys, observed=False)
---> 87     agged = grouped.agg(aggfunc)
     88     if dropna and isinstance(agged, ABCDataFrame) and len(agged.columns):
     89         agged = agged.dropna(how='all')

/usr/local/lib/python3.6/dist-packages/pandas/core/groupby/groupby.py in aggregate(self, arg, *args, **kwargs)
   4654         axis=''))
   4655     def aggregate(self, arg, *args, **kwargs):
-> 4656         return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
   4657 
   4658     agg = aggregate

/usr/local/lib/python3.6/dist-packages/pandas/core/groupby/groupby.py in aggregate(self, arg, *args, **kwargs)
   4085 
   4086         _level = kwargs.pop('_level', None)
-> 4087         result, how = self._aggregate(arg, _level=_level, *args, **kwargs)
   4088         if how is None:
   4089             return result

/usr/local/lib/python3.6/dist-packages/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
    346         if isinstance(arg, compat.string_types):
    347             return self._try_aggregate_string_function(arg, *args,
--> 348                                                        **kwargs), None
    349 
    350         if isinstance(arg, dict):

/usr/local/lib/python3.6/dist-packages/pandas/core/base.py in _try_aggregate_string_function(self, arg, *args, **kwargs)
    302         if f is not None:
    303             if callable(f):
--> 304                 return f(*args, **kwargs)
    305 
    306             # people may try to aggregate on a non-callable attribute

/usr/local/lib/python3.6/dist-packages/pandas/core/groupby/groupby.py in mean(self, *args, **kwargs)
   1304         nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
   1305         try:
-> 1306             return self._cython_agg_general('mean', **kwargs)
   1307         except GroupByError:
   1308             raise

/usr/local/lib/python3.6/dist-packages/pandas/core/groupby/groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
   3970                             min_count=-1):
   3971         new_items, new_blocks = self._cython_agg_blocks(
-> 3972             how, alt=alt, numeric_only=numeric_only, min_count=min_count)
   3973         return self._wrap_agged_blocks(new_items, new_blocks)
   3974 

/usr/local/lib/python3.6/dist-packages/pandas/core/groupby/groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   4042 
   4043         if len(new_blocks) == 0:
-> 4044             raise DataError('No numeric types to aggregate')
   4045 
   4046         # reset the locs in the blocks to correspond to our

DataError: No numeric types to aggregate

Add a describe method

Add a describe method which prints out general info about the database - size, number of entities, how far back the data goes

plot_sensor() not plotting all data

Appears that plot_sensor() is only plotting a limited amount of data compared to what is shown in the prophet plot. Could be related to timestamps which also look wrong.

Data must be accessed correctly for datetime index

To return a dataframe with a datetime index it is necessary to access the data attributes with a list, such as:
sensors_binary.data[['binary_sensor.motion_at_home']]

If accessing without a list, a series with a non datetime index is returned:
sensors_binary.data['binary_sensor.motion_at_home']

Need to add checks that the data is accessed and returned correctly with datetime index.

Make timestamps consistent

The formatting of timestamps varies with the different plot functions. Need to select a format and make it universal.

doesn't accept include

Hello,
I installed Jupyter and homeassistant-Notebok, and when i try running the getting started i got the following error on my yaml file

YAML tag !include_dir_merge_named is not supported

Also it does add "config/" extra directory or what ever included ( example: " i use[switch: !include config/switches.yaml] if i have my switches.yaml file inside homeassistant/config/switches.yaml. then it creates homeassistant/config/config/switches so it won't find it)
I’m not using hass.io, i have home assistant running on ubuntu 18
Thanks

Create sensor class

Create a super sensor class since attributes such as entities are common. With more experience the form of this super class should become more obvious.

Make correlations sensible

The calculated correlations can be 1.0 or -1.0 which must be due to insufficient data. Remove correlations with insufficient data.

ValueError: Array must be all same time zone

Hi, I'm starting to play around with your example hass-detective notebook in my own environment. I've pulled the master branch and made a few changes to your notebook code to pick up what's in master. When I try to use the NumericalSensors class I get a ValueError exception.

sensors_num = detective.NumericalSensors(parser.master_df)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/opt/conda/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
    302             try:
--> 303                 values, tz = tslib.datetime_to_datetime64(arg)
    304                 return DatetimeIndex._simple_new(values, name=name, tz=tz)

pandas/_libs/tslib.pyx in pandas._libs.tslib.datetime_to_datetime64()

ValueError: Array must be all same time zone

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-5-8918ff9e5054> in <module>()
----> 1 sensors_num = detective.NumericalSensors(parser.master_df)

~/work/HASS-data-detective/detective/core.py in __init__(self, master_df)
    183             index='last_changed', columns='entity', values='state')
    184 
--> 185         sensors_num_df.index = pd.to_datetime(sensors_num_df.index)
    186         sensors_num_df.index = sensors_num_df.index.tz_localize(None)

We hit Daylight Savings time a few weeks ago so I there's a mix of UTC-5 and UTC-6 in my database. Not sure what the right solution is...converting to unix time or, how I managed to fix the problem, is to modify NumericalSensors's __init__ to read:

        sensors_num_df.index = pd.to_datetime(sensors_num_df.index, utc=True)
        sensors_num_df.index = sensors_num_df.index.tz_localize(None)

Error when using a MySQL db

0.83.3 - External MySQL db using history.

Jupyter Error:


Successfully connected to database
Error with query: 
            SELECT entity_id, COUNT(*)
            FROM states
            GROUP BY entity_id
            ORDER by 2 DESC
            
Connection error, check your URL
---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1192                         parameters,
-> 1193                         context)
   1194         except BaseException as e:

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
    508     def do_execute(self, cursor, statement, parameters, context=None):
--> 509         cursor.execute(statement, parameters)
    510 

OperationalError: no such table: states

The above exception was the direct cause of the following exception:

OperationalError                          Traceback (most recent call last)
<ipython-input-1-b0a168cdf11c> in <module>
      1 from detective.core import db_from_hass_config
----> 2 db = db_from_hass_config()

/usr/local/lib/python3.6/dist-packages/detective/core.py in db_from_hass_config(path, **kwargs)
     16 
     17     url = config.db_url_from_hass_config(path)
---> 18     return HassDatabase(url, **kwargs)
     19 
     20 

/usr/local/lib/python3.6/dist-packages/detective/core.py in __init__(self, url, fetch_entities)
     39             print("Successfully connected to database")
     40             if fetch_entities:
---> 41                 self.fetch_entities()
     42         except:
     43             print("Connection error, check your URL")

/usr/local/lib/python3.6/dist-packages/detective/core.py in fetch_entities(self)
     62             """
     63             )
---> 64         response = self.perform_query(query)
     65         entities = [e[0] for e in list(response)]
     66         print("There are {} entities with data".format(len(entities)))

/usr/local/lib/python3.6/dist-packages/detective/core.py in perform_query(self, query)
     47         """Perform a query, where query is a string."""
     48         try:
---> 49             return self.engine.execute(query)
     50         except:
     51             print("Error with query: {}".format(query))

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in execute(self, statement, *multiparams, **params)
   2073 
   2074         connection = self.contextual_connect(close_with_result=True)
-> 2075         return connection.execute(statement, *multiparams, **params)
   2076 
   2077     def scalar(self, statement, *multiparams, **params):

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in execute(self, object, *multiparams, **params)
    946             raise exc.ObjectNotExecutableError(object)
    947         else:
--> 948             return meth(self, multiparams, params)
    949 
    950     def _execute_function(self, func, multiparams, params):

/usr/local/lib/python3.6/dist-packages/sqlalchemy/sql/elements.py in _execute_on_connection(self, connection, multiparams, params)
    267     def _execute_on_connection(self, connection, multiparams, params):
    268         if self.supports_execution:
--> 269             return connection._execute_clauseelement(self, multiparams, params)
    270         else:
    271             raise exc.ObjectNotExecutableError(self)

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in _execute_clauseelement(self, elem, multiparams, params)
   1058             compiled_sql,
   1059             distilled_params,
-> 1060             compiled_sql, distilled_params
   1061         )
   1062         if self._has_events or self.engine._has_events:

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1198                 parameters,
   1199                 cursor,
-> 1200                 context)
   1201 
   1202         if self._has_events or self.engine._has_events:

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
   1411                 util.raise_from_cause(
   1412                     sqlalchemy_exception,
-> 1413                     exc_info
   1414                 )
   1415             else:

/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py in raise_from_cause(exception, exc_info)
    263     exc_type, exc_value, exc_tb = exc_info
    264     cause = exc_value if exc_value is not exception else None
--> 265     reraise(type(exception), exception, tb=exc_tb, cause=cause)
    266 
    267 if py3k:

/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py in reraise(tp, value, tb, cause)
    246             value.__cause__ = cause
    247         if value.__traceback__ is not tb:
--> 248             raise value.with_traceback(tb)
    249         raise value
    250 

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1191                         statement,
   1192                         parameters,
-> 1193                         context)
   1194         except BaseException as e:
   1195             self._handle_dbapi_exception(

/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py in do_execute(self, cursor, statement, parameters, context)
    507 
    508     def do_execute(self, cursor, statement, parameters, context=None):
--> 509         cursor.execute(statement, parameters)
    510 
    511     def do_execute_no_params(self, cursor, statement, context=None):

OperationalError: (sqlite3.OperationalError) no such table: states [SQL: '\n            SELECT entity_id, COUNT(*)\n            FROM states\n            GROUP BY entity_id\n            ORDER by 2 DESC\n            '] (Background on this error at: http://sqlalche.me/e/e3q8)


HA addon log:


2018/12/14 15:01:12 [error] 607#607: *4 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.1, server: hassio.local, request: "GET /api/sessions?1544817671254 HTTP/1.1", upstream: "http://127.0.0.1:28459/api/sessions?1544817671254", host: "xxxxxx.duckdns.org", referrer: "https://xxxxxx.duckdns.org/lab?"
2018/12/14 15:01:12 [error] 607#607: *5 connect() failed (111: Connection refused) while connecting to upstream, client: 172.30.32.1, server: hassio.local, request: "GET /api/terminals?1544817671256 HTTP/1.1", upstream: "http://127.0.0.1:28459/api/terminals?1544817671256", host: "xxxxxx.duckdns.org", referrer: "https://xxxxxx.duckdns.org/lab?"
[W 15:01:15.262 LabApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.

Add anonymise and export data

It would be nice to get people sharing their datasets, but they will want to anonymise them first. Add this ability

load_hass_config: print warning when stubbing include

We don't support any of the includes and replace those with an empty dictionary. We should print a warning to the user so they know that we're not loading them. That way it makes sense when the recorder url cannot be found.

FileNotFoundError using !include

When running All Cells I get the following error

FileNotFoundError: [Errno 2] No such file or directory: '/config/includes/lights/includes/lights/lights.yaml'

The lights.yaml file is in /config/includes/lights so it seems to add the folder twice.

The relevant config.yaml part

automation: !include_dir_merge_list automation/
hue: !include includes/lights/hue.yaml
light: !include includes/lights/lights.yaml

I also got the message YAML tag !include_dir_merge_list is not supported but guess that's not an issue.

Also query events table

Currently detective is only querying the states table but we also want to query the events table. Need to decide how we will aggregate both datasets

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.