Code Monkey home page Code Monkey logo

latent-calendar's Introduction

Hello ๐Ÿ‘‹

My name is Will Dean. My background is in Statistics and have been working as a Data Scientist in the transportation and real estate industries.

I enjoy working with geographical data, exploring user behavior, and seeing where they intersect.

The tools I reach for frequently are Python and Docker. I will use libraries like:

  • PyMC
  • geopandas
  • typer

Or will create my own like:

I am not foreign to opening up a GitHub Issue, PR, or both ๐Ÿ˜„

I love to learn and collaborate so feel free to connect with me or keep up with my blog

latent-calendar's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

latent-calendar's Issues

Change "vocab" to MultiIndex

Storing columns of wide format limits generalization to more time slots per day

MulitIndex would allow for pandas support of

  • summing across
  • pivoting / reshaping

Need to ensure compat with sklearn and the DOWHour / segments

Additional arg types in `stack_segments`

The stack_segments takes a list of Series but might benefit from list of DataFrames or Series

mondays = ...
tuesday = ... 

df_segments = stack_segments([mondays, tuesdays])

weekends = ... 
df_segments_with_weekends = stack_segments([df_segments, weekends])

Add to methodology

More details on the problem formulation and how it relates to other problems

Support for python 3.11 via dataclasses

Error when using with python 3.11

Potential fix:

  • have TimeLabeller use default_factory
  • migrate away from dataclasses?
>>> import latent_calendar
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/latent_calendar/__init__.py", line 1, in <module>
    import latent_calendar.extensions
  File "/usr/local/lib/python3.11/site-packages/latent_calendar/extensions.py", line 85, in <module>
    from latent_calendar.plot.core import (
  File "/usr/local/lib/python3.11/site-packages/latent_calendar/plot/__init__.py", line 14, in <module>
    from latent_calendar.plot.core import (
  File "/usr/local/lib/python3.11/site-packages/latent_calendar/plot/core/__init__.py", line 1, in <module>
    from latent_calendar.plot.core.calendar import (
  File "/usr/local/lib/python3.11/site-packages/latent_calendar/plot/core/calendar.py", line 14, in <module>
    from latent_calendar.plot.elements import (
  File "/usr/local/lib/python3.11/site-packages/latent_calendar/plot/elements.py", line 31, in <module>
    @dataclass
     ^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 1230, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 1220, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'latent_calendar.vocab.HourFormatter'> for field hour_formatter is not allowed: use default_factory

Plot block across multiple days

The CalendarEvent does not support rectangles across multiple days

This might be single block from Monday until Friday from 9 - 5

Generalize time slot sizes

df = pd.DataFrame({
    "hour": np.linspace(0, 24, 100)[:-1]
})

# Like before (rounded to nearest hour)
# 0.75 -> 0, 23.75 -> 23
discretizer = HourDiscretizer(col="hour")
discretizer.transform(df)

# 0.75 -> 0.5, 23.75 -> 23.5
discretizer = HourDiscretizer(col="hour", minutes=30)
discretizer.transform(df)

Affected here and would want to be available where this functionality is exposed.

class HourDiscretizer(BaseEstimator, TransformerMixin):
"""Discretize the hour column."""
def __init__(self, col: str = "hour") -> None:
self.col = col
def fit(self, X: pd.DataFrame, y=None):
return self
def transform(self, X: pd.DataFrame, y=None):
X[self.col] = (X[self.col] // 1).astype(int)
self.columns = list(X.columns)
return X
def get_feature_names_out(self, input_features=None):
return self.columns

Test plots

Create some tests for the plotting outcomes with pytest-mpl

Bayesian `LDA`

Need a fully Bayesian LDA in order to make use of custom priors.

Building off the examples here

# (n_components, n_time_slots)
prior = df_segments.to_numpy()

from latent_calendar import BayesianLatentCalendar

model = BayesianLatentCalendar(prior=prior)
model.fit(df_model)

Link to model document in html rendering

Add badges

Badges for tests, docs, and pypi version

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.