Code Monkey home page Code Monkey logo

stravalyse's People

Contributors

felixvanoost avatar floriecai avatar jac08h avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

stravalyse's Issues

Ignore virtual activities when generating a geo data file

Summary
When generating a geo data file from a set of activity data that contains activities of type VirtualRide or VirtualRun, the tool will include these activities in the file. Virtual activities contain valid polylines and time information and are otherwise indistinguishable from non-virtual ones, but they should not be included in the geo data file as they did not take place in the real world.

Steps to reproduce

  1. Run the tool with the command line argument -gu (generate and upload geospatial data) with a set of activity data that contains activities of type VirtualRide or VirtualRun.

Observed behaviour
The tool generates and uploads a geo data file that includes the virtual activities.

Expected behaviour
The tool should generate and upload a geo data file that excludes the virtual activities.

Remove the use of the shell in subprocess

The tool currently relies on the shell=True argument in all subprocess calls, which are used mianly to interface with the HERE CLI. This presents both a security risk as well as causing issues with cross-platform behaviour and should be refactored to avoid using the shell entirely.

Ignore indoor activities when generating a geo data file

Summary
When generating a geo data file from a set of activity data that contains activities with polylines that are marked as indoor / trainer (trainer = "true"), the tool will include these activities in the file. Activities recorded with certain third-party apps (e.g. Wahoo Fitness) will still enable the GPS and record coordinates for indoor activities, which causes the corresponding polylines to be non-null. The geospatial data for these activities is typically of exceptionally poor quality and should therefore be excluded from the geo data file.

Steps to reproduce

  1. Run the tool with the command line argument -gu (generate and upload geospatial data) with a set of activity data that contains activities with polylines that are marked as trainer = "true".

Observed behaviour
The tool generates and uploads a geo data file that includes the indoor / trainer activities.

Expected behaviour
The tool should generate and upload a geo data file that excludes the indoor / trainer activities.

Parse formatted tables returned by HERE CLI version 1.1.0+

Summary
The nicely-formatted data tables returned by release 1.1.0 onwards of the HERE CLI contain UTF-8 characters that are not handled by the tool, causing an error when trying to upload a geo data file to HERE XYZ. The tool should be updated to handle the formatting used in newer releases of the HERE CLI.

Steps to reproduce

  1. Ensure that the currently installed HERE CLI is version 1.0.2 or older using the command here -V in a terminal window.
  2. Run the tool with the command line argument -gu (generate and upload geospatial data).

Observed behaviour
The HERE XYZ upload process will fail with the following traceback:

Traceback (most recent call last):
  File "run.py", line 83, in <module>
    main()
  File "run.py", line 74, in main
    here_xyz.upload_geo_data(STRAVA_GEO_DATA_FILE)
  File "C:\Users\Felix\Documents\GitHub\Strava-Heatmap-Tool\here_xyz.py", line 76, in upload_geo_data
    space_id = _get_space_id()
  File "C:\Users\Felix\Documents\GitHub\Strava-Heatmap-Tool\here_xyz.py", line 35, in _get_space_id
    line = process.stdout.readline()
  File "C:\Users\Felix\Anaconda3\envs\Felix\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 359: character maps to <undefined>

Expected behaviour
The tool should correctly parse the output from newer releases of the HERE CLI and upload the geo data file as normal.

Incorrect date parsing causing exception when generating geo data file

Summary
When generating a geo data file from a set of activity data that contains activities with certain names that appear like dates or times, the tool incorrectly converts these names into DateTime objects and a ValueError exception occurs when attempting to write the corresponding activities to the geo data file.

Steps to reproduce

  1. Run the tool with the command line argument -g (generate geo data file) with a set of activity data that contains activities with names that appear like dates or times. Two examples that caused this issue are "name": "10/09/2014" and "name": "Sun sun sun sun sun" (interpreted by the DateTime parser as 'Sunday').

Observed behaviour
The following exception occurs when attempting to generate the geo data file:

Traceback (most recent call last):
  File "strava_analysis_tool.py", line 113, in <module>
    main()
  File "strava_analysis_tool.py", line 90, in main
    geo.export_geo_data_file(config['paths']['geo_data_file'], activity_dataframe)
  File "C:\Users\Felix\Documents\GitHub\Strava-Heatmap-Tool\geo.py", line 114, in export_geo_data_file
    activity_map_geodataframe.to_file(file_path, driver='GeoJSON', encoding='utf8')
  File "C:\Users\Felix\Anaconda3\lib\site-packages\geopandas\geodataframe.py", line 504, in to_file
    to_file(self, filename, driver, schema, **kwargs)
  File "C:\Users\Felix\Anaconda3\lib\site-packages\geopandas\io\file.py", line 130, in to_file
    colxn.writerecords(df.iterfeatures())
  File "C:\Users\Felix\Anaconda3\lib\site-packages\fiona\collection.py", line 342, in writerecords
    self.session.writerecs(records, self)
  File "fiona/ogrext.pyx", line 1195, in fiona.ogrext.WritingSession.writerecs
  File "fiona/ogrext.pyx", line 412, in fiona.ogrext.OGRFeatureBuilder.build
ValueError: Invalid field type <class 'datetime.datetime'>

Expected behaviour
The tool should correctly identify and parse only valid ISO 8601 strings into DateTime objects when reading from the activity data file.

Add an option to filter results by time

Add an option to select timeframe of the results, e.g. last month, last 3 weeks, April 2019 - September 2019.
This timeframe would then be used to output summary and graphs.

As the dates are already stored in pandas dataframe, filtering them by date should be possible. I'm not sure where the user would input the requested date - another CL argument?

What do you think?

Oh and PS: Thank you for writing this tool! I was hoping to find something similar. I think this app only scratched its potential - GUI, more customizable graphs, you name it (you did, in other issues :) ) and this could be really, really cool.

Rename the main module to be more descriptive

The current main module name run.py is generic and doesn't provide any useful information about its functionality. It should be renamed to something more descriptive, like strava_analysis_tool.py instead.

Use stream option when uploading a geo data file

The HERE XYZ CLI offers a command-line option to upload GeoJSON files to the server using a streaming method, which results in significantly reduced upload times (3-4x faster) when the geo data file is large (>1000 activities). The tool should use this streaming method by default.

Automatically upload the activity geo data file to the HERE XYZ platform

The tool currently generates a file containing LineStrings and relevant metadata for all Strava activities with geospatial data in GeoJSON format. Using an online mapping platform like HERE XYZ, this file can be used to produce an interactive map of the activities in a similar fashion to the paid 'personal heatmaps' feature on Strava.

HERE XYZ offers both a CLI and an API to automate this (for now) manual uploading process. As an initial step, the CLI should be used to create an XYZ project (if one doesn't already exist) and automatically upload the file after generation.

Check for empty polylines when generating a geo data file

Summary
When generating a geo data file from a set of activity that that contains empty polyline strings, the tool will include the activities with empty polylines (i.e. no geospatial data) in the file. Attempting to upload this file to HERE XYZ via the CLI returns the error coordinates must have at least two elements.

Activities recorded directly with the Strava app and tagged as 'indoor' or 'trainer' have the polyline string value null. However, indoor activities uploaded as FIT files from other devices can cause the polyline string to be empty ('""') instead - a case the tool does not currently check for.

Steps to reproduce

  1. Run the tool with the command line argument -gu (generate and upload geospatial data) with a set of activity data that contains one or more empty polyline strings.

Observed behaviour
The tool generates a geo data file that includes the activities with empty polyline strings (no geospatial data):

"geometry": { "type": "LineString", "coordinates": [ ] } },

Uploading this file to HERE XYZ using the command line option -gu will return the following error:

HERE XYZ: Error uploading geospatial data to space ID "xxxxxxxx"

Expected behaviour
The tool should check for the presence of empty polyline strings when generating a geo data file and prevent the corresponding activities from being included.

ValueError with data without commutes

ValueError with data without commutes
When there are no commute data, correct summary is show but the application produces the ValueError

*Correct summary output*

Analysis: No commutes found
/home/jh/Strava-Analysis-Tool/venv/lib/python3.8/site-packages/pandas/core/arrays/datetimes.py:1099: UserWarning: Converting to PeriodArray/Index representation will drop timezone information.
  warnings.warn(
Traceback (most recent call last):
  File "strava_analysis_tool.py", line 101, in <module>
    main()
  File "strava_analysis_tool.py", line 93, in main
    analysis.display_commute_plots(activity_dataframe)
  File "/home/jh/SAT/analysis.py", line 331, in display_commute_plots
    _generate_commute_count_plot(commute_data, ax3, colours)
  File "/home/jh/SAT/analysis.py", line 99, in _generate_commute_count_plot
    sns.barplot(x=data.index.to_period('M'),
  File "/home/jh/Strava-Analysis-Tool/venv/lib/python3.8/site-packages/seaborn/categorical.py", line 3147, in barplot
    plotter = _BarPlotter(x, y, hue, data, order, hue_order,
  File "/home/jh/Strava-Analysis-Tool/venv/lib/python3.8/site-packages/seaborn/categorical.py", line 1616, in __init__
    self.establish_colors(color, palette, saturation)
  File "/home/jh/Strava-Analysis-Tool/venv/lib/python3.8/site-packages/seaborn/categorical.py", line 316, in establish_colors
    lum = min(light_vals) * .6
ValueError: min() arg is an empty sequence

Steps to reproduce

  1. $ python strava_analysis_tool.py

Use new HERE platform

Migrate to HERE's new platform and remove calls to the now deprecated 'Data Hub'. The tool should create a catalog and Interactive Mapping Layer (IML) to store all uploaded geospatial activity data.

For now, the tool will continue to use a naive / wasteful uploading approach by always uploading all the activities (even if they already exist in the IML).

Replace usage of the HERE CLI with the xyz-spaces-python package

Until recently, the HERE CLI was the easiest way to upload geospatial data to a HERE XYZ space for viewing. Earlier this year, HERE began development of a native Python library for HERE XYZ, xyz-spaces-python, which should make both development and user installation much easier. The tool should be updated to replace usage of the HERE CLI with xyz-spaces-python with basic feature parity.

Define a consistent colour palette across plots

The tool currently generates plots using a hard-coded and inconsistent set of colours. A colour palette should be defined and used when generating plots to create a consistent visual theme.

Add an option to refresh the activity data

The tool currently updates the activity data file with any new activities, but does not store any changes made to existing ones. This situation is prone to occur whenever an activity is modified through the Strava platform - for instance, when a follower gives kudos or the athlete adds a description - after it has already been stored locally in the file.

To allow the activity data file to always reflect these latest changes, the tool should have an option to 'refresh' the data by wiping the file and re-requesting it from scratch.

Exclude stationary activity types from the mean activity distance plot

When generating a plot of mean activity distance over time, the tool should ignore activity types that are stationary by nature (CrossFit, rock climbing, weight training, workout, and yoga). This prevents the activities from being included in the plot legend and displaying a distance of 0, which provides no useful information and clutters up the plot.

Display information on dataset health

There is currently no easy way to see an overview of the 'health' of the activity dataset (e.g. how many activities are manual or are flagged) or how many activities contain extended sensor data (e.g. heart rate, cadence, or measured power). The tool should display the following information in a similar format to the summary statistics:

  • Number of manual activities
  • Number of flagged activities
  • Number of activities with heart rate data
  • Number of cycling activities with cadence data
  • Number of cycling activities with measured power data
  • Number of activities with temperature data

Fix average speed calculation in summary statistics

Fix the incorrect average speed calculation reported in the summary statistics. The current calculation is as follows:

Mean speed (m/s) * 3.6

Which is incorrect because it calculates the mean of the mean speed reported for each activity. It should instead be:

Total distance (m) / total moving time (sec) * 3.6

Pass geopandas dataframe to HERE XYZ directly

The tool currently generates a .geojson file containing the geospatial activity data and passes it to the add_features_geojson function in the xyz-spaces-python package to upload the data to HERE Studio. This is a holdover from the HERE CLI that was previously being used, with which a file was the only way to upload data to the server.

xyz-spaces-python allows data to be uploaded directly using a GeoPandas dataframe instead using the add_features_geopandas function. This stops the tool from having to generate a .geojson file unnecessarily, which should significantly reduce overall processing and uploading times.

The option to generate a .geojson file should be broken out into a separate command line argument, so that it is only created when requested by the user.

Create a setup guide for HERE XYZ Studio

Create a setup guide for HERE XYZ Studio describing the basic procedure for:

  • Browsing through the data space created by the tool
  • Creating a new project and adding a data space to it
  • Changing the base map style
  • Apply conditional formatting to the line types and colours (e.g. based on activity type)

Implement the use of a tool configuration file

The tool currently relies on a mix of environment variables (for the Strava client ID and secret) and hard-coded values (e.g. file paths) for configuration, which somewhat restricts its flexibility. These configurable values should be imported via a user-modifiable file instead.

Check for an empty DataFrame when generating commute statistics

Summary
When generating commute statistics from a set of activity data that contains no activities marked as a commute on Strava (i.e. only activities with the attribute "commute" = false), the tool displays an empty DataFrame instead of indicating that no commutes are present.

Steps to reproduce

  1. Run the tool with a set of activity data that contains no activities marked as a commute.

Observed behaviour
The tool displays an empty DataFrame:

Commute statistics:

Empty DataFrame
Columns: []
Index: []

Expected behaviour
The tool displays a message indicating that no commutes are present in the activity data:

Analysis: No commutes found

Update to Python 3.8

Update the tool to run on Python 3.8 and generate the necessary requirements.txt (for Pip) and environment.yml (for Anaconda) files.

Store activity data using pandas directly

Store the Strava activity data directly from a pandas DataFrame to a file in JSON format using the built-in pandas to_json and read_json methods. This should help reduce loading and processing times, which will better support larger datasets and allow additional metadata (e.g. addresses obtained by reverse geocoding) to be stored without impacting the responsiveness of the tool.

Delete all HERE IML features when refreshing data

Delete all the features (activities) stored in the HERE interactive mapping layer (IML) when the --refresh-data option is selected. This ensures that the activity data stored on HERE is always up to date.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.