Code Monkey home page Code Monkey logo

data's People

Contributors

kylebarron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

data's Issues

Project recorded GPS track onto OSM/Halfmile track

There are parts of my recorded GPS track with bad data. For geotagging my photos, it would be good to

  1. Project own track onto OSM track. This means that every photo will have a position on the trail.

  2. Compare iPhone GPS coordinates to interpolated coords from watch

Create slope angle shading?

Looks to be one line of code with gdal. Then maybe try vectorizing the result? Then it can be put into mvt too

Cut and upload National Forest maps

I forget where I am with US forest topo maps. But in any case I need to figure out the gdal commands to cut them into seamless tiles. Do I need to create a metadata.json file for these raster tiles?

Find accurate length of trail

Figure out why current tracks are shorter than the "known" length of the trail (2653mi or 2662mi to Manning Park). The distances I'm getting are 2D: 2592 mi, 3D: 2607.9 mi.

Reproduce

hm = Halfmile()
trail = hm.trail_full(alternates=False)
merged = linemerge([*trail.geometry])
projected = reproject(merged, geom.WGS84, geom.CA_ALBERS)

Projected 2D length

projected.length
# 4171048.0624210313 m

Haversine

from haversine import haversine
dists = []
for a, b in zip(merged.coords, merged.coords[1:]):
    dist = haversine((a[1], a[0]), (b[1], b[0]))
    dists.append(dist)

sum(dists)
# 4171.3919978317 km

Projected 3D length

dists = []
for a, b in zip(projected.coords, projected.coords[1:]):
    x = b[0] - a[0]
    y = b[1] - a[1]
    z = b[2] - a[2]
    dist_m = sqrt(pow(x, 2) + pow(y, 2) + pow(z, 2))
    dists.append(dist_m)

sum(dists)
# 4196977.125911263 m

Create overlay vector tiles

Now that I've figured out how to use (online) vector tiles as an overlay/Fill layer in the mobile app (nst-guide/mobile#9), create vector tiles with (all?) overlays.

Example file sizes:

  • NF boundaries GeoJSON: 17MB
  • NP boundaries GeoJSON: 2MB
  • Wilderness boundaries GeoJSON: 17MB

Mbtiles with all of the above:

  • 6.5MB including all attributes
  • 3.2MB not including attributes (less than 10% the above size)

So figure out what attributes you need:

I think just an identifier is best because then you never have to update these vector tiles.

What layers should be combined?

Because the individual layer file sizes are so small, you should probably package as many of these together as you can. So have basically three vector tilesets:

Minimal required:
  • OpenMapTiles: I don't really want to touch this process, except maybe to add contours into this. I shouldn't make the default map include non-essential layers
  • Contours: this could be combined with OMT, but for now, keep it as its own, because it's not worth the $8 to re-upload all the OMT tiles to S3.
  • Hillshade: this has to be by itself because it's raster?
Anything non-default

Because the file sizes are small, package up lots of layers into a single non-default tileset.

  • National Park boundaries
  • National Forest boundaries
  • Wilderness boundaries
  • Historical wildfires
  • No-camping boundaries (though maybe this should be separate because these could be updated more-frequently)
  • Land cover types? from national land cover dataset
  • State boundaries?
  • Slope-angle shading? But then you'd have to convert the raster to vector.
  • Protected-area database Might be helpful for state-level public lands. But maybe take out nps/nf boundaries to not have redundancy?

Generate no-fire polygons

I.e. use a DEM and find all the areas above 10,000 feet in Yosemite, or whatever elevation it is. Do that for each area

Transit Data Formatting

Now I've cleaned up the original transit api code. The question is how to format this to be consumed by the website.

There are naturally two layers, a route/line layer and a stop/point layer. The route should at a minimum have properties for

  • the url of the provider
  • the color the route should be. What should the default color be?

The stop should have at a minimum properties for:

  • Is it a stop near the PCT or not?

Things to validate:

  • make sure that the color of the route is validated, so that in the mapbox style, it can be just ["get", "color"], and it's formatted correctly as rgb

Generate information about polygon areas the trail passes through.

I.e. I have a National Parks boundaries dataset. What are interesting ways to connect that data to the trail?

  • Find parks within a buffer (~10 mi?) of the trail. I think this might only add Ross Lake NRA, but good anyways.
  • Keep only National Park identifier in the vector tile. This will make the tiles really small and then those never need to change (and can be cached more?). Then link to attribute data in the browser (how?).

Attributes:

  • Length of PCT within park
  • Mile start/end of PCT going north/south
  • Link to Wikipedia? Can get description, main photo,
  • ...

Split wilderness multipolygons into polygons before intersecting with trail

When intersecting wilderness areas with the trail, wilderness areas may be a single MultiPolygon for a wilderness area. So when I intersect wilderness areas with the trail, I see polygons way out in Nevada, that aren't connected to California.

Instead, first split the multipolygons into individual polygons (keeping the attributes), then intersect. It might even be better to do two steps, first an intersection with the multipolygons, and then split into polygons, because then there's less data in the reshape.

Tile Generation

Thoughts for generating raster tiles

Tile Sources

  • FS Topo
  • USGS topo/National Map

Generating list of tiles

I want to generate a list of tiles that's within x miles from the trail. Using a GeoJSON file of a buffer polygon, use supermercado to do:

cat buffer.geojson | supermercado burn 10

If you want to see the tiles on geojson.io, pass the list of tiles back to mercantile:

cat buffer.geojson | supermercado burn 10 | mercantile shapes --collect > tiles.geojson

Rasters to tiles

gdal2tiles.py: https://gdal.org/programs/gdal2tiles.html

Hosting tiles:

https://github.com/vincentsarago/lambda-tiler

References:

Supermercado: https://github.com/mapbox/supermercado
Mercantile: https://github.com/mapbox/mercantile

Generate no camping polygons for the Sierra

Take two miles around trail, inside just Yosemite to start. find all water body features in that buffer, generate 100ft buffer.

Also 4 mile buffer around tuoulmne Meadows and Yosemite valley.

Create style-optimized basemap vector tiles

I tested out https://github.com/mapbox/vtshaver on the current OSM Liberty style.json, and it took 14/2525/*.pbf from 328KB to 139KB.

The size of the whole 14/**/*.pbf zoom level for CA, OR, and WA is 2.93GB, so if this ratio held for the entire area, that could go down to 1.2GB! It would be nothing to download everything within 20 miles of the trail.

Sentinel Imagery

Imagery is collected in the Military Grid Reference System. Here's a KML you can use to find imagery intersections with the trail.

The Sentinel AWS Readme lists how the file structure is set up. So for example, the UTM grid square that intersects Glacier Peak in WA, on July 22, 2019 is here:

> aws s3 ls s3://sentinel-s2-l2a/tiles/10/U/FU/2019/7/22/0/ --request-payer
                           PRE R10m/
                           PRE R20m/
                           PRE R60m/
                           PRE auxiliary/
                           PRE qi/
2019-07-22 21:15:40     553239 metadata.xml
2019-07-22 21:15:49       1035 productInfo.json
2019-07-22 21:15:40       1491 tileInfo.json

The tileInfo.json has helpful information, including a GeoJSON representation of the geometry of the square, (not in WGS84), cloudyPixelPercentage(!!), and the path to the image.

{
  "path" : "tiles/10/U/FU/2019/7/22/0",
  "timestamp" : "2019-07-22T19:11:13.752Z",
  "utmZone" : 10,
  "latitudeBand" : "U",
  "gridSquare" : "FU",
  "datastrip" : {
    "id" : "S2A_OPER_MSI_L2A_DS_SGS__20190722T231540_S20190722T190551_N02.13",
    "path" : "products/2019/7/22/S2A_MSIL2A_20190722T185921_N0213_R013_T10UFU_20190722T231540/datastrip/0"
  },
  "tileGeometry" : {
    "type" : "Polygon",
    "crs" : {
      "type" : "name",
      "properties" : {
        "name" : "urn:ogc:def:crs:EPSG:8.8.1:32610"
      }
    },
    "coordinates" : [ [ [ 600000.0, 5400000.0 ], [ 709800.0, 5400000.0 ], [ 709800.0, 5290200.0 ], [ 600000.0, 5290200.0 ], [ 600000.0, 5400000.0 ] ] ]
  },
  "tileDataGeometry" : {
    "type" : "Polygon",
    "crs" : {
      "type" : "name",
      "properties" : {
        "name" : "urn:ogc:def:crs:EPSG:8.8.1:32610"
      }
    },
    "coordinates" : [ [ [ 600001.0, 5399999.0 ], [ 709799.0, 5399999.0 ], [ 709799.0, 5290201.0 ], [ 600001.0, 5290201.0 ], [ 600001.0, 5399999.0 ] ] ]
  },
  "tileOrigin" : {
    "type" : "Point",
    "crs" : {
      "type" : "name",
      "properties" : {
        "name" : "urn:ogc:def:crs:EPSG:8.8.1:32610"
      }
    },
    "coordinates" : [ 600000.0, 5400000.0 ]
  },
  "dataCoveragePercentage" : 100.0,
  "cloudyPixelPercentage" : 0.0,
  "productName" : "S2A_MSIL2A_20190722T185921_N0213_R013_T10UFU_20190722T231540",
  "productPath" : "products/2019/7/22/S2A_MSIL2A_20190722T185921_N0213_R013_T10UFU_20190722T231540"
}

If you look into a given folder for (utm square / year / month / day / snapshot number / resolution), you'll see this:

> aws s3 ls s3://sentinel-s2-l2a/tiles/10/U/EU/2019/7/22/0/R60m/ --request-payer
2019-07-22 21:07:22     512645 AOT.jp2
2019-07-22 21:07:22    2840382 B01.jp2
2019-07-22 21:07:22    3182126 B02.jp2
2019-07-22 21:07:22    3344222 B03.jp2
2019-07-22 21:07:22    3297405 B04.jp2
2019-07-22 21:07:22    3529444 B05.jp2
2019-07-22 21:07:22    3755698 B06.jp2
2019-07-22 21:07:22    3736390 B07.jp2
2019-07-22 21:07:22    3341917 B08.jp2
2019-07-22 21:07:22    3750817 B09.jp2
2019-07-22 21:07:22    3707894 B11.jp2
2019-07-22 21:07:22    3553773 B12.jp2
2019-07-22 21:07:22    3727852 B8A.jp2
2019-07-22 21:07:22     317802 SCL.jp2
2019-07-22 21:07:22    3711655 TCI.jp2
2019-07-22 21:07:22    3152856 WVP.jp2

Each of the B*.jp2 files are a different band. Wikipedia explains which band is which. However, easiest for my purposes is TCI.jp2, which stands for True Color Image I think. For the 10m resolution, each file is around 129MB.

If you download a few images and run gdal2tiles.py --srcnodata="0,0,0" *.jp2 tiles, you can get an image like this:

image

--srcnodata="0,0,0" means that all fully black areas will become transparent, as seen above. The path of the satellites means that it won't always be simple to find continuous bands to display.

Note that Lambda isn't really a great place for this processing, since it has a scratch disk space limit of 500MB.

Note that while signed in to the aws cli, you need to add --request-payer to do anything with a requester pays bucket. I.e.

aws s3 ls s3://sentinel-s2-l2a/ --request-payer

Compute average slope for trail

these values are in meters

total elevation change is: 346781.89000000095
total distance is: 4171048.0624210313
average slope is: 0.08314022874114661
from math import sqrt

from shapely.ops import linemerge

import geom
from data_source import Halfmile


def main():
    hm = Halfmile()
    gdf = hm.trail_full(False)
    # Reproject to meters
    gdf = geom.reproject_gdf(gdf, geom.WGS84, geom.CA_ALBERS)

    line = linemerge(gdf.geometry.values)

    # Calculate horizontal distance and vertical distance for each segment of
    # the data:
    dist_list, dz_list = calculate_dist_and_elev_change(line)
    total_elevation_change = sum([abs(x) for x in dz_list])
    total_distance = sum(dist_list)
    average_slope = total_elevation_change / total_distance

    print(f'total elevation change is: {total_elevation_change}')
    print(f'total distance is: {total_distance}')
    print(f'average slope is: {average_slope}')


def calculate_dist_and_elev_change(line):
    dist_list = []
    dz_list = []
    for coord, next_coord in zip(line.coords, line.coords[1:]):
        dx = next_coord[0] - coord[0]
        dy = next_coord[1] - coord[1]
        dz = next_coord[2] - coord[2]
        dist = sqrt(dx ** 2 + dy ** 2)
        dist_list.append(dist)
        dz_list.append(dz)

    return dist_list, dz_list


if __name__ == '__main__':
    main()

Get peaks within distance of trail

This can definitely be a larger buffer than normal. Say for a buffer of like 10 miles around the trail, find every osm point (just point?) that's tagged as a peak or pass. Many of these will already be linked to Wikipedia, but for the others you can do a geosearch at that point, for that title, with a radius of like 100-300m.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.