A node app that synchronizes data from multiple sources to the Places database.
git clone https://github.com/nationalparkservice/places-sync.git
npm install
- Add your sources (this guide will be built in the future)
- "Load" "MasterCache"
- sqlite database (currently)
- schema:
- *key text,
- foreign_key text,
- *process text,
- *#source text,
- hash text,
- last_updated,
- data blob,
- is_removed numeric
- schema:
- sqlite database (currently)
- "Load" sourceA
- Loads sourceA into memory
- "Load" sourceB
- Loads sourceb into memory
- Get Updates From A since B last updated
- Apply Updates From A to B
- Save Source B
- Write Successful updates to MasterCache
- Close Source A
- Close Source B
The "Load" Function:
- some sources (text files, geojson, json, etc) are loaded entirely into memory as a sqlite database
- other sources (that are queryable) are loaded "as is"
The "Get Updates" Function
-
Ordered Tasks:
- Determines the last time source B was updated from source A
- Pulls all information into memory from source A that was created since the last sync
-
Unordered Tasks:
- Run Ordered Tasks
- Gets All keys from Source A to determine if anything from deleted since the last sync
- Gets all keys in the master cache, so we can determine what was previously syncronized from A to B\
-
Determine changes (getUpdates.js and getUdpates.sql
The "Apply Updates" Function
- Accepts the object returned from "Get Updates"
- It creates two "bins"
- A list of "removes" (anything marked as "removed")
- and a list of "inserts" (anything marked as "updated", "created", "missing")
- It then adds the remove / updates to the source B object in memory
- It creates two "bins"
The "Save" Function
-
Gets all updates to be run on the source
-
Gets all deletes to be run on the source
-
Pulls down the metadata object (this is used for extra information, such as a foreign key)
-
It writes / deletes the information from the source
-
Once the source has been successfully updated, it runs the "Apply Updates" and "Save" functions on the masterCache database
The "Close" Function
- Some databases and files require the connection to be closed, this step will close that connection
- This step is required for all sources, as it will clear source object out of memory