Comments (5)
Hi Girish,
Thanks for your interest. Let me try and update the documentation to address these questions. I will get back to you when I have something ready.
-andrew
from ai-predictivemaintenance.
Hi Girish,
I have updated the Developers' Manual doc. Hope it provides enough context to understand how the solution is implemented.
https://github.com/Azure/AI-PredictiveMaintenance/blob/master/docs/Developer-Manual.md
Please let me know what you think.
Thanks,
-andrew
from ai-predictivemaintenance.
Data sent to IoT Hub by the Generator is read (using Azure Event Hub Connector) and processed by solution's Spark Structured Streaming Job running on the Databricks cluster created during solution provisioning.
Andrew: A quick question: Along with data being routed to Azure Event Hub (via routes) we also notice that the data from device is also routed to Azure Blob Container. If Spark picks its data from event hub ( vi the spark connector for event hub) then when would the data from Azure blob store used? I don't see a corresponding workflow that operates on telemetry data on Azure Blob Store ? Or is it that data on ABS is used for batch training again via spark ? Could you please elaborate why data from IoT hub is being routed to two routes and what's the workflow for each one of them ?
from ai-predictivemaintenance.
Hi Girish,
This diagram would, perhaps, be the best answer to your question: https://github.com/Azure/AI-PredictiveMaintenance/blob/master/docs/img/data_flow.png
The "snapshot" is the data accumulated on ABS. It is not used in the production pipeline (the right side of the diagram), but its purpose is to enable modeling. We provide an example notebook for ingesting this stapshot data along with failure records from Storage Tables. This imitates a production scenario where telemetry is collected over a period of time whereas failure/maintenance logs are manually populated with new data.
The DataGeneration notebook provides a "shortcut" for generating seed data, but in reality, you would need to collect that data from your machines over a sufficiently long period of time (perhaps, months at least).
Hope this explains it...
from ai-predictivemaintenance.
Hi Andrew,
When I deploy the solution via pdm-arm.json it is successful and then then I can see the health of the machines on dashboard. The streaming (real time path) works and health of the machines is displayed.
Where as I don't see the training flow never starts ??? Meaning the data pushed into ABS and then picked for retraining the model via DataIngestion, Featuring and Model Creation and operaitiolization workflow never runs. I don't see notebooks going into running state any time ?
How do we make sure that the entire pipeline from streaming data insights to training the model works ?
Also it would be great if you could add section on how to build this entire project that would enable us to make modifications and redeploy and customize for our own flow.
Regards,
/Girish BK
from ai-predictivemaintenance.
Related Issues (19)
- During deployment, databricks URL is incorrect HOT 1
- Deployment failed - Databrick's featurization_task HOT 3
- VM Provisioning - ScriptExtension failed HOT 1
- Azure Error: InvalidTemplate HOT 1
- Error in Model Training.... HOT 3
- Unable to create WorkSpace HOT 1
- Model Registration Issue HOT 1
- Cant build data-generator !!! HOT 1
- linuxDsvmTemplate deployment failed HOT 5
- Data Generator HTTP Error
- Data generator error from device: TableNotFound HOT 2
- Error when creating simulated devices HOT 1
- Issue accessing Linux DSVM HOT 2
- .
- import error HOT 1
- This repo is missing important files HOT 1
- Issue when try to deploy to azure
- Test issue 2018/06/11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ai-predictivemaintenance.