Code Monkey home page Code Monkey logo

Comments (5)

wdecay avatar wdecay commented on July 30, 2024

Hi Girish,

Thanks for your interest. Let me try and update the documentation to address these questions. I will get back to you when I have something ready.

-andrew

from ai-predictivemaintenance.

wdecay avatar wdecay commented on July 30, 2024

Hi Girish,

I have updated the Developers' Manual doc. Hope it provides enough context to understand how the solution is implemented.

https://github.com/Azure/AI-PredictiveMaintenance/blob/master/docs/Developer-Manual.md

Please let me know what you think.

Thanks,
-andrew

from ai-predictivemaintenance.

girishkumarbk avatar girishkumarbk commented on July 30, 2024

Data sent to IoT Hub by the Generator is read (using Azure Event Hub Connector) and processed by solution's Spark Structured Streaming Job running on the Databricks cluster created during solution provisioning.

Andrew: A quick question: Along with data being routed to Azure Event Hub (via routes) we also notice that the data from device is also routed to Azure Blob Container. If Spark picks its data from event hub ( vi the spark connector for event hub) then when would the data from Azure blob store used? I don't see a corresponding workflow that operates on telemetry data on Azure Blob Store ? Or is it that data on ABS is used for batch training again via spark ? Could you please elaborate why data from IoT hub is being routed to two routes and what's the workflow for each one of them ?

from ai-predictivemaintenance.

wdecay avatar wdecay commented on July 30, 2024

Hi Girish,

This diagram would, perhaps, be the best answer to your question: https://github.com/Azure/AI-PredictiveMaintenance/blob/master/docs/img/data_flow.png

The "snapshot" is the data accumulated on ABS. It is not used in the production pipeline (the right side of the diagram), but its purpose is to enable modeling. We provide an example notebook for ingesting this stapshot data along with failure records from Storage Tables. This imitates a production scenario where telemetry is collected over a period of time whereas failure/maintenance logs are manually populated with new data.

The DataGeneration notebook provides a "shortcut" for generating seed data, but in reality, you would need to collect that data from your machines over a sufficiently long period of time (perhaps, months at least).

Hope this explains it...

from ai-predictivemaintenance.

girishkumarbk avatar girishkumarbk commented on July 30, 2024

Hi Andrew,

When I deploy the solution via pdm-arm.json it is successful and then then I can see the health of the machines on dashboard. The streaming (real time path) works and health of the machines is displayed.

Where as I don't see the training flow never starts ??? Meaning the data pushed into ABS and then picked for retraining the model via DataIngestion, Featuring and Model Creation and operaitiolization workflow never runs. I don't see notebooks going into running state any time ?

How do we make sure that the entire pipeline from streaming data insights to training the model works ?

Also it would be great if you could add section on how to build this entire project that would enable us to make modifications and redeploy and customize for our own flow.

Regards,
/Girish BK

from ai-predictivemaintenance.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.