Code Monkey home page Code Monkey logo

chuka-j-uzo / data-streaming-etl-iubh Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 1.0 34.85 MB

Data-Streaming-ETL-IUBH repository is developed as a real-time streaming application that captures data from a python app that simulates streamed data from the movement of a truck as its source and ingests it into a data store for processing, analysis and visualization.

License: Other

Python 54.55% Dockerfile 4.65% Jupyter Notebook 40.81%
docker docker-compose docker-container docker-images elyra grafana-dashboard grafana-datasource grafana-panel grafana-prometheus kafka kafka-consumer kafka-producer mysql-database mysql-server prometheus-metrics pyspark python spark spark-sql sqlalchemy-python

data-streaming-etl-iubh's Introduction

Regression_Analysis_PowerConsumption_Data

Build a basic linear regression model on a CSV training data with python, and then evaluate the model's performance on the test data using mean squared error and R-squared.

Here we analyze a dataset with 471,744 instances or entries using Pandas, sklearn, matplotlib & Seaborn.

data-streaming-etl-iubh's People

Contributors

chuka-j-uzo avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

jkab016

data-streaming-etl-iubh's Issues

Unable to access the Ranger admin portal

Ranger installation was successful and the Ranger admin portal should be accessible at http://localhost:6080/ with the credentials "admin/rangerR0cks!".

However, we are still unable to connect to Ranger's admin panel. We've checked if the Ranger admin service is running by using the command "systemctl status ranger-admin" in the terminal, and we realised it wasn't running. We then tried starting it with the command "systemctl start ranger-admin", but still no success. We tried reinstalling it but still no success.

Once this is sorted, we should be able to deploy access control as shown in our pipeline diagram.

Screenshot from 2023-04-17 11-17-19.

Apache Atlas fails to run: Maven clean install fails to install Kafka-Hook

Apache Atlas Web UI runs, but we are unable to connect the Kafka-Hook to it or even install other hooks required for it to run.

We attempted to solve the problem by running the mvn clean install -DskipTests command in the root directory of our Kafka Bridge project folder, where the pom.xml file is located. In this case, we ran it in the kafka-bridge directory, as that's where the pom.xml file is located, but we got the following error:

central-https: https://repo.maven.apache.org/maven2/com/puppycrawl/tools/checkstyle/5.5/checkstyle-5.5.jar (638 kB at 85 kB/s)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  42:53 min
[INFO] Finished at: 2023-04-12T11:50:49+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:2.9.1:check (checkstyle-check) on project kafka-bridge: Execution checkstyle-check of goal org.apache.maven.plugins:maven-checkstyle-plugin:2.9.1:check failed: Plugin org.apache.maven.plugins:maven-checkstyle-plugin:2.9.1 or one of its dependencies could not be resolved: Could not find artifact org.apache.atlas:atlas-buildtools:jar:1.0 in central-https (https://repo.maven.apache.org/maven2) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionException

The error message suggested that Maven was not able to find the artifact "org.apache.atlas:atlas-buildtools:jar:1.0". So one possible solution was to add the repository where this artifact is hosted to our Maven settings.xml file. Here are the steps you followed without success:

  1. Open our settings.xml file located in ~/.m2/settings.xml with our text editor.
  2. Add the following code snippet inside the tag:

<repository>
    <id>atlas-repository</id>
    <name>Apache Atlas Repository</name>
    <url>https://repository.apache.org/content/repositories/releases/</url>
</repository>
  1. Save the settings.xml file and re-run the mvn clean install -DskipTestscommand.

The above addition failed, then we tried one last option, which entailed adding the following repository to our pom.xml file:

<repositories>
  <repository>
    <id>atlas-repository</id>
    <name>Apache Atlas Repository</name>
    <url>https://repository.apache.org/content/groups/public/</url>
  </repository>
</repositories>

This addition still did not work. For this reason, we couldn't successfully install the Kafka-Hook required for Apache Atlas to run and pull Kafka topics. This hindered our ability to run Atlas for Data Discovery, Data Lineage and Data Governance in general.

A solution will be most appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.