Department of Electronics & Telecommunciation,
Don Bosco Institute of Technology, Mumbai.
Course Coordinator: Mr. Jithin Isaac
To Install the Hadoop Ecosystem via Hortonworks Data Platform Sandbox (HDP) & Explore Apache Ambari UI on an AWS EC2 instance. To get familiar with Hadoop HDFS commands.
- Software:
- Hortonworks Data Platform Sandbox https://www.cloudera.com/downloads/hortonworks-sandbox/hdp.html
- Apache HDFS https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
- Cloud: AWS EC2 with Amazon Linux 2 OS (Open ports 22, 8080 & 2222)
- Installation of HDP 2.6.5 in an AWS EC2 instance
- Explore Ambari UI at port 8080
- Work with HDFS via UI
- Work with HDFS commands via SSH at port 2222
- Install HDP via https://jithinsisaac.github.io/posts/hdp_sandbox/
- Additional help via https://www.cloudera.com/tutorials/getting-started-with-hdp-sandbox.html
- Work with HDFS commands via
- geolocation.csv โ This is the collected geolocation data from the trucks. It contains records showing truck location, date, time, type of event, speed, etc.
- trucks.csv โ This is data was exported from a relational database and it shows information on truck models, driverid, truckid, and aggregated mileage info.
- ADD THE PROCEDURE THAT YOU FOLLOWED FOR COMPLETING THE EXPERIMENT HERE
(Please attach necessary screenshots & video clips for the same)
-
HDP
a. HDP installation logs
b. Apache Ambari UI -
HDFS
a. Basic data transfer from local to HDFS via Ambari UI
b. Usage of atleast 5 HDFS commands via SSH
- ADD SCREENSHOTS OF YOUR OUTPUT HERE ALONG WITH VIDEO
- Submitted on 11-09-2021
- Submitted by Mr/Ms. XYZ
- Roll No. 111