Code Monkey home page Code Monkey logo

datastage_project's Introduction

Datastage Project

DataStage is an ETL tool which is used to Extract the data from different data source, Transform the data as per the business requirement and Load into the target database. The data source can be of any type like Relational databases, files, external data sources, etc.

  • Important Clients Interface

  • Three components comprise the DataStage client: DataStage Administrator, DataStage Designer, DataStage Director.

alt text

  • RESPONSIBILITIES FOR Datastage Administrator:

Developing new tools and processes to ensure effective use of DataStage product. The position will also be responsible for administrating and maintaining a DataStage shared environment. This includes: Designing and sizing the environment.

  • RESPONSIBILITIES FOR Datastage Designer:

The DataStage Designer is the primary interface to the metadata repository and provides a graphical user interface that enables you to view, edit, and assemble DataStage objects from the repository needed to create an ETL job. An ETL job should include source and target stages.

  • RESPONSIBILITIES FOR Datastage Director:

DataStage Director has three view options: The Status view displays the status, date and time started, elapsed time, and other run information about each job in the selected repository category. The Schedule view displays job scheduling details. The Log view displays all of the events for a particular run of a job.

alt text

  • Oracle connector

Oracle connector is used to access Oracle database systems and perform various read, write, and load functions. Setting required user privileges.

alt text

  • Parameter sets

Parameter sets enable you to expose different parameters to the user. And, to return different information based on the parameters specified by the user. You can only use one parameter set at a time.

alt text

########################################################################################

Project consists of 4 major jobs-

ORA - data from Oracle Database
DS - Dataset Files
FILE - Sequential Files
AGG - Aggregation
SHARED - Shared Containers
TRANSFORMER
MERGE
JOIN
LOOKUP
FUNNEL
SORT
FILTER
REMOVE_DUPLICATES
ORACLE_DATABASE
EXECUTE_COMMANDS

- ( 1 ) Requirement: SOURCE --> TARGET

alt text

Implementation:

alt text

- ( 2 ) Requirement: SOURCE --> TARGET

alt text

Implementation:

alt text

alt text

- ( 3 ) Requirement: SOURCE --> TARGET

alt text

Implementation:

alt text

- ( 4 ) Requirement: SOURCE --> TARGET

alt text

Implementation:

alt text

SEQUENCIAL JOBS

Sequence job, that you use to specify a sequence of parallel jobs or server jobs to run. You specify the control information, such as the different courses of action to take depending on whether a job in the sequence succeeds or fails.

alt text

alt text

########################################################################################

Some more Jobs-


1] SRC :- EMP(ORA)
   TRG(FILE) :- EMPNO,ENAME,SAL,SEQ_NUM
   DATA :- ONLY FOR SAL >= 1000 & GENERATE THE SEQ_NUM.

2] SRC :- EMP(ORA), DEPT_(ORA)
   TRG(DS) :- RANK_NUM,EMPNO,ENAME,SAL,DEPTNO,DNAME,LOC
   RANK_NUM = SEQUENCE NUMBERS
   DATA :-TOP 3 SAL RECORDS

3] SRC :- EMP
   TRG(DS) :- EMPNO,ENAME,SAL,COMM,TSAL,DEPTNO,RANK_NUM
   RANK_NUM = SEQUENCE NUMBERS

   DATA :- TSAL = SAL + COMM[HANDLE NULL]
	   BOTTOM 5 TSAL RECORDS.

4] SRC :- EMP
   TRG[2 ORA TABLES] :- EMPNO,ENAME,SAL,TAX,SEQ_NUM
   SEQ_NUM = SEQUENCE NUMBERS

   DATA :- TAX = SAL * 0.10

	   DATA INTO THE TRGS MUST BE LOADED IN AN ALTERNATE WAY.


@PARTITIONNUM + ((@INROWNUM -1) * @NUMPARTITIONS) +1


-------------------------------------------------------------

1] SRC :- EMP(ORA),DEPT(ORA)
   TRG(2) :- DEPTNO,DNAME,LOC,SUM_SAL,TAX_SUM_SAL

	TRG1(ORA) :- 
	SUM(SAL) GROUP BY DEPTNO 
	TAX_SUM_SAL = SUM_SAL * 0.50 IF DEPTNO = 10
	TAX_SUM_SAL = SUM_SAL * 0.25 IF DEPTNO = 20
	TAX_SUM_SAL = SUM_SAL * 0.10 IF DEPTNO = 30
		
	DATA FOR TRG1(DS) :- (DEPTNO = 10 OR DEPTNO = 20) AND TAX_SUM_SAL >= 2000
		   
	TRG2(FILE) :- SUM_SAL BETWEEN 4000 & 15000 

2] SRC :- EMP(ORA)
   TRG(DS) :- EMPNO,ENAME,SAL,DEPTNO

	DEPTNO = P1(INTEGER) & SAL >= P2(INTEGER)

3] SRC :- EMP(ORA),DEPT(ORA)
   TRG() :- 
	CREATE A PARAMETER SET WITH REQUIRED PARAMETERS & USE IN THIS JOB :-

	TRG1(FILE) :- EMPNO,ENAME,SAL,COMM,TSAL,TAX,TRATE,NETSAL
	
	TSAL = SAL + COMM[HANDLE NULL]
	TRATE = P1(FLOAT/DOUBLE)
	TAX = TSAL * TRATE
	NETSAL = TSAL - TAX

	DATA ONLY FOR NETSAL >= P2(FLOAT/DOUBLE)

	TRG2(DS) :- SEQ_NUM,DEPTNO,DNAME,SUM_SAL,TAX_SUM_SAL,TAX_RATE
	TAX_RATE = P3(FLOAT/DOUBLE)
	SUM_SAL = SUM(SAL) GROUP BY DEPTNO 
	TAX_SUM_SAL = SUM_SAL * TAX_RATE

	SEQ_NUM = GENERATE SEQUENCE NUMBER


- ( 1 )

alt text

- ( 2 )

alt text

- ( 3 )

alt text

- ( 4 )

alt text

- ( 5 )

alt text

- ( 6 )

alt text

- ( 7 )

alt text

Made with ๐Ÿ’– & ๐Ÿ”ฅ by MKM.

datastage_project's People

Contributors

mohitkm2302 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.