Code Monkey home page Code Monkey logo

pentaho-data-refinery's Introduction

Pentaho Data Refinery

This project contains several PDI Job and Transformation steps for use in building and publishing analysis models. The job steps include Build Model and Publish Model. The transformation steps include Annotate Stream and Shared Dimension.

Build Model creates an analytic model and stores it in a variable called ${JobEntryBuildModel.Mondrian.Schema.Model Name} where Model Name is the name you specified in the step.

Publish Model uses the model and connection information generated by Build Model and publishes a Data Source to the selected BA Server

Annotate Stream allows you to instruct the Build Model step how to use a particular field when generating the model.

Shared Dimension allows you to specify a separate dimension table that can later be linked to the fact table.

Troubleshooting

Model Annotations When your published schema doesn't reflect the Model Annotations you have specified through Annotate Stream and Shared Dimension, first you should check your PDI job logs. Each annotation that is applied will either print a success message or a failure message. Failures in Annotate Stream will only prevent that particualr annotation from getting applied. Any failure in a Shared Dimension annotation will also cause the Link Dimension annotation to fail, which means the dimension will not be available in your model. For failed annotations, review the annotation properties to ensure they are correct and consistent with eachother. For example when specifying parent attributes, the parent must be in the same dimension and hierarchy. All the names are case sensitive.

Dealing with Auto Modeled schema elements Auto Modeling will create an Attribute for every field in your data source. Annotations that are specified for a given field will remove any elements created by the auto modeler. Auto modeled fields are identified by a dimension with a single hierarchy and a single level where the names of all three are equal. This means if you create a field annotation that has those same properties, it could be removed if you have a second annotation on that same field. For example, let's assume you have a field product_id. The auto modeler will create an attribute with Dimension, Hierarchy and Level all named Product ID. You want to keep the level, but also create a measure, Product Count, on the product_id field. In this case you will have to specify two annotations, one Create Attribute and one Create Measure. The Create Attribute annotation should be created after the Create Measure annotation because Create Measure would identify the attribute as being auto-modeled and remove it from the model.

Known Issue - Dimension with Multiple Hierarchies When using model annotations to create a dimension with multiple hierarchies, you should not name any of the hierarchies the same as the dimension name and you should not create an empty hierarchy name. There will be no errors, but your published model will be incorrect.

pentaho-data-refinery's People

Contributors

bytekast avatar dfieldfl avatar kolinus avatar lgrill-pentaho avatar mkambol avatar pavel-sakun avatar pminutillo avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.