Code Monkey home page Code Monkey logo

tap-powerbi-metadata's Introduction

Welcome to the tap-powerbi-metadata Singer Tap!

This Singer tap was created using the Meletano SDK for Taps.


Installation

pipx install git+https://github.com/dataops-tk/tap-powerbi-metadata.git

Configuration

Accepted Config Options

  • client_id - The unique client ID for the Power BI tenant.
  • tenant_id - The unique identifier for the Power BI tenant.
  • username - Username to use in the flow.
  • password - Password to use in the auth flow.
  • start_date - Optional. Earliest date of data to stream.

Note:

  • A sample config file is available at .secrets/config.json.template

  • A full list of supported settings and capabilities for this tap is available by running:

    tap-powerbi-metadata --about

Source Authentication and Authorization

NOTE: Access to the Power BI REST API requires a service account (aka "Service Principal"), which must be created by someone with admin access to Azure Active Directory (AAD).

More information on this process is available under the Automation with service principals topic on docs.microsoft.com.

TODO: Test out this process of creating a new tenant and service principal for testing purposes and so users/developers won't have to run this in prod to know it works properly.

Usage

You can easily run tap-powerbi-metadata by itself or in a pipeline using Meltano.

Executing the Tap Directly

tap-powerbi-metadata --version
tap-powerbi-metadata --help
tap-powerbi-metadata --config CONFIG --discover > ./catalog.json

How to Contribute

See the SDK dev guide for more instructions on how to use the Singer SDK to develop your own taps and targets.

Upgrading the SDK Version

To upgrade the version of SDK being used, go to the Release History tab on the pypi repo for the SDK and copy the version number only

Then in the command prompt, while in the repo run the following, after replacing the version number with the one you copied

poetry add singer-sdk==0.2.0

Initialize your Development Environment

If you've not already installed Poetry:

pipx install poetry

To update your local virtual environment:

poetry install

Testing locally

Execute the tap locally with the poetry run prefix:

poetry run tap-powerbi-metadata --help

poetry run tap-powerbi-metadata --config=.secrets\config.json > Activity.jsonl
cat Activity.jsonl | target-csv
cat Activity.jsonl | target-snowflake --config=.secrets/target-config.json

Create and Run Tests

Create tests within the tap-powerbi-metadata/tests subfolder and then run:

poetry run pytest

You can also test the tap-powerbi-metadata CLI interface directly using poetry run:

poetry run tap-powerbi-metadata --help

tap-powerbi-metadata's People

Contributors

jtimeus-slalom avatar aaronsteers avatar

Stargazers

João Flávio Santos avatar Anthony Gentry avatar  avatar Edgar Ramírez Mondragón avatar Al Whatmough avatar

Watchers

David Knoernschild avatar James Cloos avatar  avatar

tap-powerbi-metadata's Issues

Use of ComplexType results in "TypeError: 'bool' object is not iterable"

Whenever I use the ComplexType and then pipe the activity to target-csv I get a TypeError: 'bool' object is not iterable error

Traceback (most recent call last):
  File "c:\python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\john.timeus\.local\bin\target-csv.exe\__main__.py", line 7, in <module>
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\target_csv.py", line 141, in main
    state = persist_messages(config.get('delimiter', ','),
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\target_csv.py", line 59, in persist_messages
    validators[o['stream']].validate(o['record'])
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 129, in validate
    for error in self.iter_errors(*args, **kwargs):
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 105, in iter_errors
    for error in errors:
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\_validators.py", line 300, in properties_draft4
    for error in validator.descend(
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 121, in descend
    for error in self.iter_errors(instance, schema):
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 105, in iter_errors
    for error in errors:
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\_validators.py", line 312, in required_draft4
    for property in required:
TypeError: 'bool' object is not iterableTraceback (most recent call last):
  File "c:\python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\john.timeus\.local\bin\target-csv.exe\__main__.py", line 7, in <module>
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\target_csv.py", line 141, in main
    state = persist_messages(config.get('delimiter', ','),
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\target_csv.py", line 59, in persist_messages
    validators[o['stream']].validate(o['record'])
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 129, in validate
    for error in self.iter_errors(*args, **kwargs):
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 105, in iter_errors
    for error in errors:
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\_validators.py", line 300, in properties_draft4
    for error in validator.descend(
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 121, in descend
    for error in self.iter_errors(instance, schema):
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\validators.py", line 105, in iter_errors
    for error in errors:
  File "c:\users\john.timeus\.local\pipx\venvs\target-csv\lib\site-packages\jsonschema\_validators.py", line 312, in required_draft4
    for property in required:
TypeError: 'bool' object is not iterable

Clean up README

Repo still has boilerplate one, update to make sense for this tap

Code efficiency analysis

As discussed, after we got everything working, I wanted to perform this quick analysis of streams.py and tap.py for any "extra" or unhelpful code snippets which could possibly be moved into the SDK itself. Excepting the final change proposed here in #12, I didn't see anything at all, really, which I would remove.

Here's a quick overview of line counts:

  • tap.py (total 47 lines including whitespace):
    • Docstring and imports: 21 lines
    • CLI boilerplate: 3 lines (including code comment)
    • Tap class definition: 15 lines
    • Constants declarations: 5 lines
  • streams.py (total 254 lines including whitespace):
    • Docstring and imports: 25 lines
    • Authentication: 20 lines (1 Authenticator class definition and 1 authenticator Stream class property)
    • Custom Source API Logic: 53 lines total over 3 areas:
      • get_next_page_token() for looping and pagination logic: 29 lines
      • get_url_params() for defining the interface with the REST API: 19 lines
      • parse_response() to translate the REST response into 'records': 5 lines
    • Stream schema definition, which specifies all field names and types: 139 lines for approximately 81 fields (including some commented-out WIP)

Array of Complex Types throws error

When trying to add an Array of Complex Types to my stream, I was getting the following error

TypeError: 'ComplexType' object is not callable

This is an example example record:

{"Id": "00000000-0000-0000-0000-000000000000", "Datasets": [{"DatasetId": "00000000-0000-0000-0000-000000000000", "DatasetName": "Dataset Final"}]}

below is the code that I tried in the PropertiesList

ArrayType("Datasets",ComplexType(
             StringType("DatasetId"),
             StringType("DatasetName")
            )
        ),

Add tap to pypi

use poetry and CICD to get automated builds to publish to pypi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.