Code Monkey home page Code Monkey logo

exohood-workteam-slack's Introduction

Exohood-Workteam-Slack

License: MIT

Singer tap that extracts data from a Slack workspace and produces JSON-formatted data following the Singer spec.

This is a exohood-workteam-slack compatible tap connector.

How to use it

The recommended method of running this tap is to use it from exohood-workteam-slack. When running it from exohood-workteam-slack you don't need to configure this tap with JSON files and most of things are automated. Please check the related documentation at Tap Slack

If you want to run this Singer Tap independently please read further.

Installation

It is highly recommended installing tap-slack in it's own isolated virtual environment. For example:

python3 -m venv ~/.venvs/tap-slack
source ~/.venvs/tap-slack/bin/activate
pip3 install Exohood-Worketeam-Slack
deactivate

Setup

The tap requires a Slack API token to interact with your Slack workspace. You can obtain a token for a single workspace by creating a new Slack App in your workspace and assigning it the relevant scopes. As of right now, the minimum required scopes for this App are:

  • channels:history
  • channels:join
  • channels:read
  • files:read
  • groups:read
  • links:read
  • reactions:read
  • remote_files:read
  • remote_files:write
  • team:read
  • usergroups:read
  • users.profile:read
  • users:read
  • users:read.email This scope is only required if you want to extract the user emails as well.

Create a config file containing the API token and a start date, e.g.:

{
  "token":"xxxx",
  "start_date":"2020-05-01T00:00:00"
}

Private channels

Optionally, you can also specify whether you want to sync private channels or not by adding the following to the config:

    "private_channels":"false"

By default, private channels will be synced.

Joining Public Channels

By adding the following to your config file you can have the tap auto-join all public channels in your ogranziation.

"join_public_channels":"true"

If you do not elect to have the tap join all public channels you must invite the bot to all channels you wish to sync.

Specify channels to sync

By default, the tap will sync all channels it has been invited to. However, you can limit the tap to sync only the channels you specify by adding their IDs to the config file, e.g.:

"channels":[
    "abc123",
    "def345"
  ]

Note this needs to be channel ID, not the name, as recommended by the Slack API. To get the ID for a channel, either use the Slack API or find it in the URL.

Archived Channels

You can control whether or not the tap will sync archived channels by including the following in the tap config:

  "exclude_archived": "false"

It's important to note that a bot CANNOT join an archived channel, so unless the bot was added to the channel prior to it being archived it will not be able to sync the data from that channel.

Date Windowing

Due to the potentially high volume of data when syncing certain streams (messages, files, threads) this tap implements date windowing based on a configuration parameter.

including

"date_window_size": "5"

Will cause the tap to sync 5 days of data per request, for applicable streams. The default value if one is not defined is to window requests for 7 days at a time.

Usage

It is recommended to follow Singer best practices when running taps either on their own or with a Singer target.

In practice, it will look something like the following:

~/.venvs/tap-slack/bin/tap-slack --config slack.config.json --catalog catalog.json | ~/.venvs/target-stitch/bin/target-stitch --config stitch.config.json

Replication

The Slack Conversations API does not natively store last updated timestamp information about a Conversation. In addition, Conversation records are mutable. Thus, tap-slack requires a FULL_TABLE replication strategy to ensure the most up-to-date data in replicated when replicating the following Streams:

  • Channels (Conversations)
  • Channel Members (Conversation Members)

The Users stream does store information about when a User record was last updated, so tap-slack uses that timestamp as a bookmark value and prefers using an INCREMENTAL replication strategy.

Table Schemas

Channels (Conversations)

  • Table Name: channels
  • Description:
  • Primary Key Column: id
  • Replication Strategy: FULL_TABLE
  • API Documentation: Link

Channel Members (Conversation Members)

  • Table Name: channel_members
  • Description:
  • Primary Key Columns: channel_id, user_id
  • Replication Strategy: FULL_TABLE
  • API Documentation: Link

Messages (Conversation History)

  • Table Name: messages
  • Description:
  • Primary Key Columns: channel_id, ts
  • Replication Strategy: INCREMENTAL
  • API Documentation: Link

Users

  • Table Name: users
  • Description:
  • Primary Key Column: id
  • Replication Strategy: INCREMENTAL
  • API Documentation: Link

Threads (Conversation Replies)

  • Table Name: threads
  • Description:
  • Primary Key Columns: channel_id, ts, thread_ts
  • Replication Strategy: FULL_TABLE for each parent message
  • API Documentation: Link

User Groups

  • Table Name: user_groups
  • Description:
  • Primary Key Column: id
  • Replication Strategy: FULL_TABLE
  • API Documentation: Link

Files

  • Table Name: files
  • Description:
  • Primary Key Column: id
  • Replication Strategy: INCREMENTAL query filtered using date windows and lookback window
  • API Documentation: Link

Remote Files

  • Table Name: remote_files
  • Description:
  • Primary Key Column: id
  • Replication Strategy: INCREMENTAL query filtered using date windows and lookback window
  • API Documentation: Link

Testing the Tap

Install test dependencies

make venv

To run tests:

make unit_test

Linting

Install test dependencies

make venv

To run linter:

make pylint

exohood-workteam-slack's People

Contributors

elakuromi avatar shiraotsuki avatar xpunk7 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.