Code Monkey home page Code Monkey logo

chrome-snowplow-inspector's Introduction

Snowplow logo

Release Release activity Latest release Docker pulls Discourse posts License


As of January 8, 2024, Snowplow is introducing the Snowplow Limited Use License Agreement, and we will be releasing new versions of our core behavioral data pipeline technology under this license.

Our mission to empower everyone to own their first-party customer behavioral data remains the same. We value all of our users and remain dedicated to helping our community use Snowplow in the optimal capacity that fits their business goals and needs.

We reflect on our Snowplow origins and provide more information about these changes in our blog post here → https://eu1.hubs.ly/H06QJZw0


Overview

Snowplow is a developer-first engine for collecting behavioral data. In short, it allows you to:

Thousands of organizations like Burberry, Strava, and Auto Trader rely on Snowplow to collect, manage, and operationalize real-time event data from their central data platform to uncover deeper customer journey insights, predict customer behaviors, deliver differentiated customer experiences, and detect fraudulent activities.

Table of contents

Why Snowplow?

  • 🏔️ “Glass-box” technical architecture capable of processing billions of events per day.
  • 🛠️ Over 20 SDKs to collect data from web, mobile, server-side, and other sources.
  • ✅ A unique approach based on schemas and validation ensures your data is as clean as possible.
  • 🪄 Over 15 enrichments to get the most out of your data.
  • 🏭 Stream data to your data warehouse/lakehouse or SaaS destinations of choice — Snowplow fits nicely within the Modern Data Stack.

➡ Where to start? ⬅️

Snowplow Community Edition Snowplow Behavioral Data Platform
Community Edition equips you with everything you need to start creating behavioral data in a high-fidelity, machine-readable way. Head over to the Quick Start Guide to set things up. Looking for an enterprise solution with a console, APIs, data governance, workflow tooling? The Behavioral Data Platform is our managed service that runs in your AWS, Azure or GCP cloud. Book a demo.

The documentation is a great place to learn more.

Would rather dive into the code? Then you are already in the right place!


Snowplow technology 101

Snowplow architecture

The repository structure follows the conceptual architecture of Snowplow, which consists of six loosely-coupled sub-systems connected by five standardized data protocols/formats.

To briefly explain these six sub-systems:

  • Trackers fire Snowplow events. Currently we have 15 trackers, covering web, mobile, desktop, server and IoT
  • Collector receives Snowplow events from trackers. Currently we have one official collector implementation with different sinks: Amazon Kinesis, Google PubSub, Amazon SQS, Apache Kafka and NSQ
  • Enrich cleans up the raw Snowplow events, enriches them and puts them into storage. Currently we have several implementations, built for different environments (GCP, AWS, Apache Kafka) and one core library
  • Storage is where the Snowplow events live. Currently we store the Snowplow events in a flat file structure on S3, and in the Redshift, Postgres, Snowflake and BigQuery databases
  • Data modeling is where event-level data is joined with other data sets and aggregated into smaller data sets, and business logic is applied. This produces a clean set of tables which make it easier to perform analysis on the data. We officially support data models for Redshift, Snowflake and BigQuery.
  • Analytics are performed on the Snowplow events or on the aggregate tables.

For more information on the current Snowplow architecture, please see the Technical architecture.


About this repository

This repository is an umbrella repository for all loosely-coupled Snowplow components and is updated on each component release.

Since June 2020, all components have been extracted into their dedicated repositories (more info here) and this repository serves as an entry point for Snowplow users and as a historical artifact.

Components that have been extracted to their own repository are still here as git submodules.

Trackers

A full list of supported trackers can be found on our documentation site. Popular trackers and use cases include:

Web Mobile Gaming TV Desktop & Server
JavaScript Android Unity Roku Command line
AMP iOS C++ iOS .NET
React Native Lua Android Go
Flutter React Native Java
Node.js
PHP
Python
Ruby
Scala
C++
Rust
Lua

Loaders

Iglu

Data modeling

Web

Mobile

Media

Retail

Testing

Parsing enriched event


Community

We want to make it super easy for Snowplow users and contributors to talk to us and connect with one another, to share ideas, solve problems and help make Snowplow awesome. Join the conversation:

  • Meetups. Don’t miss your chance to talk to us in person. We are often on the move with meetups in Amsterdam, Berlin, Boston, London, and more.
  • Discourse. Our forum for all Snowplow users: engineers setting up Snowplow, data modelers structuring the data, and data consumers building insights. You can find guides, recipes, questions and answers from Snowplow users and the Snowplow team. All questions and contributions are welcome!
  • Twitter. Follow @Snowplow for official news and @SnowplowLabs for engineering-heavy conversations and release announcements.
  • GitHub. If you spot a bug, please raise an issue in the GitHub repository of the component in question. Likewise, if you have developed a cool new feature or an improvement, please open a pull request, we’ll be glad to integrate it in the codebase! For brainstorming a potential new feature, Discourse is the best place to start.
  • Email. If you want to talk to Snowplow directly, email is the easiest way. Get in touch at [email protected].

Copyright and license

Snowplow is copyright 2012-2023 Snowplow Analytics Ltd.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

chrome-snowplow-inspector's People

Contributors

igneel64 avatar jethron avatar matus-tomlein avatar miike avatar vineshtv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chrome-snowplow-inspector's Issues

Simpler support for local schemas

The new version of Micro allows a user to mount a volume from their local filesystem to the Docker container which then uses an embedded version of Iglu to retrieve schemas.

Currently we support adding schemas from the local filesystem and storing them in Chrome storage but it would be ideal if we could potentially use the File System Access API to do this instead which would handle file browsing, persistence etc. We'll probably need to consider how we refresh / reload schemas when they change on the local disk to make sure we're always using the latest versions.

Relevant Discourse question: https://discourse.snowplowanalytics.com/t/can-chromeextension-resolve-schemas-from-micro/6532?u=mike

Extension options iframe not working

For some reason the options dialogue does not work for me (version Chrome 96). I can see there is an iframe which doesn't load for me.
Is there any other way to add custom Iglu repositories?

Screenshot 2021-12-17 at 15 08 05

Static repo problem

Hi,

there seems to be another problem with the static repo. I'm never able to connect even when importing from my resolver file. I've tried to debug it and from what I can grasp, I would say the extension uses the wrong path for looking up schemas.

My settings:
image

As you can see here, the extension tries to look for schemas in storage.googleapis.com/schemas instead of storage.googleapis.com/my-repo/schemas:
image

Maybe this is of help to find the culprit?

Cheers
Andreas

Cannot save local schema's

I want to debug a custom event data structure, but the snowplow inspector of course does not recognize this event. I found I could add my schema in a local registry, however, whenever I tried to add my schema and press Save Schemas, nothing happened.

I tried to debug it myself, slightly hindered by the minimized code. In the end I found the following code:

    update(e) {
        return new Promise((t=>chrome.storage.local.get("localSchemas", (({localSchemas: r})=>{
            if (r && "string" == typeof r) {

This if-statement always fails, as "localSchemas" does not seem to be set, and r is undefined as a result. I manually ran

chrome.storage.local.set({ localSchemas: "{}" })

And after doing this, saving local schema's actually works. I'm not sure if my installation just broke somehow or if others also have this issue. If it's the latter, hopefully this info helps you fix the problem.

Support firing test events

Build an interface to generate test events.

  • Take a collector URL (prod, mini, micro, whatever)
  • Allow selection of the different hit-types (Pageview, Struct Event, Self Describing Event, etc)
  • Generate a form for user to fill out any details.
  • Grab browser information from the currently running browser as defaults
  • Allow selection of schemas for SDE/contexts (populate from the current local schemas/schema cache, or enter a new Iglu URI), and generate an appropriate form for the custom data
  • Import existing JSON for SDE/contexts
  • Save a history of previous test events, so they can be replayed to test schema changes
  • Allow saving real production events to history so they can be used as test cases

Font size in latest version

For some reason the latest build has too large of a font used in the inspector, as you can see on the screenshot you can barely fit one event on the screen. I tried reinstalling the extension.

2018-07-16 09_52_02-window

Indicate tab being debugged

If the debugger is actively monitoring a tab, the extension badge should update in the main browser window when that tab is active so it's easier to know which tab is firing the events. This is useful if you're running the debugger in a separate window and not docked in the page.

Feature request - customize display of events

I'd like to customize how events are displayed, so that it displays what's important to me.

It's mildly annoying that the top third of the my display is taken up with fields I don't need to read, in an excessively large font. There seems no way to collapse this top section, so maybe adding that would be a quick win. Even nicer if I can turn off fields I don't care about.

The other issue I have is that my structured events' se_pr field is a stringified JSON object, and it would be much easier to read if there was a way to parse it and show the object instead of the string.

I figured I might be able to solve both problems with a custom schema. But can't work out how to make a local schema that overrides the one specified in the event. So that would also be really useful. I have no idea whether a custom schema will let me parse the se_pr field into separate fields though.

Show activity when new events happen

See if there's a way to indicate that a new event has been seen since you last viewed the Inspector page.

Maybe highlight the devtools tab somehow if you're monitoring other requests in e.g. the Network tab or Elements tab.

Maybe make the badge be more useful and show the most recently seen event in the active tab, so you can see if some tracking worked at a glance.

Read the network user ID from the response of the first non-anonymous request

It is a bit confusing not to see the network user id for events sent in the first request to the collector after the user gave consent for tracking – users are looking specifically at this to see if the user identification is working as expected.

Perhaps we could read the ID from the set-cookie header in the collector response?

Happy to contribute a PR for this if it is desirable!

Valid OPTIONS requests are marked as invalid

Hi,

I am sending Snowplow events with POST requests. The Snowplow JS tracker sets the Content-Type to application/json. This causes the browser to send a pre-flight OPTIONS requests to the tracker to ensure that this Content-Type is accepted (CORS).

This is all fine, however the Snowplow Inspector marks the pre-flight OPTIONS request as a bad event. This means that for every event that fires successfully, there is a bad event preceding it.

image

Filter beacons by collector or app_id

Publishing sites are starting to have a lot of different Snowplow collectors on them and it gets noisy not being able to differentiate.

  • Add a filter option to select only matching collector strings.
  • Add a filter option to select only matching app_id strings.
  • Colour code the pixels for specific collector+app_id combinations in a deterministic way? Or possibly have two colours on each row, one for collector and one for app_id?

Inspector isn't working anymore

I've been using this amazing tool for several months now and as of today it doesn't work anymore. I visited 2 websites that have properly setup JS tracking codes and both working fine (checked in our system), but the inspector doesn't output anything, it's just blank now.

Chrome version: Version 65.0.3325.146 (Official Build) (64-bit)
Inspector version: 0.2.2

chrome plugin stopped working

The chrome plugin stopped working and I can't find a reason why. No matter how many snowplow events are triggered nothing shows up in the snowflake inspector. I reinstalled both chrome and the extension with no success. The tool is amazing and I really want it back ;)

screen shot 2017-07-17 at 12 42 20 Amazing screenshot, where you cannot see anything...

Chrome Version 59.0.3071.115 (Official Build) (64-bit)
Using Snowplow Inspector 0.1.4

Clear Events doesn't clear selected event

[Clear Events] button doesn't clear the selected event on the right-hand side. I would expect it to clear everything.

image

Before clicking [Clear Events]

image

After clicking [Clear Events]

Import Bad Rows

It would be good to be able to copy JSONL & base64 payloads in from the bad rows in S3/Kibana and load up all the beacon information and leverage the schema validation features to ease debugging.

Copy dataLayer event as Snowplow schema

It'd be nice to be able to copy a dataLayer event (or JSON example) and generate a schema from it. Quicktype seems to do a reasonably good job of this and can merge similar classes which is quite useful so it can generalise a single schema from multiple events.

Stream rows from ElasticSearch

The Import Bad Rows feature works if you copy the bad values from e.g. Kibana, but Kibana doesn't really provide any good way to copy in bulk. Since the bad row format isn't very searchable it can make it hard to find what you're looking for in a sea of bad rows.

Now that CORS shouldn't be a problem any more & talking to newer Snowplow Mini instances is possible, it would be good to read bad (maybe even good?) rows straight from ElasticSearch. This should also help for testing non-web platforms since you'll be able to see events from a whole bunch of devices rather than just the current browser in real time and get schema validation from anything sent to Mini.

Dark Theme Support

The styles contrast a lot with the rest of DevTools if you have the Dark theme enabled. Should detect the theme used and adjust the palettes accordingly.

This will probably involve a revamp of how we bundle in Bulma since we'll actually be using variables.

Assets will also need dark-friendly versions.

Inspector has stopped working with old version of SDK

Hey there,

I am currently using an old version of the JS SDK in production (v2.4.3) and have been using this Chrome extension to test events. Last week, I noticed that this extension stopped displaying any events but confirmed that data was still being tracked.

I noticed that a new version of this extension had been released (v0.2.16) so decided to see if I built v0.2.15 if the extension would begin working again. Luckily, v0.2.15 continues to work so I do have a work-around for now.

Let me know if there are any plans to support older versions of the SDK or if there's anymore information that I can send over to debug. At the very least, I wanted to make you aware to document.

Thanks!

Can not get custom schema validation to work

Hi

I am guess I am doing something wrong but I cannot get this to work. I have added an S3 bucket url that looks something like so.

https://s3-eu-west-1.amazonaws.com/penfold45-iglu-schemas/ (not real url)

Then in here I have the following directory structure.

schemas/com.penfold45/random_name_here/jsonschema/1-0-0

Then my snowplow code it set up correctly to to post data based on this shema (I am getting the data in to my database).

But the inspector always says "Unrecognised"

The bucket and json file are all public and are def accessible when I go directly to the url.

Any thoughts?

UTF-8 Encoding

Could we update to use UTF-8 or have the option? Our french characters show encoded which cause people to think they are captured that way. However our database shows them correctly.

Add support for the Data Structures API

We should add support for the Data Structures API which will require configuration of:

  • organization_id
  • auth_token

to fetch schemas and validate instances against these schemas.

https://docs.snowplowanalytics.com/docs/understanding-tracking-design/managing-data-structures-via-the-api/
https://console.snowplowanalytics.com/api/schemas/docs/index.html?url=/api/schemas/docs/docs.yaml#/

Ability to retrieve a single schema version using GET /organizations/{organizationId}/schemas/{schemaHash}/versions/{versionNumber}

Remote Debugging for other platforms

Extension already seems to work OK using the built-in Remote Debugging feature in DevTools, in this way you can test on (Android) mobile devices, but only for apps that support it (pretty much Chrome, but potentially other WebViews). But this doesn't help for native apps or trackers on other platforms.

So we basically want some way to intercept arbitrary traffic (ala mitmproxy/Fiddler/Charles) to sniff the requests.

For a first round, it looks like Charles at least can export an entire session as a HAR file. Since this is the same format Chrome gives us we already can parse that, so we just need an import capability.

But a real-time solution would be better, though I suspect much harder.

Field ti_nm is specified as an unrecognised event in Snowplow Inspector

I was QA'ing some tags here and noticed the name was missing from a transaction item event.

Turns out the field was unrecognised by this inspector:

Screen Shot 2020-03-12 at 09 50 33

To be fair, this is also not specified in Snowplow's tracker protocol: https://github.com/snowplow/snowplow/wiki/snowplow-tracker-protocol#352-transaction-item-parameters

We're running 2.12.0 of the JS tracker here and it's pushing ti_nm for the field, but it's being showing up in Redshift/BigQuery here.

Should I raise this on the Snowplow Github too?

Change Order of Beacon Information

@kingo55 suggested at the Snowplow Discourse:

Event-specific information should float to the top of the details column (Self-desc JSON, Struct fields, page titles etc) ahead of more generic details (user IDs, time, app ID etc)

Context validity doesn't update correctly

Hi!

I often have the problem, that the event list shows me invalid events, but all the contexts inside seem to be valid. You can see the problem on the first picture. The product context should be invalid as it is missing a category key, but still it is green. Only after I click on one of the test suite tests and then go back to the invalid event, the context will show the error correctly (second picture):

image

image

This goes both ways. After I have fixed an event context, the next event will show 'invalid' until I switch to a test and back.

Cheers
Andreas

Can't get static repo to work

Hi,

I've a static repo on Google Cloud Storage, which works as expected for Snowplow. The whole bucket has public access enabled.
However, I cannot get it to work in the extension. I tried it manually and via resolver import, but with no luck. The first time I set it up, Chrome asked for additional permissions to allow the extension to connect to the bucket location, which I accepted. I can also see that https://storage.googleapis.com/* is added to the exceptions correctly.

Do I need some specific metadata to be set on my objects? Any other ideas, why this doesn't work?

Thanks for your help!
Andreas

Displaying Queued Events? (feature request)

Hi all!

We're batching POST requests in batches of 10 and then using a beacon config to send them. One of the issues with QAing this setup is that there's a lag between tracking the event and the event appearing in the snowplow debug panel (since no network request has gone out yet, we need to wait until there's been multiple-of-10 events tracked before we can debug our events. This often results in me sending a bunch of dummy events to hit that multiple-of-10 number.)

I was thinking this feature could add events that are "queued" (perhaps in "outline" text formatting, a gray background, or some similar visual indication that they haven't been sent yet) to the events list on the left. I haven't developed much in Chrome extensions, but with a bit of preliminary digging it looks like this would mean overriding localstorage from a content script within the window, and then communicating changes to localstorage to the extension (where they'd then have to be parsed and displayed)

Then, when the event actually goes out in a network request, we can match its event ID and then mark the event as resolved by changing its styling to match events that have been sent

(I was actually thinking of taking a shot at implementation if I can budget some hours towards it this week)

Thanks - love the plugin! It's been a life-saver for testing our snowplow setup.

Snapshot versions of schemas

Hi Snowplow Extension Owners,

We would like to support pre-released SNAPSHOT versions into validator. Oftentimes, our engineers will want to develop their schemas alongside any frontend code they write. Until it is in production, the schema may be not finalized. We want to release SNAPSHOT versions into our schema repository and use the snowplow extension to validate those schemas.

The format of the schema version would look something like 0-1-0-BRANCH-SNAPSHOT.

This probably means updating the Regex at https://github.com/snowflake-analytics/chrome-snowplow-inspector/blob/ec53e19c5233da5595a1f2a3218a0f58c92a184e/src/validator.ts#L11

const SCHEMA_PATTERN = /^iglu:([a-zA-Z0-9_.-]+)\/([a-zA-Z0-9_-]+)\/([a-zA-Z0-9_-]+)\/([1-9][0-9]*(?:-(?:0|[1-9][0-9]*)){2})$/;

to have (?:-[a-zA-Z0-9_-]+)? at the end to capture that SNAPSHOT tag.

const SCHEMA_PATTERN = /^iglu:([a-zA-Z0-9_.-]+)\/([a-zA-Z0-9_-]+)\/([a-zA-Z0-9_-]+)\/([1-9][0-9]*(?:-(?:0|[1-9][0-9]*)){2}(?:-[a-zA-Z0-9_-]+)?)$/;

Interested to hear other thoughts.

How can we clear extension's cache

We are trying to do development of a new schema or make changes to existing ones. Since the extension and Snowplow doesn't support SNAPSHOT versions, we are reloading our incomplete version multiple times into our static Iglu website. We want to validate our changes using the extension but it appears the schema is cached. How can we clear the cache manually so we can continue to do testing?

See how extension caches the schema here:
https://github.com/snowflake-analytics/chrome-snowplow-inspector/blob/master/src/validator.ts#L28

Invalid type error for nullable types

  "schema":"iglu:com.example/interaction/jsonschema/1-0-0",
  "data":{
    "id":"",
    "target_url":"http://google.com",
    "method":null,
    "direction":null,
    "type":"ABC"
  }
}

image

where

"method": {
"type": ["string", "null"],
"enum": ["onClick", "onScroll", "onHover"]
}

Import options incorrect

When I click "Import" the "HAR File" is already checked. If I click on "HAR file" it does nothing, however if I click "Bad Rows" my file explorer pops up and allows me to import an HAR file. If I click "ElasticSearch" I get the pop up of the "Bad Rows Import". If I click the "Ngrok Tunnel" option, the elastic search option pops up.

Add ability to 'Copy As' for a given event

Similar to the network debug panel it would be nice to have the ability to 'Copy As' an event.

Initially we could keep this simple (cURL support) by right clicking on an event and copying as cURL command which would generate the equivalent command to the users clipboard.

It'd be nice to longer term add support for code generation (i.e., Javascript 3).

Structured event values always null

I've confirmed that we are tracking category, action, label, property, value for structured events but the Chrome extension always shows "null" for the value.

screen shot 2017-11-02 at 4 03 05 pm

screen shot 2017-11-02 at 4 03 32 pm

Firefox version of latest builds

In 2019, there were Firefox builds published to Github Releases.

Would it be possible to bring these back for the latest versions of the extension?

Bulk Export Events

A Google User asks via the Chrome Web Store:

Is there a way to export snowplow events data to excel or any other format.

At the moment this is only supported at an event-by-event level. We should support exporting all events in the timeline at once.
Probably needs to be in different formats like the existing Copy As functionality, but should also support more bulk formats (e.g. CSV) for loading directly into databases etc.

This will probably add more chrome to the timeline and we can move the Clear Events button alongside it to be closer to the actual events which makes more sense as an affordance.

Add support for custom Snowplow paths

Currently supports /i and /tp2 but inflexibly supports other paths (if they have tv in them). Consider adding regex support for other paths in options.

Link click tracking

The extension works great for the most part but it does not seem to be detecting native JS tracker link clicks (given link tracking is enabled) or custom link clicks fired as structured events. We see the events in our event pipeline but not in your extension.
Is there any guidance how to surface those in your chrome extension?

Variable type display

It would be very convenient to see variable type - especially if what I get is a number/string/boolean

With inspector open, can't see any `requests` sent before Snowplow panel is opened

hi all! (again)

it can be inconvenient sometimes to have to have the "Snowplow" panel open in the chrome debugger before the extension begins listening to / populating requests[]

i'm not sure if chrome lets you execute extensions code after the chrome inspector/debugger is open, but before your tab has been selected (similar to the way the Network panel works by default), but if possible, that would be, imo, more ideal

still familiarizing myself with chrome extension architecture so i'll let you know if i have any actionable suggestions / have a pull request

thanks!
Scotty

Manifest v3

Worth upgrading once Firefox support is where we need it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.