Documentation

Developers

API References
Data Subject Request API

Data Subject Request API Version 1 and 2

Data Subject Request API Version 3

Platform API

Platform API Overview

Accounts

Apps

Audiences

Calculated Attributes

Data Points

Feeds

Field Transformations

Services

Users

Workspaces

Warehouse Sync API

Warehouse Sync API Overview

Warehouse Sync API Tutorial

Warehouse Sync API Reference

Data Mapping

Warehouse Sync SQL Reference

Warehouse Sync Troubleshooting Guide

ComposeID

Warehouse Sync API v2 Migration

Bulk Profile Deletion API Reference

Calculated Attributes Seeding API

Data Planning API

Custom Access Roles API

Group Identity API Reference

Pixel Service

Profile API

Events API

mParticle JSON Schema Reference

IDSync

Client SDKs
AMP

AMP SDK

Android

Initialization

Configuration

Network Security Configuration

Event Tracking

User Attributes

IDSync

Screen Events

Commerce Events

Location Tracking

Media

Kits

Application State and Session Management

Data Privacy Controls

Error Tracking

Opt Out

Push Notifications

WebView Integration

Logger

Preventing Blocked HTTP Traffic with CNAME

Linting Data Plans

Troubleshooting the Android SDK

API Reference

Upgrade to Version 5

Cordova

Cordova Plugin

Identity

Direct Url Routing

Direct URL Routing FAQ

Web

Android

iOS

Flutter

Getting Started

Usage

API Reference

iOS

Initialization

Configuration

Event Tracking

User Attributes

IDSync

Screen Tracking

Commerce Events

Location Tracking

Media

Kits

Application State and Session Management

Data Privacy Controls

Error Tracking

Opt Out

Push Notifications

Webview Integration

Upload Frequency

App Extensions

Preventing Blocked HTTP Traffic with CNAME

Linting Data Plans

Troubleshooting iOS SDK

Social Networks

iOS 14 Guide

iOS 15 FAQ

iOS 16 FAQ

iOS 17 FAQ

iOS 18 FAQ

API Reference

Upgrade to Version 7

React Native

Getting Started

Identity

Roku

Getting Started

Identity

Media

Unity

Upload Frequency

Getting Started

Opt Out

Initialize the SDK

Event Tracking

Commerce Tracking

Error Tracking

Screen Tracking

Identity

Location Tracking

Session Management

Xbox

Getting Started

Identity

Web

Initialization

Content Security Policy

Configuration

Event Tracking

User Attributes

IDSync

Page View Tracking

Commerce Events

Location Tracking

Media

Kits

Application State and Session Management

Data Privacy Controls

Error Tracking

Opt Out

Custom Logger

Persistence

Native Web Views

Self-Hosting

Multiple Instances

Web SDK via Google Tag Manager

Preventing Blocked HTTP Traffic with CNAME

Facebook Instant Articles

Troubleshooting the Web SDK

Browser Compatibility

Linting Data Plans

API Reference

Upgrade to Version 2 of the SDK

Xamarin

Getting Started

Identity

Web

Alexa

Quickstart
Android

Overview

Step 1. Create an input

Step 2. Verify your input

Step 3. Set up your output

Step 4. Create a connection

Step 5. Verify your connection

Step 6. Track events

Step 7. Track user data

Step 8. Create a data plan

Step 9. Test your local app

HTTP Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

iOS Quick Start

Overview

Step 1. Create an input

Step 2. Verify your input

Step 3. Set up your output

Step 4. Create a connection

Step 5. Verify your connection

Step 6. Track events

Step 7. Track user data

Step 8. Create a data plan

Java Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

Node Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

Web

Overview

Step 1. Create an input

Step 2. Verify your input

Step 3. Set up your output

Step 4. Create a connection

Step 5. Verify your connection

Step 6. Track events

Step 7. Track user data

Step 8. Create a data plan

Python Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

Media SDKs

Android

iOS

Web

Server SDKs

Node SDK

Go SDK

Python SDK

Ruby SDK

Java SDK

Tools

Linting Tools

mParticle Command Line Interface

Smartype

Guides
Partners

Introduction

Outbound Integrations

Outbound Integrations

Firehose Java SDK

Inbound Integrations

Kit Integrations

Overview

Android Kit Integration

JavaScript Kit Integration

iOS Kit Integration

Compose ID

Glossary

Data Hosting Locations

Migrate from Segment to mParticle

Migrate from Segment to mParticle

Migrate from Segment to Client-side mParticle

Migrate from Segment to Server-side mParticle

Segment-to-mParticle Migration Reference

Rules Developer Guide

API Credential Management

The Developer's Guided Journey to mParticle

Guides

Getting Started

Create an Input

Start capturing data

Connect an Event Output

Create an Audience

Connect an Audience Output

Transform and Enhance Your Data

Platform Guide
The New mParticle Experience

The new mParticle Experience

The Overview Map

Observability

Observability Overview

Observability User Guide

Observability Span Glossary

Introduction

Data Retention

Connections

Activity

Live Stream

Data Filter

Rules

Tiered Events

mParticle Users and Roles

Analytics Free Trial

Troubleshooting mParticle

Usage metering for value-based pricing (VBP)

Analytics

Introduction

Setup

Sync and Activate Analytics User Segments in mParticle

User Segment Activation

Welcome Page Announcements

Settings

Project Settings

Roles and Teammates

Organization Settings

Global Project Filters

Portfolio Analytics

Analytics Data Manager

Analytics Data Manager Overview

Events

Event Properties

User Properties

Revenue Mapping

Export Data

UTM Guide

Query Builder

Data Dictionary

Query Builder Overview

Modify Filters With And/Or Clauses

Query-time Sampling

Query Notes

Filter Where Clauses

Event vs. User Properties

Group By Clauses

Annotations

Cross-tool Compatibility

Apply All for Filter Where Clauses

Date Range and Time Settings Overview

Understanding the Screen View Event

Analyses

Analyses Introduction

Segmentation: Basics

Getting Started

Visualization Options

For Clauses

Date Range and Time Settings

Calculator

Numerical Settings

Segmentation: Advanced

Assisted Analysis

Properties Explorer

Frequency in Segmentation

Trends in Segmentation

Did [not] Perform Clauses

Cumulative vs. Non-Cumulative Analysis in Segmentation

Total Count of vs. Users Who Performed

Save Your Segmentation Analysis

Export Results in Segmentation

Explore Users from Segmentation

Funnels: Basics

Getting Started with Funnels

Group By Settings

Conversion Window

Tracking Properties

Date Range and Time Settings

Visualization Options

Interpreting a Funnel Analysis

Funnels: Advanced

Group By

Filters

Conversion over Time

Conversion Order

Trends

Funnel Direction

Multi-path Funnels

Analyze as Cohort from Funnel

Save a Funnel Analysis

Explore Users from a Funnel

Export Results from a Funnel

Cohorts

Getting Started with Cohorts

Analysis Modes

Save a Cohort Analysis

Export Results

Explore Users

Saved Analyses

Manage Analyses in Dashboards

Journeys

Getting Started

Event Menu

Visualization

Ending Event

Save a Journey Analysis

Users

Getting Started

User Activity Timelines

Time Settings

Export Results

Save A User Analysis

Dashboards

Dashboards––Getting Started

Manage Dashboards

Dashboard Filters

Organize Dashboards

Scheduled Reports

Favorites

Time and Interval Settings in Dashboards

Query Notes in Dashboards

User Aliasing

Analytics Resources

The Demo Environment

Keyboard Shortcuts

Tutorials

Analytics for Marketers

Analytics for Product Managers

Compare Conversion Across Acquisition Sources

Analyze Product Feature Usage

Identify Points of User Friction

Time-based Subscription Analysis

Dashboard Tips and Tricks

Understand Product Stickiness

Optimize User Flow with A/B Testing

User Segments

APIs

User Segments Export API

Dashboard Filter API

IDSync

IDSync Overview

Use Cases for IDSync

Components of IDSync

Store and Organize User Data

Identify Users

Default IDSync Configuration

Profile Conversion Strategy

Profile Link Strategy

Profile Isolation Strategy

Best Match Strategy

Aliasing

Data Master
Group Identity

Overview

Create and Manage Group Definitions

Introduction

Catalog

Live Stream

Data Plans

Data Plans

Blocked Data Backfill Guide

Personalization
Predictive Attributes

Predictive Attributes Overview

Create Predictive Attributes

Assess and Troubleshoot Predictions

Use Predictive Attributes in Campaigns

Predictive Audiences

Predictive Audiences Overview

Using Predictive Audiences

Introduction

Profiles

Calculated Attributes

Calculated Attributes Overview

Using Calculated Attributes

Create with AI Assistance

Calculated Attributes Reference

Audiences

Audiences Overview

Real-time Audiences

Standard Audiences

Journeys

Journeys Overview

Manage Journeys

Download an audience from a journey

Audience A/B testing from a journey

Journeys 2.0

Warehouse Sync

Data Privacy Controls

Data Subject Requests

Default Service Limits

Feeds

Cross-Account Audience Sharing

Approved Sub-Processors

Import Data with CSV Files

Import Data with CSV Files

CSV File Reference

Glossary

Video Index

Analytics (Deprecated)
Identity Providers

Single Sign-On (SSO)

Setup Examples

Settings

Debug Console

Data Warehouse Delay Alerting

Introduction

Developer Docs

Introduction

Integrations

Introduction

Rudderstack

Google Tag Manager

Segment

Data Warehouses and Data Lakes

Advanced Data Warehouse Settings

AWS Kinesis (Snowplow)

AWS Redshift (Define Your Own Schema)

AWS S3 Integration (Define Your Own Schema)

AWS S3 (Snowplow Schema)

BigQuery (Snowplow Schema)

BigQuery Firebase Schema

BigQuery (Define Your Own Schema)

GCP BigQuery Export

Snowflake (Snowplow Schema)

Snowplow Schema Overview

Snowflake (Define Your Own Schema)

APIs

Dashboard Filter API (Deprecated)

REST API

User Segments Export API (Deprecated)

SDKs

SDKs Introduction

React Native

iOS

Android

Java

JavaScript

Python

Object API

Developer Basics

Aliasing

AWS Kinesis (Snowplow)

Prerequisites

The Snowplow Unified Log is stored in an S3 bucket and you are required to write an IAM policy to grant Analytics programmatic access to the respective S3 bucket.

If there are additional enrichments required, such as joining with user property tables or deriving custom user_ids, please contact us.

Instructions

To connect your real time Snowplow data to Analytics, follow the instructions below:

  1. In Analytics, click on the gear icon and select Project Settings. Project Settings
  2. Select the Data Sources tab. Data Sources
  3. Select New Data Source. New Data Source
  4. Select Snowplow Kinesis. Select Snowplow Kinesis
  5. Click Next. You will need to use this API Key in step 4 of Create the Lambda Function. API Key

Create an IAM Role for the Lambda

Your AWS Lambda needs to have an Execution Role that allows it to use the Kinesis Stream and CloudWatch. (For more information on setting up IAM Roles, please see the official AWS tutorial.)

  1. Go to IAM Management in the Console and choose Roles from the sidebar. IAM Roles
  2. Click Create role. Create Role
  3. For the type of trusted entity select AWS Service and for the service that will use this role choose Lambda. Click Next: Permissions at the bottom of the screen. Trusted Entity
  4. Now you need to choose a permission policy for the role. The Lambda needs to have read access to Kinesis and write access to CloudWatch logs - for that we will choose AWSLambdaKinesisExecutionRole. Search for AWSLambdaKinesisExecutionRole in the search and mark the checkbox as shown below. Permission Policy
  5. Click Next: Review at the bottom of the screen.
  6. On the next screen provide a name for the newly created role under Role Name, then click Create role to finish the process.

Create the Lambda Function

The Lambda function can be created either directly through AWS Console or through other tools like the AWS CLI. For this integration, the recommended memory setting is 256 MB and because the JVM has to cold start when the function is called for the first time on a new instance, you should set a high timeout value; 90 seconds should be safe.

As with the IAM Role, we will be using the AWS Console to get our Lambda function up and running. Make sure you are in the same region as where your Kinesis streams are defined.

  1. On the Console navigate to the Lambda section and click Create a function (runtime should be Java 8). Function Role
  2. Write a name for your function in Name. In the Role dropdown pick Choose an existing role; then in the dropdown below choose the name of the role you created in the previous step. Click Create function. Create Function
  3. The Lambda has been created, although it does not do anything yet. We need to provide the code and configure the function:

    a. Take a look at the Function code box. In the Handler textbox paste: com.snowplowanalytics.indicative.LambdaHandler::recordHandler

    b. From the Code entry type dropdown pick Upload a file from Amazon S3. A textbox labeled S3 Link URL will appear. We are hosting the code through our hosted assets. You will need to choose the S3 bucket in the same region as your AWS Lambda function: for example if your Lambda is us-east-1 region, paste the following URL: s3://snowplow-hosted-assets-us-east-1/relays/indicative/indicative-relay-0.4.0.jar in the textbox. Take a look at this table to pick the right bucket name for your region. Make sure Runtime is Java 8. Lambda Code

  4. Get the API Key from step 4 from the Analytics UI.
  5. Below Function code settings you will find a section called Environment variables.

    a. In the first row, first column (the key), type INDICATIVE_API_KEY. In the second column (the value), paste your API Key. Environment Variables

    b. The relay lets you configure the following filters:

      - UNUSED_EVENTS: events that will not be relayed to Analytics;
      - UNUSED_ATOMIC_FIELDS: fields of the [canonical](https://github.com/snowplow/snowplow/wiki/canonical-event-model) Snowplow event that will not be relayed to Analytics;
      - UNUSED_CONTEXTS: contexts whose fields will not be relayed to Analytics.

Out of the box, the relay is configured to use the following defaults:

Unused events Unused atomic fields Unused contexts
app_heartbeat etl_tstamp application_context
app_initialized collector_tstamp application_error
app_shutdown dvce_created_tstamp duplicate
app_warning event geolocation_context
create_event txn_id instance_identity_document
emr_job_failed name_tracker java_context
emr_job_started v_tracker jobflow_step_status
emr_job_status v_collector parent_event
emr_job_succeeded v_etl performance_timing
incident user_fingerprint timing
incident_assign geo_latitude
incident_notify_of_close geo_longitude
incident_notify_user ip_isp
job_update ip_organization
load_failed ip_domain
load_succeeded ip_netspeed
page_ping page_urlscheme
s3_notification_event page_urlport
send_email page_urlquery
send_message page_urlfragment
storage_write_failed refr_urlscheme
stream_write_failed refr_urlport
task_update refr_urlquery
wd_access_log refr_urlfragment
pp_xoffset_min
pp_xoffset_max
pp_yoffset_min
pp_yoffset_max
br_features_pdf
br_features_flash
br_features_java
br_features_director
br_features_quicktime
br_features_realplayer
br_features_windowsmedia
br_features_gears
br_features_silverlight
br_cookies
br_colordepth
br_viewwidth
br_viewheight
dvce_ismobile
dvce_screenwidth
dvce_screenheight
doc_charset
doc_width
doc_height
tr_currency
mkt_clickid
etl_tags
dvce_sent_tstamp
refr_domain_userid
refr_device_tstamp
derived_tstamp
event_vendor
event_name
event_format
event_version
event_fingerprint
true_tstamp

To change the defaults, you can pass in your own lists of events, atomic fields or contexts to be filtered out. For example:

Environment variable key Environment variable value
UNUSED_EVENTS page_ping,file_download
UNUSED_ATOMIC_FIELDS name_tracker,event_vendor
UNUSED_CONTEXTS performance_timing,client_context

Similarly to setting up the API key, the first column (key) needs to be set to the specified environment variable name in ALLCAPS. The second column (value) is your own list as a comma-separated string with no spaces.

If you only specify the environment variable name but do not provide a list of values, then nothing will be filtered out.

If you do not set any of the environment variables, the defaults will be used.

  1. Scroll down a bit and take a look at the Basic settings box. There you can set memory and timeout limits for the Lambda. As mentioned earlier, we recommend setting 256 MB of memory or higher (on AWS Lambda the CPU performance scales linearly with the amount of memory) and a high timeout time of 1 minute 30 seconds.

Basic Settings kinesis

  1. As a final step, add your Snowplow enriched Kinesis stream as an event source for the Lambda function. You can follow the official AWS tutorial if you are using AWS CLI or do it directly from the AWS Console using the following instructions. Scroll to the top of the page and from the list of triggers in the Designer configuration up top, choose Kinesis. Kinesis Trigger

Take a look at the Configure triggers section which just appeared below. Choose your Kinesis stream that contains Snowplow enriched events. Set the batch size to your liking - 100 is a reasonable setting. Note that this is a maximum batch size, the function can be triggered with fewer records. For the starting position we recommend Trim horizon, which starts processing the stream from an observable start (Alternatively, you can select At timestamp to start sending data from a particular date). Click the Add button to finish the trigger configuration. Make sure Enable trigger is selected.

Basic Settings

  1. Save the changes by clicking the Save button in the top-right part of the page.

Validate Your Data

Go to your Indicative project to check if you are receiving data. You can also go to the debug console to troubleshoot the relay in real time.

Was this page helpful?

    Last Updated: December 20, 2024