Documentation

Developers

API References
Data Subject Request API

Data Subject Request API Version 1 and 2

Data Subject Request API Version 3

Platform API

Platform API Overview

Accounts

Apps

Audiences

Calculated Attributes

Data Points

Feeds

Field Transformations

Services

Users

Workspaces

Warehouse Sync API

Warehouse Sync API Overview

Warehouse Sync API Tutorial

Warehouse Sync API Reference

Data Mapping

Warehouse Sync SQL Reference

Warehouse Sync Troubleshooting Guide

ComposeID

Warehouse Sync API v2 Migration

Custom Access Roles API

Bulk Profile Deletion API Reference

Data Planning API

Group Identity API Reference

Calculated Attributes Seeding API

Pixel Service

Profile API

Events API

mParticle JSON Schema Reference

IDSync

Client SDKs
AMP

AMP SDK

Android

Initialization

Configuration

Network Security Configuration

Event Tracking

User Attributes

IDSync

Screen Events

Commerce Events

Location Tracking

Media

Kits

Application State and Session Management

Data Privacy Controls

Error Tracking

Opt Out

Push Notifications

WebView Integration

Logger

Preventing Blocked HTTP Traffic with CNAME

Linting Data Plans

Troubleshooting the Android SDK

API Reference

Upgrade to Version 5

Cordova

Cordova Plugin

Identity

Direct Url Routing

Direct URL Routing FAQ

Web

Android

iOS

Flutter

Getting Started

Usage

API Reference

iOS

Initialization

Configuration

Event Tracking

User Attributes

IDSync

Screen Tracking

Commerce Events

Location Tracking

Media

Kits

Application State and Session Management

Data Privacy Controls

Error Tracking

Opt Out

Push Notifications

Webview Integration

Upload Frequency

App Extensions

Preventing Blocked HTTP Traffic with CNAME

Linting Data Plans

Troubleshooting iOS SDK

Social Networks

iOS 14 Guide

iOS 15 FAQ

iOS 16 FAQ

iOS 17 FAQ

iOS 18 FAQ

API Reference

Upgrade to Version 7

React Native

Getting Started

Identity

Roku

Getting Started

Identity

Media

Unity

Upload Frequency

Getting Started

Opt Out

Initialize the SDK

Event Tracking

Commerce Tracking

Error Tracking

Screen Tracking

Identity

Location Tracking

Session Management

Xbox

Getting Started

Identity

Web

Initialization

Configuration

Content Security Policy

Event Tracking

User Attributes

IDSync

Page View Tracking

Commerce Events

Location Tracking

Media

Kits

Application State and Session Management

Data Privacy Controls

Error Tracking

Opt Out

Custom Logger

Persistence

Native Web Views

Self-Hosting

Multiple Instances

Web SDK via Google Tag Manager

Preventing Blocked HTTP Traffic with CNAME

Facebook Instant Articles

Troubleshooting the Web SDK

Browser Compatibility

Linting Data Plans

API Reference

Upgrade to Version 2 of the SDK

Xamarin

Getting Started

Identity

Web

Alexa

Media SDKs

iOS

Web

Android

Quickstart
Android

Overview

Step 1. Create an input

Step 2. Verify your input

Step 3. Set up your output

Step 4. Create a connection

Step 5. Verify your connection

Step 6. Track events

Step 7. Track user data

Step 8. Create a data plan

Step 9. Test your local app

HTTP Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

iOS Quick Start

Overview

Step 1. Create an input

Step 2. Verify your input

Step 3. Set up your output

Step 4. Create a connection

Step 5. Verify your connection

Step 6. Track events

Step 7. Track user data

Step 8. Create a data plan

Java Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

Node Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

Python Quick Start

Step 1. Create an input

Step 2. Create an output

Step 3. Verify output

Web

Overview

Step 1. Create an input

Step 2. Verify your input

Step 3. Set up your output

Step 4. Create a connection

Step 5. Verify your connection

Step 6. Track events

Step 7. Track user data

Step 8. Create a data plan

Server SDKs

Node SDK

Go SDK

Python SDK

Ruby SDK

Java SDK

Tools

mParticle Command Line Interface

Linting Tools

Smartype

Guides
Partners

Introduction

Outbound Integrations

Outbound Integrations

Firehose Java SDK

Inbound Integrations

Kit Integrations

Overview

Android Kit Integration

JavaScript Kit Integration

iOS Kit Integration

Compose ID

Data Hosting Locations

Glossary

Migrate from Segment to mParticle

Migrate from Segment to mParticle

Migrate from Segment to Client-side mParticle

Migrate from Segment to Server-side mParticle

Segment-to-mParticle Migration Reference

Rules Developer Guide

API Credential Management

The Developer's Guided Journey to mParticle

Guides

Getting Started

Create an Input

Start capturing data

Connect an Event Output

Create an Audience

Connect an Audience Output

Transform and Enhance Your Data

Platform Guide
The New mParticle Experience

The new mParticle Experience

The Overview Map

Observability

Observability Overview

Observability User Guide

Observability Span Glossary

Introduction

Data Retention

Connections

Activity

Live Stream

Data Filter

Rules

Tiered Events

mParticle Users and Roles

Analytics Free Trial

Troubleshooting mParticle

Usage metering for value-based pricing (VBP)

Analytics

Introduction

Setup

Sync and Activate Analytics User Segments in mParticle

User Segment Activation

Welcome Page Announcements

Settings

Project Settings

Roles and Teammates

Organization Settings

Global Project Filters

Portfolio Analytics

Analytics Data Manager

Analytics Data Manager Overview

Events

Event Properties

User Properties

Revenue Mapping

Export Data

UTM Guide

Query Builder

Data Dictionary

Query Builder Overview

Modify Filters With And/Or Clauses

Query-time Sampling

Query Notes

Filter Where Clauses

Event vs. User Properties

Group By Clauses

Annotations

Cross-tool Compatibility

Apply All for Filter Where Clauses

Date Range and Time Settings Overview

Understanding the Screen View Event

Analyses

Analyses Introduction

Segmentation: Basics

Getting Started

Visualization Options

For Clauses

Date Range and Time Settings

Calculator

Numerical Settings

Segmentation: Advanced

Assisted Analysis

Properties Explorer

Frequency in Segmentation

Trends in Segmentation

Did [not] Perform Clauses

Cumulative vs. Non-Cumulative Analysis in Segmentation

Total Count of vs. Users Who Performed

Save Your Segmentation Analysis

Export Results in Segmentation

Explore Users from Segmentation

Funnels: Basics

Getting Started with Funnels

Group By Settings

Conversion Window

Tracking Properties

Date Range and Time Settings

Visualization Options

Interpreting a Funnel Analysis

Funnels: Advanced

Group By

Filters

Conversion over Time

Conversion Order

Trends

Funnel Direction

Multi-path Funnels

Analyze as Cohort from Funnel

Save a Funnel Analysis

Explore Users from a Funnel

Export Results from a Funnel

Cohorts

Getting Started with Cohorts

Analysis Modes

Save a Cohort Analysis

Export Results

Explore Users

Saved Analyses

Manage Analyses in Dashboards

Journeys

Getting Started

Event Menu

Visualization

Ending Event

Save a Journey Analysis

Users

Getting Started

User Activity Timelines

Time Settings

Export Results

Save A User Analysis

Dashboards

Dashboards––Getting Started

Manage Dashboards

Dashboard Filters

Organize Dashboards

Scheduled Reports

Favorites

Time and Interval Settings in Dashboards

Query Notes in Dashboards

User Aliasing

Analytics Resources

The Demo Environment

Keyboard Shortcuts

Tutorials

Analytics for Marketers

Analytics for Product Managers

Compare Conversion Across Acquisition Sources

Analyze Product Feature Usage

Identify Points of User Friction

Time-based Subscription Analysis

Dashboard Tips and Tricks

Understand Product Stickiness

Optimize User Flow with A/B Testing

User Segments

APIs

User Segments Export API

Dashboard Filter API

IDSync

IDSync Overview

Use Cases for IDSync

Components of IDSync

Store and Organize User Data

Identify Users

Default IDSync Configuration

Profile Conversion Strategy

Profile Link Strategy

Profile Isolation Strategy

Best Match Strategy

Aliasing

Data Master
Group Identity

Overview

Create and Manage Group Definitions

Introduction

Catalog

Live Stream

Data Plans

Data Plans

Blocked Data Backfill Guide

Personalization
Predictive Attributes

Predictive Attributes Overview

Create Predictive Attributes

Assess and Troubleshoot Predictions

Use Predictive Attributes in Campaigns

Predictive Audiences

Predictive Audiences Overview

Using Predictive Audiences

Introduction

Profiles

Calculated Attributes

Calculated Attributes Overview

Using Calculated Attributes

Create with AI Assistance

Calculated Attributes Reference

Audiences

Audiences Overview

Real-time Audiences

Standard Audiences

Journeys

Journeys Overview

Manage Journeys

Download an audience from a journey

Audience A/B testing from a journey

Journeys 2.0

Warehouse Sync

Data Privacy Controls

Data Subject Requests

Default Service Limits

Feeds

Cross-Account Audience Sharing

Approved Sub-Processors

Import Data with CSV Files

Import Data with CSV Files

CSV File Reference

Glossary

Video Index

Analytics (Deprecated)
Identity Providers

Single Sign-On (SSO)

Setup Examples

Settings

Debug Console

Data Warehouse Delay Alerting

Introduction

Developer Docs

Introduction

Integrations

Introduction

Rudderstack

Google Tag Manager

Segment

Data Warehouses and Data Lakes

Advanced Data Warehouse Settings

AWS Kinesis (Snowplow)

AWS Redshift (Define Your Own Schema)

AWS S3 Integration (Define Your Own Schema)

AWS S3 (Snowplow Schema)

BigQuery (Snowplow Schema)

BigQuery Firebase Schema

BigQuery (Define Your Own Schema)

GCP BigQuery Export

Snowflake (Snowplow Schema)

Snowplow Schema Overview

Snowflake (Define Your Own Schema)

APIs

Dashboard Filter API (Deprecated)

REST API

User Segments Export API (Deprecated)

SDKs

SDKs Introduction

React Native

iOS

Android

Java

JavaScript

Python

Object API

Developer Basics

Aliasing

Warehouse Sync API Tutorial

Use this tutorial to configure your first Warehouse Sync pipeline using the mParticle Postman collection, and use the data from your pipeline to create an mParticle Audience. Postman is an easy and friendly environment for both developers and non-developers to use APIs.

This tutorial is not a complete guide to all Warehouse Sync features. For a complete API reference, see the Warehouse Sync API Reference.

Prerequisites

  • Install the latest version of the Postman desktop application. You can download Postman from https://www.postman.com/downloads/.
  • Fork the mParticle Warehouse Sync Postman Collection to your workspace:

    Run in Postman

    A copy of the Warehouse Sync environment is included. You can download it again here.

Step 1. mParticle setup

Create Platform API credentials

You need credentials to use the Platform API to create Warehouse Sync pipelines.

To create a Platform API credential:

  1. After signing in to the mParticle app as a user with the Admin role, click the gear icon in the bottom left corner.

  1. Click Platform.
  2. Select the API Credentials tab.
  3. Click the green Add Credential button in the top right.
  4. Give the credential a name, check the Platform checkbox and select Admin from the Permissions dropdown menu. Click the green Save button.

    Add api credential screen
  5. Copy the Client ID and Client Secret values for use in a later step.
  6. Click Done.

Step 2. Data warehouse setup

Work with your warehouse administrator or IT team to ensure your warehouse is reachable and accessible by mParticle.

  1. Whitelist the mParticle IP address range so your warehouse will be able to accept inbound API requests from mParticle.
  2. Ask your database administrator to perform the following steps in your warehouse to create a new role that mParticle can use to access your database. Select the correct tab for your warehouse (Snowflake, Google BigQuery, Amazon Redshift, or Databricks) below.

Run the following commands from your Snowflake instance:

USE ROLE ACCOUNTADMIN;

// mParticle recommends creating a unique role for warehouse sync 
CREATE ROLE IF NOT EXISTS {{role_name}};

GRANT USAGE ON WAREHOUSE {{warehouse}} TO ROLE {{role_name}};
GRANT USAGE ON DATABASE {{database}} TO ROLE {{role_name}};
GRANT USAGE ON SCHEMA {{database}}.{{schema}} TO ROLE {{role_name}};

// Grant SELECT privilege on any tables/views mP needs to access
GRANT SELECT ON TABLE {{database}}.{{schema}}.{{table}} TO ROLE {{role_name}};

// mParticle recommends creating a unique user for mParticle
// Mark your new user as a legacy service user to exclude it from Snowflake's multifactor authentication policy
CREATE OR REPLACE USER {{user_name}} PASSWORD = "{{unique_secure_password}}" TYPE = LEGACY_SERVICE;
GRANT ROLE {{role_name}} TO USER {{user_name}};

CREATE OR REPLACE STORAGE INTEGRATION {{storage_integration_name}}
          WITH TYPE = EXTERNAL_STAGE
          STORAGE_PROVIDER = S3
          ENABLED = TRUE
          STORAGE_AWS_ROLE_ARN = "arn:aws:iam::{{mp_pod_aws_account_id}}:role/ingest-pipeline-data-external-{{mp_org_id}}-{{mp_acct_id}}"
          STORAGE_AWS_OBJECT_ACL = "bucket-owner-full-control"
          STORAGE_ALLOWED_LOCATIONS = ("s3://{{mp_pod}}-ingest-pipeline-data/{{mp_org_id}}/{{mp_acct_id}}");


GRANT USAGE ON INTEGRATION {{storage_integration_name}} TO ROLE {{role_name}};

Where:

  • role_name: The ID of the role mParticle will assume while executing SQL commands on your Snowflake instance. mParticle recommends creating a unique role for warehouse sync.
  • warehouse: The ID of the Snowflake virtual warehouse compute cluster where SQL commands will be executed.
  • database: The ID of the database in your Snowflake instance from which you want to sync data.
  • schema: The ID of the schema in your Snowflake instance containing the tables you want to sync data from.
  • table: The ID of the table containing data you want to sync. Grant SELECT privileges on any tables/views mParticle needs to access.
  • user_name: The ID of the user mParticle will log in as while executing SQL commands on your Snowflake instance. mParticle recommends creating a unique role for warehouse sync.

    • If creating a new user, use a new unique_secure_password.
  • storage_integration_name: The ID of a Snowflake external storage integration allowing mParticle to unload data from your Snowflake instance to an S3 storage bucket.
  • mp_pod: The mParticle region ID of your data hosting location, one of US1, US2, AU1, or EU1.
  • mp_pod_aws_account_id: The mParticle provided ID for the data hosting location where your organization resides. Use the corresponding value for your mParticle instance:

    • US1: 338661164609
    • US2: 386705975570
    • AU1: 526464060896
    • EU1: 583371261087
  • mp_org_id: The mParticle provided ID of your organization where this connection will be stored. This can be found from your API client setup in step 1.
  • mp_acct_id: The mParticle provided ID of the account where this connection will be stored. This can be found from your API client setup in step 1.

Create a new service account for mParticle

  1. Go to console.cloud.google.com, log in, and navigate to IAM & Admin > Service Accounts.
  2. Select Create Service Account.
  3. Enter a new identifier for mParticle in Service account ID. In the example below, the email address is the service account ID. Save this value for your Postman setup.

  1. Under Grant this service account access to project, select BigQuery Job User under the Role dropdown menu, and click DONE.

  1. Select your new service account and navigate to the Keys tab.
  2. Click ADD KEY and select Create new key. The value for service_account_key will be the contents of the generated JSON file. Save this value for your Postman setup.

Identify your BigQuery warehouse details

Navigate to your BigQuery instance from console.cloud.google.com.

  • Your project_id is the first portion of Dataset ID (the portion before the .). In the example above, it is mp-project.
  • Your dataset_id is the second portion of Dataset ID (the portion immediately after the .) In the example above, it is mp-dataset.
  • Your region is the Data location. This is us-east4 in the example above.

Grant access to the dataset in BigQuery

  1. From your BigQuery instance in console.cloud.google.com, click Sharing and select Permissions.
  2. Click Add Principle.
  3. Assign two Roles, one for BigQuery Data Viewer, and one for BigQuery User.
  4. Click Save.

  1. Navigate to your AWS Console, log in with your administrator account, and navigate to your Redshift cluster details.
  2. Run the following SQL statements to create a new user for mParticle, grant the necessary schema permissions to the new user, and grant the necessary access to your tables/views.
-- Create a unique user for mParticle
CREATE USER {{user_name}} WITH PASSWORD '{{unique_secure_password}}'

-- Grant schema usage permissions to the new user
GRANT USAGE ON SCHEMA {{schema_name}} TO {{user_name}}

-- Grant SELECT privilege on any tables/views mP needs to access to the new user
GRANT SELECT ON TABLE {{schema_name}}.{{table_name}} TO {{user_name}}
  1. Navigate to the Identity And Access Management (IAM) dashboard, select Roles under the left hand nav bar, and click Create role.

  1. In Step 1 Select trusted entity, click AWS service under Trusted entity
  2. Select Redshift from the dropdown menu titled “Use cases for other AWS services”, and select Redshift - Customizable. Click Next.

  1. In Step 2 Add permissions, click Create Policy.

  1. Click JSON in the policy editor, and enter the following permissions before clicking Next.
  • Replace {mp_pod_aws_account_id} with one of the following values according to your mParticle instance’s location:

    • US1 = 338661164609
    • US2 = 386705975570
    • AU1 = 526464060896
    • EU1 = 583371261087
{
    "Statement": [
        {
            "Action": "sts:AssumeRole",
            "Effect": "Allow",
            "Resource": "arn:aws:iam::{{mp_pod_aws_account_id}}:role/ingest-pipeline-data-external-{{mp_org_id}}-{{mp_acct_id}}",
            "Sid": ""
        }
    ],
    "Version": "2012-10-17"
}

  1. Enter a meaningful name for your new policy, such as mparticle_redshift_assume_role_policy, and click Create policy.

  1. Return to the Create role tab, click the refresh button, and select your new policy. Click Next.

  1. Enter a meaningful name for your new role, such as mparticle_redshift_role, and click Create role.

Your configuration will differ between Amazon Redshift and Amazon Redshift Serverless. To complete your configuration, follow the appropriate steps for your use case below.

Make sure to save the value of your new role’s ARN. You will need to use this when setting up Postman in the next section.

Amazon Redshift (not serverless)

  1. Navigate to your AWS Console, then navigate to Redshift cluster details. Select the Properties tab.

  1. Scroll to Associated IAM roles, and select Associate IAM Roles from the Manage IAM roles dropdown menu.

  1. Select the new role you just created. The name for the role in this example is mparticle_redshift_role.

Amazon Redshift Serverless

  1. Navigate to your AWS Console, then navigate to your Redshift namespace configuration. Select the Security & Encryption tab, and click Manage IAM roles.

  1. Select Associate IAM roles from the Manage IAM roles dropdown menu.
  2. Select the new role you just created. The name for the role in this example is mparticle_redshift_role.

Warehouse Sync uses the Databricks-to-Databricks Delta Sharing protocol to ingest data from Databricks into mParticle.

Complete the following steps to prepare your Databricks instance for Warehouse Sync.

1. Enable Delta Sharing

  1. Log into your Databricks account and navigate to the Account Admin Console. You must have Account Admin user privileges.
  2. Click Catalog from the left hand nav bar, and select the metastore you want to ingest data from.
  3. Select the Configuration tab. Under Delta Sharing, check the box labeled “Allow Delta Sharing with parties outside your organization”.
  4. Find and save your Databricks provider name: the value displayed under Organization name. You will use this value for your provider name when creating the connection between the mParticle Warehouse Sync API and Databricks.

2. Configure a Delta Sharing recipient for mParticle

  1. From the Unity Catalog Explorer in your Databricks account, click the Delta Sharing button, and select Shared by me.
  2. Click New Recipient in the top right corner.
  3. Within the Create a new recipient window, enter mParticle_{YOUR-DATA-POD} under Recipient name where {YOUR-DATA-POD} is either us1, us2, eu1, or au1 depending on the location of the data pod configured for your mParticle account.
  4. In Sharing identifier, enter one of the following identifiers below, depending on the location of your mParticle account’s data pod:

    • US1: aws:us-east-1:e92fd7c1-5d24-4113-b83d-07e0edbb787b
    • US2: aws:us-east-1:e92fd7c1-5d24-4113-b83d-07e0edbb787b
    • EU1: aws:eu-central-1:2b8d9413-05fe-43ce-a570-3f6bc5fc3acf
    • AU1: aws:ap-southeast-2:ac9a9fc4-22a2-40cc-a706-fef8a4cd554e

3. Share your Databricks tables and schema with your new Delta Sharing recipient

  1. From the Unity Catalog Explorer in your Databricks account, click the Delta Sharing button.
  2. Click Share data in the top right.
  3. Within the Create share window, enter mparticle_{YOUR-MPARTICLE-ORG-ID}_{YOUR-MPARTICLE-ACCOUNT-ID} under Share name where {YOUR-MPARTICLE-ORG-ID} and {YOUR-MPARTICLE-ACCOUNT-ID} are your mParticle Org and Account IDs.

    • To find your Org ID, log into the mParticle app. View the page source. For example, in Google Chrome, go to View > Developer > View Page Source. In the resulting source for the page, look for “orgId”:xxx. This number is your Org ID.
    • Follow a similar process to find your Account ID (“accountId”:yyy) and Workspace ID (“workspaceId”:zzz).

  1. Click Save and continue at the bottom right.
  2. In the Add data assets section, select the assets to add to the schemas and tables you want to send to mParticle. Make sure to remember your schema name: you will need this value when configuring your Databricks feed in mParticle.

  1. Click Save and continue at the bottom right until you reach the Add recipients step. (You can skip the Add notebooks step.)
  2. In the Add recipients step, make sure to add the new mParticle recipient you created in Step 2.
  3. Finally, click the Share data button at the bottom right.

Unsupported data types between mParticle and Databricks

Databricks Delta Sharing does not currently support the TIMESTAMP_NTZ data type.

Other data types that are not currently supported by the Databricks integration for Warehouse Sync (for both user and events data) include:

If you are ingesting events data through Warehouse Sync, the following data types are unsupported:

While multi-dimensional, or nested, arrays are unsupported, you can still ingest simple arrays with events data.

Step 3. Postman setup

Once you have installed Postman, configure the collection environment settings and variables.

Update Postman environment settings

  1. Ensure you forked the mParticle Warehouse Sync API Postman Collection, as described in the Prerequisites section of this tutorial. In Postman, click the Environments tab from the left navigation menu.
  2. If you successfully forked the Warehouse Sync API collection, you’ll see it in the list of Environment configurations. You can rename it to something more meaningful by right-clicking on the next to the name and choosing the Rename option.

    Postman page with environment displayed
  3. Replace the the placeholders (replace_me) with the correct values for your environment. You must update the values under column labeled Current value.

    • Replace PLATFORM_API_CLIENT_ID and PLATFORM_API_CLIENT_SECRET with your new Platform API credentials.
    • Replace WORKSPACE_ID, ACCOUNT_ID, and ORG_ID with the corresponding values for your mParticle account.

      • To find your Org ID, log into the mParticle app. View the page source. For example, in Google Chrome, go to View > Developer > View Page Source. In the resulting source for the page, look for “orgId”:xxx. This number is your Org ID.
      • Follow a similar process to find your Account ID (“accountId”:yyy) and Workspace ID (“workspaceId”:zzz).
    • Replace POD with the regional pod your mParticle account is deployed on. Look at the URL in your browser where you are signed into mParticle. The POD is one of the following values: US1, US2, EU1, AU1.
  4. Enter the data warehouse usernames and passwords you saved from “Step 2. Data Warehouse Setup.” according to the data warehouse you are using:

    • For Snowflake, replace SNOWFLAKE_PASSWORD and SNOWFLAKE_STORAGE_INTEGRATION with the values you saved in step 2. Please refer to the Snowflake documentation to determine your account_identifier and region.
    • For BigQuery, replace BIG_QUERY_SERVICE_ACCOUNT_ID with the service account ID you used in BigQuery, and BIG_QUERY_SERVICE_ACCOUNT_KEY with the key from the generated JSON file in step 2.
    • For Redshift, replace REDSHIFT_USER, REDSHIFT_PASSWORD, and REDSHIFT_AWS_IAM_ROLE_ARN with the values created in step 2.
  5. After updating all necessary values, run COMMAND-S (or CTRL-S) to save your changes.

Update the Postman collection

  1. Ensure you forked the mParticle Warehouse Sync API Postman Collection as described in the Prerequisites section. In Postman, click the Collections tab on the left hand navigation.
  2. Once successfully forked, you’ll see the collection in the list of available collections.
  3. Click Warehouse Sync API, then select the Variables tab.
  4. Replace replace_me placeholders with the values corresponding to your environment. Ensure you update the values in the Current value column.

    • Replace INGEST_PIPELINE_SLUG and INGEST_PIPELINE_NAME with the slug and name you want to use to identify your new pipeline.
    • Replace SQL_QUERY with the database SQL query mParticle will use to retrieve data from your warehouse. SQL is a powerful language and you can use advanced expressions to filter, aggregate, join, etc. your data. Work with your database administrator if you need help crafting the right SQL query:

      • Your query can contain a timestamp column that mParticle will use to keep track of which rows need to be loaded.
      • Your query should contain one or more user identity columns that mParticle will use to perform identity resolution to ensure that data ends up on the correct user profile.
      • As part of the SQL query, you must specify how columns in the query will map to attributes on a user’s profile. You do this by using column aliasing in SQL. For example, in the following query, the column cid in Snowflake is being mapped to the mParticle attribute customer_id.

        sql example

        If you don’t provide an alias, mParticle will use the name of the column in your database. If an attribute of this name does not already exist on the user’s profile, mParticle will create a new attribute with this name.

    • Before using the query in mParticle, test the query outside of mParticle to ensure it returns the data you expect.
    • To learn more checkout the Warehouse Sync SQL reference
  5. After updating all necessary values, run COMMAND-S (or CTRL-S) to save your changes.

Step 4: Create your first Warehouse Sync pipeline

Creating a Warehouse Sync pipeline requires completing the following steps:

  1. Create a partner feed.
  2. Create a connection.
  3. Create a data model.
  4. Create a field transformation
  5. Create the pipeline.

After configuration, you can monitor the pipeline.

Create the partner feed

First, you must create a data feed by submitting a request to the Feeds API. mParticle uses this feed for rules and connections in your workspace. You must provide a value for module_name that corresponds with the data warehouse you are using.

Valid values for module_name are:

  • Redshift
  • BigQuery
  • Snowflake
  • Databricks

These values are case sensitive. For example, if you use snowflake instead of Snowflake, you will encounter errors later in your configuration.

  1. In Postman, ensure the environment drop-down is pointed to the Environment configuration you recently imported.
  2. Expand the Warehouse Sync Collection and open the Feeds folder.
  3. Click Create Warehouse Sync Feed.
  4. Click the Body tab to see the information you will pass to the API in order to create the feed.

    Postman page showing body of post request
    • module_name must be one of Snowflake, BigQuery, or Redshift

    Values surrounded by double braces (for example: {{WORKSPACE_ID}}) are taken from the variables you updated in previous steps of this tutorial.

  5. Verify all values you changed, and click the blue Send button. mParticle returns a success message with details about your new feed. If the request fails, mParticle returns an error message with additional information.

Create the connection

The next step is to create a connection between mParticle and your data warehouse.

  1. In Postman, ensure the environment drop-down is pointed to the Environment configuration you recently imported.
  2. Expand the Warehouse Sync Collection and open the Connections folder.
  3. Click POST Create Snowflake/BigQuery/Redshift Connection, selecting the appropriate endpoint for the data warehouse you are using.
  4. Click the Body tab and replace each "replace_me" placeholder with the correct value for your specific warehouse.

    The values in {{Sample Values}} are taken from the environment variables you updated in earlier steps. Make sure these values match the values for your organization’s data warehouse. You may need to work with your database administrator to ensure you have the correct values.

  5. Verify all values you changed, and click the blue Send button. mParticle returns a success message with details about the configuration you just created. If the request fails, mParticle returns an error message with additional information.

    Postman page showing body of post request

Create the data model

The next step is to create a data model. A data model is a SQL query that mParticle sends to your warehouse specifying exactly what columns, rows, and fields of data you want to ingest through your pipeline.

In the case of pipelines that ingest user profile data, the data model is also responsible for mapping ingested data fields to mParticle user attributes.

To create a data model using the Warehouse Sync API:

  1. In Postman, expand Warehouse Sync API and click Data Models.
  2. Click Create Data Model.
  3. Select the Body tab and enter your data model.

    Postman page showing body of post request
  4. The values in {{Sample Values}} are taken from the variables you updated in previous steps.
  5. Once you are confident all values are correct, click the blue Send button. mParticle returns a success message with details about the data model you just created. If the request fails, mParticle returns an error message with additional information.

For more details about using SQL to create a data model, with example queries and best practices, see the Warehouse Sync SQL Reference.

Create the field transformation

You can create a field transformation to specify exactly how fields in your database should map to fields in the mParticle JSON schema.

For detailed instructions on how to create a field transformation, read Event Data Mapping.

To use your field transformation, add the field_transformation_id to the request body of your API call when creating your pipeline in the next step.

Create the pipeline

The final step is to create the pipeline. You pass in the connection and data model configurations previously created along with your sync mode and scheduling settings.

  1. In Postman, expand Warehouse Sync Collection and open the Pipelines folder.
  2. Click Create Pipeline.
  3. Select the Body tab, and update the sync_mode and schedule settings as follows:

    • For sync_mode:

      • Set type to either incremental or full
      • If you set type to incremental, Set iterator_field to the name of the column in your sql query mParticle will monitor to track changes that need to be synced
      • If you set type to incremental, Set iterator_data_type to the datatype of your iterator

        • Valid values for Snowflake are: timestamp_ltz, timestamp_ntz, timestamp_tz, datetime, date, timestamp_unixtime_ms, timestamp_unixtime_s.
        • Valid values for Google BigQuery are datetime, date, timestamp, timestamp_unixtime_ms, timestamp_unixtime_s.
        • Valid values for AWS Redshift are date, timestamp, timestamptz, timestamp_unixtime_ms, timestamp_unixtime_s.
        • Valid values for Databricks are date, timestamp, timestamp_unixtime_ms, timestamp_unixtime_s.
  • For schedule:

    • Set type to interval, once, or on_demand
    • If you set type to interval, set frequency to hourly, daily, weekly, or monthly
    • Set start to the date-time value you want recurring syncs to begin
<img src="/images/dwi/api-v2-tutorial/5.png" alt="Postman page showing request results"> 

The values in `{{Sample Values}}` will be taken from the variables you updated in previous steps. You can optionally update the `environment` variable. It is currently set to target your mParticle development environment, but you can change it to target your production environment.
  1. Set the field_transformation_id to the ID of a a custom field transformation to map data from your warehouse to specific fields in the mParticle JSON schema. To learn more about custom field transformations and how to create them, see Field Transformations API.
  2. You can associate an optional data plan with your pipeline to help improve the quality of, and grant you more control over, the data imported to mParticle. In the request body of API request, include the body parameters plan_id and plan_version and set these to the values of the data plan and version you want to use. You must use a data plan version that is active and exists in the mParticle workspace you are using.
  3. Once you are confident all values are correct, click the blue Send button. If successful, mParticle returns a success message with details about the configuration you just created. If it was not successful, mParticle returns an error message with additional information.

Monitor the pipeline

Once a pipeline has been created, you can monitor its status using the additional requests provided in the Postman collection.

  1. In Postman, expand the Warehouse Sync Collection and open the Pipelines folder.
  2. Click Get Pipeline Report.
  3. Click the blue Send button.
  4. mParticle sends a detailed message with the pipeline’s current status. After creating a pipeline, there is an approximate one-minute delay until the pipeline is created in the mParticle backend, so submitting a Get Pipeline Report request results in a Not Founderror. Try again after several minutes.

    Postman page showing success status
  5. While a pipeline is ingesting data, you can monitor it in mParticle as you would with any other input. From mParticle, go to Data Master > Live Stream to inspect the incoming data from Snowflake.

    mParticle live stream page
  6. Once the data is ingested, the data points appear on the user’s profile. Go to Activity > User Activity and look up a sample profile. If the attributes do not appear as expected, validate the mapping you provided in the SQL query provided earlier.

    mParticle live stream page with data points

Step 5. Activate the data

Now that the data has been loaded into mParticle, it’s time to put it to use by creating an audience using the newly ingested data and sending it to a downstream integration.

Create an audience

Create an audience that uses one of the attributes we ingested as a qualifying criteria for it:

  1. In the mParticle app, go to Audiences > Standard.
  2. Click the green New Standard Audience in the upper right corner.
  3. Give the audience a name.
  4. Select a date range, or choose “All available data.”
  5. Select the Warehouse Sync feed you created in mParticle Setup.
  6. Add an audience criteria that leverages one of the data points you ingested from your warehouse. In the example below, we only want to consider users who have a propensity-to-buy score that’s greater than 0.7.

    mparticle page showing user attribute
  7. Add any other criteria you want to be considered. Click Save As Draft when you are done.
  8. Click Calculate to run the audience calculation.

Connect the audience to an output

In the final step, we will send users who qualified for this audience to Iterable for further targeting. If your organization doesn’t use Iterable, pick a different integration that your organization uses.

After the audience has been fully calculated, connect it to an output:

  1. If you aren’t still there, go to Audiences > Standard.
  2. In the row for the audience you just created, click the Connect button in the Actions column.

    mparticle audiences page
  3. Click Connect Output.

    mparticle audience connect page
  4. Click the Add Output green button.
  5. Click the Iterable tile in the integrations directory, or pick another integration your organization uses if you don’t have Iterable.
  6. Select the Audience checkbox and click Configure.

    mparticle audience setup page
  7. Enter a configuration name, your Iterable API key, and the user ID that is used to identify users, and then click Save & Open in Connections.

    mparticle audience configuration page
  8. Provide the ID of the list in Iterable that the user data should be loaded into, and click Add Connection.

    mparticle connect output page
  9. Click the Send button to send the audience to Iterable.

    mparticle standard audience page
  10. Data starts flowing to Iterable.

    mparticle standard audience page showing data being sent

Now you can open the audience in Iterable.

Was this page helpful?

    Last Updated: November 20, 2024