Data plans help you improve data quality and control across the enterprise:
A data plan is a codified set of expectations about the extent and shape of your data collected with mParticle. A plan contains the following elements:
Depending on your goals, create a single plan or multiple plans. You decide whether all of your feeds and inputs share the same plan, or if you create a unique plan for every individual client-side and server-side input. Two common scenarios indicate a need for multiple plans:
Although most of the data planning steps can be completed in the Data Plan UI, Step 3 requires a small code change. You must have developer support for the following:
Web | v2.1.1 or later | Github |
iOS | v7.13.0 or later | Github |
Android | v5.12.0 or later | Github |
Python | v0.10.8 or later | Github |
Java | v2.2.0 or later | Github |
Roku | v2.1.3 or later | Github |
Optionally, you can review the JSON Reference to learn more about the full JSON structure.
To create and implement a data plan:
| ![image of lightbulb](/images/icons/icons8-idea-64.png) | Remember that you can easily create and delete plans, so feel free to experiment. |
| --- | --- |
</aside>
Data Plan Builder is a Google Sheet add-on and template that helps you create a data plan:
Choose from one of the industry-specific templates or the generic template.
{
"version_description":"",
"version_document":{
"data_points":[
{
"description":"This is an example Custom Event. Use it as a reference when creating your own.",
"match":{
"type":"custom_event",
"criteria":{
"event_name":"My Custom Event",
"custom_event_type":"other"
}
},
"validator":{
"type":"json_schema",
"definition":{
"properties":{
"data":{
"additionalProperties":true,
"properties":{
"custom_attributes":{
"additionalProperties":false,
"properties":{
"my string attribute (enum validation)":{
"description":"An example string attribute using enum validation.",
"type":"string",
"enum":[
"two seater",
"three seater",
"sectional"
]
},
"my string attribute (regex validation)":{
"description":"An example string attribute using regex validation.",
"type":"string",
"pattern":"^[a-zA-Z0-9_]*$"
},
"my numeric attribute":{
"description":"An example numeric attribute using range validation.",
"type":"number",
"minimum":0,
"maximum":100
},
"my boolean attribute":{
"description":"An example boolean attribute.",
"type":"boolean"
},
"my shared attribute":{
"description":"An example shared attribute. This will appear on every event in the \"example_events\" group (see column titled \"Group\" on the Events tab)",
"type":"number"
}
},
"required":[
"my string attribute (enum validation)",
"my string attribute (regex validation)",
"my numeric attribute",
"my boolean attribute",
"my shared attribute"
],
"type":"object"
}
},
"required":[
],
"type":"object"
}
}
}
}
}
]
}
}
Once you have the JSON from Data Plan Builder, paste it into the Data Plan import window (as explained in Step 1.3 below), or store the file and upload it using the Data Planning API.
To create a plan:
To start verifying incoming data against your plan, you first need to activate it. To do this, click the Activate button on your data plan’s home screen. Then in the Activate modal, use the Status dropdown to select the environment in which you want to activate your data plan (dev
or prod
). (You also have the option to save the plan as a draft to return to later.)
Now that your plan is active, you need to ensure that incoming data is tagged with your plan’s id. Continue to the next step to learn how.
Before mParticle validates incoming data against the plan, the data must be tagged with a plan ID, an environment, and optionally a plan version. This is the step that requires a small code change, as mentioned in Prerequisites.
development
or production
): The environment of your data. mParticle uses this value to look for plans that are activated in the given environment.To find your plan ID, navigate to the plan listing page. In the following image, fintech_template
is the plan ID and should be used in the code snippets below:
Include the plan ID and environment in all batches sent to mParticle.
Example Code in Four Languages
You can cut and paste the following example code in either JSON, Swift, Kotlin, or JavaScript for your developer to implement:
{
"context": {
"data_plan": {
"plan_id": "mobile_data_plan",
"plan_version": 2
}
},
"environment": "development",
"user_identities": {...},
"events": [...]
}
let options = MParticleOptions(
key: "REPLACE WITH APP KEY",
secret: "REPLACE WITH APP SECRET"
)
options.dataPlanId = "mobile_data_plan"
options.dataPlanVersion = 2
options.environment = MPEnvironment.development;
MParticle.sharedInstance().start(options);
var options = MParticleOptions.builder(this)
.credentials("REPLACE WITH APP KEY", "REPLACE WITH APP SECRET")
.environment(MParticle.Environment.Development)
.dataplan("mobile_data_plan", 2)
.build()
MParticle.start(options)
window.mParticle = {
config: {
isDevelopmentMode: true,
dataPlan: {
planId: 'mobile_data_plan',
planVersion: 2,
}
}
};
Now that you have tagged incoming data, use Live Stream to debug violations as they occur.
Once your plan is validating data, violations reports help monitor your data quality. To view violations, click Unique Violations in the header row of your data plan’s home screen. This will display a violation report like the one below:
Your data needs change over time. Data plans can be easily updated to reflect these changes.
Smaller changes can be made directly to an existing plan version. Updates to active data plans are live immediately: simply update the plan in the UI and save your changes. For larger changes, we recommend creating a new plan version. Creating a new plan version allows you to track changes over time and to revert back to an older version if necessary.
If you’re using a Data Plan Builder, make the update in the builder and follow instructions to export a new data plan version into mParticle.
To view the version history of a data plan:
Once you are confident that your plan reflects the data you want to collect, you can block unplanned data from being forwarded to downstream systems. Learn more about blocking data in the next section.
Using Data Plans, you can block unplanned data from being forwarded to downstream systems. You can think of this feature as an allowlist (sometimes called a whitelist) for the data you want to capture with mParticle: any event, event attribute, or user attribute that is not included in the allowlist can be blocked from further processing.
custom_attributes
can be blocked from client-side kits. Other unplanned event attributes will not be blocked client-side. Learn more in the Blocking data sent to mParticle Kits section.To prevent blocked data from being lost, you can opt for blocked data to be forwarded to an Output with a Quarantine Connection. To illustrate a typical workflow, assume you choose to configure an Amazon S3 bucket as your Quarantine Output:
Anytime a data point is blocked, the Quarantine Connection will forward the original batch and metadata about what was blocked to the configured Amazon S3 bucket. You will then be able to:
In most cases, data collected by the mParticle SDK is sent to mParticle and then forwarded on to an integration partner server-to-server. However, in cases where a server-to-server integration cannot support all required functionality for an integration, an embedded kit can be used instead.
You can learn which integrations are kits for a given SDK here:
By default, the current Block feature supports blocking for server-side integrations. If you would like to enable blocking for mParticle kits, you need to follow additional steps outlined below for each of our most popular SDKs: Web, Android and iOS.
Before you can enable the blocking feature, you need to create a data plan and initialize the respective SDK with a data plan ID in your code. Read our “Getting Started” section for detailed guidance.
Platform | Versions | TTL | Repo |
---|---|---|---|
Web | v2.1.1 or later | 60 min | Github |
iOS | v8.1.0 or later | 10 min | Github |
Android | v5.15.0 or later | 10 min | Github |
Our SDKs are served by a CDN that caches SDK configuration, including your data plan, for some period of time (the “TTL”). As a result, updates to a data plan can take time before they are reflected in your client code. To avoid caching a plan version while you are iterating on it:
You can now turn on Block settings for the type of data you would like to block by completing the following steps:
For Web, you can use the developer console to verify when a kit’s underlying SDK uploads an event to the partner’s API. For iOS and Android, you can typically use verbose console logs or a proxy such as Charles Proxy. Depending on your block settings, you should see unplanned data removed from payloads. For example, if you have not planned “Bad Event A”, “Bad Event A” will not be forwarded to a specific partner integration.
Follow your usual software development process to deploy your code changes to production. Remember to also promote your data plan version to prod through the mParticle UI to start blocking production data that does not match your plan. Plan versions active on production are locked in the UI to prevent accidental updates. The recommended flow for updating a production plan is to clone the latest version and to deploy a new version after testing.
To protect shared resources, every mParticle account includes a memory quota for active data plan versions. The byte size of a plan version’s JSON representation is a good estimate of its memory footprint. The typical data plan version size is approximately 50 KB.
You can verify your current usage, check the size of a data plan, and if needed, take action to reduce your memory quota usage:
Contact your mParticle representative if you need more memory provisioned for your account.
To enable validation, you need to point your code to a data plan id with at least one active version. For a version to be considered active, its status has to be set to dev
or dev & prod
.
You can either pin your code to a specific data plan version or omit the version, in which case mParticle will match the data you send with the latest plan version that is active in a given environment (dev
or prod
). Learn more about how to implement a data plan with Getting Started.
You can plan for and validate the following events:
The following events are not yet included:
You can plan for and validate the following user identifier types:
The Other
identifiers allow you to enter up to ten different custom strings against which to validate data.
Here’s an example schema configuration for a screen event called “Sandbox Page View”:
This configuration states the following:
custom_attributes
object is required and any additional attributes that are not listed below should be flagged – the behavior for additional attributes is implied by the validation dropdown for the custom_attributes
object.anchor
is a string, and it’s required.name
is a string, and it’s optional.Let’s look at a couple examples to see this schema validation in action.
window.mParticle.logPageView(
'Sandbox Page View',
{
'anchor': '/home',
'name': 'Home',
}
)
This event passes validation.
window.mParticle.logPageView(
'Sandbox Page View',
{
'name': 'Home',
}
)
This event fails validation since the required anchor
attribute is excluded.
window.mParticle.logPageView(
'Sandbox Page View',
{
'anchor': '/home',
}
)
This event passes validation: The name
attribute is excluded but optional.
window.mParticle.logPageView(
'Sandbox Page View',
{
'anchor': '/home',
'label': 'Home'
}
)
This event fails validation: The label
attribute is unplanned and custom_attributes
has been configured to disallow additional attributes. You can change this behavior by changing the validation of the custom_attributes
object to Allow add'l attributes
(see below).
If you’re looking for an example of how to implement events that conform to your data plan, download your data plan and check out our Snippets Tool. This tool will show you how to implement every data point in your plan for any of our SDKs.
Since various mParticle features (Audiences, Calculated Attributes, Forwarding Rules, some integrations) will automatically convert string representations of numbers and booleans to their respective types, data planning does not distinguish between raw numeric or boolean values (e.g. 42
or true
) and their string representation (e.g. "42"
or "true"
). As long as the value can be converted to a type, it is considered valid.
You can validate specific attributes differently depending on detected type. Learn more about how type validation works here.
Number can be validated in two ways:
minimum
and maximum
keywords. Learn more here.enum
keyword. Learn more here.String can be validated in three ways:
enum
keyword. Learn more here. Within an enum value, commas are not allowed.Ingestion, plan validation (and blocking), and event forwarding occur in the following sequence:
Use any API client or SDK to send data to the Events API, and tag the data with your plan ID and, optionally, a plan version. For instructions, see Step 1 in Getting Started.
If you are using an mParticle kit to forward data to a destination, and you have enabled blocking of bad data, you can configure popular client SDKs to block bad data before it is forwarded to a kit. Learn more about blocking bad data before it is sent to kits here.
Your data then passes through the mParticle Rules engine. You can use your Rules to further enrich or fix your data.
Data is then validated and, optionally, blocked. You can see dev data being validated in real-time with Live Stream.
Data is then sent to the mParticle profile storage system. When you block bad data, it is dropped before being stored on a profile. Learn more about what happens when data is blocked here.
Your data then passes through the rest of the mParticle platform and is sent outbound, including:
During plan enforcement, mParticle will generate violations when actual data does not match expectations. mParticle tracks the following types of violations:
The event type and name combination is not expected.
The attribute is not expected on a specific event.
The user attribute or identity is not expected.
This means the attribute is expected, but it has one or more data quality violations such as:
You can’t block the following items:
user_attribute_change
can’t be blocked as unplanned dataBlocked data is dropped from your data stream before it is consumed by other mParticle features, such as:
For debugging and reporting purposes, blocked data is shown in Live Stream and the Data Plan Report. Unless you create a Quarantine Connection, you won’t be able to recover blocked data.
Blocking data does not impact MTU or (ingested) event counts.
To prevent blocked data from being lost, you can opt for blocked data to be forwarded to an Output with a Quarantine Connection. To illustrate a typical workflow, assume you choose to configure an Amazon S3 bucket as your Quarantine Output.
Anytime a data point is blocked, the Quarantine Connection will forward the original batch and metadata about what was blocked to the configured Amazon S3 bucket. You will then be able to:
Learn more about how to use quarantined data in the Blocked Data Backfill Guide.
We’ve developed tools for you to be able to lint your Swift, Kotlin/Java, and JavaScript/TypeScript code. For more information, see Linting Tools.
The mParticle Snippets tool helps you to generate example code blocks that log events using the mParticle SDKs in a way that conforms to a specified data plan.
For example, if a data plan includes a data point for a custom event with 10 different attributes, you can create the exact code that will log that event with all of its attributes by running the data plan through the Snippets tool.
This is helpful when integrating the mParticle SDK into your app if you are unsure which method to call to log a specific event or how to ensure that all of an event’s attributes are captured correctly.
To use the Snippets tool:
For example:
// Data Plan Point 6
// User added product to cart
let product = mParticle.eCommerce.createProduct('productName', 'productId', 19.199, 1)
mParticle.eCommerce.logProductAction(mParticle.ProductActionType.AddToCart, [product])
For this data point, you must first create the product being added to the cart using the mParticle.eCommerce.createProduct()
method, passing in the attributes productName
, productId
, and 19.199
for the product’s name, ID, price, and amount.
To log the event, your app must call the mParticle.eCommerce.logProductAction()
method passing in the product object just created and the product action type (AddToCart
).
Visit the mParticle developer documentation to learn more about integrating the SDKs into your application.
For more information about the Snippets Tool, visit the GitHub repo.
Was this page helpful?