When setting up Warehouse Sync, it is possible to encounter problems with one or more components of your configuration:
Following are several troubleshooting guides for each of these problem categories. If you are still encountering issues after following the appropriate steps below, contact mParticle support.
Connectivity issues are often the result of an incomplete or incorrect first-time configuration.
Before troubleshooting, verify the following:
POST {baseURI}/connections
API call. Are they correct for the data warehouse instance you are trying to connect to?Errors or incompatibility in the SQL syntax of your data model will return errors and prevent the sync from succeeding.
Before troubleshooting, run the SQL query outside of mParticle. If it doesn’t run successfully or return the expected results, the issue is likely in your query, independent of Warehouse Sync.
Verify your SQL syntax. While most data warehouses support common SQL syntax, it is possible to encounter exceptions in SQL extension for your warehouse. For example:
SELECT current_timestamp AS \"tstamp\" FROM tableXYZ ... "
in your SQL query will fail if iterator_field
is tstamp
in your data model.Workaround 1, Remove the explicit identifier " "
:
SELECT current_timestamp AS tstamp FROM tableXYZ ... "
iterator_field": "tstamp"
Workaround 2, force UPPER CASE:
SELECT current_timestamp AS \”TSTAMP\" FROM tableXYZ ... "
"iterator_field": "TSTAMP"
If the error is related to the timestamp field in the query, ensure that:
Pipeline issues are typically caused by security problems, the timestamp field provided in the data model, or other factors with the environment.
The report API returns some type of error message. For example:
The report API returns "successful_records": 0
. For example:
{
"pipeline_id": "string",
"status": "idle",
"connection_status": "healthy",
"data_model_status": "valid",
"latest_pipeline_run": {
"id": 0,
"pipeline_id": "string",
"type": "scheduled",
"status": "success",
"errors": [
{
"message": "string"
}
],
"logical_date": "2023-10-25T18:11:57.321Z",
"started_on": "2023-10-25T18:11:57.321Z",
"ended_on": "2023-10-25T18:11:57.321Z",
"range_start": "2023-10-25T18:11:57.321Z",
"range_end": "2023-10-25T18:11:57.321Z",
"successful_records": 0,
"failed_records": 0
}
}
Before troubleshooting, verify the following:
Importing and mapping problems usually result from incorrect mapping between data rows in the warehouse and user profiles or attributes in mParticle.
Before troubleshooting, verify the following:
Provide mParticle support or your account representative with the event batch JSON object from your mParticle Livestream, or the MPID and batch ID for the event, as well as a CSV of the source data
If a table schema changes and validation is still occurring, you may need to wait 24 hours for the cache in BigQuery to clear and reset before trying again.
The incremental
sync mode uses the specified iterator field to track what data has been synchronized in a monotonically increasing fashion. If you need to synchronize a specific window of time,
you can create a new full
, once
pipeline and use the from
and to
parameters to capture the desired data interval. You may reuse your existing connection, field transformation, and data model.
First, use the Get a Specific Pipeline endpoint to retrieve details of an existing pipeline for the specific time window synchronization. You may want to reuse the following parameters:
pipeline_type
connection_id
field_transformation_id
data_model_id
partner_feed_id
iterator_field
iterator_data_type
environment
data_plan_id
data_plan_version
Then, use the Create a Pipeline endpoint to create your new pipeline. In this example, we create a new full
, once
pipeline to retrieve data from 2022-07-01T16:00:00Z
to 2022-08-01T16:00:00Z
.
{
"id": "sync-specific-time-window",
"name": "Sync Specific Time Window",
"pipeline_type": "events",
"connection_id": "existing-connection-id",
"field_transformation_id": "existing-field-transformation-id",
"data_model_id": "existing-data-model-id",
"partner_feed_id": 1234,
"state": "active",
"sync_mode": {
"type": "full",
"iterator_field": "updated_at",
"iterator_data_type": "timestamp",
"from": "2022-07-01T16:00:00Z",
"until": "2022-08-01T16:00:00Z"
},
"schedule": {
"type": "once"
},
"environment": "development",
"data_plan_id": "example-data-plan-id",
"data_plan_version": 2
}
Was this page helpful?