CategoriesAll

Event

Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Prerequisites

To enable the Kafka integration, you will need a list of comma-separated bootstrap servers that identify an initial subset of servers, “brokers,” in the Kafka cluster. You will also need to provide the name of the Kafka topic stream to publish to.

Each broker is a host and port pair, separated by a colon:

localhost:9092
localhost:9092,anotherhost:9092

mParticle will pick the first available broker from the list and use that broker to discover the full set of brokers in the cluster. Because membership in a cluster dynamically changes and can be quite large, it is not necessary to specify all of the brokers. Forwarding will make use of all brokers regardless of which brokers were initially specified.

The cluster of brokers is managed by an additional Zookeeper server that handles membership changes and leader election processes.

Authentication

mParticle will only transmit data to Kafka encrypted over TLS. Brokers must use a certificate issued by a globally trusted certificate authority and containing a CN (common name) or SAN (subject alternative name) that matches the hostname.

Username and password authentication is optional.

If no username is provided, the Kafka security protocol used is SSL.

If a username is provided, the Kafka security protocol used is SASL_SSL. Note that it is not possible to have a password without a username. The SASL-enabled mechanisms supported are PLAIN (default), SCRAM-SHA-256, SCRAM-SHA-512.

Reference the Kafka documentation for more information about configuring your streams.

With username and password

Below is an example configuration using the PLAIN SASL mechanism. The configuration settings are contained within the .properties file for your broker:

ssl.client.auth=none
sasl.enabled.mechanisms=PLAIN
security.inter.broker.protocol=SASL_SSL

# The following settings are needed to allow mParticle to verify your certificate
ssl.keystore.location=your-keystore-location
ssl.keystore.password=your-keystore-password
ssl.keystore.type=JKS
ssl.key.password=your-key-password

# Replace with your specific host name and port
listeners=SASL_SSL://your-host.name:port

java.security.auth.login.config=/path/to/jaas.conf

The keystore may be in PKCS12 or JKS format.

In your jaas.conf, create a username and password for mParticle:

KafkaServer {
  org.apache.kafka.common.security.plain.PlainLoginModule required
  username="admin"
  password="admin-secret"
  user_admin="admin-secret"
  user_mparticle="mparticle-secret";
};

The KafkaServer block in jaas.conf is in addition to another Client block, stored within the file and containing credentials for use with the Zookeeper server that manages brokers in the cluster.

The username and password keys are used by the broker as login credentials for login into other brokers. Note that these should not be used as the User Name and Password in the Configuration Settings.

The user_xxxxxx keys identifies valid accounts and passwords pairs for logging into the broker itself. The User Name and Password in the Configuration Settings must match one of the pairs with the user_ prefix. For example, a valid user in the above example is either admin or mparticle.

To avoid placing username and password information in configuration files, you can write a custom login module or use the SCRAM mechanisms to store credentials within the Zookeeper server.

Without username and password

In the .properties file for your broker, set the following

ssl.client.auth=none
security.inter.broker.protocol=SSL

# The following settings are needed to allow mParticle to verify your certificate
ssl.keystore.location=your-keystore-location
ssl.keystore.password=your-keystore-password
ssl.keystore.type=JKS
ssl.key.password=your-key-password

# Replace with your specific host name and port
listeners=SSL://your-host.name:port

Further information

For more information on Apache Kafka security configuration, see https://kafka.apache.org/documentation/#security. More information about the general structure of the jaas.conf file can be found here: https://docs.oracle.com/javase/7/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html.

Data processing notes

  • The Kafka integration accepts data from all platform types.
  • Kafka topics are divided into partitions. Ordering is only guaranteed within a partition. Events forwarded to a Kafka topic are assigned an mParticle user ID as the partitioning key so that each user’s events are sent to the same partition and are received in order. In the rare case that an event does not have a user ID, the event is forwarded to a random partition.

Supported Identities

mParticle forwards the following identifiers to Kafka, where available:

  • Android ID
  • GAID (Google Advertising ID)
  • IDFA (iOS Advertising ID)
  • IDFV (iOS Vendor ID)
  • Customer ID
  • Email address
  • Facebook Audience ID
  • Facebook ID
  • Google ID
  • Microsoft ID
  • mParticle ID
  • Twitter ID
  • Yahoo ID
  • Other (and Other2, Other3, and Other4)

Supported events

mParticle forwards the following event types to Kafka:

  • App Event
  • Application State Transition
  • Breadcrumb
  • Commerce Event
  • Crash Report
  • First Run
  • Network Performance
  • Opt Out
  • Profile
  • Push Message
  • Push Registration
  • Screen View
  • Session Start / End
  • Uninstall
  • User Attribute Change
  • User Identity Change

Data format

mParticle forwards data to Kafka as event batches in our standard outgoing JSON format.

Configuration Settings

Setting Name Data Type Default Value Description
Kafka Bootstrap Servers string This is a comma-separated list of Kafka bootstrap servers.
Topic Name string This is your Kafka topic.
Authentication Mechanism string PLAIN This is the mechanism used for sending username and password to the Kafka cluster. It must be one of PLAIN, SCRAM-SHA-256, SCRAM-SHA-512. This is only used when the user name is non-empty.
User Name string User Name for your Kafka account
Password string Password for your Kafka account

Connection Settings

Setting Name Data Type Default Value Description
Unique ID string All
Send Lifecycle Events bool True All
Send Screen Views bool True All
Send Crash Events bool True All
Send Network Performance Events bool True All
Send Custom Events bool True All
Send Push Registrations and Receipts bool True All
Send as Batch bool True All
Include Location Information bool True All
Include User Identities bool True All
Send Profile Change Events bool True All
Send Commerce Events bool True All
Include Metadata bool True All
Include User Attribute Change Events bool False All
Include User Identity Change Events bool False All
Send Batches without Events bool False All

Was this page helpful?