
Shashi Shankar
Mar 19, 2023
Easily Generate Test and Feed Test Data to Amazon Kinesis Stream
What is KDG
Kinesis Data Generator is a tool provided by AWS for generating sample data streams to simulate real-world scenarios. It allows users to create and customize data streams with various parameters such as message frequency, size, and format.
This is an open source tool comprising of static HTML and JavaScript codes. It can be activated simply by clicking this URL: https://awslabs.github.io/amazon-kinesis-data-generator/web/producer.html
This tool is commonly used during development, testing, and demonstration phases of projects that involve streaming data processing with services like Amazon Kinesis. Developers can use Kinesis Data Generator to generate synthetic data that resembles the actual data their applications will process, enabling them to test their data processing pipelines, applications, and analytics workflows under different scenarios.
Kinesis Data Generator is particularly useful for verifying the scalability, reliability, and performance of streaming data applications and services. It helps developers identify potential issues and optimize their solutions before deploying them in production environments. Additionally, it can be utilized in training sessions and workshops to illustrate the capabilities of AWS streaming data services.
How to Use KDG
· Create an Amazon Cognito user in AWS Account
· Bring up KDG by clicking the link above
· Login with Cognito user credentials
· Create a record template of your choice representing the structure of test data
· Populate the template with test data
· Save the template for future use
How to Activate Data Generation
Login to KDG
Provide the following information on KDG UI screen
o Select your region where stream is created,
o Providing delivery stream name
o Provide number of records to be sent per second **
o Provide record template with sample data
o Click the Send button
** Please note that KDG is not intended for load testing. However, according to Amazon documentation, the number of records per second can be in the thousands, depending on the complexity of the test data record structure.
How to Create Record Templates

Reference: Amazon AWS Documentation
Enclose record layout between curly brace {}.
Define how many sensors
Define randomness for sensor data, such as temperature, current etc.
KDG template format uses double curly-braces ‘{ }’ to enclose items that should be replaced before the record is sent to Amazon Kinesis.
Here is an example:
{
"sensor_id": "{{random.number(1, 48)}}",
"timestamp": "{{date.now()}}",
"current_flow": "{{random.float(0, 1000, 2)}}",
"status": "{{random.arrayElement(['low', 'high', 'very high'])}}"
}
KDG will randomize the sensor numbers in the above sample between 1 to 48 and for each senor it will send random current reading and status of ‘low’, ‘high’ or ‘very high’.
Amazon provides a CloudFormation template for creation of Cognito accounts and IAM roles for the account. It can be found at: https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html
The CloudFormation template will create the following resources in your AWS account:
An IAM role that gives the Lambda function permission to create Cognito resources.
An IAM role that is assigned to authenticated Cognito users. This role has only enough permission to use the KDG.
An IAM role that is assigned to unauthenticated Cognito users. This role has only enough permission to create Cognito analytics events.
A Lambda function to bootstrap the Cognito Lambda install from GitHub
A Lambda function to set up Cognito.
The Cognito Lambda function will create the following resources in your AWS account:
A Cognito User Pool.
A Cognito Federated Identity Pool.
A Cognito User, with the username and password specified by you when you created the CloudFormation stack.
The necessary relationships between the roles, users and pools.
To summarize, Amazon’s Kinesis Data Generator (KDG) is a very handy tool for generating test data for testing stream pipelines that uses Amazon Data Firehose (formerly Kinesis Firehose). It is browser-based and does not require installation. Record Templates created on the KDG can be used for future and easily be integrated with stream pipeline testing strategy.