Sample dataset generator for Aiven for Apache Kafka®
====================================================
Learning to work with streaming data is much more fun with data, so to get you started on your Apache Kafka® journey we help you create fake streaming data to a topic.
.. Note::
The following example is based on `Docker `_ images, which require `Docker `_ or `Podman `_ to be executed.
The following example assumes you have an Aiven for Apache Kafka® service running. You can create one following the :doc:`dedicated instructions `.
Fake data generator on Docker
-----------------------------
To learn data streaming, you need a continuous flow of data and for that you can use the `Dockerized fake data producer for Aiven for Apache Kafka® `_. To start using the generator:
1. Clone the repository:
.. code::
git clone https://github.com/aiven/fake-data-producer-for-apache-kafka-docker
2. Copy the file ``conf/env.conf.sample`` to ``conf/env.conf``
3. Create a new access token via the `Aiven Console `_ or the following command in the :doc:`Aiven CLI `, changing the ``max-age-seconds`` appropriately for the duration of your test:
.. code::
avn user access-token create \
--description "Token used by Fake data generator" \
--max-age-seconds 3600 \
--json | jq -r '.[].full_token'
.. Tip::
The above command uses ``jq`` (https://stedolan.github.io/jq/) to parse the result of the Aiven CLI command.
If you don't have ``jq`` installed, you can remove the ``| jq -r '.[].full_token'`` section from the above command and parse the JSON result manually to extract the access token.
4. Edit the ``conf/env.conf`` file filling the following placeholders:
* ``my_project_name``: the name of your Aiven project
* ``my_kafka_service_name``: the name of your Aiven for Apache Kafka instance
* ``my_topic_name``: the name of the target topic, can be any name
* ``my_aiven_email``: the email address used as username to log in to Aiven services
* ``my_aiven_token``: the access token generated during the previous step
5. Build the Docker image with:
.. code::
docker build -t fake-data-producer-for-apache-kafka-docker .
.. Tip::
Every time you change any parameters in the ``conf/env.conf`` file, you need to rebuild the Docker image to start using them.
6. Start the streaming data flow with:
.. code::
docker run fake-data-producer-for-apache-kafka-docker
7. Once the Docker image is running, check in the target Aiven for Apache Kafka® service that the topic is populated. This can be done with the `Aiven Console `_, if the Kafka REST option is enabled, in the *Topics* tab. Alternatively you can use tools like :doc:`kcat ` to achieve the same.