A Quick Demo of Apache Beam with Docker

Deploy Flink & Beam with Docker

git clone https://github.com/ecesena/docker-beam-flink.git
cd docker-beam-flink
docker-compose up -d
docker-compose scale taskmanager=2
docker psCONTAINER ID IMAGE      ... NAMES
3d59d952d152 beam-flink ... dockerbeamflink_taskmanager_2
4cce6219be80 beam-flink ... dockerbeamflink_taskmanager_1
3b7b6b32b4de beam-flink ... dockerbeamflink_jobmanager_1

Run HelloWo — ehm, WordCount

open http://$(docker-machine ip default):48080
  1. Click “Submit new Job” in the left menu — we'll see beam-starter-0.1.jar pre-uploaded
  2. Flag the checkbox near beam-starter-0.1.jar
  3. Click on “Submit” (or “Show Plan”). No additional parameter is needed.
docker exec -it dockerbeamflink_taskmanager_1 /bin/bashcat /tmp/output.txt*
...
live: 13
long: 15
look: 14
lord: 90
lose: 6
...

Build a Beam Pipeline

git clone https://github.com/ecesena/beam-starter
cd beam-starter
mvn clean package

--

--

--

Forging the Everdragons2 NFT. Former security at Pinterest.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Emanuele Cesena

Emanuele Cesena

Forging the Everdragons2 NFT. Former security at Pinterest.

More from Medium

Building an Apache Airflow configured with Local Executor and Spark Standalone Cluster with Docker

Loading Data from PostgreSQL to AWS Redshift

Message platform patterns

What makes Apache Airflow: most efficient platform to manage your Data Engineering workflows.