Cloud & Engineering

We regularly write about our technical experiences (good and bad) and what we're learning from the market.

  • There are no suggestions because the search field is empty.

Tags

Write and deploy an Apache Beam pipeline with Dataflow

Posted by Sheng Wu on 02 April 2019

tech, gcp, dataflow, Apache Beam, Fast-Data, parquet, csv

Overview 

Apache Beam is a unified programming model and the name Beam means Batch + strEAM. It is good at processing both batch and streaming data and can be run on different runners, such as Google Dataflow, Apache Spark, and Apache Flink. The Beam programming guide documents on how to develop a pipeline and the ...

Continue reading

Access KSQL server in Google Kubernetes Engine locally in 5 steps

Posted by Tabish Ghani on 25 March 2019

kubernetes, container, kafka, dataflow, Fast-Data

Overview

Apache Kafka allows both local and cloud deployment so you can publish data from on premise environment and trigger services in the cloud. It is at the heart of our stacks that require real time processing. Confluent KSQL (streaming engine) allows stream processing in a simple and interactive SQL interface...

Continue reading