Imagine: you are working for a company that produces tires. For many years now, your company has been collecting sensor data from the previously sold tires. Now the time has come to extract value from that data. You will lead the implementation of a data-driven predictive maintenance service, so your company can take over responsibility for timely maintenance from the customer (“servitization”). You need to gather the right expertise around you, and build a digital infrastructure. But how do you limit the amount of the digital/technical knowledge needed by the team, if digitalization is not your core business?

This blog…

Using AWS Timestream, FastAPI on Lambda and HTTP API

Imagine that you invest in wind turbines and you even build them in your backyard. To ensure that you have full insights into the performance of your investment, you install sensors to measure power production and place a weather station. But how do you get this data into the cloud in a practical way? As my colleague Jaap de Koning and I pondered about this hypothetical situation, we set off to invest the latest technologies on AWS.

This blog discusses new functionalities that lets you setup your own IoT platform with…

How to create a serverless deployment with Lambda, SNS, SQS and Kinesis

tl;dr Testing and updating machine learning models can be done safely and systematically using the Rendezvous architecture. This architecture lets you run multiple model versions in parallel, by decoupling them with the message queue SQS and streaming platform Kinesis. A rendezvous function is responsible for selecting the model result with the highest relevance. Although the Rendezvous architecture is relatively easy to implement, defining the input/output structure of the models can be tricky. The input/output structure is the contract between models and the rendezvous function, but new model versions may need different information and therefore complicate the contract.

In the book

From asynchronous HTTP request to parquet file

tl;dr Serverless functions are great for lightweight cloud architecture and rapid provisioning. However, sometimes serverless introduces additional complexity to the deployment process. I compare Python and Go with respect to the ease of deployment when setting up a simple data factory on AWS Lambda. The factory makes many HTTP requests, validates responses, and creates parquet files. Go shows to have advantages over Python due to the packaging system, cross-platform compilation, and native parquet implementation.

Serverless functions are a great way to simplify your cloud stack and quickly produce new functionality. Serverless contributes to scalability, transferability, and flexibility of the entire…

During workshops, I often see participants wrestle with software installation before they can get started. This wasts already limited time that can be spent on learning. Wouldn’t it be nice if this could be avoided?

For our PyData workshop with 80 participants, we decided to develop a stack that provisions a dedicated, zero-setup environment for each participant. We opted for a cloud-based environment because this allowed us to design the structure. In this blog post, I share our experiences and provide the source code, which you could use for your own workshop.

The structure is as follows. First, the challenge…

Dick Abma

Data Engineer at BigData Republic

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store