Handle very many pipelined web, database, and micro-service requests concurrently and efficiently

Photo by Joshua Newton on Unsplash

Pipelined functions can be very powerful for memory efficient and readable code. These are functions where the output of the one feeds into the input of the next. Some example applications would be:

  • a multi-stage web scraper across different but related data sources
  • a machine learning pipeline with data extraction, transformation, and processing steps
  • doing line-by line operations on big datasets

Using Python’s asyncio library for concurrent operations is fast and resource efficient, especially for IO-bound tasks.

This article will show you how to build your own pipelines that can run very efficiently, especially if you do a lot of…

5 tips to help you put this beast to work

BigQuery is Google’s analytical database that is insanely fast to query large datasets and lets you get started for free. At the time of writing, the first 1TB of data processed is free, and first 10GB of data stored is free. In this article I’ll show you some tips on how to get the most out of this amazing system.

Photo by Lê Tân on Unsplash

BigQuery is a distributed system that combines distributed storage with a compute cluster that throws (often) 2000 or more slots (vCPUs) at your query in parallel. Use these tips to put this power to good use.

Tip #1: Generate Test Data on the Fly with Unnest

You’ll see this in…

Dewald Abrie

Exploring the limits of software, data, and ML.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store