Building and maintaining an ETL pipeline for Let’s Do This with Python, MongoDB and Amazon Redshift

Let’s Do This (Y Combinator winter 2018) are a fast-growing startup aiming to be the Airbnb of the sports endurance world. We carried out of one of their key data analytics objectives, replicating NoSQL (MongoDB) data into a relational database (Amazon Redshift) in a format designed for maximum ease of querying for staff across their business.

To speed up the process and respect budgeting requirements, we used a third party data integration provider Stitch, whilst also writing our own SQL in order to tailor the results to the exact specifications of the client.

As with many technology companies today, part of Let’s Do This’ business model relies on gathering data from a range of different sources. We used our software development expertise to create a data aggregation and cleaning pipeline that is efficient, easily extensible / maintainable and scalable, leveraging modern industry best practices along the way.

We turned to Lambert Labs when we needed help with our data ingestion and ETL processes. Work was always completed professionally, and they helped make our data pipelines more efficient. George and his team really go above and beyond to ensure targets are met on time and I can wholeheartedly recommend them to anyone. (Sam Browne, CEO, Let’s Do This)

Find our more about how we build robust Python applications in the cloud.