Who are Ballotpedia?
Ballotpedia are a nonprofit online political encyclopedia covering federal, state, and local politics, elections, and public policy in the United States. They provide a digital encyclopedia of American politics, and are the premier resource for unbiased information on United States elections, politics and policy.
Why Lambert Labs?
Ballotpedia chose to collaborate with Lambert Labs because of our expertise and experience in running web based Extract Transform Load (ETL) workloads in AWS. We were an especially good fit given our proficiency with Python, Scrapy, Docker containerisation, and PostgreSQL databases, technologies that lend themselves especially well to running modern web data extraction applications in the cloud.
What we did – technical implementation
- Conducted an AWS Well-Architected Review to identify elements of Ballotpedia’s AWS infrastructure that would benefit from improvement or reconfiguration.
- Containerised a Python Scrapy application and Python scripts using Docker and AWS ECR
- Ran containers in AWS using ECS Fargate tasks and EventBridge rules
- Extracted HTML data into Amazon RDS PostgreSQL databases and PDF files into Amazon S3 buckets
- Used Optical Character Recognition (OCR) libraries to extract text from searchable and non searchable PDFs, and PyQuery to extract data from HTML
- Integrated Scrapy with Zyte for browser automation
- Setup an automated testing suite using Pytest
- Setup a CI/CD pipeline (linting, formatting, unit tests, database migrations, containerisation) in CircleCI
- Used Alembic with SQLAlchemy for database migrations
- Setup monitoring and logging in Amazon CloudWatch
- Used RDS CloudFormation templates to satisfy Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)
- Automated the running and creating of ECS Fargate tasks via JSON templates and Boto3 scripts
- Setup pre-commit hooks (linting, formatting, unit tests) in local development
- Used Docker Compose to simplify and streamline local development
We are now working with Ballotpedia on an ongoing basis by conducting regular AWS Well-Architected Reviews and providing AWS consultancy with a specific focus on solutions architecture.
If you would like to find out a little bit more about Lambert Labs then please feel free to read about our Python development services, AWS consulting services or AWS Well-Architected Framework Review services.