Data Engineer
Skills
About the Role
You will build scalable and reliable data pipelines, develop and maintain streaming and batch ETL, and administer large cloud-deployed relational and non-relational databases. You will design, document, automate and execute test plans to ensure dataset quality, collaborate on architecture with an emphasis on scalability and security, drive systems toward near real-time operation, and participate in generating and analyzing features.
Requirements
- Experience building streaming and batch data pipelines and ETL
- Expertise in Python
- Expertise in PostgreSQL and PL/pgSQL development and administration of large databases
- Experience with scalability solutions and multi-region replication and failover
- Experience with data warehouse technologies such as Trino, ClickHouse, and Airflow
- Bachelor's degree in Computer Science, Engineering, or related field or equivalent experience
- Deep understanding of programming and experience with at least one programming language
- English language proficiency
- Knowledge of Kubernetes and Docker
- 4+ years of working experience in relevant data field
- Knowledge of blockchain technology or the mining pool industry
- Experience with agile development methodology
- Experience delivering and owning web-scale data systems in production
- Experience with Kafka, preferably Redpanda and Redpanda Connect
Responsibilities
- Build scalable and reliable data pipelines that provide accurate data feeds from internal and external systems
- Govern scalable and performant cloud-deployed production relational and non-relational databases
- Collaborate on architecture definitions with a focus on scalable and secure solutions
- Drive data systems to be as near real-time as possible
- Design, document, automate and execute test plans to ensure dataset quality
- Participate in generating and analyzing features
