Search...

Data Engineer

Skills

About the Role

You will build scalable and reliable data pipelines, develop and maintain streaming and batch ETL, and administer large cloud-deployed relational and non-relational databases. You will design, document, automate and execute test plans to ensure dataset quality, collaborate on architecture with an emphasis on scalability and security, drive systems toward near real-time operation, and participate in generating and analyzing features.

Requirements

  • Experience building streaming and batch data pipelines and ETL
  • Expertise in Python
  • Expertise in PostgreSQL and PL/pgSQL development and administration of large databases
  • Experience with scalability solutions and multi-region replication and failover
  • Experience with data warehouse technologies such as Trino, ClickHouse, and Airflow
  • Bachelor's degree in Computer Science, Engineering, or related field or equivalent experience
  • Deep understanding of programming and experience with at least one programming language
  • English language proficiency
  • Knowledge of Kubernetes and Docker
  • 4+ years of working experience in relevant data field
  • Knowledge of blockchain technology or the mining pool industry
  • Experience with agile development methodology
  • Experience delivering and owning web-scale data systems in production
  • Experience with Kafka, preferably Redpanda and Redpanda Connect

Responsibilities

  • Build scalable and reliable data pipelines that provide accurate data feeds from internal and external systems
  • Govern scalable and performant cloud-deployed production relational and non-relational databases
  • Collaborate on architecture definitions with a focus on scalable and secure solutions
  • Drive data systems to be as near real-time as possible
  • Design, document, automate and execute test plans to ensure dataset quality
  • Participate in generating and analyzing features