Search...

Senior Software Engineer, Data Platform

Skills

About the Role

You will design, implement, and operate data infrastructure that processes petabytes of blockchain and transactional data in real time. You will build reliable data services that integrate with multiple blockchains, develop and optimize ETL pipelines and streaming workflows, and create data models for fast querying. You will deploy, scale, and monitor large database clusters, automate routine scaling and maintenance tasks, and implement observability to ensure performance and high availability. You will also optimize systems for operational speed and deliver pragmatic first-version solutions that can be iterated on quickly.

Requirements

  • Bachelor's degree or equivalent in Computer Science or related field
  • 8+ years of hands-on experience architecting distributed systems and shipping production deployments
  • Exceptional programming skills in Python
  • Proficiency in SQL or SparkSQL
  • Experience with data stores such as ClickHouse, Elasticsearch, Postgres, Redis, and Neo4j
  • Familiarity with orchestration tools like Airflow, DBT, Luigi, Azkaban, or Storm
  • Expertise in data processing and streaming technologies such as Spark, Kafka, and Flink
  • Experience deploying and monitoring infrastructure in public cloud using Docker, Terraform, Kubernetes, and Datadog
  • Proven ability to load, query, and transform very large datasets

Responsibilities

  • Build highly reliable data services to integrate with multiple blockchains
  • Develop complex ETL pipelines to transform and process petabytes of data in real time
  • Design and architect data models for optimal storage and sub-second query latency
  • Deploy, scale, and monitor large database clusters with a focus on performance and high availability
  • Implement data pipeline orchestration and streaming workflows
  • Automate scaling and maintenance tasks such as pgbouncer provisioning, disk scaling, and cluster updates
  • Deliver pragmatic first-version solutions quickly and iterate based on stakeholder feedback