Research Crawling Engineer

2 days agoAnywhere Remote Full Time Engineering Jobs by Wynd Network

Skills

About the Role

You will design, build, and operate large-scale web data acquisition systems for research and model development. You will implement and maintain distributed crawlers, handle anti-bot and rate-limiting challenges, extract and normalize data from dynamic sites, and build pipelines for cleaning, deduplication, and dataset construction. You will monitor crawl performance and data quality, optimize infrastructure for cost and reliability, and collaborate with researchers to meet modeling needs.

Requirements

Strong programming experience in one or more of Go Rust Python Java or C++
Experience building web crawlers or large-scale data pipelines
Solid understanding of HTTP networking and browser behavior
Familiarity with distributed systems and parallel processing
Experience working with large datasets (TB–PB scale preferred)
Ability to debug unstable or adversarial environments

Responsibilities

Build and maintain large-scale web crawlers
Design high-throughput fault-tolerant systems for data collection
Handle anti-bot systems rate limits and dynamic JS-heavy sites
Develop pipelines for cleaning deduplication filtering and normalization
Construct and maintain datasets for research and model training
Monitor crawl performance coverage and data quality and iterate quickly
Collaborate with research teams to align data collection with modeling needs
Optimize infrastructure for cost latency and reliability
Own end-to-end data acquisition pipelines

Benefits

Benefits package
Equity package
Fully remote work

Skills

About the Role

Requirements

Responsibilities

Benefits

Similar Jobs