Search...
Urgently Hiring

Web Scraping Specialist

Grass logo
Grass

Grass is a network that allows users to share their unused internet bandwidth and earn rewards. It features a Sovereign Data Rollup that uses a network of nodes, validators, and a ZK processor to source and structure web data.

Distributed
About Grass

Grass is a network with a mission to redefine Internet incentive structures. Its infrastructure consists of two key components: the Grass app, which allows users to share their unused internet bandwidth with the network and earn rewards, and the Sovereign Data Rollup. The rollup is comprised of a network of nodes, routers, a validator, a ZK processor, and a Data ledger, which collectively facilitate the sourcing and transformation of unstructured web data into structured datasets.

View jobs by Grass

Skills

About the Role

You will build, test, and refine code to extract data from diverse online sources, handling complexities like pagination and dynamic AJAX content. You will clean and format extracted data to meet quality standards and design efficient NoSQL storage solutions. You will deploy and manage scraping jobs on cloud platforms, monitor processes, and resolve issues to maintain continuous data flow. You will apply machine learning methods where useful for data cleaning or categorization and contribute to open source projects related to scraping and data processing.

Requirements

  • Portfolio or examples of past web scraping projects demonstrating ability to extract from complex sites
  • Proficiency in Python or JavaScript
  • Experience with BeautifulSoup Scrapy or Selenium
  • Knowledge of asynchronous programming multithreading and distributed scraping
  • In depth knowledge of HTML CSS JavaScript and the DOM
  • Experience with NoSQL databases such as MongoDB or Cassandra
  • Experience with cloud services AWS Google Cloud or Azure for deploying scraping jobs
  • Experience applying machine learning algorithms for data cleaning categorization or predictive analysis is a plus
  • Active participation in relevant open source projects

Responsibilities

  • Write test and refine code that extracts data from various online sources ensuring reliability and efficiency
  • Perform data retrieval handling pagination and dynamic AJAX loaded content
  • Clean and format extracted data to meet quality standards
  • Design and manage databases for scraped data optimizing access speed and data integrity
  • Monitor scraping processes and identify and resolve issues to maintain continuous data flow
  • Optimize scraping processes for distributed and large scale crawling

Benefits

  • benefits package
  • equity package
Web Scraping Specialist at Grass | JobStash