Machine Learning Engineer
Grass is a decentralized network that allows users to share their unused internet bandwidth and earn rewards. The infrastructure facilitates the sourcing and transformation of web data into structured datasets for AI.
Funding
About Wynd Network
Grass is a decentralized network for accessing the public web, allowing users to sell their unused network resources to companies and AI labs. It functions as a residential proxy network where users can earn points by running the Grass software, which will be converted to a network stake. The network is designed to provide web data for training AI models, with a focus on privacy and security for its users. The infrastructure consists of the Grass app and the Sovereign Data Rollup, which includes a network of nodes, routers, validators, a ZK processor, and a Data ledger to facilitate the sourcing and transformation of unstructured web data into structured datasets.
Skills
About the Role
You will develop, fine-tune, and deploy large language models for NLP tasks such as text generation, summarization, translation, and sentiment analysis. You will design and implement scalable data processing and analysis pipelines, analyze complex time series data, and apply OCR to extract and normalize text. You will build, maintain, and improve data-driven models and algorithms, ensure data quality and integrity, collaborate with cross-functional teams to deliver solutions, and continuously research and adopt best practices in machine learning and data science.
Requirements
- Bachelor’s, Master’s, or Doctoral degree in Data Science, Computer Science, Statistics, or a related field
- Minimum of 3 years of work or research experience with large datasets
- Strong coding skills in Python or other object-oriented programming languages
- Graduate-level knowledge of statistics including hypothesis testing, regression analysis, and probability
- Good communication skills and ability to articulate complex data concepts to non-technical stakeholders
- Experience working in a high output team and thriving in a fast-paced startup environment
Responsibilities
- Develop, fine-tune, and deploy LLMs for NLP tasks
- Design and implement data processing and analysis pipelines
- Analyze time series data and provide actionable insights
- Design, implement, and maintain data-driven models and algorithms
- Ensure data quality and integrity across processes
- Implement OCR solutions to extract and normalize text
- Collaborate with cross-functional teams to deliver data solutions
- Research and apply best practices in machine learning and data science
- Contribute to development and improvement of internal data processing tools and infrastructure
Benefits
- Remote work
- Equity
- Benefits package
