Urgently Hiring

Senior AI Data Scientist

2 weeks agoSeniorSalary: 130K - 150KFull Time Ai Jobs by Chaos Labs

Skills

Tooling Rag Agent Evaluation Vector Database Modeling Finance Defi Observability Llm Mcp Python Analysis Api Embedding Prompting

About the Role

You will design, evaluate, and evolve agentic systems that reason, plan, and act over complex financial workflows. You will define agent behavior, memory, and tool-use strategies emphasizing correctness and controllability. You will develop and maintain LLM evaluation frameworks measuring accuracy, faithfulness, latency, cost, regressions, and edge cases. You will design structured prompting, schemas, and tool-calling strategies for production LLM systems and build and operate MCP servers with schema design, permissions, and safety boundaries. You will analyze model behavior and failure modes, convert qualitative issues into measurable signals, and partner with engineering to productionize research: observability, retries, state, and reliability. You will optimize system performance and cost across models and agent architectures and mentor engineers and data scientists on best practices for applied LLM and agentic systems.

Requirements

6+ years experience in data science, applied ML, or AI-focused software roles
3+ years building production AI / ML systems with ownership beyond experimentation
Deep hands-on experience with LLMs, agentic patterns, and tool-calling systems
Strong Python skills and comfort working close to production systems and APIs
Experience with RAG pipelines, embeddings, and vector databases
Strong intuition for model behavior, trade-offs, and failure analysis
Experience applying ML or statistical methods to financial data (crypto/DeFi a plus)
Track record of building reusable ML or data abstractions that improved team velocity or decision-making

Responsibilities

Design and own single and multi-agent systems that reason, plan, and act over complex financial workflows
Define agent behavior, memory, and tool-use strategies with emphasis on correctness and controllability
Develop and maintain LLM evaluation frameworks covering accuracy, faithfulness, latency, cost, regressions, and edge cases
Design structured prompting, schemas, and tool-calling strategies for production LLM systems
Build and operate MCP servers including schema design, permissions, and safety boundaries
Analyze model behavior and failure modes and turn qualitative issues into measurable signals
Partner with engineering to productionize research insights including observability, retries, state, and reliability
Optimize system performance and cost across models, workflows, and agent architectures
Mentor engineers and data scientists and set best practices for applied LLM and agentic systems

Skills

About the Role

Requirements

Responsibilities

Similar Jobs