Senior AI Data Scientist
Skills
About the Role
You will design, evaluate, and evolve agentic systems that reason, plan, and act over complex financial workflows. You will define agent behavior, memory, and tool-use strategies emphasizing correctness and controllability. You will develop and maintain LLM evaluation frameworks measuring accuracy, faithfulness, latency, cost, regressions, and edge cases. You will design structured prompting, schemas, and tool-calling strategies for production LLM systems and build and operate MCP servers with schema design, permissions, and safety boundaries. You will analyze model behavior and failure modes, convert qualitative issues into measurable signals, and partner with engineering to productionize research: observability, retries, state, and reliability. You will optimize system performance and cost across models and agent architectures and mentor engineers and data scientists on best practices for applied LLM and agentic systems.
Requirements
- 6+ years experience in data science, applied ML, or AI-focused software roles
- 3+ years building production AI / ML systems with ownership beyond experimentation
- Deep hands-on experience with LLMs, agentic patterns, and tool-calling systems
- Strong Python skills and comfort working close to production systems and APIs
- Experience with RAG pipelines, embeddings, and vector databases
- Strong intuition for model behavior, trade-offs, and failure analysis
- Experience applying ML or statistical methods to financial data (crypto/DeFi a plus)
- Track record of building reusable ML or data abstractions that improved team velocity or decision-making
Responsibilities
- Design and own single and multi-agent systems that reason, plan, and act over complex financial workflows
- Define agent behavior, memory, and tool-use strategies with emphasis on correctness and controllability
- Develop and maintain LLM evaluation frameworks covering accuracy, faithfulness, latency, cost, regressions, and edge cases
- Design structured prompting, schemas, and tool-calling strategies for production LLM systems
- Build and operate MCP servers including schema design, permissions, and safety boundaries
- Analyze model behavior and failure modes and turn qualitative issues into measurable signals
- Partner with engineering to productionize research insights including observability, retries, state, and reliability
- Optimize system performance and cost across models, workflows, and agent architectures
- Mentor engineers and data scientists and set best practices for applied LLM and agentic systems
