AI Engineer (GOON-CORE)
Skills
MlxQuantizationOn-Device InferenceAnimeArm NnPytorchLoraVector DatabasePrompt EngineeringAwsCi/CdLlmVllmPythonGoRustDockerContent SafetyTtsA/B TestingQloraFlash-AttentionRlaifPersona EngineeringSemantic TaggingEmotion ClassifierRobertaTiny-MistralDiffusionCore MlNnapiVast.AiCanary ReleaseHentaiEro-MangaWaifu Trope
About the Role
You will fine-tune state-of-the-art large language models and maintain private datasets of policy-safe chat logs, applying LoRA/QLoRA and RLAIF to improve behavior. You will design multi-level prompt and persona stacks, build long-term memory with vector databases and semantic tagging, and implement emotion/kink classifiers to steer replies. You will integrate diffusion and TTS models, optimize on-device inference (Core ML / NNAPI), and implement tiered content-safety filters and A/B experiments. You will containerize services with Docker, autoscale on cloud and spot GPUs, and run CI/CD pipelines with nightly retrains and canary releases. You will pair with product, design, and mobile teams, prototype rapidly, and mentor junior engineers.
Requirements
- Proven track record fine-tuning LLMs (link to repo or model card)
- Fluency in Python and PyTorch
- Experience with Rust or Go is a plus
- Deep familiarity with hentai, ero-manga, waifu tropes, and r/GoonsGoneWild culture
- 3+ years shipping machine learning to production
- Ability to design content-safety systems that preserve user experience
- iOS and Android on-device ML optimization (extra credit)
- Diffusion or TTS model experience for adult content (extra credit)
Responsibilities
- Fine-tune state-of-the-art LLMs using LoRA and QLoRA
- Build and curate private policy-safe chat datasets and apply RLAIF
- Design multi-level prompt stacks and persona systems
- Implement auto-temperature and other runtime controls
- Architect long-term memory with vector database and semantic tagging
- Implement emotion and kink classifiers to steer replies
- Integrate diffusion and TTS models for multimodal responses
- Optimize on-device inference using Core ML and NNAPI
- Build tiered content-safety filters with in-prompt guardrails and hard classifier gates
- Run A/B tests to balance freedom and user safety
- Containerize services with Docker and autoscale on cloud and spot GPUs
- Maintain CI/CD pipelines for nightly retrains and canary releases
- Collaborate with product, design, and mobile teams and mentor juniors
Benefits
- Revenue-share tokens once Foxy passes 1 M MAU
