About Turing:
Based in San Francisco, California, Turing is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L.
Role Overview:
We are seeking highly skilled and motivated AI Research Scientists with a Master's or Ph.D. in a Biology fields to join our team. In this role, you will be instrumental in pushing the boundaries of AI capabilities by creating and evaluating high-level, RLMF-style questions designed to serve as headroom for advanced AI models like Gemini. This position requires deep domain expertise, a keen understanding of current AI limitations, and a commitment to rigorous, verifiable evaluation.
What does day-to-day look like:
You’ll solve and explain advanced biology problems and questions, integrating text and visuals. Your day might include:
- Explaining gene expression regulation with annotated diagrams of transcription/translation processes.
- Solving population genetics problems using Punnett squares, Hardy-Weinberg equations, and evolutionary models.
- Describing physiological systems (e.g., circulatory, respiratory) using integrated anatomical illustrations.
- Answering complex questions on biochemistry, enzymatic reactions, and metabolic pathways using both narrative and schematic models. Key Responsibilities:
- Develop High-Level Evaluation (HLE) Questions: Create challenging and novel questions that require advanced reasoning and specialized knowledge, aiming to identify areas where current state-of-the-art AI model
- Ensure Domain Expertise: Design questions that necessitate the depth and precision of a graduate-level expert, covering a wide range of highly specialized topics.
- Identify Headroom: Formulate questions that are currently unknown to or challenge the capabilities of Gemini 2.5 Pro, with an added benefit if they also pose a challenge to OpenAI's SOTA model (O3 Pro).
- Ensure Automatic Verifiability: Construct questions with single, definitive, and concise answers to enable objective and streamlined evaluation.
- Incorporate High-Quality Visual Inputs: Develop multimodal questions involving high-quality images, avoiding poor resolution or unclear content.
- Promote Diversity: Contribute to a diverse dataset of topics and question types, avoiding over-representation of specific failure modes.
- Require Expert-Level Reasoning: Design questions that demand expert-level reasoning rather than simple keyword or internet lookups.
- Evaluate Model Responses using Eduarena: Utilize the Eduarena platform to set up side-by-side comparisons between Gemini-2.5-Pro and O3-Pro, and to test against O3-DeepResearch.
- Provide Detailed Feedback: Document model failures, explain reasoning discrepancies, provide correct answers, and concise solutions for verification.
- Maintain Detailed Records: Accurately record questions, solutions, final answers, and links to evaluation sessions in a shared tracking sheet.
Perks of Freelancing With Turing:
- Work in a fully remote environment.
- Opportunity to work on cutting-edge AI projects with leading LLM companies.
- Potential for contract extension based on performance and project needs.
Offer Details:
- Commitments Required: at least 4 hours per day and minimum 30 hours per week with 4 hours of overlap with PST. (We have 2 options of time commitment: 30 hrs/week or 40 hrs/week)
- Engagement type: Contractor assignment/freelancer (no medical/paid leave)
- Duration of contract: 1 month
Evaluation Process
- Shortlisted candidates will be sent a job interest form
- Based on the answers, you will be contacted to discuss the pre-onboarding requirements