Annotator - STEM

Full-time

Remote

Skills

Python

Cyber Security

Overview

About Turing:

Turing is one of the world’s fastest-growing AI companies accelerating the advancement and deployment of powerful AI systems.

Turing helps customers in two ways: Working with the world’s leading AI labs to advance frontier model capabilities in thinking, reasoning, coding, agentic behavior, multimodality, multilinguality, STEM and frontier knowledge; and leveraging that work to build real-world AI systems that solve mission-critical priorities for companies.

About the Role:

Annotators are the core builders of SkillsBench. You will design and write AI evaluation tasks — structured challenges given to large language model (LLM) agents running inside automated environments. Each task you create tests whether an AI agent performs significantly better when given domain-specific knowledge

skills versus without it. Your tasks directly feed into Turing's commercial AI evaluation pipeline, used by clients.

What You Will Do:

Write clear, unambiguous task instructions that define exactly what an AI agent must produce, where to save it, and what rules to follow Create reference solutions that demonstrate the correct approach and pass all automated checks Write human-readable verifier descriptions listing every check the automated test suite will run Author domain-specific skill files that teach an AI agent the conventions, workflows, and edge cases relevant to the task — without leaking expected answers Ensure the no-skills variant of each task is identical to the with-skills variant except for the absence of skill files Work within the task structure (instruction, environment, solution, tests) and follow Turing's task quality standards

Required:

Bachelor's degree or higher in a relevant technical or domain-specific field (Computer Science, Engineering, Finance, Data Science, Linguistics, etc.) Experience: 1–3 years in a domain where you have hands-on practical expertise (software development, financial analysis, document processing, data science, etc.)

Must Have:

Strong written English; ability to write precise, unambiguous instructions

Genuine hands-on expertise in at least one of the SkillsBench domains (coding, finance, document generation, audio/ML, etc.)

Ability to think from an AI agent's perspective — what would a model get wrong without guidance?

Comfort reading and producing structured file outputs (JSON, DOCX, XLSX, Markdown)

Nice to Have:

Prior experience with LLM evaluation, prompt engineering, or AI benchmark design

Familiarity with Python scripting Experience with Docker or containerised environments

Domains :

Power Systems & Control

Cybersecurity

Network & System Engineering

Offer Details:

Commitments Required: 40 hours per week with overlap 4 hours with PST
Engagement type : Contractor assignment(no medical/paid leave)
Duration of contract : 2 months; [expected start date is next week]
Location : India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Mexico

Create an account

Already have an account?

Or continue with email

Trusted by AI leaders, enterprises, and more

Don't miss out on this job opportunity!

Annotator - STEM