GenAI Engineer – Test Agent / CI Integration

Full-time

Hybrid

Skills

Python

E2E Testing

Overview

GenAI Engineer – Test Agent / CI Integration

Location: Gurugram, India ( hybrid)

Experience: 4–8 Years

Employment Type: Full-Time

Job Description:

We are looking for a highly skilled GenAI Engineer – Test Agent / CI Integration to build and scale intelligent AI-powered testing systems for next-generation applications. The ideal candidate will work on automated test-agent frameworks, synthetic data generation, evaluation harnesses, and CI/CD-integrated AI testing pipelines.

This role requires strong expertise in Python backend development, LLM evaluation frameworks, retrieval-grounded testing systems, and modern DevOps practices.

Key Responsibilities:

AI Test Agent Development

• Design and develop autonomous AI-driven test agents for validating GenAI and LLM-powered applications

• Build systems for:

• Synthetic data generation

• Test-case synthesis

• Scenario generation

• Adversarial and edge-case testing

• Develop reusable evaluation harnesses for benchmarking model quality, accuracy, safety, and reliability

Context-Aware Test Generation

• Integrate test agents with BLK’s knowledge/context graph for retrieval-grounded testing

• Enable contextual test generation using RAG pipelines and graph-based retrieval systems

• Ensure generated tests align with enterprise knowledge sources and real-world workflows

CI/CD & Automation

• Integrate AI test agents into CI/CD pipelines as first-class pipeline jobs

• Automate regression testing, evaluation runs, and quality scoring during deployments

• Build scalable validation workflows for continuous model monitoring and release gating

Evaluation Frameworks & Quality Engineering

• Work with LLM evaluation frameworks such as:

• DeepEval

• Ragas

• Custom evaluation frameworks

• Develop automated scoring mechanisms for:

• Hallucination detection

• Faithfulness

• Relevance

• Toxicity

• Response quality

• Integrate with pytest and existing QA ecosystems

Backend & Infrastructure

• Build and maintain Python backend services powering evaluation workflows

• Optimize distributed evaluation execution for scalability and performance

• Collaborate with platform, MLOps, and DevOps teams for production deployment

Required Skills & Qualifications:

Technical Skills

• Strong proficiency in Python

• Experience with:

Pytest
CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI, etc.)
REST APIs & backend development

• Hands-on experience with:

LLM evaluation frameworks (DeepEval, Ragas, LangSmith, custom evaluators)
RAG systems and retrieval pipelines
Synthetic dataset generation
Prompt engineering and evaluation strategies

AI/ML & GenAI Expertise

• Strong understanding of:

Large Language Models (LLMs)
Agentic systems
AI evaluation methodologies
Context grounding and knowledge retrieval

• Familiarity with vector databases, embeddings, and knowledge graphs

DevOps & Automation

• Experience integrating AI workflows into CI/CD environments

• Understanding of automated quality gates and testing orchestration

Preferred Qualifications

• Experience working with knowledge graphs or graph databases

• Exposure to LangChain, LlamaIndex, or similar orchestration frameworks

• Familiarity with Kubernetes, Docker, and cloud platforms (AWS/GCP/Azure)

• Experience in enterprise-scale AI platform engineering

Important Note:

This is a niche requirement and not a regular GenAI developer role. We are specifically looking for candidates with experience in:

• AI validation/testing

• QE automation

• Python backend

• CI/CD integration

• LLM evaluation frameworks

• RAG and retrieval-grounded systems

Create an account

Already have an account?

Or continue with email

Trusted by AI leaders, enterprises, and more

Don't miss out on this job opportunity!

GenAI Engineer – Test Agent / CI Integration