Mid-Level (2-5 years)

Generative AI Engineer

You'll be building and optimising our generative AI systems, getting hands-on with large language models (LLMs) and making sure they actually work in the real world. This isn't just research; it's about making our products smarter and our customers happier. You'll be a key part of the team that turns cutting-edge AI concepts into tangible features.

Job ID
JD-TECH-GEAI-002
Department
Technical Roles
NOS Level
Level 5-6 (OFQUAL equivalent)
OFQUAL Level
Level 5-6
Experience
Mid-Level (2-5 years)

Role Purpose & Context

Role Summary

The Generative AI Engineer is responsible for taking generative AI models from concept to production, making sure they're robust, performant, and deliver real value. You'll be knee-deep in code, building out features and fixing the inevitable quirks that come with working with LLMs. This role sits right at the heart of our product development, translating complex AI research into practical, user-facing applications. When you do this well, our products feel genuinely intelligent and intuitive, which means happier customers and new revenue streams. If it's not done right, we risk shipping features that don't quite work, leading to frustration and wasted effort. The tricky part is keeping up with the insane pace of AI innovation while still delivering stable, reliable systems. But honestly, the reward is seeing your work directly impact how people use our products, making them genuinely better.

Reporting Structure

Key Stakeholders

Internal:

External:

Organisational Impact

Scope: This role directly impacts our product's intelligence and our ability to deliver innovative features. Your work will shape how our customers interact with our platform, driving user engagement and, ultimately, our market position. Get it right, and we're seen as leaders; get it wrong, and we're just another company dabbling in AI.

Performance Metrics

Quantitative Metrics

  1. Metric: Model Inference Latency (P95)
  2. Desc: The time it takes for our generative AI models to respond to a request.
  3. Target: Maintain P95 latency below 500ms for user-facing features.
  4. Freq: Weekly, monitored via production dashboards.
  5. Example: If a user asks a question, we want the AI to answer within half a second, 95% of the time. If it's consistently hitting 700ms, that's a problem we need to fix.
  6. Metric: Task-specific Evaluation Score
  7. Desc: How accurately and appropriately our models generate content for specific tasks (e.g., summarisation, code generation, question answering).
  8. Target: Achieve 85% accuracy/relevance on internal evaluation benchmarks for new features.
  9. Freq: Per feature release and monthly re-evaluation.
  10. Example: For our new content generation tool, if 8 out of 10 outputs are considered 'good' by human evaluators, we're hitting our target. If it drops to 6, we know something's off.
  11. Metric: Deployment Frequency
  12. Desc: How often you're able to push new or updated generative AI features/models to production.
  13. Target: Deploy new model versions or feature updates at least once every two weeks.
  14. Freq: Tracked through CI/CD pipelines.
  15. Example: If we're only pushing code once a month, we're moving too slowly. The goal is to iterate quickly, so getting new improvements out every fortnight is key.
  16. Metric: Cost-per-inference Optimisation
  17. Desc: The computational cost associated with each model request.
  18. Target: Reduce average cost-per-inference by 10% quarter-over-quarter without sacrificing quality.
  19. Freq: Monthly cost reports.
  20. Example: If each API call to our LLM costs £0.01, we're looking to get that down to £0.009 next quarter. Small savings add up quickly when you're running millions of inferences.

Qualitative Metrics

  1. Metric: Code Quality & Maintainability
  2. Desc: How clean, well-documented, and easy-to-understand your code is, making it simple for others to pick up or debug.
  3. Evidence: Positive feedback in code reviews, fewer bugs reported in your modules, clear and concise documentation (docstrings, READMEs), and your solutions being adopted as patterns by others.
  4. Metric: Problem-Solving Effectiveness
  5. Desc: Your ability to diagnose and fix issues with generative AI models, especially when things go wrong in unexpected ways.
  6. Evidence: Successfully debugging complex model behaviours (e.g., hallucinations, unexpected outputs), identifying root causes quickly, and proposing robust solutions that prevent recurrence. You're the person others come to when they're stuck on a tricky model issue.
  7. Metric: Collaboration & Knowledge Sharing
  8. Desc: How well you work with other teams and share what you've learned, helping everyone else get better at GenAI.
  9. Evidence: Actively participating in team discussions, offering helpful advice in code reviews, contributing to internal knowledge bases or wikis, and helping junior team members get unstuck. You're seen as someone who makes the team smarter.
  10. Metric: Proactive Issue Identification
  11. Desc: Spotting potential problems with models or pipelines before they become big, user-facing issues.
  12. Evidence: Raising concerns about data drift, potential model biases, or scaling challenges before they impact production. You're not just reacting to fires; you're trying to prevent them.

Primary Traits

Supporting Traits

Primary Motivators

  1. Motivator: Solving Hard, Novel Technical Problems
  2. Daily: You'll spend your days grappling with tricky issues like reducing model hallucinations, optimising inference speed for a new architecture, or figuring out how to get a RAG pipeline to work reliably with really messy data. It's a constant puzzle.
  3. Motivator: Direct Impact on Product & Users
  4. Daily: Your code and models won't just sit in a research paper; they'll be integrated into our products, and you'll see users interacting with them. You'll get feedback (good and bad) and know your work is making a difference.
  5. Motivator: Continuous Learning in a Rapidly Evolving Field
  6. Daily: You'll be expected (and encouraged) to spend time exploring new models, frameworks, and research papers. This isn't a static role; you'll constantly be learning and applying the latest techniques.

Potential Demotivators

Honestly, this role isn't for everyone. If you need a perfectly stable, predictable environment, you'll probably struggle. You'll rerun the same analysis three times because stakeholders keep changing the question. The 'urgent' request that disrupted your Thursday will get deprioritised on Friday. You'll build a beautiful model that never gets deployed because the business moved on, or a new, better open-source model came out last week. If you need to see every piece of work make it to production, or if you get frustrated by ambiguity and constant change, you'll find this tough. We won't pretend it's easy.

Common Frustrations

  1. The unreasonable effectiveness of 'magic': Stakeholders often treat LLMs as magical black boxes, leading to wildly unrealistic expectations and feature requests that defy the current laws of AI physics.
  2. The data janitor reality: 80% of building a good RAG system is the unglamorous work of cleaning, chunking, and preparing messy, unstructured source documents.
  3. Chasing a moving target: You spend three months fine-tuning a model for a specific task, only for GPT-5 to be released, making your work obsolete overnight. The pace is relentless.
  4. The GPU budget scrutiny: Your requests for more A100/H100 compute are scrutinised by finance like a capital expense, forcing you to justify every pound spent on experimental model training.

What Role Doesn't Offer

  1. A slow, predictable pace with clearly defined, unchanging requirements.
  2. A role where you only work on greenfield projects; there's plenty of existing code to maintain and improve.
  3. A role where you're handed perfectly clean datasets; you'll be doing a lot of data wrangling.
  4. A role where every single model or feature you build makes it to production; some experiments just won't pan out.

ADHD Positives

  1. The fast-paced nature and constant novelty of generative AI can be highly engaging for those with ADHD, providing continuous stimulation and new challenges.
  2. The ability to hyper-focus on complex technical problems, like optimising a model or debugging a tricky RAG pipeline, can be a significant strength.
  3. The need for rapid iteration and experimentation aligns well with a preference for dynamic, hands-on work rather than long, monotonous tasks.

ADHD Challenges and Accommodations

  1. Managing multiple concurrent tasks and shifting priorities can be challenging; we use structured project management tools and daily stand-ups to help keep things on track.
  2. Detailed documentation can feel tedious; we encourage using AI tools for initial drafts and pair programming for review to make it less of a solo burden.
  3. Maintaining focus during long, uninterrupted coding sessions might be difficult; we support flexible work patterns and regular breaks to help manage energy levels.

Dyslexia Positives

  1. Strong spatial reasoning and pattern recognition, often associated with dyslexia, are incredibly valuable for understanding complex model architectures and identifying trends in data.
  2. The emphasis on logical problem-solving and abstract thinking in AI engineering can be a great fit, as these strengths are often pronounced.
  3. Visual tools for model architecture design (e.g., flowcharts, diagrams) and data visualisation are heavily used, playing to visual processing strengths.

Dyslexia Challenges and Accommodations

  1. Reading and writing extensive documentation or research papers can be demanding; we provide access to text-to-speech tools and encourage the use of AI summarisation for long documents.
  2. Careful attention to syntax in code is crucial; we use robust IDEs with strong auto-completion, linting, and pair programming for code reviews to catch errors collaboratively.
  3. Proofreading written communications (emails, reports) might take more effort; we encourage using grammar checkers and peer review for important documents.

Autism Positives

  1. A deep focus on logic, systems, and detail is highly beneficial for debugging models, optimising algorithms, and ensuring the precision of AI systems.
  2. The preference for clear, direct communication and objective data analysis aligns well with the technical nature of the role.
  3. The opportunity to specialise in complex technical areas, becoming a subject matter expert in specific generative AI techniques, can be very rewarding.

Autism Challenges and Accommodations

  1. Navigating ambiguous requirements or rapidly changing stakeholder expectations can be difficult; we strive for clear, written specifications and provide a Senior Engineer as a consistent point of contact.
  2. Unplanned social interactions or noisy open-plan environments can be overwhelming; we offer options for focused work in quieter spaces or remote work, and schedule meetings with clear agendas.
  3. Interpreting subtle social cues in team dynamics might be a challenge; we foster a culture of direct, respectful feedback and provide clear expectations for collaboration.

Sensory Considerations

Our main office environment is a modern, open-plan space which can sometimes be a bit lively. That said, we have quiet zones, noise-cancelling headphones available, and plenty of flexibility for remote work or working from home a few days a week. We're happy to discuss specific needs to make sure you're comfortable and can do your best work.

Flexibility Notes

We believe in output, not hours. We offer flexible start and end times, and a hybrid working model. If you need specific adjustments, let's talk about them – we're committed to making this a great place to work for everyone.

Key Responsibilities

Experience Levels Responsibilities

  1. Level: Mid-Level Generative AI Engineer
  2. Responsibilities: Independently build and deploy core components of our generative AI features, like a new RAG pipeline for customer support or a text summarisation module for our internal tools. You'll own it from start to finish for smaller features.
  3. Take ownership of optimising existing generative models for performance, latency, and cost. This means digging into inference settings, trying out different PEFT methods (like LoRA), and making sure we're not burning money on GPUs.
  4. Identify and debug complex issues in our generative AI systems, whether it's models hallucinating, RAG pipelines returning irrelevant context, or unexpected API errors. You'll be the one figuring out 'why did it say that?'
  5. Propose and implement improvements to our prompt engineering strategies, constantly refining how we talk to our LLMs to get better, more reliable outputs for specific tasks. This is a continuous process.
  6. Collaborate closely with product managers and other engineering teams to understand requirements and integrate generative AI features seamlessly into our existing products. You'll be the technical voice in those discussions.
  7. Contribute to our internal documentation and knowledge sharing, making sure that what you've built and learned is clearly explained for others. Yes, it's boring sometimes, but future-you (and everyone else) will thank you.
  8. Participate in code reviews, offering constructive feedback to peers and learning from their approaches. We all get better together.
  9. Supervision: You'll typically have weekly check-ins with a Senior Generative AI Engineer or your manager. For routine tasks, you'll work independently, but for anything novel or particularly tricky, you're expected to flag it and get guidance.
  10. Decision: You'll make routine technical decisions within the scope of your assigned projects, like choosing an appropriate embedding model or fine-tuning technique. Budget decisions above £2,000 or significant architectural changes will need approval from a Senior Engineer or your manager. You'll inform stakeholders about progress and potential roadblocks, but consult on major changes to timelines or scope.
  11. Success: Success looks like reliably delivering high-quality, performant generative AI features that meet product requirements, proactively identifying and solving technical challenges, and contributing positively to the team's overall knowledge and capabilities. Basically, you're building stuff that works, and you're getting better at it every day.

Decision-Making Authority

Save 10-15 hours weekly with AI-powered engineering tools!

Let's be real, a lot of engineering work involves repetitive tasks, boilerplate code, and sifting through mountains of information. But what if you could offload some of that grunt work to AI? That's exactly what we're doing here at Zavmo.

ID:

Tool: Boilerplate Code Generation

Benefit: Use a code-generation model like GitHub Copilot or similar tools to instantly create data loading scripts, model class skeletons, unit tests, and even complex API integrations. This means less time writing repetitive code and more time on core logic and innovation.

ID:

Tool: Research Paper Summariser

Benefit: Feed the latest ArXiv papers, technical blogs, or internal documentation into an LLM to get concise summaries of key innovations, methodologies, and results. Stay current with the rapidly evolving GenAI landscape without having to read every 20-page paper in full detail.

ID:

Tool: Interactive Debugging Assistant

Benefit: Paste complex error messages, confusing code blocks, or even a tricky prompt into a chat interface and ask an LLM to explain potential causes, suggest fixes, or refactor the code for clarity. It's like having a senior engineer on standby 24/7.

ID: ✍️

Tool: Automated Documentation Writer

Benefit: Use an AI tool to automatically generate docstrings, README files, API documentation, and even internal wikis directly from your code. Ensure your projects are always well-documented without the manual grind, freeing you up for more impactful work.

Roughly 10-15 hours per week on routine tasks. Weekly time savings potential
You'll have access to our internal AI tools and external subscriptions, typically costing around £50-£150 per month, paid for by us. Typical tool investment
Explore AI Productivity for Generative AI Engineer →

12-15 specific tools & techniques with implementation guides

Competency Requirements

Foundation Skills (Transferable)

Beyond the technical wizardry, we need people who can think clearly, work well with others, and adapt when things inevitably go sideways. These are the bedrock skills that make a great engineer.

Functional Skills (Role-Specific Technical)

This is where the rubber meets the road. You'll need solid hands-on experience with the tools and techniques that make generative AI actually work in a product setting.

Technical Competencies

Digital Tools

Industry Knowledge

Regulatory Compliance Regulations

Essential Prerequisites

Career Pathway Context

These are the foundational skills we expect you to bring to the table. If you've got these locked down, you're in a great position to grow into this role and beyond. We're looking for someone who can hit the ground running on the technical side, not someone who needs to learn Python from scratch.

Qualifications & Credentials

Emerging Foundation Skills

Advancing Technical Skills

Future Skills Closing Note

This isn't just about keeping up; it's about staying ahead. We'll support your learning with resources, time, and opportunities to apply these new skills. Your growth here is our growth.

Education Requirements

Experience Requirements

You'll need roughly 2-5 years of hands-on experience in machine learning engineering or a related technical role. This should include practical experience building, training, and deploying machine learning models, ideally with at least 1-2 years specifically focused on generative AI, large language models, or natural language processing. We're looking for someone who has moved beyond just following tutorials and has actually shipped AI-powered features.

Preferred Certifications

Recommended Activities

Career Progression Pathways

Entry Paths to This Role

Career Progression From This Role

Long Term Vision Potential Roles

Sector Mobility

The skills you'll gain in this role are highly transferable across almost any industry. Generative AI is transforming everything from finance and healthcare to media and manufacturing. You'll be a sought-after expert in a rapidly expanding field, with opportunities in product companies, research labs, or even starting your own venture.

How Zavmo Delivers This Role's Development

DISCOVER Phase: Skills Gap Analysis

Zavmo maps your current competencies against all requirements in this job description through conversational assessment. We evaluate your foundation skills (communication, strategic thinking), functional skills (CRM expertise, negotiation), and readiness for career progression.

Output: Personalised skills gap heat map showing strengths and priorities, estimated time to competency, neurodiversity accommodations.

DISCUSS Phase: Personalised Learning Pathway

Based on your DISCOVER results, Zavmo creates a personalised learning plan prioritised by impact: foundation skills first, then functional skills. We adapt to your learning style, pace, and neurodiversity needs (ADHD, dyslexia, autism).

Output: Week-by-week schedule, each module linked to specific job responsibilities, checkpoints and milestones.

DELIVER Phase: Conversational Learning

Learn through conversation, not boring modules. Zavmo uses 10 conversation types (Socratic dialogue, role-play, coaching, case studies) to build competence. Practice difficult QBR presentations, negotiate tough renewals, and handle churn conversations in a safe AI environment before facing real clients.

Example: "For 'Stakeholder Mapping', Zavmo will guide you through analysing a complex enterprise account, identifying key decision-makers, and building an engagement strategy."

DEMONSTRATE Phase: Competency Assessment

Zavmo automatically builds your evidence portfolio as you learn. Every conversation, practice scenario, and application example is captured and mapped to NOS performance criteria. When ready, your portfolio supports OFQUAL qualification claims and demonstrates competence to employers.

Output: Competency matrix, evidence portfolio (downloadable), qualification readiness, career progression score.

Discover Your Skills Gap Explore Learning Paths