Table of Contents
Executive Summary
EMA’s draft Annex 22 — the first comprehensive EU regulatory framework dedicated to artificial intelligence in pharmaceutical manufacturing — was published for stakeholder consultation on July 7, 2025, with the consultation window closing October 7, 2025. Finalization is expected during 2026, with a typical 6-12 month grace period after publication before legal implementation. As of May 2026, sponsors are in a critical preparation window: the framework’s contours are clear from the draft, the consultation feedback is informing finalization, and the operational work required to prepare is real but tractable.
This article translates the current state of Annex 22 into operational implications for pharma sponsors. We cover where the timeline actually stands, what the consultation surfaced, the scope clauses that materially shape the operational impact, how Annex 22 interacts with the EU AI Act and FDA guidance, and the work sponsors should be doing now between May and year-end to be ready when Annex 22 finalizes.
Where the Timeline Actually Stands
The Annex 22 trajectory has moved through several recognizable stages over 2024-2026. EMA’s Inspectors’ Working Group, in collaboration with PIC/S and with FDA and UK MHRA as observers, drafted the annex over multiple iterations. The draft was published for stakeholder consultation by the European Commission on July 7, 2025, alongside companion revisions to Chapter 4 (Documentation) and Annex 11 (Computerised Systems). The consultation, accessible via the European Commission consultation page, closed October 7, 2025.
Following the consultation, the EMA’s working group is reviewing the feedback and finalizing the documents. Finalization is expected during 2026. The typical EMA practice provides a 6-12 month grace period between publication of the finalized document and the date legal implementation is enforced — which means the operational implementation window for sponsors is likely to land in 2026 to early 2027, depending on the specific publication date.
The practical implication for pharma sponsors operating in Europe: the window for proactive preparation runs from now through finalization. Sponsors that wait until finalization to begin operational work will find themselves implementing under time pressure, and as the Rephine analysis of Annex 22 preparation indicates, the grace period after publication is typically too short for organizations starting from scratch to build the required disciplines.
Two additional considerations affect the timeline. First, EMA’s finalization will need to align with the EU AI Act implementation cadence, which is on its own timeline. Second, the relationship between Annex 22 and PIC/S guidance will need to be articulated, since PIC/S inspectors apply to manufacturers serving multiple jurisdictions. Both of these introduce some uncertainty into the precise finalization date, but neither changes the operational implication: sponsors should be preparing now.
Consultation Outcomes and What Industry Said
The consultation period produced substantive industry feedback. The published analyses and industry summaries — from PDA, ISPE, EFPIA, and individual major sponsors — signal where the feedback has converged.
Scope of the static/deterministic model restriction. The draft annex restricts critical GMP functions to “static, deterministic” AI models — those that produce consistent outputs for the same inputs — and excludes generative AI and large language models from those critical functions. As described in Continuous Intelligence’s analysis of Annex 22, this scope clause has been a major focus of industry feedback. The industry position has generally been that the restriction is appropriate for direct quality-impacting functions but that the definitional boundary needs clarification, particularly for hybrid systems that combine deterministic and learned components.
Definition of “high impact” AI. The annex’s risk-based approach triggers the strictest requirements for high-impact AI affecting safety, quality, or data integrity. Industry feedback has focused on the operational specifics of the high-impact threshold and on how borderline cases should be classified. The clarity of the threshold materially shapes implementation cost.
Validation expectations for AI used in non-critical functions. The annex applies a lighter validation expectation to AI used in non-critical functions, but the line between critical and non-critical is consequential. Industry has asked for clearer definitions, particularly for use cases that operate adjacent to critical functions (e.g., predictive maintenance for critical equipment).
Human-in-the-loop expectations. The annex articulates expectations for human-in-the-loop oversight that vary by impact tier. Industry feedback has focused on the operational specifics of what evidences adequate human oversight, particularly for systems where human review is structurally challenging (e.g., real-time process control).
Lifecycle management and explainability. The annex’s expectations for lifecycle management and explainability are foundational, but industry has flagged that the explainability expectations need calibration to model class. Some AI techniques are inherently less explainable, and the practical operational consequence of the explainability expectation depends on how the expectation is interpreted.
The consultation feedback is not public in aggregate form, but the contours visible through trade press, industry working group communications, and individual sponsor positions indicate that finalization will likely refine — rather than reverse — the draft’s risk-based structure.
The Scope Clauses That Matter Most
Three scope clauses in the draft materially shape the operational implications.
Static and deterministic restriction for critical functions. AI used in processes directly impacting product quality or patient safety must be static and deterministic. Generative AI and LLMs are excluded from critical functions. This is the single most operationally consequential clause: it limits the AI architectural choices available for the highest-risk pharma applications and shapes vendor selection accordingly. Sponsors planning to deploy LLMs in critical manufacturing functions would need to revisit those deployments under Annex 22.
Risk-based tiering. The risk-based approach triggers the strictest requirements only for high-impact AI. Lower-impact uses receive proportional requirements. This is the structural backbone of the annex and aligns with FDA’s risk-based credibility framework. Sponsors with a tier classification SOP that maps onto the EMA structure are well-positioned; sponsors without one will need to develop one quickly.
Human-in-the-loop oversight. Human oversight is required for high-impact AI, with the operational specifics varying by use case. The clause is broadly consistent with the FDA’s GMLP and GAIP principles, but the operational specifics matter. As the Merit Solutions analysis of Annex 22 explains, the human-in-the-loop expectation is the practical enforcement mechanism for the broader risk-based framework.
The combination of these clauses produces a recognizable operational picture. Sponsors operating in Europe will need to: (1) inventory all AI in manufacturing systems, (2) classify each use by impact tier, (3) confirm that high-impact uses meet the static/deterministic restriction, (4) demonstrate human-in-the-loop oversight where required, and (5) maintain lifecycle management evidence aligned with the annex’s expectations. None of these are conceptually new disciplines for pharma quality; the work is in extending them to AI.
Interaction With the EU AI Act
Annex 22 does not exist in isolation. AI use cases in pharma manufacturing may also fall under the EU AI Act’s high-risk classification, with corresponding documentation, conformity assessment, and post-market monitoring obligations. The two regimes overlap but are not identical, and managing both efficiently requires deliberate framework design.
The EU AI Act’s high-risk classification applies broadly to AI used in safety-critical applications, including healthcare and pharmaceutical manufacturing where AI affects patient safety. Annex 22 applies specifically to AI in GMP manufacturing of medicinal products and active substances. Most high-impact AI under Annex 22 will also be high-risk under the AI Act, but the documentation and conformity assessment requirements differ in important ways.
The practical implication: sponsors operating in Europe need a framework that satisfies both regimes simultaneously. Parallel compliance programs are operationally expensive; an integrated framework is materially more efficient. Sponsors that build their QMS extension with both regimes in mind from the start avoid the more expensive remediation work that parallel compliance produces.
A key area of interaction is post-market monitoring. The EU AI Act requires post-market monitoring for high-risk AI systems; Annex 22 requires lifecycle management and ongoing performance monitoring. The two requirements can be satisfied by a single well-designed monitoring program, but the documentation has to satisfy both. Quality leaders should anticipate this in their monitoring infrastructure design.
Interaction With FDA’s Posture
FDA participated in the Annex 22 drafting as an observer, alongside the UK MHRA. The relationship between Annex 22 and FDA’s domestic posture has been deliberately coordinated, but the two frameworks are not identical. The FDA’s January 2025 draft guidance on AI for regulatory decision-making, available through the FDA guidance landing page, takes a credibility-framework approach that complements Annex 22’s risk-based tiering rather than duplicating it.
For multinational sponsors, the operational implication is that a single AI governance framework can be designed to satisfy both FDA and EMA expectations, with marginal localizations for specific jurisdictional differences. As the BioSlice Blog analysis of the EU consultation notes, the coordination between EMA and FDA on AI in regulated manufacturing has been deliberate and ongoing.
Where the two frameworks differ in emphasis: EMA’s Annex 22 is more prescriptive about the static/deterministic restriction for critical functions; FDA’s posture is more flexible on architecture but more rigorous on credibility evidence. Sponsors operating in both jurisdictions need to satisfy the more restrictive of the two for critical functions while maintaining the credibility evidence FDA expects.
What to Do Now: A May-to-Year-End Plan
For sponsors operating in Europe, the May 2026 to year-end window is the right time to operationalize Annex 22 preparation. A workable plan:
| Window | Workstream | Deliverable |
|---|---|---|
| May – June | AI inventory in manufacturing systems | Comprehensive inventory of all AI use in manufacturing, including vendor-embedded features |
| June – July | Tier classification SOP aligned with Annex 22 structure | Published SOP with risk-based classification criteria |
| July – August | Apply classification to inventory | Classified inventory with high-impact use cases identified |
| August – September | Gap assessment for high-impact uses | Documented gaps against Annex 22 requirements with remediation plans |
| September – October | Remediation work for highest-priority gaps | Resolved gaps for the use cases with the largest regulatory exposure |
| October – November | Human-in-the-loop oversight design | Validated oversight workflows for high-impact AI |
| November – December | Lifecycle management and monitoring infrastructure | Operational monitoring for high-impact AI with defined response procedures |
By year-end 2026, a sponsor following this plan would have a published tier classification SOP, an inventory, a gap-and-remediation register for high-impact uses, validated oversight workflows, and operational monitoring infrastructure. This is a defensible posture going into the implementation window.
Operational Implications for Sponsors
The operational implications of Annex 22 are significant but tractable for sponsors that begin work now.
Vendor relationships will need to be revisited. Vendor contracts for AI-enabled manufacturing systems will need to address model architecture (static/deterministic versus dynamic), model version pinning, change notification, and validation cooperation. Sponsors with contracts negotiated before Annex 22 should anticipate amendments or renegotiations.
Some current AI deployments will need to be reconsidered. AI use in critical functions that does not meet the static/deterministic restriction will need to be reconsidered. This is most acute for generative AI and LLM deployments in critical manufacturing contexts — although as discussed in the case-pattern article, most pharma LLM deployments have not been in critical manufacturing.
Documentation burden will increase. Annex 22’s lifecycle management and explainability expectations require documentation that traditional CSV programs do not typically produce. Sponsors will need to extend their documentation discipline rather than build parallel AI-specific documentation.
Cross-functional coordination becomes more important. Manufacturing AI use cases typically involve IT, Engineering, Quality, and Regulatory — and the Annex 22 work requires all four to align on tier classification, validation, and ongoing monitoring. Sponsors with weak cross-functional coordination will find Annex 22 implementation slower than sponsors with strong governance.
Inspector training will lag. EMA inspectors will be applying Annex 22 starting from implementation, but their depth of AI expertise will vary. Sponsors should anticipate that early inspections under Annex 22 may be inconsistent and should prepare documentation that is intelligible to inspectors with varying AI backgrounds. The 21 CFR Part 11-style architecture that pharma quality leaders are familiar with is the right reference for documentation design.
The broader strategic implication is that Annex 22 will not be the last EU regulatory action on AI in pharma. Sponsors that build AI governance frameworks expecting ongoing regulatory evolution — rather than treating Annex 22 as a one-time compliance event — will be materially better positioned for the post-2027 regulatory environment than sponsors who treat it as a discrete project. The framework being built now should be designed to accommodate evolution, not to solve a single regulatory release.
How the Annex 22 work interacts with Chapter 4 and Annex 11 revisions
A practical implementation point that often gets underweighted: Annex 22 was published in consultation alongside revisions to Chapter 4 (Documentation) and Annex 11 (Computerised Systems). The three documents are designed to work together, and sponsors that focus exclusively on Annex 22 risk missing the operational changes the Chapter 4 and Annex 11 revisions will require. As noted in Epista’s preparation analysis for the GMP revisions, the three documents collectively introduce material changes to documentation expectations, computerized system validation, and AI-specific controls.
The Chapter 4 revisions are expected to extend documentation expectations to include AI-related artifacts: model documentation, training data lineage, validation evidence, and ongoing performance records. The Annex 11 revisions are expected to address computerized systems incorporating AI components, including the interaction between AI lifecycle management and broader CSV discipline. The Annex 22 is the dedicated AI annex but operates in the context that the Chapter 4 and Annex 11 revisions establish.
Quality leaders preparing for the implementation window should be reading all three documents as a coherent set, not Annex 22 in isolation. Documentation programs designed to satisfy Annex 22 alone will miss the broader expectations that the companion revisions establish, and remediation under inspection pressure later is materially more expensive than preparation now.
Inspector preparation and the early enforcement environment
An underappreciated dimension of the implementation window is how the inspector cadre will be prepared to enforce Annex 22. EMA and national competent authority inspectors have varying degrees of AI expertise; even after Annex 22 finalizes, the operational interpretation of the framework will evolve through actual inspection experience over the first year or two of enforcement. Sponsors anticipating the implementation window should expect inconsistent interpretation across inspections during this learning period, and should design their documentation to be intelligible to inspectors with varying AI backgrounds.
The 21 CFR Part 11-style architecture that pharma quality leaders already know is the right reference for documentation design. Inspectors familiar with computerized system validation will navigate AI documentation built on the same scaffolding more readily than documentation built on AI-specific architecture. The familiar scaffolding is itself a compliance posture: it reduces the cognitive load on inspectors and produces faster, smoother inspections.
The cost of not acting now
A final strategic dimension worth understanding: the cost of waiting until Annex 22 finalizes before beginning operational work. Sponsors that defer preparation face several real costs.
First, the work compresses into the grace period after publication, which is typically too short for the full operational build. Programs that compressed analogous preparation into grace periods for other regulatory actions have consistently delivered lower-quality compliance posture under time pressure.
Second, vendor contract renegotiations cannot reasonably be compressed. Vendor contracts for AI-enabled manufacturing systems often run on multi-year cycles; sponsors that defer preparation will find themselves needing to renegotiate during the grace period, when leverage is weakest because the deadline is fixed.
Third, organizational learning takes time. Building genuine QA capability for AI requires repeated exposure to the work, not just training. Sponsors that start the operational work now will have eighteen months of organizational learning ahead of the implementation window; sponsors that defer will be starting that learning under deadline pressure.
Fourth, regulatory engagement opportunities are time-sensitive. Pre-submission meetings, scientific advice procedures, and similar regulator engagements that can clarify operational expectations are most useful when conducted ahead of the implementation deadline. Sponsors that defer preparation typically have fewer opportunities to engage productively with regulators during the implementation window itself.
The cumulative effect is that the cost of waiting is materially higher than the cost of acting now. Quality leaders making the business case for proactive preparation should be explicit about these costs, because the conservative-feeling decision to wait often produces the more expensive outcome.
References & Sources
For Further Reading
References & Sources
- Stakeholders’ Consultation on EudraLex Volume 4: Chapter 4, Annex 11 and New Annex 22 — European Commission Public Health. Official consultation page documenting the July 7 to October 7, 2025 consultation window for Annex 22 and companion revisions.
- Multistakeholder workshop on expert contributions to AI guidance development (Annex 22) — European Medicines Agency. Reference for the EMA’s stakeholder engagement process during Annex 22 development.
- Annex 22: EMA’s AI Regulation for Pharma Manufacturing — Continuous Intelligence. Industry analysis of the static/deterministic restriction and other key clauses in the draft Annex 22.
- How to Prepare for Annex 22 — Rephine. Practitioner-grade guidance on the implementation timeline and the operational work sponsors should do during the grace period.
- EU GMP Annex 22: The New AI Regulatory Standard — Merit Solutions. Industry analysis of Annex 22’s human-in-the-loop expectations and broader operational implications.
- EU Consults on New GMP Rules for AI in Pharma Manufacturing — BioSlice Blog. Documentation of the consultation process and the EMA-FDA coordination on AI in regulated manufacturing.








Your perspective matters—join the conversation.