Technical details about AI-assisted checks

Background

When is a PDF document accessible? This question is not so easy to answer. The document must pass no fewer than 31 tests with a total of 137 conditions. These tests are listed in detail in the Matterhorn Protocol. It is the checklist for PDF accessibility.

Of the 136 conditions in the protocol, 41 require human judgment. For example, “classic” software cannot recognize whether a paragraph belongs to a title or to the body text (i.e., whether an H or P tag would be correct). All PDF testing tools have failed to solve this problem so far.

PAC 2026 introduces artificial intelligence (AI) features. With these new features, PAC can cover some of the Matterhorn Protocol's checkpoints that previously required human judgment. PAC 2026 thus significantly reduces the manual testing effort and makes it easier for beginners in particular to access professional PDF/UA testing.

Functionality

The AI built into PAC 2026 was trained using many accessible PDFs. It learned how to recognize semantically correct tagged areas based on these documents.

Semantic areas (structural elements)

The AI-supported checks recognize the following semantic areas of a document:

  • Body text (P)
  • Headings (H)
  • Lists (L)
  • Tables (Table)
  • Images and graphics (Figure)
  • Tables of contents (TOC)
  • Captions (Caption)
  • Footnotes and notes (Note)