Untagged PDF Documents
Most PDF documents are either a graphical representation of the original document (unstructured) or a combination of images and electronic text (structured). PDF documents created by scanning paper documents are unstructured (images) and cannot be searched or accessed by screen readers. PDF documents created from authoring tools using Adobe Distiller, as a "print" option or using server side and other desktop tools are structured documents. While these documents are searchable they are not functionally usable by screen readers. In order for the document to to accessible it must be converted into a tagged format.
Adding Tags to Structured Documents
- Open PDF document in Adobe Acrobat Professional
- Tag document by selecting Advance > Accessibility > Add Tags to Document from the menubar. The document will now be tagged
- Check for Accessibility by selecting Advance > Accessibility > Full Check from the menubar
- Make any required changes and edits to the tag tree
- Re-check for Accessibility
Converting Unstructured (image) into a Tagged Document
Often print documents are scanned as images (TIFF, GIF etc.) and converted into PDF documents for publishing on the web. Before a documented can be converted into an accessible tagged document, this scanned (image) must be converted into a searchable electronic format. Optical Character Recognition (OCR) software is used to convert an electronic format. The clarity and resolution of the image will affect the accuracy of the recognition and the document will need to be edited for recognition errors. This can be done in a couple ways
Option 1: Process the document within Adobe Acrobat
- Open image-only PDF document
- Use the OCR feature built into application
- In Adobe Acrobat 7.0, select Document > Recognize Text Using OCR > Start.
- From the Recognize Text dialog box, page range and then Edit
- From the Recognize Text Settings dialog box, select Formatted Text and Graphic
- Select OK
- Follow the steps above to add tags to the document.
Option 2: Process document using commercial OCR software and MS Word
- Use any OCR software to scan and process document
- Edit and correct document within OCR tool and save as MS Word document
- Open document in MS Word
- Edit and format the document for readability and visual presentation, adding alternative text to images
- Convert document to a tagged PDF format
- Open document in Adobe Acrobat and check for Accessibility