Using Optical Character Recognition (OCR) with AXIOM 4.8 & AXIOM Cyber 4.8

With the release of AXIOM and AXIOM Cyber 4.8, investigators can now leverage Optical Character Recognition (OCR) technology to help them recover embedded text in PDFs, scanned documents, and images. Text extraction using OCR is available for users with active Magnet AXIOM Cyber licenses or both Magnet AXIOM Computer and Smartphone licenses.

OCR in AXIOM is currently optimized for extracting text from PDF’s, scanned docs and images of docs.

Optical Character Recognition in AXIOM and AXIOM Cyber

Before using the new OCR functionality in AXIOM or AXIOM Cyber, you’ll first need to decide whether to start the processing during your initial scan of your evidence or do so afterwards from Examine. As seen in the screenshot below from AXIOM Process, users are provided with a dashboard to show the status of the OCR processing. You can configure OCR to run automatically during the post-processing actions portion of a search.

NOTE: Investigators needing quick access to evidence for analysis may prefer running OCR from AXIOM Examine after completing an initial review of the case.

After processing the files, you can view the extracted text in the text extracted using OCR preview card, found within the Details panel in AXIOM Examine. Additionally, you can search the extracted text from these files using the standard keyword searching capabilities of AXIOM or AXIOM Cyber, and you can include text extracted using OCR as an attachment for artifacts in HTML exports. OCR in AXIOM and AXIOM Cyber is currently optimized for to extract text from files with Latin characters.

Two important items worth noting: if a PDF document or picture was recovered through carving, text extracted using OCR will not appear in the text extracted using OCR card in the File System Explorer. Users can instead view extracted text for carved evidence from the Artifacts Explorer in this case. Secondly, if you add more evidence to your case, you will need to run OCR again to extract text from new PDF documents and pictures, but AXIOM will only process the new files.

