You are viewing limited content. For full access, please sign in.

Question

Question

OCR Font Recommendations

asked on March 3, 2014

I would like to know does Laserfiche have a recommended font that works best with Omnipage Zone OCR.  The client is experiencing incorrect reading of numbers and letters.  4s are being read as 9s and 8s as 0s.  Also, the letter "s" reads as 5.  Different settings have been used for the zone OCR to no avail.  Any information or recommendation will be appreciated. Your prompt response is appreciated as the client is becoming very impatient. 

 

0 0

Answer

APPROVED ANSWER
replied on March 3, 2014

Laserfiche doesn't have any recommended fonts, but of course some will work better than others; think Wingdings vs. Arial, for example.  Generally, if you're using Times New Roman, Arial, Calibri, or other upright clear fonts, you should be fine.  

 

If you're already using a clear font, you should probably check the quality of the scan.  Laserfiche recommends 300 DPI for OCR, and this could easily be the reason the characters aren't being identified correctly.

 

Feel free to respond with any other questions!  Please mark this reply as user approved if it answered your question! 

0 0
replied on April 16, 2015

Hi rob, is there a recommendation on the font size? The smallest size that can be ocr'd?

Replies

replied on March 4, 2014 Show version history

Other fonts you could try include:

 

OCRA

OCRB

 

Also, rather than 300dpi, I've had noticeable increases in recognition by using 240dpi, which keeps the files size significantly below  300dpi images.

 

Finally, and this may be obvious, avoid unbleached and recycled paper for pages used for document or file separation purposes.

 

-Ben

 

edit: In the above, I meant to say that 240dpi is noticeably better than 200dpi for OCR.

1 0
replied on December 6, 2017

I've been having issues with this myself, so I set up a simple test. I put the commonly used fonts with common error letters into Word, took a screen shot and set it as a sample image, and then ran it through different OCR activities in Quick Fields. Speed and balanced were terrible in all tests, so all OCR was set to Accuracy.

O 0 o I l L 1 B 8 4 9 s S 5 z Z 2 - Original Text Used for tests

I tested a number of the most popular fonts like Tahoma, Arial, Comic Sans, Verdana, Book Antiqua, Courier, etc... And also tested at 11 point and 14 point, and found a more obscure font called Consolas that has a slash through the zero, to see if that would improve results.

 

If you know you're scanning for numbers or letters you can set the advanced properties for the Zone OCR to favor numbers or letters, so I did Zone OCR with no bias, bias for letters, and bias for numbers.

Nothing read with 100% accuracy. ALL of the fonts had fails, with zero and capital O being the worst at 11 points, and I l L 1 failing at 14 points, so Zone OCR and doing a bias in favor of numbers or letters is the only fix for when it's mixing small l and the number 1, or capital O and the number 0.

For 11 point the only fonts that performed well consistently were Tahoma and Consola. Consola read zeros as small a's very consistently, so you could do a fix in Pattern Match and use that. Tahoma read O's as 0's.

At 14 point the zeros and capital O's read better, but almost all fonts failed on small L and the number 1.

With OCR there are more improvements with increasing the size of the text rather than changing the font.

Hope that helps!

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.