We have some customers who use Quick Fields to scan low-quality (e.g. historical) documents. The OCR engine tends to have a hard time reading the text on these documents. For example, even with lots of image cleanup, the word "invoice" can be read as:
- 1nvoice
- inv0lce
- Inu0.ce
Our typical approach is to use regular expressions to catch these irregularities. The problem is that regular expressions can get us only so far. Sometimes there are way too many possible combinations to account for.
I was thinking it would be great to be able to apply some sort of "fuzzy" comparison algorithm, similar to the fuzzy search feature in LFFTS. Basically, if we could say "if a given text is within 3 characters of 'invoice', return true". That would match all the words in the bullet point list above.