Question

QF pattern

Quick Fields

Updated March 2, 2015

asked on March 2, 2015

I am trying to create a pattern that will find any word prior to another word. Specifically, I have a document with the word "description" in it and I want to have any word directly before "description" be filled into metadata. Hope this makes sense. Thank you :)

0 0

Answer

SELECTED ANSWER

replied on March 2, 2015 • Show version history

The reason my regex wouldn't work here is most likely to be the fact that your word DESCRIPTION is upper-case (assuming you are using a case sensitive match)

Try

(\S+)\s+DESCRIPTION

(or try making your match not be case sensitive). Make sure you use \s+ for the space, as this will also match the newlines that your sample seems to use.

1 0

Replies

replied on March 2, 2015 • Show version history

So if your pattern is Document Description and you want to find the word document, I would use the following pattern.

(\w+) +Description

The \w+ means any length of letters (a word), and the parentheses means grab that part of the pattern only.

1 0

replied on March 2, 2015 • Show version history

Hi Nicholas! There's a few ways of doing this. If you know there is one space between the word "description" and the word you want, you can use this pattern:

([^\s]+)\sdescription

That is, "one or more of anything except a space character, then a space character, then the word descriptions (capturing the part before the space character)"

Alternatively you can do

(\S+)\sdescription

where \S is the equivalent of [^\s].

Note that these things will also capture symbols and stuff, so it'll match #O'Malley!! in

Hello Mr. #O'Malley!! description

You can also do something like

(\S+)\s+description

if there is a potential of more than one space before the word 'description'

0 0

replied on March 2, 2015

Thank you for these responses. Those didn't quite work only because I left out a bit of information. Zone OCR reads the text as follows:

NAME
NICHOLAS J MARTIN
DESCRIPTION

I am trying to grab the last name separately from everything else for metadata. Not sure the best way to do this. Some employees have middle initial, some don't and some have full middle name, so the inconsistencies make it difficult. So I thought grabbing any 1 word before description would work but description is on a separate line. I hope I am making sense. :)

0 0

SELECTED ANSWER

replied on March 2, 2015 • Show version history

The reason my regex wouldn't work here is most likely to be the fact that your word DESCRIPTION is upper-case (assuming you are using a case sensitive match)

Try

(\S+)\s+DESCRIPTION

(or try making your match not be case sensitive). Make sure you use \s+ for the space, as this will also match the newlines that your sample seems to use.

1 0

replied on March 2, 2015

Yes Sir! That worked like a charm. Thank you so much.

0 0

You are not allowed to follow up in this post.

Question

Question

QF pattern

Answer

Replies

Sign in to reply to this post.