Solved How to use the 'Extract Regex Values' action of Utility - Strings

VJR

Well-Known Member
Staff member
#1
How to use the 'Extract Regex Values action'?
What are the 'Named values' input and output parameters? These are collections.
I do not see a string output when tried with a simple regular expression to retrieve only numbers from an alphanumeric string.
Thanks.
 
#2
Hi @VJR,

Not sure if you’ve sorted this already, but if not hopefully this explains how to use it:

For the Extract Regex Values action you have the following inputs:

Target String (text) - this is the string you wish to perform the Regular Expression on.

Regex Pattern (text) - as the name suggests, the Regular Expression pattern to apply to the Target String.

Named Values (collection) - this is a collection containing 2 text fields called ‘Name’ and ‘Value’. Value should be left blank and the Name is what you want to call the result of your Regex.

So to put into a scenario, let’s say you have a string of text that consists of a product name and a 4 digit product code. You want to extract just th product code. It is always in different positions and sometimes other single digits can appear in the string so you’ve decided Regex will be the best way to get what you need. In this example your inputs are:

Target string - text data item with value:

“Product name Abc - 1234 Additional info”

(1234 being the product code we’re looking to extract)

Named Values - this is a collection with 2 text fields called ‘Name’ and ‘Value’. You will need as many rows as the number of outputs you want your Regular Expression to yield. For each row, ‘Value’ will be empty (this is where the result will be stored), and ‘Name’ is the identifier of our regex output.
In this case, we only want one output which is the product code, so we’ll add one row to the collection with: Name = ProdCode and Value = empty.

Regex Pattern - you need to reference the named Value within your Regex. So in this example our text input for the Regex Pattern would be: “(?<ProdCode>\d{4})”

The results are then returned from the action in the Named Values collection.

I.e. in this example the output would be the one row collection with:
Name = ProdCode
Value = 1234

Hopefully that makes sense, let me know if any questions.

Rich
 

VJR

Well-Known Member
Staff member
#3
Hi Rich, No, I haven't attended to this after the original post. Thank you for your detailed response. I've attempted implementing per the above instructions and I am unclear about referencing the Named value within the regex pattern parameter. I've tried different combinations but without the desired result. How would the Regex Pattern field look like to fetch only numbers from an Input string like "ABCD12345" when the Name column in the collection is called "Only Numbers"? Also the 'Named Values' Input and Output parameters have been set as [Output] and Output respectively where Output is the name of the resulting collection.
 
#4
I tried it like this and it worked. I'm not a Regex expert at all, but it didn't seem to like having a space between 'Only Numbers'

and '\d' looks for digits with the '+' after returns all the numbers.

FYI I used the same collection for input & output, attached screenshots show initial values and the current values after it's been run.

hope this helps.

Rich

Unknown-2.png
Unknown-1.png

Unknown.png
 

VJR

Well-Known Member
Staff member
#5
Hi Rich,

Thanks. That works perfectly fine.
Another regex that I used in this case was [0-9]+ which worked after removing the space. So looks like the space was the culprit.


Cheers.
 
#6
Good to hear, you’re welcome!

Yeah as I said before I’m no Regex expert but as far as I know [0-9] and \d are similar, I’m sure there’s some differences but not sure what they are.

Rich
 

VJR

Well-Known Member
Staff member
#8
Hi Alagupandi,

I know this action is very tricky to understand and there is no documentation explaining about it.
Below are the screenshots.

You need to set a Named Value output parameter in the collection at design time (4.jpg) and then give the same parameter in the actual Regular Expression (2.jpg).

This serves my purpose for an input string of "ABCD123456", but this does not return multiple values in the output collection if the input is something like "ABCD1234XYZ6789". I have been trying to figure out on how it returns multiple matches. If you do find out a way do let me know.

Attn. @Rich : If you have something on multiple outputs please post here. Thanks.
 

Attachments

#10
HI
I have the below string and I want to get only the date (5/2/2018). Can you please suggest the regex expression for the same.
Wed 5/2/2018, 4:35 PM
 
#11
I get only one match ( the first one), but there are supposed to be other matches, what if i want to add more than one match to the collection? What do i write in the Regex Pattern?

(?<Invoice>[0-9]{7}-[0-9]{4})
 
#12
Hi @VJR,

Not sure if you’ve sorted this already, but if not hopefully this explains how to use it:

For the Extract Regex Values action you have the following inputs:

Target String (text) - this is the string you wish to perform the Regular Expression on.

Regex Pattern (text) - as the name suggests, the Regular Expression pattern to apply to the Target String.

Named Values (collection) - this is a collection containing 2 text fields called ‘Name’ and ‘Value’. Value should be left blank and the Name is what you want to call the result of your Regex.

So to put into a scenario, let’s say you have a string of text that consists of a product name and a 4 digit product code. You want to extract just th product code. It is always in different positions and sometimes other single digits can appear in the string so you’ve decided Regex will be the best way to get what you need. In this example your inputs are:

Target string - text data item with value:

“Product name Abc - 1234 Additional info”

(1234 being the product code we’re looking to extract)

Named Values - this is a collection with 2 text fields called ‘Name’ and ‘Value’. You will need as many rows as the number of outputs you want your Regular Expression to yield. For each row, ‘Value’ will be empty (this is where the result will be stored), and ‘Name’ is the identifier of our regex output.
In this case, we only want one output which is the product code, so we’ll add one row to the collection with: Name = ProdCode and Value = empty.

Regex Pattern - you need to reference the named Value within your Regex. So in this example our text input for the Regex Pattern would be: “(?<ProdCode>\d{4})”

The results are then returned from the action in the Named Values collection.

I.e. in this example the output would be the one row collection with:
Name = ProdCode
Value = 1234

Hopefully that makes sense, let me know if any questions.

Rich
Great, now what do i do if i want to have more than one match?
 
#19
I managed to get the result for regex using multiple matches using the example of "ABCD1234XYZ6789" as input and extracting the numbers. To do that, you'll have to assume the text pattern is from left to right, and the regex code is as follows:


Code:
(?<myout1>\d+)[A-Z]{3}(?<myout2>\d+)
In this case, I'm specifying digits (one or more) as the pattern to go into my collection row labelled myout1 (which is 1234). This is followed by any A-Z chars after that (in this case only 3 chars). The following digits (again one or more) are identified as the pattern to store into myout2 row in the output collection. Attached is my result after running the above regex.
 

Attachments

#20
I managed to get the result for regex using multiple matches using the example of "ABCD1234XYZ6789" as input and extracting the numbers. To do that, you'll have to assume the text pattern is from left to right, and the regex code is as follows:


Code:
(?<myout1>\d+)[A-Z]{3}(?<myout2>\d+)
In this case, I'm specifying digits (one or more) as the pattern to go into my collection row labelled myout1 (which is 1234). This is followed by any A-Z chars after that (in this case only 3 chars). The following digits (again one or more) are identified as the pattern to store into myout2 row in the output collection. Attached is my result after running the above regex.
I managed to get the result for regex using multiple matches using the example of "ABCD1234XYZ6789" as input and extracting the numbers. To do that, you'll have to assume the text pattern is from left to right, and the regex code is as follows:


Code:
(?<myout1>\d+)[A-Z]{3}(?<myout2>\d+)
In this case, I'm specifying digits (one or more) as the pattern to go into my collection row labelled myout1 (which is 1234). This is followed by any A-Z chars after that (in this case only 3 chars). The following digits (again one or more) are identified as the pattern to store into myout2 row in the output collection. Attached is my result after running the above regex.

Thanks for your explanation. But what should I need to do, If I need to find multiple numbers in different format in a paragraph.
Example : If there are multiple numbers in a paragraph (like: claim #12356321-220, Claim No:8522133-456 and Claim No:8522133-456) what code should I need to use to get all the "claim numbers" in the paragraph?. Please advice.
 
Top