Getting number from line of text in word doc

jakeg023

New Member
Hey All,

I'm trying to create a decision making process for a word document. The idea is to determine if a candidate for a job posting has the required number of years of experience in their resume. The resumes are standardized and total years of experience is pretty easy to find. What I would like to do is have the Word VBO read in the line of the word document and then if the number of years is over a certain number pass that resume through. Any ideas how to perform this? I know the Word VBO has the find text option, but I'm not sure how I can implement this correctly.

Thanks All!
 

Rich

Member
Hi @jakeg023

Not sure of functionality within the MS word VBO but if you can extract the line you’re looking to search you have a couple of options.

If the number is in the same position of the line of text 100% of the time you could use a simple Mid function in a calculation stage to get the number from its fixed position.

If the number isn’t always in the same position you might want to consider using a Regular Expression to extract the number. See this post for details on how to do that:
http://www.rpaforum.net/threads/how-to-use-the-extract-regex-values-action-of-utility-strings.681/

Hope that helps,
Rich
 

tmaaranen

New Member
And as a third option if your document length varies you can find a text lets say ”Work experience in years:” using instr to get the position of said text. Add 25-26 to that number (characters in Work exp...). Now you got a position of your data and you can Mid it like the previous poster suggested.
 

jakeg023

New Member
Thanks for the replies everyone. Tmaaranen I think your example seems a bit more simple, could you expand on what you mean by instr? I would ideally like to find "number years of experience: xx", select that, then trim it down to only the years so I could bring in a decision stage based on those numbers. Thanks in advance!
 
Mid("Example",3,3) will result in "amp". so in your case, lets say that "number years of experience: xx" is stored in Variable1. Then your Mid function would look like - Mid(variable1, 28, 2) since "number years of experience: " is 28 bytes long.
 

jakeg023

New Member
yeah I’m pretty familiar with parsing from an already selected string of text, I’m just trying to figure out how to pull out this string of text. Number of years of experience will be changing depending on the resume so I can’t just “find text” for a specific number of years. I know regex is probably the easiest way, but I’ve never worked with it before so trying to figure out how do it without that.
 

VJR

Well-Known Member
Hi jakeg023,

You will need to tweak the code as per your requirements, but below is the basic idea of what you can do.

Refer the attachments-
1. Process Diagram

2. Input Word File: This is how the Input Word file looks like.
Note that there is space after each colon ":"
Your case may be different.

3. Find Text Input params: Find the desired text
3. Find Text Output params: Return a flag whether text is found or not.

The Decision stage checks whether this flag is True or not.
You can take the required action on the Resume if flag is False.
Continue only if flag is True.

4. Select All: Select the entire text in the document

5. Get Clipboard: Get the text from the Clipboard that was copied using the 'Copy to Clipboard' action.

6. Calcs stage - Get years of experience: Find the position of the text using Instr and use Mid to fetch the years of experience.
"Total years of experience:" -> This string has 26 characters. And Note that there is a space after the colon.
The Instr returns the starting position of the first letter of this string which is "T". So if the starting position is 1 then you will need to add 27 characters [26 (above string length) + 1 (space)] in order to get the numbers.
From that position we are fetching a length of 5 characters so as to accommodate the 2 digits before and after the decimal point ??.?? (these are 5 characters)
You can adjust this as per your requirement. You can also make use of the Trim function.

7. Clipboard after running process: This is how the Clipboard looks when the data from the file is copied onto it.

8. Total Years after running process: The number of years returned in the 'Total Years' data item after running the process.
 

Attachments

  • 1. Process Diagram.JPG
    1. Process Diagram.JPG
    46.1 KB · Views: 111
  • 2. Input Word File.JPG
    2. Input Word File.JPG
    15.9 KB · Views: 121
  • 3. Find Text Input params.JPG
    3. Find Text Input params.JPG
    39.7 KB · Views: 126
  • 3. Find Text Output params.JPG
    3. Find Text Output params.JPG
    27.1 KB · Views: 119
  • 4. Select All.JPG
    4. Select All.JPG
    32 KB · Views: 108
  • 5. Get Clipboard.JPG
    5. Get Clipboard.JPG
    29 KB · Views: 111
  • 6. Calcs stage - Get years of experience.JPG
    6. Calcs stage - Get years of experience.JPG
    30.5 KB · Views: 114
  • 7. Clipboard after running process.JPG
    7. Clipboard after running process.JPG
    31.5 KB · Views: 107
  • 8. Total Years after running process.JPG
    8. Total Years after running process.JPG
    29.4 KB · Views: 98

jakeg023

New Member
VJR, this worked perfectly! Thanks so much. Was also able to modify and extract names, former employment dates, etc. I really appreciate the detailed response.
 

jakeg023

New Member
VJR one more question. Is there anyway to make the calc stage more variable? I'm thinking for employee name. I would like it to be able to recognize John Smith as well as a name longer, without spying any text after.
 

VJR

Well-Known Member
This is tricky and will depend on a several factors like what all is present after that name- a newline character, a tab, or another line of text, or something else.

Screenshot #1. If the Word file contains a long name as this

2. The Calc stage will need to contain something like
Mid([Clipboard], InStr([Clipboard], "Name:") + 6, InStr([Clipboard], "Address:") - InStr([Clipboard], "Name:") - 6)

- Since "Name:" is 5 characters long, the Mid function above starts from 5 + 1 (Space)=6.
- Assuming there is an "Address:" after the Emp name, we are finding its position using InStr([Clipboard], "Address:")
- Then read anything that's between the actual name and the "Address:" heading
- Keep both the numbers shown in bold as same.

3. You can see after running the process, the long name is retrieved.

But you can also see that the piped cursor is on the next line which implies there is a new line character after the name and before the address that was present in the Word file and that is also retrieved in the final Name Data Item.
If you want to ignore it then nothing much to do it.
If you want to get rid of it then you need to use different options like-
- Find what kind of newline character it is in the Word file - A carriage return (Cr) or a Line Feed (Lf) or both (CrLf)
- Either find if there is a way to use the BP Replace function to Replace that new line character with blank
OR use those many number of bytes while passing in the above mentioned Mid function. For eg; if there is a Cr and a Lf then InStr([Clipboard], "Name:") - 8 should get rid of the last two bytes and remove the CrLf giving you just the Employee name.

Refer below two screenshots :

With InStr([Clipboard], "Name:") - 8

4. For removing newline.JPG

Cursor moved up to the end of the name:

5. After removing the Newline.JPG
 

Attachments

  • 1. Word file.JPG
    1. Word file.JPG
    23.8 KB · Views: 17
  • 2. Calc stage - emp name.JPG
    2. Calc stage - emp name.JPG
    33.4 KB · Views: 17
  • 3. Emp Name Data item.JPG
    3. Emp Name Data item.JPG
    26.6 KB · Views: 15

PakamSuman

New Member
Hi,

Pls help me,how to get the text from the word document in blue prism.

Thanks,
Pakam Suman
 

Attachments

  • Invoice Sample.jpg
    Invoice Sample.jpg
    41.4 KB · Views: 22
Hey All,

I'm trying to create a decision making process for a word document. The idea is to determine if a candidate for a job posting has the required number of years of experience in their resume. The resumes are standardized and total years of experience is pretty easy to find. What I would like to do is have the Word VBO read in the line of the word document and then if the number of years is over a certain number pass that resume through. Any ideas how to perform this? I know the Word VBO has the find text option, but I'm not sure how I can implement this correctly.

Thanks All!
hi, plz i want to know if you resolve the problem . thank you
 
Top