Compare two excel worksheets

AmyT

New Member
I would like to compare 2 excel workbooks. Each workbook has a sheet and one column with user id. One workbook is the master file (say file A) and the other is a smaller file (say file B).
I have read each of the workbooks in a collection each . I would like to identify all those users in file B that are not in file A and write it to another excel file.
What is the best way to do this? I am a beginner and would appreciate any help.
Thanks
 

VJR

Well-Known Member
Hi AmyT,

Suppose File A is in ColA (collection) and File B is in ColB.

- You can loop through ColB.
- During the looping, check if each value of ColB is present in ColA. Use the 'Collection Contains Value' action. It returns a True/False Found flag.
If Found is false then copy that value to a new collection ColC.
- After the loop is finished dump the ColC to the desired Excel Workbook. There is an action to do it in one shot.

You can make use of the diagram and the functions shown in the below link.
http://www.rpaforum.net/threads/how-to-get-unique-values-from-a-collection.757/post-1872
 
Last edited:

VJR

Well-Known Member
Hi AmyT,

Just in case if the multiple loopings will slow down the process, is there for any specific purpose you have taken 2 loops since the above link that I suggested has only 1 loop? Or may be you have used it for a different purpose.
 

AmyT

New Member
Apologize for the confusion- I had earlier used 2 loops which is brutally inefficient, I prefer the solution recommended by you and will be updating the process. Thanks for the tip.
 
  • Like
Reactions: VJR
Hi, VJR solution should work fine, i did that once. The problem in my case, was that my files were big, im talking more than 10.000 row each, so the process was taking more time than the available. The solution, at least for me, was the use of a code stage. Basically i pass both files to collections(this collections were the inputs of the code stage) and later go over all of each items using a simple loop and then comparing values, nothing too elaborate or complicated. The time went from 20 mins to just 30 seconds like that.
Hope this can help you too.
Andres.
 

VJR

Well-Known Member
Hi arodriguezm, That's the perfect way to do by making use of the Code stage in scenarios where large data is involved.
Your response will surely be of help to anyone landing on this thread.
 

AmyT

New Member
Hi, VJR solution should work fine, i did that once. The problem in my case, was that my files were big, im talking more than 10.000 row each, so the process was taking more time than the available. The solution, at least for me, was the use of a code stage. Basically i pass both files to collections(this collections were the inputs of the code stage) and later go over all of each items using a simple loop and then comparing values, nothing too elaborate or complicated. The time went from 20 mins to just 30 seconds like that.
Hope this can help you too.
Andres.
Thanks for the tip.
 
Top