Enhancement to Code under "Utility- File Management" VBO to Get Pages and Word count

Rushitv

New Member
Hello All,

I am trying to get the Pages and Word count from the given folder "Get Files" action in "Utility- File Management" VBO. In Patterns CSV I'm simply using "*."* to get all files but in collection out I'm not getting "Page" and "Word count"column.

I checked the Code stage for Get Files and these two parameters are not defined in it.

Can someone please suggest an enhancement required to Code stage to get these columns in output collections?

Please refer below image for more details.


View attachment 1561011989671.png
 

joshwgray

New Member
That object uses the FileInfo class which returns properties that enable you to retrieve certain information about a file. Page count and word count are not one of those properties. To extend that object to get the word count you could try:
  1. Read the text:
    C#:
    var fileContents = File.ReadAllText(file.FullName);
  2. Use something like Regex to get the the word count:
    C#:
    MatchCollection collection = Regex.Matches(fileContents, @"[\S]+");var wordCount = collection.Count;
  3. To get the page count you could add this method to the global code of the object, and call it in the getFiles action:
    C#:
    public static int getNumberOfPages(string pathDocument){
        using (StreamReader sr = new StreamReader(File.OpenRead(pathDocument)))
        {
            return new Regex(@"/Type\s*/Page[^s]").Matches(sr.ReadToEnd()).Count;
        }
    }
 
Last edited:

Sachin_Kharmale

Active Member
Hi Rushtiv,

for adding page and word count column in the Files Collection Replace the code with bellow code,
Dim objTable As DataTable
Dim objRow As DataRow
Dim aFiles As FileInfo()
Dim aPatterns As String()
Dim oDirectory As New DirectoryInfo(Folder)

Try

'Create a data table to output as a collection
objTable = GetDataTable( _
"Path," _
& "Folder," _
& "Name," _
& "Extension," _
& "Created," _
& "Last Accessed," _
& "Last Written," _
& "Read Only," _
& "Bytes," _
& "Pages," _
& "Word Count", _
"System.String," _
& "System.String," _
& "System.String," _
& "System.String," _
& "System.DateTime," _
& "System.DateTime," _
& "System.DateTime," _
& "System.Boolean," _
& "System.Double," _
& "System.String," _
& "System.String")

Patterns_CSV = Patterns_CSV.replace("\,", "?")
aPatterns = Patterns_CSV.split(",")

For each sPattern As String in aPatterns
sPattern = sPattern.replace("?", ",")
aFiles = oDirectory.GetFiles(sPattern.Trim)
For each oFile as FileInfo in aFiles
objRow = objTable.NewRow()
objRow("Path") = oFile.FullName
objRow("Folder") = oFile.DirectoryName
objRow("Name") = oFile.Name
objRow("Extension") = oFile.Extension
objRow("Created") = oFile.CreationTimeUtc
objRow("Last Accessed") = oFile.LastAccessTimeUtc
objRow("Last Written") = oFile.LastWriteTimeUtc
objRow("Read Only") = oFile.IsReadOnly
objRow("Bytes") = oFile.Length
objRow("Pages") = ""
objRow("Word Count") = ""
objTable.Rows.Add(objRow)
Next
Next

objTable = objTable.DefaultView.ToTable(True, _
"Path", _
"Folder", _
"Name", _
"Extension", _
"Created", _
"Last Accessed", _
"Last Written", _
"Read Only", _
"Bytes", _
"Pages", _
"Word Count")
Files = objTable
Success = True
Message = ""
Catch e As Exception
Success = False
Message = e.Message
End Try


Then you will get output Like
View attachment 1561021563095.png


I Hope it will help you.
 

Sachin_Kharmale

Active Member
In file Object we have not such a method to get page or Word Count.

is your input folder have same type of files like Word or Text file ?

if Yes then we need to handle this case in another ways because as per your first screen shots all are word files.
 

Sachin_Kharmale

Active Member
Then you need to use the above code to get file names from Source Directory
it will return Files Collection
1. Loop the Output Files collection
2. Using Ms Word VBO Get the Page Count and Word Count for Single file one by one
3. Add page count and word count values to the files collection
4.End of File Loop
Finally you will get the result with files details with page and word count
 
Last edited:
Top