Reading text files with specific prefix - vba

I have a folder with lots of text files each containing (but in random order) :
A = ...
B = ...
C = ...
Now I would like to import these text files into an excel-spreadsheet,
where each of the prefixes is organized in the colums, and the files are listed as rows
Example: 2 files
File 1:
A = 1
B = 2
C = 3
File 2:
A = 4
B = 5
C = 6
I would the excel to look like :
NR / A / B / C
1 / 1 /2 /3
2 / 4/ 5 /6
I am still learning VB, and this is just a bit over the top for me.
I have found a macro like this:
Sub Read_Text_Files()
Dim sPath As String, sLine As String
Dim oPath As Object, oFile As Object, oFSO As Object
Dim r As Long
'Files location
sPath = "C:\Test\"
r = 1
Set oFSO = CreateObject( _
"Scripting.FileSystemObject")
Set oPath = oFSO.GetFolder(sPath)
Application.ScreenUpdating = False
For Each oFile In oPath.Files
If LCase(Right(oFile.Name, 4)) = ".txt" Then
Open oFile For Input As #1 ' Open file for input.
Do While Not EOF(1) ' Loop until end of file.
Input #1, sLine ' Read data
If Left(sLine, 1) = "A=" Then 'Now i need to write this to the first column of that row
If Left(sLine, 1) = "B=" Then 'For the second column.
Range("A" & r).Formula = sLine ' Write data line
r = r + 1
Loop
Close #1 ' Close file.
End If
Next oFile
Application.ScreenUpdating = True
End Sub

Do you know how to open files in VBA for reading using syntax like Open and Line Input?
If not, read this: https://stackoverflow.com/a/11528932/2832561
I found this by googling for "VBA open file read"
Do you know how to work with and parse strings (and arrays) using functions like Mid, Left, Right, Split and Join?
If not, try reading this: http://www.exceltrick.com/formulas_macros/vba-split-function/
I found this by googling for "VBA String functions parse text"
Do you know how to work with Workbook and Worksheet objects and assign values to Range objects in Excel?
If not, try reading this: http://www.anthony-vba.kefra.com/vba/vbabasic2.htm
I found this by googling for "Workbook Worksheet Range VBA"
Once you have had a chance to try putting together a solution using these pieces, you can post specific questions on any issues you run into.

Related

Inconsistent page count of a PDF document

I'm trying to get the number of pages in the PDF document. Some of my PDFs are created in Word (saved as PDF), some of them are Xeroxed into the directory (not sure if this matters).
After hours of research I've come to find out that this is easier said than done. The page count rarely comes back giving me the correct number of pages, even though most PDF's do in fact have /Count inside the Binary Code.
For example I've used the following code; it is supposed to open the document in Binary Mode, look for /Count or /N and get the number next to it which is supposed to give me the page count.
Public Sub pagecount(sfilename As String)
On Error GoTo a
Dim nFileNum As Integer
Dim s As String
Dim c As Integer
Dim pos, pos1 As Integer
pos = 0
pos1 = 0
c = 0
' Get an available file number from the system
nFileNum = FreeFile
'OPEN the PDF file in Binary mode
Open sfilename For Binary Lock Read Write As #nFileNum
' Get the data from the file
Do Until EOF(nFileNum)
Input #1, s
c = c + 1
If c <= 10 Then
pos = InStr(s, "/N")
End If
pos1 = InStr(s, "/count")
If pos > 0 Or pos1 > 0 Then
Close #nFileNum
s = Trim(Mid(s, pos, 10))
s = Replace(s, "/N", "")
s = Replace(s, "/count", "")
s = Replace(s, " ", "")
s = Replace(s, "/", "")
For i = 65 To 125
s = Replace(s, Chr(i), "")
Next
pages = Val(Trim(s))
If pages < 0 Then
pages = 1
End If
Close #nFileNum
Exit Sub
End If
'imp only 1000 lines searches
If c >= 1000 Then
GoTo a
End If
Loop
Close #nFileNum
Exit Sub
a:
Close #nFileNum
pages = 1
Exit Sub
End Sub
However, most of the time, it defaults to pages = 1 (under a:).
I've also updated this to 10000 to be sure that it hits the /Count line, yet it still does not give me the correct count.
If c >= 10000 Then
GoTo a
End If
I also came across this reddit
Is there another way to do this, something I can utilize in my app?
Any help is greatly appreciated.
Background:
This is for a legacy vb6 app where I'm attempting to let the user manipulate the PDF files. I added a ListBox that displays all PDF documents in a particular directory. When user double clicks on any one of the files, i display it in a WebBrowser component inside my application.
EDIT: Image containing the BinaryMode line Count for 3 different documents:
I double checked the page count, and /Count displays the correct page count for each of the three documents.
Regular expressions have limits, but I prefer to use them for searching for strings and I think this would be a good place to use one. You may want to play with the pattern because I did this relatively quickly with only a little testing.
Add a reference to Microsoft VBScript Regular Expressions 5.5 to your project. Then you can try the sample code below.
Private Sub Command1_Click()
Dim oRegEx As RegExp
Dim fHndl As Integer
Dim sContents As String
Dim oMatches As MatchCollection
On Error GoTo ErrCommand1_Click
'Open and read in the file
fHndl = FreeFile
Open some pdf file For Binary Access Read As fHndl
sContents = String(LOF(fHndl), vbNull)
Get #fHndl, 1, sContents
Close #fHndl 'We have the file contents so close it
fHndl = 0
'Instantiate and configure the RegEx
Set oRegEx = New RegExp
oRegEx.Global = True
oRegEx.Pattern = "((?:/Count )(\d+))"
Set oMatches = oRegEx.Execute(sContents)
'Look for a match
If oMatches.Count > 0 Then
If oMatches(0).SubMatches.Count > 0 Then
MsgBox CStr(oMatches(0).SubMatches(0)) & " Pages"
End If
End If
Exit Sub
ErrCommand1_Click:
Debug.Print "Error: " & CStr(Err.Number) & ", " & Err.Description
If Not oRegEx Is Nothing Then Set oRegEx = Nothing
If Not oMatches Is Nothing Then Set oMatches = Nothing
End Sub
An explanation of the RegEx pattern:
() creates a group
?: inside the parenthesis makes the group non-capturing
<</Linearized is a literal string
.* greedy quantifier, match any character 0 or more times
/N literal string
\d+ greedy qualtifier, match digits 1 or more times
>> literal string

Writing Fixed width text files from excel vba

This is the output of a program.
I have specified what shall be width of each cell in the program and my program shows correct output.
What I want to do is cell content shall be written from right to left. E.g highlighted figure 9983.54 has width of 21. Text file has used first 7 columns. But I want it to use last 7 columns of text file.
Please see expected output image.
I am not getting any clue how to do this. I am not a very professional programmer but I love coding. This text file is used as input to some other program and i am trying to automate writing text file from excel VBA.
Can anyone suggest a way to get this output format?
Here is the code which gave me first output
Option Explicit
Sub CreateFixedWidthFile(strFile As String, ws As Worksheet, s() As Integer)
Dim i As Long, j As Long
Dim strLine As String, strCell As String
'get a freefile
Dim fNum As Long
fNum = FreeFile
'open the textfile
Open strFile For Output As fNum
'loop from first to last row
'use 2 rather than 1 to ignore header row
For i = 1 To ws.Range("a65536").End(xlUp).Row
'new line
strLine = ""
'loop through each field
For j = 0 To UBound(s)
'make sure we only take chars up to length of field (may want to output some sort of error if it is longer than field)
strCell = Left$(ws.Cells(i, j + 1).Value, s(j))
'add on string of spaces with length equal to the difference in length between field length and value length
strLine = strLine & strCell & String$(s(j) - Len(strCell), Chr$(32))
Next j
'write the line to the file
Print #fNum, strLine
Next i
'close the file
Close #fNum
End Sub
'for example the code could be called using:
Sub CreateFile()
Dim sPath As String
sPath = Application.GetSaveAsFilename("", "Text Files,*.txt")
If LCase$(sPath) = "false" Then Exit Sub
'specify the widths of our fields
'the number of columns is the number specified in the line below +1
Dim s(6) As Integer
'starting at 0 specify the width of each column
s(0) = 21
s(1) = 9
s(2) = 15
s(3) = 11
s(4) = 12
s(5) = 10
s(6) = 186
'for example to use 3 columns with field of length 5, 10 and 15 you would use:
'dim s(2) as Integer
's(0)=5
's(1)=10
's(2)=15
'write to file the data from the activesheet
CreateFixedWidthFile sPath, ActiveSheet, s
End Sub
Something like this should work:
x = 9983.54
a = Space(21-Len(CStr(x))) & CStr(x)
Then a will be 14 spaces followed by x:
a = " 9983.54"
Here 21 is the desired column width --- change as necessary. CStr may be unnecessary for non-numeric x.
If you're going to right-justify a lot of different data to different width fields you could write a general purpose function:
Function LeftJust(val As String, width As Integer) As String
LeftJust = Space(width - Len(val)) & val
End Function
The you call it with LeftJust(CStr(9983.54), 21).
Also note that VBA's Print # statement has a Spc(n) parameter that you can use to produce fixed-width output, e.g., Print #fNum, Spc(n); a; before this statement you calculate n: n = 21-Len(CStr(a)).
Hope that helps

Excel VBA user defined function to find images in folder (match excel names to folder names of images)

Currently i am using a function to match image names from excel sheet to image folder, but i want one more thing... that if i save image and forget to add its name in excel then it should show me that i forget to add name.
for example if i save 3 images in image folder
16095_1.jpg,16095_2.jpg,16095_3.jpg
and i add image names in excel sheet as
16095_1.jpg,16095_2.jpg
then it should warn me that i forget one image name in excel cell.
my image name format is - 16095_1.jpg,16095_2.jpg
function i am using is...
Function findimage(Path As String, ImageList As String)
Dim results
Dim x As Long
Dim dc 'double comma
results = Split(ImageList, ",")
If Not Right(Path, 1) = "\" Then Path = Path & "\"
For x = 0 To UBound(results)
results(x) = Len(Dir(Path & results(x))) > 0
Next
dc = InStr(ImageList, ",,")
If dc = 0 Then
findimage = Join(results, ",")
Else
findimage = ("Double_comma")
End If
End Function
This function takes a folder path and a variable number of patterns (See MSDN - Parameter Arrays (Visual Basic)). Using the MSDN - Dir Function to iterates over the file names in the folder path and compares them against the patterns with the MSDN - Like Operator (Visual Basic) to count the number of files that match the patterns.
Usage:
getFileCount("C:\Users\Owner\Pictures",".gif",".png")
getFileCount("C:\Users\Owner\Pictures","*.gif"
getFileCount("C:\Users\Owner\Pictures","apple_.gif","banana_.gif", "orange_##.*")
getFileCount("C:\Users\Owner\Pictures","#####_#.gif")
Function getFileCount(DirPath As String, ParamArray Patterns() As Variant) As Integer
Dim MyFile As String
Dim count As Integer, x As Long
If Not Right(DirPath, 1) = "\" Then DirPath = DirPath & "\"
MyFile = Dir(DirPath, vbDirectory)
Do While MyFile <> ""
For x = 0 To UBound(Patterns)
If MyFile Like Patterns(x) Then
count = count + 1
Exit For
End If
Next
MyFile = Dir()
Loop
getFileCount = count
End Function

VBA code to delete files in a directory that contains specific characters

I need help in a VBA macro that'll delete files in a directory that contains more than 2 "_" and is older than 3 months old, however there are some folders & sub folders in the directory that must not be touched or modified.
E.g, Hi_Thanks_for_your_help or Hi_Thank_You etc.
Const DIR = "x"
Const MAX_AGE = 3 ' Unit: Months
Dim oFSO
Dim aExclude
Sub XLS()
aExclude = Array("x")
Set oFSO = CreateObject("Scripting.FilesystemObject")
deleteFiles oFSO.GetFolder(DIR)
Set oFSO = Nothing
End Sub
'=================================
Function isExclude(sPath)
Dim s, bAns
bAns = False
For Each s In aExclude
If InStr(1, sPath, s, vbTextCompare) = 1 Then
bAns = True
Exit For
End If
Next
isExclude = bAns
End Function
'=================================
Function isOldFile(fFile)
' Old file if "MAX_AGE" months before today is greater than the file modification time
isOldFile = (DateAdd("m", -MAX_AGE, Date) > fFile.DateLastModified)
End Function
This is the furthest i got with a code, what i'm lacking is how to check if a file name consists more than 2 "_" and if so & it's older than 3 months old = delete.
Thanks in advance! Cheers!
Dim pathname As String = ""
If fileNameCount("file_name") And DateDiff("m", NOW(), FileDateTime(pathname)) > 3 Then ' if '_' is more than 2 count and more than 3 months old, then delete
' if true delete file codes starts here
......
End If
Public Function fileNameCount(filename As String) As Boolean
fileNameCount = False
Dim count As Long
Dim temp() As String
temp = Split(filename, "_")
count = UBound(temp, 1)
If (count > 2) Then
fileNameCount = True
End If
End Function
I have written portion of the codes for you, the method fileNameCount will return you true / false for number of counts of '_', I'm using DateDiff to get the difference of the month of the file. Therefore I'm detecting on the both conditions, if both statement are true condition then you should proceed on with your deletion of file codes which I didn't write for that.
What you need to do is
1) Pass in the "file_name" argument which you need to think on how to get the file name
2) Pass in the right pathname of the file
3) Write the code for deletion of files
Anyway, I didn't test out the code so it might have some error(s). Hope this will help what you're trying to do.
To get the amount of "_" in a file, I would use something similar to this:
Dim a
Dim c As Integer
a = Split("File_Name_Here", "_")
c = Ubound(a)
Using this, you know that if the filename gets split into 3 or more substrings, there were 2 "_" in the filename. As for the age of the file, FileDateTime("FilePath") will get you the created date or the last modified date.

Function to count number of lines in a text file

Need a function that will accept a filename as parameter and then return the number of lines in that file.
Should be take under 30 seconds to get the count of a 10 million line file.
Currently have something along the lines of - but it is too slow with large files:
Dim objFSO, strTextFile, strData, arrLines, LineCount
CONST ForReading = 1
'name of the text file
strTextFile = "sample.txt"
'Create a File System Object
Set objFSO = CreateObject("Scripting.FileSystemObject")
'Open the text file - strData now contains the whole file
strData = objFSO.OpenTextFile(strTextFile,ForReading).ReadAll
'Split by lines, put into an array
arrLines = Split(strData,vbCrLf)
'Use UBound to count the lines
LineCount = UBound(arrLines) + 1
wscript.echo LineCount
'Cleanup
Set objFSO = Nothing
If somebody still looking for faster way, here is the code:
Const ForAppending = 8
Set fso = CreateObject("Scripting.FileSystemObject")
Set theFile = fso.OpenTextFile("C:\textfile.txt", ForAppending, Create:=True)
WScript.Echo theFile.Line
Set Fso = Nothing
Of course, the processing time depend very much of the file size, not only of the lines number. Compared with the RegEx method TextStream.Line property is at least 3 times quicker.
The only alternative I see is to read the lines one by one (EDIT: or even just skip them one by one) instead of reading the whole file at once. Unfortunately I can't test which is faster right now. I imagine skipping is quicker.
Dim objFSO, txsInput, strTemp, arrLines
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
strTextFile = "sample.txt"
txsInput = objFSO.OpenTextFile(strTextFile, ForReading)
'Skip lines one by one
Do While txsInput.AtEndOfStream <> True
txsInput.SkipLine ' or strTemp = txsInput.ReadLine
Loop
wscript.echo txsInput.Line-1 ' Returns the number of lines
'Cleanup
Set objFSO = Nothing
Incidentally, I took the liberty of removing some of your 'comments. In terms of good practice, they were superfluous and didn't really add any explanatory value, especially when they basically repeated the method names themselves, e.g.
'Create a File System Object
... CreateObject("Scripting.FileSystemObject")
Too large files...
The following is the fastest-effeciently way I know of:
Dim oFso, oReg, sData, lCount
Const ForReading = 1, sPath = "C:\file.txt"
Set oReg = New RegExp
Set oFso = CreateObject("Scripting.FileSystemObject")
sData = oFso.OpenTextFile(sPath, ForReading).ReadAll
With oReg
.Global = True
.Pattern = "\r\n" 'vbCrLf
'.Pattern = "\n" ' vbLf, Unix style line-endings
lCount = .Execute(sData).Count + 1
End With
WScript.Echo lCount
Set oFso = Nothing
Set oReg = Nothing
You could try some variation on this
cnt = 0
Set fso = CreateObject("Scripting.FileSystemObject")
Set theFile = fso.OpenTextFile(filespec, ForReading, False)
Do While theFile.AtEndOfStream <> True
theFile.SkipLine
c = c + 1
Loop
theFile.Close
WScript.Echo c,"lines"
txt = "c:\YourTxtFile.txt"
j = 0
Dim read
Open txt For Input As #1
Do While Not EOF(1)
Input #1, read
j = j + 1
Loop
Close #1
If it adds an empty last line the result is (j - 1).
It works fine for one column in the txt file.
How to count all lines in the notepad
Answers:
=> Below is the code -
Set t1=createObject("Scripting.FileSystemObject")
Set t2=t1.openTextFile ("C:\temp\temp1\temp2_VBSCode.txt",1)
Do Until t2.AtEndOfStream
strlinenumber = t2.Line
strLine = t2.Readline
Loop
msgbox strlinenumber
t2.Close
I was looking for a faster way than what I already had to determine the number of lines in a text file. I searched the internet and came across 2 promising solution. One was a solution based on SQL thew other the solution I found here based on Fso by Kul-Tigin. I tested them and this is part of the result:
Number of lines Time elapsed Variant
--------------------------------------------------------
110 00:00:00.70 SQL
110 00:00:00.00 Vanilla VBA (my solution)
110 00:00:00.16 FSO
--------------------------------------------------------
1445014 00:00:17.25 SQL
1445014 00:00:09.19 Vanilla VBA (my solution)
1445014 00:00:17.73 FSO
I ran this several times with large and small numbers. Time and again the vanilla VBA came out on top. I know this is far out of date, but for anyone still looking for the fastest way to determine the number of lines in a csv/text file, down here's the code I use.
Public Function GetNumRecs(ASCFile As String) As Long
Dim InStream As Long
Dim Record As String
InStream = FreeFile
GetNumRecs = 0
Open ASCFile For Input As #InStream
Do While Not EOF(InStream)
Line Input #InStream, Record
GetNumRecs = GetNumRecs + 1
Loop
Close #InStream
End Function