How do I set max character length of filenames? - vb.net

I am currently using the following code to remove the defined special characters and white spaces from any file names within my defined directory:
For Each file As FileInfo In files
newName = Regex.Replace(file.Name, "[!##$%^&*()_ ]", "")
If (file.Name <> newName) Then
newPath = Path.Combine(dir, newName)
file.CopyTo(newPath)
End If
Next
Edit: How do I trim the characters of the new file name (newName) to all but the first 26 characters?
Answer:
For Each file As FileInfo In files
If (file.Name.Length >= 36) Then
Dim maxLen As Integer = 26 - file.Extension.Length
newName = ${Regex.Replace(Path.GetFileNameWithoutExtension(file.Name), "[!##$%^&*()_ ]",
"").Substring(0, maxLen)}{file.Extension}"
newPath = Path.Combine(dir, newName)
file.CopyTo(newPath, True)
ElseIf (file.Name.Length < 36) Then
newName = Regex.Replace(file.Name, "[!##$%^&*()_ ]", "")
If (file.Name <> newName) Then
newPath = Path.Combine(dir, newName)
file.CopyTo(newPath)
End If
End If
Next

You can use Linq as follows:
Dim dir = "c:\myFolder"
Dim except = "[!##$%^&*()_ ]".ToArray
For Each file As FileInfo In files
Dim maxLen As Integer = 26 - file.Extension.Length
Dim newPath = Path.Combine(dir,
$"{New String(Path.GetFileNameWithoutExtension(file.Name).
ToCharArray.
Except(except).
Take(maxLen).
ToArray)}{file.Extension}")
file.CopyTo(newPath, True)
Next
Suppose you have a file with name:
abcdefg_hijk!lm#pno#pq%r(stuvy)x$z.dbf
The newPath output will be:
c:\myFolder\abcdefghijklmpnoqrstuv.dbf
If that is what you need to do.
Edit:
Alternative using RegEx:
Dim except = "[!##$%^&*()_ ]"
For Each file As FileInfo In files
Dim maxLen As Integer = 26 - file.Extension.Length
Dim newName = $"{Regex.Replace(Path.GetFileNameWithoutExtension(file.Name),
except,
"").Substring(0, maxLen)}{file.Extension}"
Dim newPath = Path.Combine(dir, newName)
file.CopyTo(newPath, True)
Next
So the newPath for a file with name:
n_BrucesMiddle NH 12 34 5 W3_H.dbf
... will be:
c:\myFolder\nBrucesMiddleNH12345W3.dbf
The unwanted characters have been removed and the maximum length of the new file name (newName) including the extension is 26.
Here's regex101 example.
Again, if that is what you need. Good luck.

Use String.Remove
newName = newName.Remove(26)
Note that: string length should be greater than or equal to 26
EDIT:
If you want the extension to remain. use this instead:
newName = newName.Remove(26, newName.length - 30)

To rename files you can use .MoveTo method.
From the docs:
Moves a specified file to a new location, providing the option to
specify a new file name.
You probably want to rename only the "name" part so extension remain unchanged.
This approach will support any file extension (not only extension with 3 characters)
For Each file As FileInfo In files
Dim newName As String = Path.GetFileNameWithoutExtension(file.Name).Remove(26)
Dim newPath = Path.Combine(file.DirectoryName, $"{newName}{file.Extension}")
file.MoveTo(newPath)
Next

If you are asking how to get a subset of a String, you would get the Substring. If you want the first 26 characters it would be the second overload.
Example:
Dim filename As String = "this is a terrible f^ln#me! that is (way) too long.txt"
Dim newFilename As String = Regex.Replace(filename, "[!##$%^&*()_ ]", String.Empty).Substring(0, 26)
Live Demo: Fiddle
Update
As per your comment, here is how you'd trim down just the filename without the extension:
Dim filename As String = "this is a terrible f^ln#me! that is (way) too long.txt"
Dim extension As String = Path.GetExtension(filename)
Dim shortFilename As String = Path.GetFileNameWithoutExtension(filename)
Dim newFilename As String = Regex.Replace(shortFilename, "[!##$%^&*()_ ]", String.Empty).Substring(0, 26) & extension
Live Demo: Fiddle

Related

Extracting and save relevant data from a txt file

I tried to write a code but I didn't succeed at all. Could someone help me please?
I want my program to read the data.txt file
The data.txt file contains:
Name: Christian
Phone: x
Address: x
Name: Alexander
Phone: x
Address: x
I would like the program to save the names in a output file: output_data.txt
The output_data.txt file should be contain:
Christian
Alexander
This is what I have so far:
Using DataReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("data.txt")
Dim DataSaver As System.IO.StreamWriter
DataReader.TextFieldType = FileIO.FieldType.Delimited
DataReader.SetDelimiters("Name: ")
Dim Row As String()
While Not DataReader.EndOfData
Row = DataReader.ReadFields()
Dim DataSplited As String
For Each DataSplited In Row
My.Computer.FileSystem.WriteAllText("output_data.txt", DataSplited, False)
'MsgBox(DataSplited)
Next
End While
End Using
But the output_data.txt file does not save properly to what "MsgBox (DataSplited)" shows. MsgBox(DataSplit) delimits the name by Name: but also shows the rest, such as address, phone. I don't know what the problem is.
this will work:
Dim strFile = "c:\test5\data.txt"
Dim InputBuf As String = File.ReadAllText(strFile)
Dim OutBuf As String = ""
Dim InBufHold() As String = Split(InputBuf, "Name:")
For i = 1 To InBufHold.Length - 1
OutBuf += Trim(Split(InBufHold(i), vbCrLf)(0)) & vbCrLf
Next
File.WriteAllText("c:\test5\output_data.txt", OutBuf)
Dim names = File.ReadLines("data.txt").
Where(Function(line) line.StartsWith("Name: ")).
Select(Function(line) line.SubString(5).Trim())
File.WriteAllText("output_data.txt", String.Join(vbCrLf, names))

How can I format value between 2nd and 4th underscore in the file name?

I have VBA code to capture filenames to a table in an MS Access Database.
The values look like this:
FileName
----------------------------------------------------
WC1603992365_Michael_Cert_03-19-2019_858680723.csv
WC1603992365_John_Non-Cert_03-19-2019_858680722.csv
WC1703611403_Paul_Cert_03-27-2019_858679288.csv
Each filename has 4 _ underscores and the length of the filename varies.
I want to capture the value between the 2nd and the 3rd underscore, e.g.:
Cert
Non-Cert
Cert
I have another file downloading program, and it has "renaming" feature with a regular expression. And I set up the following:
Source file Name: (.*)\_(.*)\_(.*)\_(.*)\_\-(.*)\.(.*)
New File Name: \5.\6
In this example, I move the 5th section of the file name to the front, and add the file extension.
For example, WC1603992365_Michael_Cert_03-19-2019_858680723.csv would be saved as 858680723.csv in the folder.
Is there a way that I can use RegEx to capture 3rd section of the file name, and save the value in a field?
I tried VBA code, and searched SQL examples, but I did not find any.
Because the file name length is not fixed, I cannot use LEFT or RIGHT...
Thank you in advance.
One possible solution is to use the VBA Split function to split the string into an array of strings using the underscore as a delimiter, and then return the item at index 2 in this array.
For example, you could define a VBA function such as the following, residing in a public module:
Function StringElement(strStr, intIdx As Integer) As String
Dim strArr() As String
strArr = Split(Nz(strStr, ""), "_")
If intIdx <= UBound(strArr) Then StringElement = strArr(intIdx)
End Function
Here, I've defined the argument strStr as a Variant so that you may pass it Null values without error.
If supplied with a Null value or if the supplied index exceeds the bounds of the array returned by splitting the string using an underscore, the function will return an empty string.
You can then call the above function from a SQL statement:
select StringElement(t.Filename, 2) from Filenames t
Here I have assumed that your table is called Filenames - change this to suit.
This is the working code that I completed. Thank you for sharing your answers.
Public Function getSourceFiles()
Dim rs As Recordset
Dim strFile As String
Dim strPath As String
Dim newFileName As String
Dim FirstFileName As String
Dim newPathFileName As String
Dim RecSeq1 As Integer
Dim RecSeq2 As Integer
Dim FileName2 As String
Dim WrdArrat() As String
RecSeq1 = 0
Set rs = CurrentDb.OpenRecordset("tcsvFileNames", dbOpenDynaset) 'open a recordset
strPath = "c:\in\RegEx\"
strFile = Dir(strPath, vbNormal)
Do 'Loop through the balance of files
RecSeq1 = RecSeq1 + 1
If strFile = "" Then 'If no file, exit function
GoTo ExitHere
End If
FirstFileName = strPath & strFile
newFileName = strFile
newPathFileName = strPath & newFileName
FileName2 = strFile
Dim SubStrings() As String
SubStrings = Split(FileName2, "_")
Debug.Print SubStrings(2)
rs.AddNew
rs!FileName = strFile
rs!FileName68 = newFileName 'assign new files name max 68 characters
rs!Decision = SubStrings(2) 'extract the value after the 3rd underscore, and add it to Decision Field
rs.Update
Name FirstFileName As newPathFileName
strFile = Dir()
Loop
ExitHere:
Set rs = Nothing
MsgBox ("Directory list is complete.")
End Function

Delete specific symbol and number at the end of filename if exist

My application is downloading many diffrent files from network. There is possibility that some of the files could contain additional number within brackets like below:
report78-12-34-34_ex 'nothing to be removed
blabla3424dm_d334(7) '(7) - to be removed
erer3r3r3_2015_03_03-1945-user-_d334(31).xml '(31) - to be removed
group78-12-34-34_ex.html 'nothing to be removed
somereport5_6456 'nothing to be removed
As you see if (number) appear within filename it has to be removed. Do you have some nice secure method which could do the job?
I got some code from rakesh but it is not working when string doesn't contain (number):
string test="something(3)";
test=Regex.Replace(test, #"\d", "").Replace("()","");
Not working when e.g:
if i place file like this: UIPArt3MilaGroupUIAPO34mev1-mihe-2015_9_23-21_30_5_580.csv then it will show: UIPArtMilaGroupUIAPOmev-mihe--_.csv
And i would prefer not using regex.
Avoids Regex and checks the string inside the parentheses, only removing the substring if the enclosed string is a number.
Private Function NewFileName(ByVal FileName As String) As String
If FileName Like "*(*)*" Then
Try
Dim SubStrings() As String = Split(FileName, "(", 2)
NewFileName = SubStrings(0)
SubStrings = Split(SubStrings(1), ")", 2)
SubStrings(0) = NewFileName(SubStrings(0))
SubStrings(1) = NewFileName(SubStrings(1))
If IsNumeric(SubStrings(0)) Then
NewFileName &= SubStrings(1)
Else
Return FileName
End If
Catch
Return FileName
End Try
Else
Return FileName
End If
End Sub
I would do something like this:
Public Function GetFileName(ByVal fileName As String) As String
Dim lastOpenBracketPos As Integer = fileName.LastIndexOf("(")
Dim lastCloseBracketPos As Integer = fileName.LastIndexOf(")")
If lastOpenBracketPos <> -1 AndAlso lastCloseBracketPos <> -1 AndAlso lastCloseBracketPos > lastOpenBracketPos Then
Dim bracketsText As String = fileName.Substring(lastOpenBracketPos, lastCloseBracketPos-lastOpenBracketPos+1)
If IsNumeric(bracketsText.Trim("(",")")) Then
Return fileName.Replace(bracketsText,"")
End If
End If
Return fileName
End Function
Out of all code here i made out my own one because it has to be ensured that before every playing with filename first has to be checked how many brackets within filename - only if 1 for open and 1 for close bracket is there then go with checking. What do you think is there any issue i don;t see or something which could be tuned up?
Private Function DeleteBrackets(ByVal fn As String) As String
Dim countOpenBracket As Integer = fn.Split("(").Length - 1
Dim countCloseBracket As Integer = fn.Split(")").Length - 1
'-- If only one occurence of ( and one occurence of )
If countOpenBracket = 1 And countCloseBracket = 1 Then
Dim filextension = IO.Path.GetExtension(fn)
Dim filewithoutExtension As String = IO.Path.GetFileNameWithoutExtension(fn)
'Debug.Print("Oryginal file name = " & fn)
'Debug.Print("File name without extension = " & filewithoutExtension)
'Debug.Print("Extension = " & IO.Path.GetExtension(fn))
If filewithoutExtension.EndsWith(")") Then
fn = filewithoutExtension.Remove(filewithoutExtension.LastIndexOf("("))
'Debug.Print("After removing last index of ( = " & fn)
'Debug.Print("Adding again extension = " & fn & filextension)
End If
'Debug.Print(fn)
End If
Return fn
End Function

Find most recent fileS, return the last x number of files IF made within a minute of each other

The situation I'm in is the following:
I need to return the path of the most recent fileS in a folder. The number of files that I need to return is specified by "numberOfFiles" and is in descending order from most recent.
E.g,
File1.doc - Last modified at 8:42:00 PM
File2.doc - Last modified at 8:43:00 PM
File3.doc - Last modified at 8:44:00 PM
numberOfFiles = 2, should return an array of;
File3.doc's path
File2.doc's path
This much is working, with the code below.
Option Explicit
Sub test()
Dim FileName As String
Dim FileSpec As String
Dim MostRecentFile As String
Dim MostRecentDate As Date
Dim Directory As String
Dim resultArray() As String
Dim groupedArray() As String
Dim fileCounter As Integer
Dim groupedArrayCounter As Integer
Dim resultArrayCounter As Integer
Dim i As Integer
Dim numberOfFiles As Integer: numberOfFiles = 2
Directory = "C:\Test\"
FileSpec = "File*.doc"
If Right(Directory, 1) <> "\" Then Directory = Directory & "\"
fileCounter = 0
FileName = Dir(Directory & FileSpec, 0)
If FileName <> "" Then
MostRecentFile = FileName
MostRecentDate = FileDateTime(Directory & FileName)
Do While FileName <> ""
If FileDateTime(Directory & FileName) > MostRecentDate Then
MostRecentFile = FileName
MostRecentDate = FileDateTime(Directory & FileName)
ReDim Preserve resultArray(fileCounter)
resultArray(fileCounter) = FileName
fileCounter = fileCounter + 1
End If
FileName = Dir()
Loop
End If
groupedArrayCounter = 0
resultArrayCounter = UBound(resultArray)
ReDim groupedArray(numberOfFiles - 1)
For i = numberOfFiles To 1 Step -1
groupedArray(groupedArrayCounter) = resultArray(resultArrayCounter)
groupedArrayCounter = groupedArrayCounter + 1
resultArrayCounter = resultArrayCounter - 1
Next i
MsgBox "Done"
End Sub
One last requirement has been put on at the last minute, and I'm not sure how I can achieve it. While I need to be able to return numberOfFiles amount of the most recent files (which works), I must only do so if the files are modified within 60 seconds or less of each other (This also needs to be done in descending order from the most recent - in this example, File3). For example;
If file 2 is made within 60 seconds of file 3, add it to the final array
If file 1 is made within 60 seconds of file 2, add it to the final array
Etc until there are no more files or we have exceeded numberOfFiles
Help greatly appreciated
Edit:
I know this can be done somehow using DateDiff("s", var1, var2), I'm just not entirely sure how the logic will work going in descending order starting from the uBound of my array

Split string from file

I have a file in which the name of a book and its author is on each line. (EX: "Douglas Adams,The Hitchhiker's Guide To The Galaxy" is one line of the file). I can read each line into a temporary string, but when I split it at the comma to put the author and book in different arrays, it won't work.
Here is my code:
objReader = New StreamReader(AppPath() + "books\books.txt")
i = 1
Dim temp() As String
Dim tempStr As String
Do While objReader.Peek() <> -1
tempStr = objReader.ReadLine()
temp = tempStr.Split(New Char() {","c})
temp(0) = authors(i)
temp(1) = books(i)
i = i + 1
Loop
I already initialized objReader and i earlier, and I imported System.IO, too.
I have tried to change the delimiters to semicolons, slashes, and backslashes in both the code and the file, but it doesn't work. I can confirm the file loads correctly.
You have to put the string in the arrays, you're doing it the other way around:
authors(i) = temp(0)
books(i) = temp(1)