I have a question regarding a piece of coding I would like to make more efficient in Visual Basic. What I am trying to do is the following:
I have a folder containing 100 .csv files (comma delimited), these files have around 5000 lines and around 200 columns. The order of columns may vary from one file to another, and some column are missing in some files.
My goal is to create one big .csv file that combines all the 100 .csv files, with a selection of column that I specify in advance.
Here is how I proceed:
Create an array to store the name of the columns I want in the final “big .csv”
Loop through all the files in the folder. For each file,
For each line in the file, use the Split function, to create an array containing all the values for a given line.
create a mapping array, that store the position of the column in the file for each column name chosen in the first step (do this only for the first line of each file)
Write in a file (“the big .csv”) the header (do this only once)
Write in the same big file, for each line of each file, the data based on the position of the column.
So that process works well, I get the outcome that I want but it is very slow… (It takes ~40min on my computer for ~200 files which once appended contains 500,000 lines and 200 columns. A colleague has managed to do a similar process, appending all the files, using the data.table package in R, and he is able to perform the same appending with the same .csv tables in 5-10min on the same computer)
I was wondering if there is a better alternative than going through the file “cell by cell”? Could I identify the column I don’t want from the source file and delete them entirely? Is there a function to append files together rather than reading each cell and then write them back?
Edit: Alternatively, is there another programming language that is significantly more efficient (Python? Power-Shell?) to do this kind of files manipulation?
Edit2: More details about why I consider it slow.
Edit3: The piece of code relevant to my question as requested in the comments:
Public Module Public_Variables
'Initializr technical parameters
Public Enable_SQL_Upload As String = "Yes"
Public Enable_CSV_Output As String = "Yes"
Public Enable_Runlog As String = "Yes"
Public MPF_Type As String
'Initialize Path and Folder location
Public Path As String '= "L:\Prophet\1902_Analysis\results\RUN_200" ' "S:\Users\Jonathan\12_Project_Space\Input" '"K:\Prophet\1809\Model_B2_LS_Analysis\results\RUN_31"
'Initialize parameters for Actuarial SQL database connection
Public SQLServer As String '= "MELAIPWBAT01"
Public SQLDataBase As String '= "VALDATA"
Public SQLTableName As String '= "RPT_1902_" & Right(Path, 7) '"aMPF_Retail_1808"
Public HeaderScopeFileName As String
Public ValidFilesFileName As String = "S:\Users\Jonathan\12_Project_Space\Tables\ValidFiles.txt"
Public InputFileName As String = "S:\Users\Jonathan\12_Project_Space\Tables\Input.txt"
Public tsrl As String = Now.Year.ToString() + Now.Month.ToString() + Now.Day.ToString() + "_" + Now.Hour.ToString() + "h" + Now.Minute.ToString() + "m" + Now.Second.ToString() + "s"
Public RunLogFileName As String = "\\Melaipwbat01\c$\Users\AACT064\Desktop\SQL_CSV_BULK_INSERT\RunLog" & tsrl & ".csv"
Public RunLogFile As IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(RunLogFileName, False)
Public HeaderFile1 As String '= "K:\Prophet\1903\MPF\MPF_SNAPSHOT\C_TROC.rpt" ' "K:\Prophet\1903\Model_B2_LS_Analysis\results\RUN_200\C_TROC.rpt" '"K:\Prophet\1809\Model_B2_LS_Analysis\results\RUN_31\C_DIO0.rpt"
Public HeaderFile2 As String '= "K:\Prophet\1901\Model_GRP\results\RUN_16\CORS_0.rpt" '"K:\Prophet\1809\Model_B2_LS_Analysis\results\RUN_31\C_DIO0.rpt"
Public ValidFilesFile As New System.IO.StreamReader(ValidFilesFileName)
'UserForm Design
'Public UserFormHeight_Start As Integer = 800
'Public UserFormWidth_Start As Integer = 800
Public UserFormHeight_ProgressBarExtension As Integer = 0
Public UserFormWidth_ProgressBarExtension As Integer = 0
'Initialize Misc.
Public Input_Array(,) As String
Public TextLine As String
Public TextLineSplit() As String
Public ValidFilesArray(,) As String
Public RecordCount As Integer = 0
Public BodyString As String = ""
Public SqlCommandText1 As String = ""
Public SqlCommandText2 As String = ""
'Initialize IS variables
Public Is_RPT_Name As Integer = 1
Public Is_RPT_Type As Integer = 2
Public Is_RPT_Valid As Integer = 3
Public Is_Name As Integer = 1
Public Is_Type As String = 2
Public Is_Not_found As Integer = -1
Public Is_RetailDCS As Integer = 0
Public Is_GroupDCS As Integer = 1
Public Is_MPF_Type As Integer
Public Is_Input_Header As Integer = 1
Public Is_Input_Path As Integer = 2
Public Is_Input_Server As Integer = 3
Public Is_Input_Database As Integer = 4
Public Is_Input_TableName As Integer = 5
Public Is_Input_HeaderFileRetailDCS As Integer = 6
Public Is_Input_HeaderFileGroupDCS As Integer = 7
'Initialize temp variables
Public temp_Valid As Integer
'Initialize the header of the SQL table that is created from that application
Public HeaderScope(,) As String
Public HeaderMapId(,) As String
Public HeaderStringSQL As String = ""
Public HeaderStringCSV As String = ""
Public MappingFound As Boolean
'Initialization for the files looping
Public temp_file As Integer = 1
Public temp_line As Integer
Public temp_headerfile As String
Public FileSize As Integer
Public TimerCounter As Integer = 0
End Module
Public Class UF_UserForm
Private Sub UF_UserForm_Load(sender As Object, e As EventArgs) Handles MyBase.Load
'Me.Height = UserFormHeight_Start
'Me.Width = UserFormWidth_Start
TB_Input.Text = InputFileName
CB_SQLUpload.Text = Enable_SQL_Upload
CB_CSVOutput.Text = Enable_CSV_Output
CB_Runlog.Text = Enable_Runlog
GB_Progress.Visible = False
End Sub
Private Sub B_Run_Click_1(sender As Object, e As EventArgs) Handles B_Run.Click
'Disable the button, switch to 'Progress' tab
B_Run.Enabled = False
TabControl1.SelectedIndex = 1
'Start the Timer
Timer1.Interval = 1000
TimerCounter = 0
Timer1.Start()
'Initilialize the parameters with the Text Box values
TB_Server.Enabled = False
TB_Database.Enabled = False
TB_TableName.Enabled = False
TB_Path.Enabled = False
TB_FileType.Enabled = False
CB_SQLUpload.Enabled = False
CB_CSVOutput.Enabled = False
CB_Runlog.Enabled = False
Enable_SQL_Upload = CB_SQLUpload.Text
Enable_CSV_Output = CB_CSVOutput.Text
Enable_Runlog = CB_Runlog.Text
'Extract the inputs from the input.txt file
Dim temp_Input As Integer
Dim InputFirstCol As String
Dim InputSecCol As String
Dim Nb_Of_Runs As Integer
Dim EndOfLoop As Boolean
InputFileName = TB_Input.Text
Dim InputFile As New System.IO.StreamReader(InputFileName)
temp_Input = 0
EndOfLoop = False
Do While InputFile.Peek() <> -1 And EndOfLoop = False
TextLine = InputFile.ReadLine()
TextLineSplit = TextLine.Split(",")
InputFirstCol = TextLineSplit(0)
If InputFirstCol <> "#" And InputFirstCol <> "" And InputFirstCol <> "--End--" Then
InputSecCol = TextLineSplit(1)
Else
InputSecCol = ""
End If
If InputFirstCol = "--End--" Then
EndOfLoop = True
Else
If InputFirstCol = "#" Then
temp_Input = temp_Input + 1
ElseIf InputFirstCol = "HeaderScope" Then
ReDim Preserve Input_Array(7, temp_Input)
Input_Array(Is_Input_Header, temp_Input) = InputSecCol
ElseIf InputFirstCol = "Path" Then
Input_Array(Is_Input_Path, temp_Input) = InputSecCol
ElseIf InputFirstCol = "Server" Then
Input_Array(Is_Input_Server, temp_Input) = InputSecCol
ElseIf InputFirstCol = "Database" Then
Input_Array(Is_Input_Database, temp_Input) = InputSecCol
ElseIf InputFirstCol = "TableName" Then
Input_Array(Is_Input_TableName, temp_Input) = InputSecCol
ElseIf InputFirstCol = "HeaderFileRetailDCS" Then
Input_Array(Is_Input_HeaderFileRetailDCS, temp_Input) = InputSecCol
ElseIf InputFirstCol = "HeaderFileGroupDCS" Then
Input_Array(Is_Input_HeaderFileGroupDCS, temp_Input) = InputSecCol
End If
End If
Loop
Nb_Of_Runs = temp_Input
'Create an array to store the timer per run
Dim Timer_Array(Nb_Of_Runs) As String
'Let's start the loop for each run
For temp_Run = 1 To Nb_Of_Runs
'Initialize date stamp variables and create the date stamp (called ts)
Dim now As DateTime = DateTime.Now
Dim ts As String = now.Year.ToString() + now.Month.ToString() + now.Day.ToString() + "_" + now.Hour.ToString() + "h" + now.Minute.ToString() + "m" + now.Second.ToString() + "s"
Dim temp_full_count As Integer = 0
Dim File_Count As Integer = 0
Dim temp_count As Integer = 0
'Open the.csv file
Dim OutputCSVFileName As String = "\\Melaipwbat01\c$\Users\AACT064\Desktop\SQL_CSV_BULK_INSERT\SQL_Upload" & ts & ".csv" ' "S:\Users\Jonathan\12_Project_Space\Output\RPT_Files" & ts & ".csv"
Dim SQLUploadFileName As String = Replace(OutputCSVFileName, "\\Melaipwbat01\c$", "C:")
Dim SQLQueriesFileName As String = "\\Melaipwbat01\c$\Users\AACT064\Desktop\SQL_CSV_BULK_INSERT\SQL_Queries" & ts & ".csv"
Dim outFile As IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(OutputCSVFileName, False)
Dim SQLQueriesFile As IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(SQLQueriesFileName, False)
'Store the inputs from the Input_Array
SQLServer = Input_Array(Is_Input_Server, temp_Run)
TB_Server.Text = Input_Array(Is_Input_Server, temp_Run)
SQLDataBase = Input_Array(Is_Input_Database, temp_Run)
TB_Database.Text = Input_Array(Is_Input_Database, temp_Run)
SQLTableName = Input_Array(Is_Input_TableName, temp_Run)
TB_TableName.Text = Input_Array(Is_Input_TableName, temp_Run)
Path = Input_Array(Is_Input_Path, temp_Run)
TB_Path.Text = Input_Array(Is_Input_Path, temp_Run)
HeaderScopeFileName = Input_Array(Is_Input_Header, temp_Run)
TB_FileType.Text = Input_Array(Is_Input_Header, temp_Run)
HeaderFile1 = Input_Array(Is_Input_HeaderFileRetailDCS, temp_Run)
TB_HeaderRetailDCS.Text = Input_Array(Is_Input_HeaderFileRetailDCS, temp_Run)
HeaderFile2 = Input_Array(Is_Input_HeaderFileGroupDCS, temp_Run)
TB_HeaderGroupDCS.Text = Input_Array(Is_Input_HeaderFileGroupDCS, temp_Run)
'Open the folder location and store all the files objetcs into files()
Dim files() As String = IO.Directory.GetFiles(Path)
'Open the Header scope .txt file
Dim HeaderScopeFile As New System.IO.StreamReader(HeaderScopeFileName)
'Initialize the variable ValidFilesArray
temp_Valid = 1
Do While ValidFilesFile.Peek() <> -1
TextLine = ValidFilesFile.ReadLine()
TextLineSplit = TextLine.Split(", ")
ReDim Preserve ValidFilesArray(3, temp_Valid)
ValidFilesArray(Is_RPT_Name, temp_Valid) = TextLineSplit(Is_RPT_Name - 1)
ValidFilesArray(Is_RPT_Type, temp_Valid) = TextLineSplit(Is_RPT_Type - 1)
ValidFilesArray(Is_RPT_Valid, temp_Valid) = TextLineSplit(Is_RPT_Valid - 1)
temp_Valid = temp_Valid + 1
Loop
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Display the Progress Group Box ''''''''''''''''''''''''''''''''''''''''''''''''
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
L_ProgressPC.Text = "Initialisation"
ProgressBar.Value = 0
GB_Progress.Visible = True
'Me.Height = UserFormHeight_Start + UserFormHeight_ProgressBarExtension
'Me.Width = UserFormWidth_Start + UserFormWidth_ProgressBarExtension
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Nb of Files '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' The goal of this piece of code aims at calculating the number of .rpt files
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
TB_Runlog.Text = "Checking number of .rpt files..." & Environment.NewLine & TB_Runlog.Text
For Each file As String In files
If CheckValidRPTFile(file, False) = True Then
File_Count = File_Count + 1
L_NbOfFiles.Text = File_Count
L_NbOfRuns.Text = temp_Run & "/" & Nb_Of_Runs
End If
Application.DoEvents()
Next
TB_Runlog.Text = "... " & " Run number " & temp_Run & Environment.NewLine & TB_Runlog.Text
TB_Runlog.Text = "... " & File_Count & " rpt files founds" & Environment.NewLine & TB_Runlog.Text
TB_Runlog.Text = "---------------------------------------" & Environment.NewLine & TB_Runlog.Text
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' The key is to define the header with the field names and the field types
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim HeaderNbOfField As Integer = 0
Do While HeaderScopeFile.Peek() <> -1
HeaderNbOfField = HeaderNbOfField + 1
'We split the libe into an array using the comma delimiter
TextLine = HeaderScopeFile.ReadLine()
TextLineSplit = TextLine.Split(", ")
ReDim Preserve HeaderScope(2, HeaderNbOfField)
HeaderScope(Is_Name, HeaderNbOfField) = TextLineSplit(0)
HeaderScope(Is_Type, HeaderNbOfField) = TextLineSplit(1)
Loop
'That array stores the position of a given field in the file
'It is important to initialise the array to -1
'When further down we assign HeaderMapId, if a value remains "-1" il will mean the field was not assigned thus not found.
'We will then assign a default value for those not found fields
ReDim HeaderMapId(1, HeaderNbOfField)
For temp_headermapini = 0 To HeaderNbOfField
HeaderMapId(Is_RetailDCS, temp_headermapini) = Is_Not_found
HeaderMapId(Is_GroupDCS, temp_headermapini) = Is_Not_found
Next
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Header ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' The goal of this piece of code is to populate the variable HeaderMapId
' HeaderMapId stores the position of each field define is HeaderScope variable
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
L_ProgressPC.Text = "Find position of each field in the flat file"
'We use HeaderFile as the base to define the position of each field
'The application will function properly only if every .rpt file in the folder have same header as HeaderFile
For temp_header = 0 To 1
'We loop through 2 different type of MPFs
'Typicaly coming from Retail DCS and Group DCS
If temp_header = Is_RetailDCS Then
temp_headerfile = HeaderFile1
ElseIf temp_header = Is_GroupDCS Then
temp_headerfile = HeaderFile2
Else
temp_headerfile = "" 'Error
End If
Dim objReaderHeader As New System.IO.StreamReader(temp_headerfile)
'We read line by line until the end of the file
Do While objReaderHeader.Peek() <> -1
TextLine = objReaderHeader.ReadLine()
'We only care about the header, which starts with the character "!" in prophet .rpt files
If Strings.Left(TextLine, 1) = "!" Then
Dim temp_scope As Integer
Dim temp_scope_id As Integer
'We split the line in an array delimited by comma
TextLineSplit = TextLine.Split(", ")
'We loop through the array
'Once a field match one of the field define in HeaderScope, we store the position of that field in HeaderMapId
temp_scope_id = 1
For Each s As String In TextLineSplit
For temp_scope = 1 To UBound(HeaderScope, 2)
If HeaderScope(Is_Name, temp_scope) = s Then
HeaderMapId(temp_header, temp_scope) = temp_scope_id
End If
Next
temp_scope_id = temp_scope_id + 1
Next
End If
Loop
Next
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Header''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Create the query for SQL table creation
Dim temp_field As Integer
For temp_field = 1 To HeaderNbOfField
If temp_field = 1 Then
HeaderStringSQL = "(Prophet_Name varchar(255),"
HeaderStringCSV = "Prophet_Name,"
End If
'We replace the bracket by underscore to avoid crashes when creating the SQL table
HeaderStringSQL = HeaderStringSQL & Replace(Replace(HeaderScope(Is_Name, temp_field), "(", "_"), ")", "_") & " " & HeaderScope(Is_Type, temp_field)
HeaderStringCSV = HeaderStringCSV & Replace(Replace(HeaderScope(Is_Name, temp_field), "(", "_"), ")", "_")
If temp_field <> HeaderNbOfField Then
HeaderStringSQL = HeaderStringSQL & ","
HeaderStringCSV = HeaderStringCSV & ","
Else
HeaderStringSQL = HeaderStringSQL & ")"
HeaderStringCSV = HeaderStringCSV & ","
End If
Next
If Enable_CSV_Output = "Yes" Then
'Remove braquets and single quotes
outFile.WriteLine(Replace(Replace(Replace(HeaderStringCSV, ")", ""), "(", ""), "'", ""))
End If
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Body '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' We loop through all the files and pick the information we need based on HeaderMapId
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'We loop through each file in the folder location
For Each file As String In files
Dim objReader As New System.IO.StreamReader(file)
Dim fileInfo As New IO.FileInfo(file)
'We only loop through the valid .rpt files
If CheckValidRPTFile(file, True) = True Then
'Count the number of files we go through
temp_count = temp_count + 1
'Count the number of lines in the file (called FileSize)
'Dim objReaderLineCOunt As New System.IO.StreamReader(file)
'FileSize = 0
'Do While objReaderLineCOunt.Peek() <> -1
'TextLine = objReaderLineCOunt.ReadLine()
'FileSize = FileSize + 1
'Loop
FileSize = 1000000
'temp_line = 1
''We loop through line by line for a given file
'Do While objReader.Peek() <> -1
Dim TextLines() As String = System.IO.File.ReadAllLines(file)
For Each TextLine2 In TextLines
'Update the Progress Bar
temp_full_count = temp_full_count + 1
If temp_full_count Mod 200 = 0 Then
ProgressBar.Value = Int(100 * temp_count / File_Count)
L_ProgressPC.Text = "Processing file " & fileInfo.Name & " " & temp_count & "/" & File_Count & " - line " & temp_line & "/" & FileSize
L_RecordsProcessed.Text = temp_full_count
Application.DoEvents()
End If
'We split the libe into an array using the comma delimiter
'TextLine = objReader.ReadLine()
TextLineSplit = TextLine2.Split(", ")
'Skip line that are not actual prophet records (skip header and first few lines)
If Strings.Left(TextLine2, 1) = "*" Then
'We loop through the number of field we wish to extract for the file
For temp_field = 1 To HeaderNbOfField
If temp_field = 1 Then
BodyString = "('" & Strings.Left(fileInfo.Name, Len(fileInfo.Name) - 4) & "',"
End If
If MPF_Type = "RetailDCS" Then
Is_MPF_Type = Is_RetailDCS
ElseIf MPF_Type = "GroupDCS" Then
Is_MPF_Type = Is_GroupDCS
Else
MsgBox("Is_MPF_Type value is nor recognized")
End
End If
'The array HeaderMapId tells us where to pick the information from the file
'This assumes that each file in the folder have same header as the 'HeaderFile'
If HeaderMapId(Is_MPF_Type, temp_field) = Is_Not_found Then
BodyString = BodyString & "98766789"
Else
BodyString = BodyString & TextLineSplit(HeaderMapId(Is_MPF_Type, temp_field) - 1)
End If
If temp_field <> HeaderNbOfField Then
BodyString = BodyString & ","
Else
BodyString = BodyString & ")"
End If
Next
'We replace double quotes with single quotes
BodyString = Replace(BodyString, """", "'")
'This Line is to add records to the .csv file
If Enable_CSV_Output = "Yes" Then
'Remove braquets and single quotes
outFile.WriteLine(Replace(Replace(Replace(BodyString, ")", ""), "(", ""), "'", ""))
End If
End If
temp_line = temp_line + 1
Next
TB_Runlog.Text = "Completed: " & fileInfo.Name & Environment.NewLine & TB_Runlog.Text
temp_file = temp_file + 1
End If
Next
outFile.Close()
ProgressBar.Value = Int(100 * temp_count / File_Count)
L_RecordsProcessed.Text = temp_full_count
Application.DoEvents()
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Upload to SQL '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' The goal of this code is to create the SQL table
' And push the .csv created into the SQL table using BULK INSERT function
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
L_ProgressPC.Text = "Creating SQL Table"
Application.DoEvents()
If Enable_SQL_Upload = "Yes" Then
'First query is to create the table
SqlCommandText1 = "CREATE TABLE [" & SQLDataBase & "].[analysis]." & SQLTableName & "_" & ts & " " & HeaderStringSQL
'Second query is to populate the data
SqlCommandText2 = "BULK INSERT [" & SQLDataBase & "].[analysis]." & SQLTableName & "_" & ts & " " & " FROM '" & SQLUploadFileName & "' WITH ( FIELDTERMINATOR = ',', ROWTERMINATOR = '\n' , FIRSTROW=2)"
SQLQueriesFile.WriteLine(SqlCommandText1)
SQLQueriesFile.WriteLine(SqlCommandText2)
SQLQueriesFile.Close()
Using connection As New SqlConnection("Data Source=" & SQLServer & ";Integrated Security=True;Connection Timeout=2000;Initial Catalog=" & SQLDataBase & ";")
connection.Open()
Dim command As New SqlCommand(SqlCommandText1, connection)
command.ExecuteNonQuery()
Dim command2 = New SqlCommand(SqlCommandText2, connection)
command2.CommandTimeout = 2000
command2.ExecuteNonQuery()
connection.Close()
End Using
End If
Timer_Array(temp_Run) = L_Timer.Text
Next 'temp_Run
RunLogFile.Close()
Timer1.Stop()
L_ProgressPC.Text = "Job completed"
One way to speed up your program is to mininize the number of access to disk. Right now, you are reading each file twice, line-by-line. Each file most likely fits in memory. So, what you can do, is read all lines of a file in memory, and then process its lines. This will be much faster.
Something like:
'We only loop through the valid .rpt files
If CheckValidRPTFile(file, True) = True Then
''Count the number of files we go through
'temp_count = temp_count + 1
''Count the number of lines in the file (called FileSize)
'Dim objReaderLineCOunt As New System.IO.StreamReader(file)
'FileSize = 0
'Do While objReaderLineCOunt.Peek() <> -1
' TextLine = objReaderLineCOunt.ReadLine()
' FileSize = FileSize + 1
'Loop
'temp_line = 1
''We loop through line by line for a given file
'Do While objReader.Peek() <> -1
Dim TextLines() As String = System.IO.File.ReadAllLines(file)
For Each TextLine In TextLines
'We split into an array using the comma delimiter
'TextLine = objReader.ReadLine()
TextLineSplit = TextLine.Split(", ")
I looked at your updated code. You have a few additional performance issues in your code.
Reading and writing to network shared folder files is not very efficient especially when there is a lot of back and forth access to the file because everything goes through the network rather than direct local drive access.
Probably the most inefficient part of your program is iterating through your 200 fields for each line of the files. Let's say we have 5000 lines per file on average, then this means 200 x 5000 x 284 = 284 millions iterations!
The efficient method for accessing network shared files is to read an entire file in memory using System.IO.ReadAllLines() or System.IO.File.ReadAllText(), and then process its contents. Similarly, writing to a network shared file should consist of building the file contents in memory (if possible) using StringBuilder or a List(Of String), and then write the entire file to the network share with System.IO.File.WriteAllText() of System.IO.File.WriteAllLines. This should be the preferred way of accessing network shared files for best performance.
For the second performance issue, the loop through fields can be simplified as follows.
Dim BodyStringBuilder As New StringBuilder("")
BodyStringBuilder.Append("('" & Strings.Left(fileInfo.Name, Len(fileInfo.Name) - 4) & "',")
If MPF_Type = "RetailDCS" Then
Is_MPF_Type = Is_RetailDCS
ElseIf MPF_Type = "GroupDCS" Then
Is_MPF_Type = Is_GroupDCS
Else
MsgBox("Is_MPF_Type value is nor recognized")
End
End If
'Skip line that are not actual prophet records (skip header and first few lines)
If Strings.Left(TextLine2, 1) = "*" Then
'We loop through the number of field we wish to extract for the file
For temp_field = 1 To HeaderNbOfField
'The array HeaderMapId tells us where to pick the information from the file
'This assumes that each file in the folder have same header as the 'HeaderFile'
If HeaderMapId(Is_MPF_Type, temp_field) = Is_Not_found Then
BodyStringBuilder.Append("98766789,")
Else
BodyStringBuilder.Append(TextLineSplit(HeaderMapId(Is_MPF_Type, temp_field) - 1) & ",")
End If
Next
' Replace last "," by ")".
BodyStringBuilder.Remove(BodyStringBuilder.Length - 1, 1).Append(")")
'We replace double quotes with single quotes
BodyString = Replace(BodyString.ToString, """", "'")
If I manually put my address in for EmailMessage.To.Add(GetDelimitedField(x, strEmailRep, ";")) It sends me the message just fine. However If I use the code as is below which is using a list that looks like ;email1#mail.com;email2.mail.com
Then it gives an error that email address cannot be blank
Somewhere in GetDelimitedField is erasing addresses. I'm not sure where the problem is actually occurring. Here is all the code involved with this.
strmsg = "LOW STOCK ALERT: Component (" & rsMPCS("MTI_PART_NO") & ") has reached or fallen below it's minimum quantity(" & rsMPCS("MIN_QTY") & ")."
Dim EmailMessage As MailMessage = New MailMessage
EmailMessage.From = New MailAddress("noreply#mail.com")
For x = 1 To GetCommaCount(strEmailRep) + 1
EmailMessage.To.Add(GetDelimitedField(x, strEmailRep, ";"))
Next
EmailMessage.Subject = ("LOW STOCK ALERT!")
EmailMessage.Body = strmsg
EmailMessage.Priority = MailPriority.High
EmailMessage.IsBodyHtml = True
Dim smtp As New SmtpClient("smtp.mycompany.com")
smtp.UseDefaultCredentials = True
smtp.Send(EmailMessage)
Public Function GetCommaCount(ByVal sText As String)
Dim X As Integer
Dim Count As Integer
Dim Look As String
For X = 1 To Len(sText)
Look = Microsoft.VisualBasic.Left(sText, X)
If InStr(X, Look, ";", 1) > 0 Then
Count = Count + 1
End If
Next
GetCommaCount = Count
End Function
Public Function GetDelimitedField(ByRef FieldNum As Short, ByRef DelimitedString As String, ByRef Delimiter As String) As String
Dim NewPos As Short
Dim FieldCounter As Short
Dim FieldData As String
Dim RightLength As Short
Dim NextDelimiter As Short
If (DelimitedString = "") Or (Delimiter = "") Or (FieldNum = 0) Then
GetDelimitedField = ""
Exit Function
End If
NewPos = 1
FieldCounter = 1
While (FieldCounter < FieldNum) And (NewPos <> 0)
NewPos = InStr(NewPos, DelimitedString, Delimiter, CompareMethod.Text)
If NewPos <> 0 Then
FieldCounter = FieldCounter + 1
NewPos = NewPos + 1
End If
End While
RightLength = Len(DelimitedString) - NewPos + 1
FieldData = Microsoft.VisualBasic.Right(DelimitedString, RightLength)
NextDelimiter = InStr(1, FieldData, Delimiter, CompareMethod.Text)
If NextDelimiter <> 0 Then
FieldData = Microsoft.VisualBasic.Left(FieldData, NextDelimiter - 1)
End If
GetDelimitedField = FieldData
End Function
You can split the list easier using string.Split:
Dim strEmails = "a#test.com;b#test.com;c#test.com;"
Dim lstEmails = strEmails.Split(";").ToList()
'In case the last one had a semicolon:
If (lstEmails(lstEmails.Count - 1).Trim() = String.Empty) Then
lstEmails.RemoveAt(lstEmails.Count - 1)
End If
If (lstEmails.Count > 0) Then
lstEmails.AddRange(lstEmails)
End If
Code for reading and splitting from file:
Public Sub LoadAccount()
currentfilereader = New StreamReader(filename)
Dim Seperator As Char = " "c
For count As Integer = 0 To NumUsers - 1
textstring = currentfilereader.ReadLine
Dim words() As String = currentfilereader.ReadLine.Split(Seperator)
Username = words(0)
Password = words(1)
If words(2) = "1" Then
AccessGranted = True
Else
AccessGranted = False
End If
Users(count, 0) = Username
Users(count, 1) = Password
Users(count, 2) = AccessGranted
Next
currentfilereader.Close()
End Sub
Code for logging in:
Public Sub Login()
Dim InvalidUsername, InvalidPassword As Boolean
InvalidUsername = True
InvalidPassword = True
LoginName = Form1.tbun.Text
LoginPassword = Form1.tbpw.Text
For count As Integer = 0 To NumUsers - 1
If LoginName = Users(count, 0) Then
InvalidUsername = False
If LoginPassword = Users(count, 1) Then
InvalidPassword = False
CurrentUsername = LoginName
CurrentPassword = LoginPassword
CurrentAccessGranted = Users(count, 2)
loggedin = True
Else
MsgBox("Invalid Password")
End If
Else
MsgBox("Invalid Username")
End If
Next
End Sub
Code for calculating number of users:
Public Sub NumberOfUsers()
currentfilereader = New StreamReader(filename)
NumUsers = File.ReadAllLines("Accounts.txt").Length
MsgBox("There are " & NumUsers & " users")
End Sub
I have added a MsgBox to show the number of users to make sure all is working fine which returns the value of 2, since I currently have 2 lines in the text file, "a a 1" and "b b 1".
However when this line runs, Dim words() As String = currentfilereader.ReadLine.Split(Seperator), it returns null.
The purpose of subtracting 1 from the NumUsers in the count is since the count starts at zero along with the array. Meaning that if I didn't it would check 3 times if there is only 2 users in the file. But I just can't seem to figure out what is wrong and why it is returning null.
You call ReadLine twice for each user:
textstring = currentfilereader.ReadLine
Dim words() As String = currentfilereader.ReadLine.Split(Seperator)
This means that for the first user you read both lines, and for the second user you read nothing, leading to the empty split array.
Replace
Dim words() As String = currentfilereader.ReadLine.Split(Seperator)
With
Dim words() As String = textstring.Split(Seperator)
I am trying to make a program that requires the user to sign in and if the right username and password is entered then open a form for that user. The way I have it done just now is the username is stored in a variable and I was wondering how I can use the text from the variable in the form name for example:
UserName = User1 then
FrmUser1.show
UserName = JohnSmith then
FrmJohnSmith.show
Here is an excerpt from the code I have so far:
Dim Path As String = "Account_File.text "
Dim Read_File() As String = File.ReadAllLines(Path)
Dim NoOfLines As Integer = Read_File.Length
Dim UserName(NoOfLines) As String
Dim Password(NoOfLines) As String
Dim LastNonEmpty As Integer = -1
Dim InputUserName As String = ""
Dim InputPassword As String = ""
If TxtUserName.Text = "" Then
MsgBox("Please enter a username.")
GoTo InvalidDetails
Else
InputUserName = TxtUserName.Text
End If
If TxtPassword.Text = "" Then
MsgBox("Please enter a Password.")
GoTo InvalidDetails
Else
InputPassword = TxtPassword.Text
End If
For i = 0 To NoOfLines
Dim SplitString() As String = Split(Read_File(i))
For j As Integer = 0 To SplitString.Length - 1
If SplitString(j) <> "" Then
LastNonEmpty += 1
SplitString(LastNonEmpty) = SplitString(j)
End If
Next
ReDim Preserve SplitString(LastNonEmpty)
LastNonEmpty = -1
UserName(i) = SplitString(0)
Password(i) = SplitString(1)
Next
For i = 0 To NoOfLines
If UserName(i) = InputUserName And Password(i) = InputPassword Then
frm.show()
Else
MsgBox("Either the username or password is incorrect.")
End If
Next
Some sample code:
Dim formname = "Form" + TxtUserName.Text
Dim typename = Me.GetType().Namespace + "." + formname
Dim type = Me.GetType().Assembly.GetType(typename)
If type IsNot Nothing Then
Dim form = CType(Activator.CreateInstance(type), Form)
form.Show()
End If
Which assumes that all forms live in the same assembly and have the same namespace. Do keep in mind that your customer isn't very likely to be thrilled to have to call you every single time he gets a new employee. Or for that matter is going to like you asking for username + password without providing the kind of security guarantees that a Windows logon already provides. Don't do this.
i am getting this problem in some systems, some systems working properly, here my code is,
Dim fileName As String = "FaultTypesByMonth.csv"
Using writer As IO.StreamWriter = New IO.StreamWriter(fileName, True, System.Text.Encoding.Default) '------------ rao new ----
Dim Str As String
Dim i As Integer
Dim j As Integer
Dim headertext1(rsTerms.Columns.Count) As String
Dim k As Integer = 0
Dim arrcols As String = Nothing
For Each column As DataColumn In TempTab.Columns
arrcols += column.ColumnName.ToString() + ","c
k += 1
Next
writer.WriteLine(arrcols)
For i = 0 To (TempTab.Rows.Count - 1)
For j = 0 To (TempTab.Columns.Count - 1)
If j = (TempTab.Columns.Count - 1) Then
Str = (TempTab.Rows(i)(j).ToString)
Else
Str = (TempTab.Rows(i)(j).ToString & ",")
End If
writer.Write(Str)
Next
writer.WriteLine()
Next
writer.Close()
writer.Dispose()
End Using
Dim FileToDelete As String = Nothing
Dim sd As New SaveFileDialog
sd.Filter = "CSV Files (*.csv)|*.csv"
sd.FileName = "FaultTypesByMonth"
If sd.ShowDialog = Windows.Forms.DialogResult.OK Then
FileCopy(fileName, sd.FileName)
MsgBox(" File Saved in selected path")
FileToDelete = fileName
If System.IO.File.Exists(FileToDelete) = True Then
System.IO.File.Delete(FileToDelete)
End If
End If
FileToDelete = fileName
If System.IO.File.Exists(FileToDelete) = True Then
System.IO.File.Delete(FileToDelete)
End If
when i am trying to save this file in desired path, then i am getting this error.
if save in shared folder i am not getting this error
system.io.ioexception the process cannot access because it is being used by another process...
what i am doing wrong,Help me