I have an Excel sheet and its humongous.. Around like 200000 rows that needs to be processed.
All I have to do is read it and process them with a query on a DB2 table. I have written the program where its more than 8 hours to process 5000 rows.
Is there a way where I can simultaneously read the excel and execute the query. I want them to be independent of the process. I cannot use Parallel.for as reading and creating so many instance of threads is no advantage. ANy pipes and queues are of no use. THis is a dom method using and it does not read a row, it reads a string.. if there is a null value on the row, it executes the row and throws an null exception. I am well with Background workers and TPL's. Any idea or code would be appreciated. No DLL can be used apart from OPENXML
Ideally I do not want to add to array,, I want it in 2 diff variables and process them when read..
Read a row( only 2 columns, ignore other cols
create a thread to execute the row and in Parallel, execute the read row.
Merge into one single table.
display results.. Sounds simple but there are challenges.
.
Try
Using spreadsheetDocument As SpreadsheetDocument = spreadsheetDocument.Open(fileName, False)
Dim workbookPart As WorkbookPart = spreadsheetDocument.WorkbookPart
Dim worksheetPart As WorksheetPart = workbookPart.WorksheetParts.First()
Dim sheetData As SheetData = worksheetPart.Worksheet.Elements(Of SheetData)().First()
For Each r As Row In sheetData.Elements(Of Row)()
For Each c As Cell In r.Elements(Of Cell)()
Dim text As String
Try
text = c.CellValue.Text.ToString
Debug.Print(text.ToString)
If text IsNot Nothing AndAlso Trim(text).Length > 0 Then
Arr.Add(text.ToString)
End If
text = Nothing
j += 1
Catch
End Try
Next
text = Nothing
Next
End Using
Catch ex As Exception
MsgBox("Exception caught: " + ex.Message)
Debug.Print(ex.Message.ToString)
End
End Try
myArr = CType(Arr.ToArray(GetType(String)), String())
This is the process which is dividing the data into 2 parameters
For i As Integer = 2 To myArr.Count - 1 Step 2
If i And 1 Then
i = i - 1
Else
dstr = DateTime.FromOADate(Double.Parse(myArr(i).ToString())).ToShortDateString()
'Debug.Print(dstr.ToString & "----->" & i.ToString & "TCID--->" & myArr(i + 1).ToString)
DQueue.Enqueue(DateTime.FromOADate(Double.Parse(myArr(i).ToString())).ToShortDateString())
Tqueue.Enqueue((myArr(i + 1).ToString()))
TCArr.Add((myArr(i + 1).ToString()))
dc.Merge(ProcessQueryODBC(dstr, myArr(i + 1).ToString))
If dc.Rows.Count > 0 Then
dt.Merge(dc)
Else
nFound.Merge(CreateDT(dstr, myArr(i + 1).ToString()))
End If
End If
Next
Instead of opening a DB connection through ODBC. Can you export your data to a CSV file and then let DB2 perform the import?
somestring = "import from "myfile.csv" of DEL ...."
DoCmd.RunSQL somestring
Related
I coded a simple program that reads from a Textfile Line by Line and If the current readed Line has alphabetics (a-z A-Z) it will write that Line into an other txt file.
If the current readed line doesn't have alphabetics it wont write that line into a new text file.
I created this for the purpose that I have members registering at my website and some of them are using only numbers as Username. I will filter them out and only save the alphabetic Names. (Focus on this Project please I know i could just use php stuff)
That works great already but it takes a while to read line by line and write into the other text file (Write speed 150kb in 1 Minute - Its not my drive I have a fast ssd).
So I wonder if there is a faster way. I could "readalllines" first but on large files it just freezes my program so I don't know if that works too (I want to focus on large +1gb files)
This is my code so far:
If System.IO.File.Exists(FILE_NAME) = True Then
Dim objReader As New System.IO.StreamReader(FILE_NAME)
Do While objReader.Peek() <> -1
Dim myFile As New FileInfo(output)
Dim sizeInBytes As Long = myFile.Length
If sizeInBytes > splitvalue Then
outcount += 1
output = outputold + outcount.ToString + ".txt"
File.Create(output).Dispose()
End If
count += 1
TextLine = objReader.ReadLine() & vbNewLine
Console.WriteLine(TextLine)
If CheckForAlphaCharacters(TextLine) Then
File.AppendAllText(output, TextLine)
Else
found += 1
Label2.Text = "Removed: " + found.ToString
TextBox1.Text = TextLine
End If
Label1.Text = "Checked: " + count.ToString
Loop
MessageBox.Show("Finish!")
End If
First of all, as hinted by #Sean Skelly updating UI controls - repeatedly - is an expensive operation.
But your bigger problem is File.AppendAllText:
If CheckForAlphaCharacters(TextLine) Then
File.AppendAllText(output, TextLine)
Else
found += 1
Label2.Text = "Removed: " + found.ToString
TextBox1.Text = TextLine
End If
AppendAllText(String, String)
Opens a file, appends the specified string to the file, and then
closes the file. If the file does not exist, this method creates a
file, writes the specified string to the file, then closes the file.
Source
You are repeatedly opening and closing a file, causing overhead. AppendAllText is a convenience method since it performs several operations in one single call but you can now see why it's not performing well in a big loop.
The fix is easy. Open the file once when you start your loop and close it at the end. Make sure that you always close the file properly even when an exception occurs. For that, you can either invoke the Close in a Finally block, or use a context manager, that is keep your file write operations within a Using block.
And you could remove the print to console as well. Display management has a cost too. Or you could print status updates every 10K lines or so.
When you've done all that, you should notice improved performance.
My Final Code - It works a lot faster now (500mbs in 1 minute)
Using sw As StreamWriter = File.CreateText(output)
For Each oneLine As String In File.ReadLines(FILE_NAME)
Try
If changeme = True Then
changeme = False
GoTo Again2
End If
If oneLine.Contains(":") Then
Dim TestString = oneLine.Substring(0, oneLine.IndexOf(":")).Trim()
Dim TestString2 = oneLine.Substring(oneLine.IndexOf(":")).Trim()
If CheckForAlphaCharacters(TestString) = False And CheckForAlphaCharacters(TestString2) = False Then
sw.WriteLine(oneLine)
Else
found += 1
End If
ElseIf oneLine.Contains(";") Or oneLine.Contains("|") Or oneLine.Contains(" ") Then
Dim oneLineReplac As String = oneLine.Replace(" ", ":")
Dim oneLineReplace As String = oneLineReplac.Replace("|", ":")
Dim oneLineReplaced As String = oneLineReplace.Replace(";", ":")
If oneLineReplaced.Contains(":") Then
Dim TestString3 = oneLineReplaced.Substring(0, oneLineReplaced.IndexOf(":")).Trim()
Dim TestString4 = oneLineReplaced.Substring(oneLineReplaced.IndexOf(":")).Trim()
If CheckForAlphaCharacters(TestString3) = False And CheckForAlphaCharacters(TestString4) = False Then
sw.WriteLine(oneLineReplaced)
Else
found += 1
End If
Else
errors += 1
textstring = oneLine
End If
Else
errors += 1
textstring = oneLine
End If
count += 1
Catch
errors += 1
textstring = oneLine
End Try
Next
End Using
I have an Excel file with a lot of PC's names in a server, I want to execute the "systeminfo" command and get the OS out of it. Then the OS shall be put into an Excel cell automatically. To do so, I used the following codes, respectively in the VBA file and the batch file.
however, whenever the server can't reach a pc, the cmd window is stuck until I manually close it. Since the list is actually 148 names long, knowing of a way to automatically close those Windows after, say, 8 seconds would be really helpful.
I tried to look up for a way to multi-thread VBA, just to find out that It is a single-threaded Language. I then tried to start another batch file with the one I'm actually using as to forcefuly kill it afetr a set of time, but it seems that the second batch starts only after the first is terminated, making it useless.
VBA
Sub Test()
'
' Test Macro
' I'm not an expert in VBA, I just picked it up for this task, so a lot of code will result redundant. Bear with me
'
'
Dim i As Integer
'a is basically i-1.
a = 1
' I needed 148 cells for the project
Dim models(1 To 147) As String
For i = 2 To 148
models(a) = Cells(i, 3).Value
a = a + 1
Next i
a = 1
For i = 2 To 148
'not totally sure what the next five lines actually do, but "metodo" is the name of the batch file.
Dim strShellCommand As String
strShellCommand = "C:\Users\Administrator\Desktop\metodo.bat " + models(a)
Set oSh = CreateObject("WScript.Shell")
Set oEx = oSh.Exec(strShellCommand)
strBuf = oEx.StdOut.readAll
'I took out of the string everything that wasn't purely the OS name
Dim FinalString As String
FinalString = Right(strBuf, 26)
FinalString = Left(FinalString, 25)
'this is the line that prints the OS names into Excel cells
ActiveSheet.Cells(i, 10) = FinalString
a = a + 1
Next i
End Sub
then there is the Batch file
set nome=%1
shift
systeminfo /s %nome% |findstr /c:"Microsoft Windows "
u can do a control loop after the Set oEx = oSh.Exec(strShellCommand)
like :
Set oEx = oSh.Exec(strShellCommand)
LoopCount = 0
Do 'Control loop
wscript.Sleep 1000
If TimeOut > 0 Then LoopCount = LoopCount + 1
Loop Until (oEx.Status <> 0) Or (LoopCount > TimeOut * 8)
If oEx.Status = 0 Then 'Timeout occured
oEx.Terminate
ReturnValue = "[Process terminated after timeout!]" & VbCrLf
Else
ReturnValue = "[Process completed]" & VbCrLf
End If
each loop takes 1 second (wscript.Sleep 1000) and the (LoopCount > TimeOut * 8) sets the total time to 8 seconds
good luck
I have a program that saves data from a simple program on our Shop Floor. The issue I am having is that I am getting duplicate entries I beleive due to collisions with writing to the file at the same time by 2 different people
Do While done = 0 And attempt < 1
done = 1
Try
Using theWriter As New System.IO.StreamWriter(ReportText, True)
For Each currentrow As DataGridViewRow In Me.ReportGrid.Rows
' The Cells are the Columns
For Each currentcolumn As DataGridViewCell In currentrow.Cells
theWriter.Write(currentcolumn.Value & vbTab)
Next
theWriter.WriteLine()
Me.ReportGrid.Rows.Remove(currentrow)'the new line added
Next
theWriter.Close()
End Using
Catch When Err.Number = 75
wait(2000)
done = 0
attempt = attempt + 1
Catch When Err.Number <> 53
Notify()
End Try
Loop
I'm a bit of a newbie so any advice would be great. I have a program
that opens a CSV, and then saves it as a csv with a different name. there will be a set of rules to change fields but haven't got that far yet.
when I run this on a small csv file (about 4 columns and rows) it works fine, but with a larger file, it fails with the error above. i'm sure its something daft but I I'm at a loss.
Thanks,
Dean
Dim FileName = tbOpen.Text
Dim fileout = tbSave.Text
Dim lines = File.ReadAllLines(FileName)
Dim output As New List(Of String)
For Each line In lines
Dim fields = line.Split(","c)
If fields(1) = "" Then 'This is where the error is triggered
fields(1) = "Norman"
End If
If fields(3) = "" Then
fields(3) = "Blue Leather"
End If
If fields(4) = "" Then
fields(3) = "Interlined"
End If
output.Add(String.Join(","c, fields))
Next
File.WriteAllLines(fileout, output)
Try
Dim a As String = My.Computer.FileSystem.ReadAllText(tbSave.Text)
Dim b As String() = a.Split(vbNewLine)
ListBox2.Items.AddRange(b)
Catch ex As Exception
MsgBox("error")
End Try
Keep in mind that arrays start a index 0. VB is notorious for stretching this concept. Normally, when you declare an array with 10 elements the indexes would be from 0 - 9. With VB, on the other hand, the indexes will be from 0 - 10 which will actually give you 11 elements.
Can someone help me with this code:
I have a dataGrid with 2 columns:
and what I want to do is use PStools' Psloggedon cmd to give me the name of every person logged in and append that result to the "LOGGED_IN" column but what is happening is that if there is no user logged into a PC, the process takes like 5 minutes to post an error message.
Now, what I want to do is that if .5 seconds has gone to just forget the row it's currently querying and move on to the next row, in the column?
here is the vb.net code i want to focus on:
Dim RowCount As Integer = datagridView1.RowCount
For i = 0 To RowCount - 2
'PERFORM PSLOGGEDON ROUTINE
Dim Proc1 As New Process
Proc1.StartInfo = New ProcessStartInfo("psloggedon")
Proc1.StartInfo.Arguments = "-l \\" & datagridView1.Rows(i).Cells(0).Value & ""
Proc1.StartInfo.RedirectStandardOutput = True
Proc1.StartInfo.UseShellExecute = False
Proc1.StartInfo.CreateNoWindow = True
Proc1.Start()
'INSERT RESULTS IN LOGGEN_IN COLUMN
datagridView1.Rows(i).Cells(1).Value = Proc1.StandardOutput.ReadToEnd
Next
Can someone please show me how to write the code to get that done?
Use Process.WaitForExit(int milliseconds) method.
Instructs the Process component to wait the specified number of milliseconds for the associated process to exit.
Return Value
Type: System.Boolean
true if the associated process has exited; otherwise, false.
You can then use Process.Kill to kill process if it did not exit in given time.
Something like
Dim RowCount As Integer = datagridView1.RowCount
For i = 0 To RowCount - 2
'PERFORM PSLOGGEDON ROUTINE
Dim Proc1 As New Process
Proc1.StartInfo = New ProcessStartInfo("psloggedon")
Proc1.StartInfo.Arguments = "-l \\" & datagridView1.Rows(i).Cells(0).Value & ""
Proc1.StartInfo.RedirectStandardOutput = True
Proc1.StartInfo.UseShellExecute = False
Proc1.StartInfo.CreateNoWindow = True
Proc1.Start()
If Not Proc1.WaitForExit(5000) Then
Proc1.Kill()
End If
'INSERT RESULTS IN LOGGEN_IN COLUMN
datagridView1.Rows(i).Cells(1).Value = Proc1.StandardOutput.ReadToEnd
Next