Read several lines from file at once - vb.net

The following code works just fine:
Function ReadLines(FN As String, n As Integer) As String()
Dim T(0 To n) As String
With My.Computer.FileSystem.OpenTextFileReader(FN)
For i As Integer = 0 To n
T(i) = .ReadLine()
Next
End With
Return T
End Function
However, if the file is located on a distant server, this might prove horrendously slow. Is there a way to do the same more efficiently? I could read the whole file at once too, but this also is fairly inefficient...

The BufferedStream class is specifically designed to reduce the number of system IO when a file is read (or written) sequentially in a series. So this is expected to make your reads more effective:
Function ReadLines(FN As String, n As Integer) As String()
Using fs As FileStream = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
Using bs As New BufferedStream(fs)
Using sr As New StreamReader(bs)
Dim lines = New List(Of String)(n)
For i As Integer = 0 To n
Dim line As String = sr.ReadLine()
If (line Is Nothing) Then
Exit For
End If
lines.Add(line)
Next
Return lines.ToArray()
End Using
End Using
End Using
End Function

Dim lines = IO.File.ReadLines(fileName).
Skip(firstLineIndex).
Take(lineCount).
ToArray()
Unlike the File.ReadAllLines method, File.ReadLines doesn't read the whole file in a single operation. It exposes the contents of the file as an Enumerable(Of String) but doesn't read anything from the file until you use it. That code will have to read every line up to the last one you want but won't read anything beyond that.
To be honest, I'm not sure how much quicker it will be though, because it might actually use StreamReader.ReadLine under the hood. It's worth testing though. The alternative would be to just read larger chunks of the file and break it into lines yourself, stopping when you've read at least as much as you need.

Related

Search/Replace in VB.NET

Been following the threads for sometime as a novice (more of a newbie) but am now starting to do more.
I can read how to open a text file but am having trouble understanding the .replace functionality (I get the syntax just can't get it to work).
Scenario:
Inputfile name = test_in.txt
replace {inputfile} with c:\temp\test1.txt
I'm using test.txt as a template for a scripting tool and need to replace various values within to a new file called test_2.txt.
I've got variables defining the input and output files without a problem, I just can't catch the syntax for opening the new file and replacing.
You really have not given us that much to go on. But a common mistake in using String.Replace is that it makes a copy of the source which needs to be saved to another variable or else it will go into the bit bucket. So in your case, something like this should work.
Dim Buffer As String 'buffer
Dim inputFile As String = "C:\temp\test.txt" 'template file
Dim outputFile As String = "C:\temp\test_2.txt" 'output file
Using tr As TextReader = File.OpenText(inputFile)
Buffer = tr.ReadToEnd
End Using
Buffer = Buffer.Replace("templateString", "Hello World")
File.WriteAllText(outputFile, Buffer)
Try something like this:
Dim sValuesToReplace() As String = New String() {"Value1", "Value2", "Value3"}
Dim sText As String = IO.File.ReadAllText(inputFilePath)
For Each elem As String In sValuesToReplace
sText = sText.Replace(elem, sNewValue)
Next
IO.File.WriteAllText(sOutputFilePath, sText)
It depends if you want to replace all values with only one value, or with different values for each. If you need different values you can use a Dictionary:
Dim sValuesToReplace As New Dictionary(Of String, String)()
sValuesToReplace.Add("oldValue1", "newValue1")
sValuesToReplace.Add("oldValue2", "newValue2")
'etc
And then loop throgh it with:
For Each oldElem As String In sValuesToReplace.Keys
sText = sText.Replace(oldElem, sValuesToReplace(oldElem))
Next

Visual Basic Read File Line by Line storing each Line in str

I am trying to loop through the contents of a text file reading the text file line by line. During the looping process there is several times I need to use the files contents.
Dim xRead As System.IO.StreamReader
xRead = File.OpenText(TextBox3.Text)
Do Until xRead.EndOfStream
Dim linetext As String = xRead.ReadLine
Dim aryTextFile() As String = linetext.Split(" ")
Dim firstname As String = Val(aryTextFile(0))
TextBox1.Text = firstname.ToString
Dim lastname As String = Val(aryTextFile(0))
TextBox2.Text = lastname.ToString
Loop
Edit: What I am trying to do is read say the first five items in a text file perform some random processing then read the next 5 lines of the text file.
I would like to be able to use the lines pulled from the text file as separated string variables.
It is not clear why you would need to have 5 lines stored at any time, according to your code sample, since you are only processing one line at a time. If you think that doing 5 lines at once will be faster - this is unlikely, because .NET maintains caching internally, so both approaches will probably perform the same. However, reading one line at a time is a much more simple pattern to use, so better look into that first.
Still, here is an approximate version of the code that does processing every 5 lines:
Sub Main()
Dim bufferMaxSize As Integer = 5
Using xRead As New System.IO.StreamReader(TextBox3.Text)
Dim buffer As New List(Of String)
Do Until xRead.EndOfStream
If buffer.Count < bufferMaxSize Then
buffer.Add(xRead.ReadLine)
Continue Do
Else
PerformProcessing(buffer)
buffer.Clear()
End If
Loop
If buffer.Count > 0 Then
'if line count is not divisible by bufferMaxSize, 5 in this case
'there will be a remainder of 1-4 records,
'which also needs to be processed
PerformProcessing(buffer)
End If
End Using
End Sub
Here is mine . Rely easy . Just copy the location from the file and copy1 folder to does locations . This is my first program :) . ready proud of it
Imports System.IO
Module Module1
Sub Main()
For Each Line In File.ReadLines("C:\location.txt".ToArray)
My.Computer.FileSystem.CopyDirectory("C:\Copy1", Line, True)
Next
Console.WriteLine("Done")
Console.ReadLine()
End Sub
End Module

VB.NET Speed up cycle function

I have this function to write bytes in a bin file.
Public Shared Function writeFS(path As String, count As Integer) As Integer
Using writer As New BinaryWriter(File.Open(path, FileMode.Open, FileAccess.Write, FileShare.Write), Encoding.ASCII)
Dim x As Integer
Do Until x = count
writer.BaseStream.Seek(0, SeekOrigin.End)
writer.Write(CByte(&HFF))
x += 1
Loop
End Using
Return -1
End Function
I have a textBox that is the count value. Count is the number of byte to write into the file.
The problem is when i want to write 1mb+ it take like 10+ seconds because of the cycle.
I need a better/faster way to write hex value FF at the end of file 'value' times.
I'm sorry if i've not explained very well.
This should be better:
Public Shared Function writeFS(path As String, count As Integer) As Integer
Using writer As New BinaryWriter(File.Open(path, FileMode.Open, FileAccess.Write, FileShare.Write), Encoding.ASCII)
Dim x As Integer
Dim b as Byte
b = CByte(&HFF)
writer.BaseStream.Seek(0, SeekOrigin.End)
Do Until x = count
writer.Write(b)
x += 1
Loop
End Using
Return -1
End Function
That way you're not calling CByte every time. And there is no need to move to the end of the stream after each write.
Some questions before:
Why is the function Shared? Why do you use FileSHARE.Write? WriteShare means that OTHER process can write to the file, while YOU write to it. And WHY do you write single bytes that are all the same? And why does the function return -1 every time? Might be better fitter with a SUB? And why not using a simple for loop instead while?
Public Sub writeFS(path As String, count As Integer)
Using Stream As New FileStream("", FileMode.Append, FileAccess.Write, FileShare.Read)
Stream.Write(Enumerable.Repeat(Of Byte)(255, count).ToArray, 0, count)
End Using
End Sub
OK, if you need to write 100MB, this doesnt fit, but than you can partition your writes.

linq submitchanges runs out of memory

I have a database with about 180,000 records. I'm trying to attach a pdf file to each of those records. Each pdf is about 250 kb in size. However, after about a minute my program starts taking about about a GB of memory and I have to stop it. I tried doing it so the reference to each linq object is removed once it's updated but that doesn't seem to help. How can I make it clear the reference?
Thanks for your help
Private Sub uploadPDFs(ByVal args() As String)
Dim indexFiles = (From indexFile In dataContext.IndexFiles
Where indexFile.PDFContent = Nothing
Order By indexFile.PDFFolder).ToList
Dim currentDirectory As IO.DirectoryInfo
Dim currentFile As IO.FileInfo
Dim tempIndexFile As IndexFile
While indexFiles.Count > 0
tempIndexFile = indexFiles(0)
indexFiles = indexFiles.Skip(1).ToList
currentDirectory = 'I set the directory that I need
currentFile = 'I get the file that I need
writePDF(currentDirectory, currentFile, tempIndexFile)
End While
End Sub
Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
Dim bytes() As Byte
bytes = getFileStream(file)
indexFile.PDFContent = bytes
dataContext.SubmitChanges()
counter += 1
If counter Mod 10 = 0 Then Console.WriteLine(" saved file " & file.Name & " at " & directory.Name)
End Sub
Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
Dim fileStream = fileInfo.OpenRead()
Dim bytesLength As Long = fileStream.Length
Dim bytes(bytesLength) As Byte
fileStream.Read(bytes, 0, bytesLength)
fileStream.Close()
Return bytes
End Function
I suggest you perform this in batches, using Take (before the call to ToList) to process a particular number of items at a time. Read (say) 10, set the PDFContent on all of them, call SubmitChanges, and then start again. (I'm not sure offhand whether you should start with a new DataContext at that point, but it might be cleanest to do so.)
As an aside, your code to read the contents of a file is broken in at least a couple of ways - but it would be simpler just to use File.ReadAllBytes in the first place.
Also, your way of handling the list gradually shrinking is really inefficient - after fetching 180,000 records, you're then building a new list with 179,999 records, then another with 179,998 records etc.
Does the DataContext have ObjectTrackingEnabled set to true (the default value)? If so, then it will try to keep a record of essentially all the data it touches, thus preventing the garbage collector from being able to collect any of it.
If so, you should be able to fix the situation by periodically disposing the DataContext and creating a new one, or turning object tracking off.
OK. To use the smallest amount of memory we have to update the datacontext in blocks. I've put a sample code below. Might have sytax errors since I'm using notepad to type it in.
Dim DB as YourDataContext = new YourDataContext
Dim BlockSize as integer = 25
Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)
Dim count = 0
Dim tmpDB as YourDataContext = new YourDataContext
While (count < AllITems.Count)
Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
_item.PDF = GetPDF()
Count +=1
if count mod BlockSize = 0 or count = AllItems.Count then
tmpDB.SubmitChanges()
tmpDB = new YourDataContext
GC.Collect()
end if
End While
To Further optimise the speed you can get the recordID's into an array from allitems as an anonymous type, and set DelayLoading on for that PDF field.

VB.NET - Load a List of Values from a Text File

I Have a text file that is like the following:
[group1]
value1
value2
value3
[group2]
value1
value2
[group3]
value3
value 4
etc
What I want to be able to do, is load the values into an array (or list?) based on a passed in group value. eg. If i pass in "group2", then it would return a list of "value1" and "value2".
Also these values don't change that often (maybe every 6 months or so), so is there a better way to store them instead of a plain old text file so that it makes it faster to load etc?
Thanks for your help.
Leddo
This is a home work question?
Use the StreamReader class to read the file (you will need to probably use .EndOfStream and ReadLine()) and use the String class for the string manipulation (probably .StartsWith(), .Substring() and .Split().
As for the better way to store them "IT DEPENDS". How many groups will you have, how many values will there be, how often is the data accessed, etc. It's possible that the original wording of the question will give us a better clue about what they were after hear.
Addition:
So, assuming this program/service is up and running all day, and that the file isn't very large, then you probably want to read the file just once into a Dictionary(of String, List(of String)). The ContainsKey method of this will determine if a group exists.
Function GetValueSet(ByVal filename As String) As Dictionary(Of String, List(Of String))
Dim valueSet = New Dictionary(Of String, List(Of String))()
Dim lines = System.IO.File.ReadAllLines(filename)
Dim header As String
Dim values As List(Of String) = Nothing
For Each line As String In lines
If line.StartsWith("[") Then
If Not values Is Nothing Then
valueSet.add(header, values)
End If
header = GetHeader(line)
values = New List(Of String)()
ElseIf Not values Is Nothing Then
Dim value As String = line.Trim()
If value <> "" Then
values.Add(value)
End If
End If
Next
If Not values Is Nothing Then
valueSet.add(header, values)
End If
Return valueSet
End Function
Function GetHeader(ByVal line As String)
Dim index As Integer = line.IndexOf("]")
Return line.Substring(1, index - 1)
End Function
Addition:
Now if your running a multi-threaded solution (that includes all ASP.Net solutions) then you either want to make sure you do this at the application start up (for ASP.Net that's in Global.asax, I think it's ApplicationStart or OnStart or something), or you will need locking. WinForms and Services are by default not multi-threaded.
Also, if the file changes you need to restart the app/service/web-site or you will need to add a file watcher to reload the data (and then multi-threading will need locking because this is not longer confined to application startup).
ok, here is what I edned up coding:
Public Function FillFromFile(ByVal vFileName As String, ByVal vGroupName As String) As List(Of String)
' open the file
' read the entire file into memory
' find the starting group name
Dim blnFoundHeading As Boolean = False
Dim lstValues As New List(Of String)
Dim lines() As String = IO.File.ReadAllLines(vFileName)
For Each line As String In lines
If line.ToLower.Contains("[" & vGroupName.ToLower & "]") Then
' found the heading, now start loading the lines into the list until the next heading
blnFoundHeading = True
ElseIf line.Contains("[") Then
If blnFoundHeading Then
' we are at the end so exit the loop
Exit For
Else
' its another group so keep going
End If
Else
If blnFoundHeading And line.Trim.Length > 0 Then
lstValues.Add(line.Trim)
End If
End If
Next
Return lstValues
End Function
Regarding a possible better way to store the data: you might find XML useful. It is ridiculously easy to read XML data into a DataTable object.
Example:
Dim dtTest As New System.Data.DataTable
dtTest.ReadXml("YourFilePathNameGoesHere.xml")