How To Read From Text File & Store Data So To Modify At A Later Time - vb.net

What I am trying to do may be better for use with SQL Server but I have seen many applications in the past that simply work on text files and I am wanting to try to imitate the same behaviour that those applications follow.
I have a list of URL's in a text file. This is simple enough to open and read line by line, but how can I store additional data from the file and query the data?
E.g.
Text File:
http://link1.com/ - 0
http://link2.com/ - 0
http://link3.com/ - 1
http://link4.com/ - 0
http://link5.com/ - 1
Then I will read the data with:
Private Sub ButtonX2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles ButtonX2.Click
OpenFileDialog1.Filter = "*txt Text Files|*.txt"
If OpenFileDialog1.ShowDialog() = DialogResult.OK Then
Dim AllText As String = My.Computer.FileSystem.ReadAllText(OpenFileDialog1.FileName)
Dim Lines() = Split(AllText, vbCrLf)
Dim list = New List(Of Test)
Dim URLsLoaded As Integer = 0
For i = 0 To UBound(Lines)
If Lines(i) = "" Then Continue For
Dim URLInfo As String() = Split(Lines(i), " - ")
If URLInfo.Count < 6 Then Continue For
list.Add(New Test(URLInfo(0), URLInfo(1)))
URLsLoaded += 1
Next
DataGridViewX1.DataSource = list
LabelX5.Text = URLsLoaded.ToString()
End If
End Sub
So as you can see, above I am prompting the user to open a text file, afterwards it is displayed back to the user in a datagridview.
Now here is my issue, I want to be able to query the data, E.g. Select * From URLs WHERE active='1' (Too used to PHP + MySQL!)
Where the 1 is the corresponding 1 or 0 after the URL in the text file.
In the above example the data is being stored in a simple class as per below:
Public Class Test
Public Sub New(ByVal URL As String, ByVal Active As Integer)
_URL = URL
_Active = Active
End Sub
Private _URL As String
Public Property URL() As String
Get
Return _URL
End Get
Set(ByVal value As String)
_URL = value
End Set
End Property
Private _Active As String
Public Property Active As String
Get
Return _Active
End Get
Set(ByVal value As String)
_Active = value
End Set
End Property
End Class
Am I going completely the wrong way about storing the data after importing from a text file?
I am new to VB.NET and still learning the basics but I find it much easier to learn by playing around before hitting the massive books!

Working example:
Dim myurls As New List(Of Test)
myurls.Add(New Test("http://link1.com/", 1))
myurls.Add(New Test("http://link2.com/", 0))
myurls.Add(New Test("http://link3.com/", 0))
Dim result = From t In myurls Where t.Active = 1
For Each testitem As Test In result
MsgBox(testitem.URL)
Next
By the way, LINQ is magic. You can shorten your loading/parse code to 3 rows of code:
Dim Lines() = IO.File.ReadAllLines("myfile.txt")
Dim myurls As List(Of Test) = (From t In lines Select New Test(Split(t, " - ")(0), Split(t, " - ")(1))).ToList
DataGridViewX1.DataSource = myurls
The first line reads all lines in the file to an array of strings.
The second line splits each line in the array, and creates a test-item and then converts all those result items to an list ( of Test).
Of course this could be misused to sillyness by making it to a one-row:er:
DataGridViewX1.DataSource = (From t In IO.File.ReadAllLines("myfile.txt") Select New Test(Split(t, " - ")(0), Split(t, " - ")(1))).ToList
Wich would render your load function to contain only following 4 rows:
If OpenFileDialog1.ShowDialog() = DialogResult.OK Then
DataGridViewX1.DataSource = (From t In IO.File.ReadAllLines("myfile.txt") Select New Test(Split(t, " - ")(0), Split(t, " - ")(1))).ToList
LabelX5.Text = ctype(datagridviewx1.datasource,List(Of Test)).Count
End If

You can query your class using LINQ, as long as it is in an appropriate collection type, like List(of Test) . I am not familiar completely with the VB syntax for LINQ but it would be something like below.
list.Where(Function(x) x.Active == "1").Select(Function(x) x.Url)
However, this isnt actually storing anything into a database, which i think your question might be asking?

I think you are reinventing the wheel, which is not generally a good thing. If you want SQL like functionality just store the data in a SQL DB and query it.
There are a lot of reasons you should just use an existing DB:
Your code will be less tested and thus more likely to have bugs.
Your code will be less optimized and probably perform worse. (You were planning on implementing a query optimizer and indexing engine for performance, right?)
Your code won't have as many features (locking, constraints, triggers, backup/recovery, a query language, etc.)
There are lots of free RDBMS options out there so it might even be cheaper to use an existing system than spending your time writing an inferior one.
That said, if this is just an academic exercise, go for it. However, I wouldn't do this for a real-world system.

Related

Best way to loop through a list within a sub-loop

First things first, I did a search before posting this question but the answers are so generic that I could not find anything similar to what I'm trying to do. I apologize in advance if this has been answered before and I did not find it.
I have a scenario where I already collected a large list of rows (>10K) from the SQL server and put into an array (List) of strings. These rows are consisted of filenames. Because I already put them on a list, I don't want to query the SQL server again and instead want work with what I already have in memory.
This is the code I'm trying to get right:
'Valid file types for InfoLink1, InfoLink2, InfoLink3
Dim lstValidImageFormats As New List(Of String)({".JPG", ".JPEG", ".JPE", ".BMP", ".PNG", ".TIF", ".TIFF", ".GIF"})
Dim lstValidSTLFormats As New List(Of String)({".STL"})
Dim lstValidSTEPFormats As New List(Of String)({".STP", ".STEP"})
'////////////////
'// Components //
'////////////////
'We don't check Parts.InfoLink because all formats are allowed in this field
'Parts.InfoLink1 - File in Infolink1 columns MUST BE images
For i = 0 To arrStrComponentsInfolink1Values.Count - 1 'We have 10K rows with filenames in this list
For Each FileExtension As String In lstValidImageFormats
If arrStrComponentsInfolink1Values.Item(i).EndsWith(FileExtension) = False Then
End If
Next
Next
I'm trying to parse each item (filename) I have in the array arrStrComponentsInfolink1Values and check if the filename DOES NOT end with one of the extensions in the list lstValidImageFormats. If it doesn't, then I'll send the filename to another list (array).
My difficulty here is about how to iterate through each item in arrStrComponentsInfolink1Values, then check if each filename ends with one of the extensions declared in lstValidImageFormats, do what I want to do with the item if it DOES NOT end with one of those extensions, and then proceed to parse the next item in arrStrComponentsInfolink1Values.
I sincerely don't know what's the best way/performance efficient to do that.
My code above is empty because algorithmically I don't know the best approach to do what I want without querying the SQL server again with something like AND filename NOT LIKE '%.JPG' AND filename NOT LIKE '%.JPEG' AND filename NOT LIKE '%.JPE' AND filename NOT LIKE '%.BMP'...
Because I already have the data in the memory in a list, performance would be much better if I could use what I already have.
Any suggestions or material I could read to learn how to do what I'm looking for?
Thank you!
Here's how I would tackle this:
Dim invalidFormatFiles = _
From x In arrStrComponentsInfolink1Values _
Let fi = New FileInfo(x) _
Where Not lstValidImageFormats.Contains(fi.Extension.ToUpperInvariant()) _
Select x
For Each invalidFormatFile In invalidFormatFiles
' Do your processing
Next
I ended up doing this and it worked:
Dim lstValidImageFormats As New List(Of String)({".JPG", ".JPEG", ".JPE", ".BMP", ".PNG", ".TIF", ".TIFF", ".GIF"})
Dim lstValidSTLFormats As New List(Of String)({".STL"})
Dim lstValidSTEPFormats As New List(Of String)({".STP", ".STEP"})
'////////////////
'// Components //
'////////////////
'We don't check Parts.InfoLink because all formats are allowed in this field
'Parts.InfoLink1 - File in Infolink1 columns MUST BE images
Dim intExtCounter As Integer = 0
For i = 0 To arrIntComponentsInfolink1UNRs.Count - 1 'We have 10K rows with filenames in this list
intExtCounter = 0
For j = 0 To lstValidImageFormats.Count - 1
If arrStrComponentsInfolink1Values.Item(i).EndsWith(lstValidImageFormats.Item(j)) = True Then
intExtCounter += 1
End If
Next
If intExtCounter = 0 Then 'At least one file extension was found
arrIntComponentsInfolink1UNRsReportSectionInvalidExtensions.Add(i) 'File extension is not in the list of allowed extensions
End If
Next
But #41686d6564 answer was the best solution:
Dim newList = arrStrComponentsInfolink1Values.Where(Function(x) Not lstValidImageFormats.Contains(IO.Path.GetExtension(x))).ToList()
Thank you!

Why does my CSV Writer skips some lines? vb.net

I dont know why my CSV writer skips some lines. I tried to export today a gridview and saw it by coincindence. The grid has 250 rows. It writes 1-60 and then goes on at 140-250. 61-139 are missing in the excel table?
to the code: First I create a list whith all columns, because the writer method shall be able to also write only specific columns, but this is another button.
Then the grid and the list is given to the csv Writer.
Private Sub Button4_Click(sender As Object, e As EventArgs) Handles Button4.Click
Dim list As List(Of Integer) = New List(Of Integer)
For Each column As DataGridViewColumn In datagridview2.Columns
list.Add(column.Index)
Next
Module3.csvwriter(list, datagridview2)
MsgBox("exportieren beendet")
End Sub
The csv-Writer creates a csv-body-String for every datarow.
For each row, append every columnvalue to body if the columnindex is in the list.
The best is, that I have seen how the method build row 61, but then it is missing in the csv :/
Public Sub csvwriter(list As List(Of Integer), grid As DataGridView)
Dim body As String = ""
Dim myWriter As New StreamWriter(Importverzeichnis.savecsv, True, System.Text.Encoding.Default)
Dim i As Integer
For i = 0 To grid.Rows.Count - 1
For ix = 0 To grid.Columns.Count - 1
If list.Contains(ix) Then
If grid.Rows(i).Cells(ix).Value IsNot Nothing Then
body = body + grid.Rows(i).Cells(ix).Value.ToString + ";"
Else
body = body + ";"
End If
End If
Next
myWriter.WriteLine(body)
body = ""
Next
myWriter.Close()
End Sub
Has anyone an idea? Do I overlook something?
I found an answer.
1. The csv file is complete. every row is in there.
The problem was or is, that in my datarows is a sign " which escapes the delimiter ; sign.
To solve the problem I save the file now as .txt file,
because then the user gets asked everytime when he opens the file, which one is the delimiter, which is the escaping sign.. this way the user is forced to put in the right settings, while a .csv file uses standard configurations..

how to input data into an array from a text file that are vbTab separated?

I am having trouble turning a set of data from a .txt file into arrays, basically, what i have in the text file is:
Eddy vbtab 20
Andy vbtab 30
James vbtab 20
etc..
I want to set up the names as a Names array, and numbers as number array.
Now what I have done is
strFilename = "CustomerPrices.txt"
If File.Exists(strFilename) Then
Dim srReader As New StreamReader(strFilename)
intRecords = srReader.ReadLine()
intRows = intRecords
For i = 0 To intRows - 1
intLastBlank = strInput.IndexOf(vbTab)
strName(intPrices) = strInput.Substring(0, intLastBlank)
dblPrices(intPrices) = Double.Parse(strInput.Substring(intLastBlank + 1))
But when I debug I get a problem "Object Reference not set to an instance of an object"
Can anyone give me some advise?
Thanks
Separate arrays are probably a bad idea here. They group your data by fields, when it's almost always better to group your data by records. What you want instead is a single collection filled with classes of a particular type. Go for something like this:
Public Class CustomerPrice
Public Property Name As String
Public Property Price As Decimal
End Class
Public Function ReadCustomerPrices(ByVal fileName As String) As List(Of CustomerPrice)
Dim result As New List(Of CustomerPrice)()
Using srReader As New StreamReader(fileName)
Dim line As String
While (line = srReader.ReadLine()) <> Nothing
Dim data() As String = line.Split(vbTab)
result.Add(new CustomerPrice() From {Name = data(0), Price = Decimal.Parse(data(1))})
End While
End Using
Return result
End Function
Some other things worth noting in this code:
The Using block will guarantee the file is closed, even if an exception is thrown
It's almost never appropriate to check File.Exists(). It's wasteful code, because you still have to be able to handle the file io exceptions.
When working with money, you pretty much always want to use the Decimal type rather than Double
This code requires Visual Studio 2010 / .Net 4, and was typed directly into the reply window and so likely contains a bug, or even base syntax error.

linq submitchanges runs out of memory

I have a database with about 180,000 records. I'm trying to attach a pdf file to each of those records. Each pdf is about 250 kb in size. However, after about a minute my program starts taking about about a GB of memory and I have to stop it. I tried doing it so the reference to each linq object is removed once it's updated but that doesn't seem to help. How can I make it clear the reference?
Thanks for your help
Private Sub uploadPDFs(ByVal args() As String)
Dim indexFiles = (From indexFile In dataContext.IndexFiles
Where indexFile.PDFContent = Nothing
Order By indexFile.PDFFolder).ToList
Dim currentDirectory As IO.DirectoryInfo
Dim currentFile As IO.FileInfo
Dim tempIndexFile As IndexFile
While indexFiles.Count > 0
tempIndexFile = indexFiles(0)
indexFiles = indexFiles.Skip(1).ToList
currentDirectory = 'I set the directory that I need
currentFile = 'I get the file that I need
writePDF(currentDirectory, currentFile, tempIndexFile)
End While
End Sub
Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
Dim bytes() As Byte
bytes = getFileStream(file)
indexFile.PDFContent = bytes
dataContext.SubmitChanges()
counter += 1
If counter Mod 10 = 0 Then Console.WriteLine(" saved file " & file.Name & " at " & directory.Name)
End Sub
Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
Dim fileStream = fileInfo.OpenRead()
Dim bytesLength As Long = fileStream.Length
Dim bytes(bytesLength) As Byte
fileStream.Read(bytes, 0, bytesLength)
fileStream.Close()
Return bytes
End Function
I suggest you perform this in batches, using Take (before the call to ToList) to process a particular number of items at a time. Read (say) 10, set the PDFContent on all of them, call SubmitChanges, and then start again. (I'm not sure offhand whether you should start with a new DataContext at that point, but it might be cleanest to do so.)
As an aside, your code to read the contents of a file is broken in at least a couple of ways - but it would be simpler just to use File.ReadAllBytes in the first place.
Also, your way of handling the list gradually shrinking is really inefficient - after fetching 180,000 records, you're then building a new list with 179,999 records, then another with 179,998 records etc.
Does the DataContext have ObjectTrackingEnabled set to true (the default value)? If so, then it will try to keep a record of essentially all the data it touches, thus preventing the garbage collector from being able to collect any of it.
If so, you should be able to fix the situation by periodically disposing the DataContext and creating a new one, or turning object tracking off.
OK. To use the smallest amount of memory we have to update the datacontext in blocks. I've put a sample code below. Might have sytax errors since I'm using notepad to type it in.
Dim DB as YourDataContext = new YourDataContext
Dim BlockSize as integer = 25
Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)
Dim count = 0
Dim tmpDB as YourDataContext = new YourDataContext
While (count < AllITems.Count)
Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
_item.PDF = GetPDF()
Count +=1
if count mod BlockSize = 0 or count = AllItems.Count then
tmpDB.SubmitChanges()
tmpDB = new YourDataContext
GC.Collect()
end if
End While
To Further optimise the speed you can get the recordID's into an array from allitems as an anonymous type, and set DelayLoading on for that PDF field.

VB.NET - Load a List of Values from a Text File

I Have a text file that is like the following:
[group1]
value1
value2
value3
[group2]
value1
value2
[group3]
value3
value 4
etc
What I want to be able to do, is load the values into an array (or list?) based on a passed in group value. eg. If i pass in "group2", then it would return a list of "value1" and "value2".
Also these values don't change that often (maybe every 6 months or so), so is there a better way to store them instead of a plain old text file so that it makes it faster to load etc?
Thanks for your help.
Leddo
This is a home work question?
Use the StreamReader class to read the file (you will need to probably use .EndOfStream and ReadLine()) and use the String class for the string manipulation (probably .StartsWith(), .Substring() and .Split().
As for the better way to store them "IT DEPENDS". How many groups will you have, how many values will there be, how often is the data accessed, etc. It's possible that the original wording of the question will give us a better clue about what they were after hear.
Addition:
So, assuming this program/service is up and running all day, and that the file isn't very large, then you probably want to read the file just once into a Dictionary(of String, List(of String)). The ContainsKey method of this will determine if a group exists.
Function GetValueSet(ByVal filename As String) As Dictionary(Of String, List(Of String))
Dim valueSet = New Dictionary(Of String, List(Of String))()
Dim lines = System.IO.File.ReadAllLines(filename)
Dim header As String
Dim values As List(Of String) = Nothing
For Each line As String In lines
If line.StartsWith("[") Then
If Not values Is Nothing Then
valueSet.add(header, values)
End If
header = GetHeader(line)
values = New List(Of String)()
ElseIf Not values Is Nothing Then
Dim value As String = line.Trim()
If value <> "" Then
values.Add(value)
End If
End If
Next
If Not values Is Nothing Then
valueSet.add(header, values)
End If
Return valueSet
End Function
Function GetHeader(ByVal line As String)
Dim index As Integer = line.IndexOf("]")
Return line.Substring(1, index - 1)
End Function
Addition:
Now if your running a multi-threaded solution (that includes all ASP.Net solutions) then you either want to make sure you do this at the application start up (for ASP.Net that's in Global.asax, I think it's ApplicationStart or OnStart or something), or you will need locking. WinForms and Services are by default not multi-threaded.
Also, if the file changes you need to restart the app/service/web-site or you will need to add a file watcher to reload the data (and then multi-threading will need locking because this is not longer confined to application startup).
ok, here is what I edned up coding:
Public Function FillFromFile(ByVal vFileName As String, ByVal vGroupName As String) As List(Of String)
' open the file
' read the entire file into memory
' find the starting group name
Dim blnFoundHeading As Boolean = False
Dim lstValues As New List(Of String)
Dim lines() As String = IO.File.ReadAllLines(vFileName)
For Each line As String In lines
If line.ToLower.Contains("[" & vGroupName.ToLower & "]") Then
' found the heading, now start loading the lines into the list until the next heading
blnFoundHeading = True
ElseIf line.Contains("[") Then
If blnFoundHeading Then
' we are at the end so exit the loop
Exit For
Else
' its another group so keep going
End If
Else
If blnFoundHeading And line.Trim.Length > 0 Then
lstValues.Add(line.Trim)
End If
End If
Next
Return lstValues
End Function
Regarding a possible better way to store the data: you might find XML useful. It is ridiculously easy to read XML data into a DataTable object.
Example:
Dim dtTest As New System.Data.DataTable
dtTest.ReadXml("YourFilePathNameGoesHere.xml")