How to : streamreader in csv file splits to next if lowercase followed by uppercase in line - vb.net

I am using asp.Net MVC application to upload the excel data from its CSV form to database. While reading the csv file using the Stream Reader, if line contains lower case letter followed by Upper case, it splits in two line . EX.
Line :"1,This is nothing but the Example to explanationIt results wrong, testing example"
This line splits to :
Line 1: 1,This is nothing but the Example to explanation"
Line 2:""
Line 3:It results wrong, testing example
where as CSV file generates right as ""1,This is nothing but the Example to explanationIt results wrong, testing example"
code :
Dim csvFileReader As New StreamReader("my csv file Path")
While Not csvFileReader.EndOfStream()
Dim _line = csvFileReader.ReadLine()
End While
Why should this is happening ? how to resolve this.

When a cell in an excel spreadsheet contains multiple lines, and it is saved to a CSV file, excel separates the lines in the cell with a line-feed character (ASCII value 0x0A). Each row in the spreadsheet is separated with the typical carriage-return/line-feed pair (0x0D 0x0A). When you open the CSV file in notepad, it does not show the lone LF character at all, so it looks like it all runs together on one line. So, in the CSV file, even though notepad doesn't show it, it actually looks like this:
' 1,"This is nothing but the Example to explanation{LF}It results wrong",testing example{CR}{LF}
According to the MSDN documentation on the StreamReader.Readline method:
A line is defined as a sequence of characters followed by a line feed ("\n"), a carriage return ("\r"), or a carriage return immediately followed by a line feed ("\r\n").
Therefore, when you call ReadLine, it will stop reading at the end of the first line in a multi-line cell. To avoid this, you would need to use a different "read" method and then split on CR/LF pairs rather than on either individually.
However, this isn't the only issue you will run into with reading CSV files. For instance, you also need to properly handle the way quotation characters in a cell are escaped in CSV. In such cases, unless it's really necessary to implement it in your own way, it's better to use an existing library to read the file. In this case, Microsoft provides a class in the .NET framework that properly handles reading CSV files (including ones with multi-line cells). The name of the class is TextFieldParser and it's in the Microsoft.VisualBasic.FileIO namespace. Here's the link to a page in the MSDN that explains how to use it to read a CSV file:
http://msdn.microsoft.com/en-us/library/cakac7e6
Here's an example:
Using reader As New TextFieldParser("my csv file Path")
reader.TextFieldType = FieldType.Delimited
reader.SetDelimiters(",")
While Not reader.EndOfData
Try
Dim fields() as String = reader.ReadFields()
' Process fields in this row ...
Catch ex As MalformedLineException
' Handle exception ...
End Try
End While
End Using

Related

VB writeline writes corrupt lines to text file

Is it possible for the following code to produce NUL values within a text file?
var temp_str = "123456;1234567"
My.Computer.FileSystem.WriteAllText(Path & "stats.txt", temp_str, False)
It seems simple, but it writes quite often and I'm seeing several files that get accessed by the application that have Strings written to as:
When opening the file with Notepad++. Some other editors show just squares, and it seems like each character is represented by a block/NUL.
So far I've been unable to reproduce this on my test system. I just find the files on a COMX module's file system that's been running in the field and comes back faulty, but I've been seeing enough of these files to make it a problem that needs to be solved.
Does anyone have an idea to prevent this behaviour?
Hard to say what the problem is without more code, but try this if you want to replace the existing contents of the file:
Dim fileContent = "My UTF-8 file contents"
Using writer As IO.StreamWriter = IO.File.CreateText(fullPathIncludingExtension)
writer.Write(fileContent)
End Using
Or this if you want to append UTF-8 text:
Dim newLines = "My UTF-8 content to append"
Using writer As IO.StreamWriter = IO.File.AppendAllText(fullPathIncludingExtension)
writer.Write(fileContent)
End Using
If you want to append Unicode text, you must use a different constructor for StreamWriter:
Using writer As IO.StreamWriter = New IO.StreamWriter("full/path/to/file.txt", True, Text.Encoding.Unicode)
writer.Write(MyContentToAppend)
End Using
Note that the True argument to the constructor specifies that you want to append text.

Having issues with illegal characters in file path

I am writing a program in VB.NET which loops through a file with some file paths in it to perform an action on. The file paths in this file are each on a line, and i'm looping through the file like:
Dim FileContents As String
FileContents = System.IO.File.ReadAllText("C:\File.txt")
Dim FileSplit As String()
FileSplit = FileContents.Split(vbCrLf)
For Each ThisLine In FileSplit
Dim FileModified As Date
FileModified = System.IO.File.GetLastWriteTime(ThisLine)
'Do something here
Next
Contents of File.txt is:
Y:\Users\localadmin\Desktop\MakeShadowCopy\FileInfo.vb
Y:\Users\localadmin\Desktop\MakeShadowCopy\FindFiles.vb
Y:\Users\localadmin\Desktop\MakeShadowCopy\MakeShadowCopy.sln
Y:\Users\localadmin\Desktop\MakeShadowCopy\MakeShadowCopy.v12.suo
The loop works fine, but it is throwing an exception on the line with GetLastWriteTime() on it, saying that the path contains illegal characters, but it is just a normal string with a file path in it.
If anyone has any ideas, or know how to escape the string going into GetLastWriteTime() that would be much appreciated :)
Thanks!
Probably the lines in your file are not correctly vbCrLf terminated.
If this is the case the Split cannot divide correctly your input in lines and you end up with the whole text passed to the GetLastWriteTime.
Instead of using ReadAllText you could use ReadAllLines and let the work to split the lines to the Framework that knows how to handle the file line break and carriage return codes.
For Each ThisLine In System.IO.File.ReadAllLines("C:\file.txt")
Dim FileModified As Date
FileModified = System.IO.File.GetLastWriteTime(ThisLine.Trim())
Next
Also add a Trim to the ThisLine variable to remove some unseen character added erroneusly to the line
Two ideas:
Use For instead of For Each and ensure that you're getting exception on the very first iteration. If not, you may have issues with one specific file path. Check out iteration variable value if that is the case.
Open the file in a hex editor and ensure that each line is terminating properly. You might have either CR (10) or LF(13) character at the end but not both as normal in Windows.

VBA Reading From a UCS-2 Little Endian Encoded Text File

I have a whole bunch of text files that are exported from Photoshop that I need to import into an Excel document. I wrote a macro to get the job done and it seemed to work just fine for my test document but when I tried loading in some of the actual files produced by Photoshop Excel started putting all the data in a separate column except for the first line.
My code that reads the text file:
Open currentDocPath For Input As stream
Do Until EOF(stream)
Input #stream, currentLine
columnContents = Split(currentLine, vbTab)
For n = 0 To UBound(columnContents)
ActiveSheet.Cells(row, Chr(64 + colum + n)).Value = columnContents(n)
Next n
row = row + 1
Loop
Close stream
The text files I am reading look like this, only with much more data:
"Name" "Data" "Info" "blah"
"Name1" "Data1" "Info1" "blah1"
"Name2" "Data2" "Info2" "blah2"
The problem seemed pretty trivial, but when I load it into excel, instaed of looking like it does above it looks like this:
ÿþ"Name" "Data" "Info" "blah"
Name1
Data1
Info1
blah1
Name2
Data2
Info2
blah2
Now I am not sure why this is happening. It seems like the first two characters in the first row are there because those bytes declare the text encoding. Somehow those characters keep the first row formatted correctly while the remaining rows lose their quotation marks and all get moved to new lines.
Could someone who understands UCS-2 Little Endian text encoding explain how I can work around this? When I convert the files to ASCII it works fine.
Cheers!
edit: Okay so I understand now that the encoding is UTF-16 (I don't know a whole lot about character encoding). My main issue is that it's formatting strangely and I don't understand why or how to fix it. Thanks!
As I mentioned in my comment, it appears the file you're trying to import is encoded in UTF-16.
In this vbaexpress.com article, someone suggested that the following should work:
Dim GetOpenFile As String
Dim MyData As String
Dim r As Long
GetOpenFile = Application.GetOpenFilename
r = 1
Open GetOpenFile For Input As #1
Do While Not EOF(1)
Line Input #1, MyData
Cells(r, 1).Value = MyData
r = r + 1
Loop
Close #1
Obviously I can't test it myself, but maybe it'll help you.
Why not just tell excel to import the file. MS has probably put hundreds of thousands of person hours into that code. Record the importation to get easy code.
Remember Excel is a tool for non programmers to do programming things. Use it instead of trying to replace it.
These are the replacement file functions that you use for new code. Add a reference to Microsoft Scripting Runtime.
Opens a specified file and returns a TextStream object that can be used to read from, write to, or append to the file.
object.OpenTextFile(filename[, iomode[, create[, format]]])
Arguments
object
Required. Object is always the name of a FileSystemObject.
filename
Required. String expression that identifies the file to open.
iomode
Optional. Can be one of three constants: ForReading, ForWriting, or ForAppending.
create
Optional. Boolean value that indicates whether a new file can be created if the specified filename doesn't exist. The value is True if a new file is created, False if it isn't created. If omitted, a new file isn't created.
format
Optional. One of three Tristate values used to indicate the format of the opened file. If omitted, the file is opened as ASCII.
The format argument can have any of the following settings:
Constant Value Description
TristateUseDefault
-2
Opens the file using the system default.
TristateTrue
-1
Opens the file as Unicode.
TristateFalse
0
Opens the file as ASCII.

How to search data in a txt file through Visual Basic

I have this txt file with the following information:
National_Insurence_Number;Name;Surname;Hours_Worked;Price_Per_Hour so:
eg.: aa-12-34-56-a;Peter;Smith;36;12
This data has been inputed to the txt file through a VB form which works totally fine, the problem comes when, on another form. This is what I expect it to do:
The user will input into a text box the employees NI Number.
The program will then search through the file that NI Number and, if found;
It will fill in the appropriate text boxes with its data.
(Then the program calculates tax and national insurance which i got working fine)
So basically the problem comes telling the program to search that NI number and introduce each ";" delimited field into its corresponding text box.
Thanks for all.
You just need to parse the file like a csv, you can use Microsoft.VisualBasic.FileIO.TextFieldParser to do this or you can use CSVHelper - https://github.com/JoshClose/CsvHelper
I've used csv helper in the past and it works great, it allows you to create a class with the structure of the records in your data file then imports the data into a list of these for searching.
You can look here for more info on TextFieldParser if you want to go that way -
Parse Delimited CSV in .NET
Dim afile As FileIO.TextFieldParser = New FileIO.TextFieldParser(FileName)
Dim CurrentRecord As String() ' this array will hold each line of data
afile.TextFieldType = FileIO.FieldType.Delimited
afile.Delimiters = New String() {";"}
afile.HasFieldsEnclosedInQuotes = True
' parse the actual file
Do While Not afile.EndOfData
Try
CurrentRecord = afile.ReadFields
Catch ex As FileIO.MalformedLineException
Stop
End Try
Loop
I'd recommend using CsvHelper though, the documentation is pretty good and working with objects is much easier opposed to the raw string data.
Once you have found the record you can then manually set the text of each text box on your form or use a bindingsource.

line input not working as expected in VBA

I have a text file that I open and attempt to read the individual lines. I have used the same code before on other files with no problem, but for some reason, this particular file is strange. When I do the following command;
Line Input #1, read_string
the string read_string contains the entire sequence of each line in the file concatenated together. When I look at the special chararcters of the file I do see a cariage return. But just so you know what the file looks like, here are the first two lines (daniweb formatting is too strange to print text here),
k_arr[8'h1C]= {10'b001111_0100,10'b110000_1011} ;
k_arr[8'h1C]= {10'b001111_0100,10'b110000_1011} ;
Anybody know how I can read each line? apparently line input doesnt work for this file.
Try
Dim lines() As String
lines = Split(read_string, vbCr) 'splitting with Carriage Return delimiter
'did it work?
Debug.Print lines(1)
Debug.Print lines(2) Dim lines() As String
Each element of the lines array should now contain one line of your text file.
If it didn't work, try with another delimiter instead of vbCr, e.g. vbLf (line feed).