Need a more efficent way to merge text files - vb.net

This is my code at present
Dim Paths() As String = Directory.GetFiles("files*.txt")
For Each Path As String In Paths
File.AppendAllText("merged.txt", File.ReadAllText(Path), Encoding.Default)
Next
The problem seems that using this method, performance is poort when dealing with several large files.
Is there a more efficent way to merge text files? Maybe reading all the files into a streamreader first and then creating the output file in one operation?

try this:
using (StreamWriter sw = new StreamWriter("merge.txt"))
{
string[] paths = Directory.GetFiles("files*.txt");
foreach (string path in paths)
using (StreamReader sr = new StreamReader(path))
{
sw.Write(sr.ReadToEnd());
sw.WriteLine("");
}
}
I think that the slow operation is in File.AppendAllText that open->write->close the merge.txt file for each txt file in directory

Related

Save picture directly to stream? [duplicate]

I have a filename pointing to a text file, including its path, as a string. Now I'd like to load this .csv file into memory stream. How should I do that?
For example, I have this:
Dim filename as string="C:\Users\Desktop\abc.csv"
Dim stream As New MemoryStream(File.ReadAllBytes(filename))
You don't need to load a file into a MemoryStream.
You can simply call File.OpenRead to get a FileStream containing the file.
If you really want the file to be in a MemoryStream, you can call CopyTo to copy the FileStream to a MemoryStream.
I had an XML file being read from disk, using the old XmlReader API. How to read the XML file into memory, and then work with it in memory, instead of reading the disk repeatedly? Based on VB answer from Centro (upvoted) but with a Using block, and in C#.
The key line:
MemoryStream myXMLDocument = new MemoryStream(File.ReadAllBytes(#"c:\temp\myDemoXMLDocument.xml"));
Re the OP's question, if you wanted to load a CSV file into a MemoryStream:
MemoryStream myCSVDataInMemory = new MemoryStream(File.ReadAllBytes(#"C:\Users\Desktop\abc.csv"));
Following is a code snippet showing code to reads through XML document now that it's in a MemoryStream. Basically the same code as when it was coming from a FileStream that pointed to a file on disk. Yes, the XMLTextReader API is old and clunky, but it's what I had to work with in this app.
string myXMLFileName = #"c:\temp\myDemoXMLDocument.xml";
using (MemoryStream myXMLDocument = new MemoryStream(File.ReadAllBytes(myXMLFileName)))
{
myXMLTextReader = new XmlTextReader(myXMLDocument);
myXMLTextReader.WhitespaceHandling = WhitespaceHandling.None;
myXmlTextReader.Read(); // read the XML declaration node, advance to <Batch> tag
while (!myXmlTextReader.EOF)
{
if (myXmlTextReader.Name == "xml" && !myXmlTextReader.IsStartElement()) break;
// advance to <Batch> tag
while (myXmlTextReader.Name == "Batch" && myXmlTextReader.IsStartElement())
{
string BatchIdentifier = myXmlTextReader.GetAttribute("BatchIdentifier");
myXmlTextReader.Read(); // advance to next tag
while (!myXmlTextReader.EOF)
{
if (myXmlTextReader.Name == "Transaction" && myXmlTextReader.IsStartElement())
{
// Start a new set of items
string transactionID = myXmlTextReader.GetAttribute("ID");
myXmlTextReader.Read(); // Read next element, possibly another Transaction tag
}
}
//All Batch tags are completed.Move to next tag
myXmlTextReader.Read();
}
// Close the XML memory stream.
myXmlTextReader.Close();
myXmlDocument.Close();
}
}
You can copy it to a file stream like so:
string fullPath = Path.Combine(filePath, fileName);
FileStream fileStream = new FileStream(fullPath, FileMode.Open);
Image image = Image.FromStream(fileStream);
MemoryStream memoryStream = new MemoryStream();
image.Save(memoryStream, ImageFormat.Jpeg);
//Close File Stream
fileStream.Close();

Upload file with meta data and checkin to sharpoint folder using Client Object Model

Hi I'm trying to upload a file to sharepoint 2010 using the client api with meta data and also checkin the file after I'm done. Below is my code:
public void UploadDocument(SharePointFolder folder, String filename, Boolean overwrite)
{
var fileInfo = new FileInfo(filename);
var targetLocation = String.Format("{0}{1}{2}", folder.ServerRelativeUrl,
Path.AltDirectorySeparatorChar, fileInfo.Name);
using (var fs = new FileStream(filename, FileMode.Open))
{
SPFile.SaveBinaryDirect(mClientContext, targetLocation, fs, overwrite);
}
// doesn't work
SPFile newFile = mRootWeb.GetFileByServerRelativeUrl(targetLocation);
mClientContext.Load(newFile);
mClientContext.ExecuteQuery();
//check out to make sure not to create multiple versions
newFile.CheckOut();
// use OverwriteCheckIn type to make sure not to create multiple versions
newFile.CheckIn("test", CheckinType.OverwriteCheckIn);
mClientContext.Load(newFile);
mClientContext.ExecuteQuery();
//SPFile uploadFile = mRootWeb.GetFileByServerRelativeUrl(targetLocation);
//uploadFile.CheckOut();
//uploadFile.CheckIn("SOME VERSION COMMENT I'D LIKE TO ADD", CheckinType.OverwriteCheckIn);
//mClientContext.ExecuteQuery();
}
I'm able to upload the file but I can't add any meta data and file is checked out. I want to add some meta data and checkin the file after I'm done.
My SharePointFolder class has the serverRelativeUrl of the folder path to upload to. Any help greatly appreciated.
You need a credential before the executeQuery(); and SaveBinaryDirect();
For example:
mClientContext.Credentials = new NetworkCredential("LoginID","LoginPW", "LoginDomain");
SPFile newFile = mRootWeb.GetFileByServerRelativeUrl(targetLocation);
mClientContext.Load(newFile);
mClientContext.ExecuteQuery();

Winrt StreamWriter & StorageFile does not completely Overwrite File

Quick search here yielded nothing. So, I have started using some rather roundabout ways to use StreamWriter in my WinRT Application. Reading works well, writing works differently. What' I'm seeing is that when I select my file to write, if I choose a new file then no problem. The file is created as I expect. If I choose to overwrite a file, then the file is overwritten to a point, but the point where the stream stops writing, if the original file was large, then the old contents exist past where my new stream writes.
The code is as such:
public async void WriteFile(StorageFile selectedFileToSave)
{
// At this point, selectedFileToSave is from the Save File picker so can be a enw or existing file
StreamWriter writeStream;
Encoding enc = new UTF8Encoding();
Stream dotNetStream;
dotNetStream = await selectedFileToSave.OpenStreamForWriteAsync();
StreamWriter writeStream = new StreamWriter(dotNetStream, enc);
// Do writing here
// Close
writeStream.Write(Environment.NewLine);
await writeStream.FlushAsync();
await dotNetStream.FlushAsync();
}
Can anyone offer clues on what I could be missing? There are lots of functions missing in WinRT, so not really following ways to get around this
Alternatively you can set length of the stream to 0 with SetLength method before using StreamWriter:
var stream = await file.OpenStreamForWriteAsync();
stream.SetLength(0);
using (var writer = new StreamWriter(stream))
{
writer.Write(text);
}
Why not just use the helper methods in FileIO class? You could call:
FileIO.WriteTextAsync(selectedFileToSave, newTextContents);
If you really need a StreamWriter, first truncate the file by calling
FileIO.WriteBytesAsync(selectedFileToSave, new byte[0]);
And then continue with your existing code.

How to Read a pre-built Text File in a Windows Phone Application

I've been trying to read a pre-built file with Car Maintenance tips, there's one in each line of my "Tips.txt" file. I've tried to follow around 4 or 5 different approaches but It's not working, it compiles but I get an exception. Here's what I've got:
using (IsolatedStorageFile store = IsolatedStorageFile.GetUserStoreForApplication())
{
using (StreamReader sr = new StreamReader(store.OpenFile("Tips.txt", FileMode.Open, FileAccess.Read)))
{
string line;
while ((line = sr.ReadLine()) != null)
{
(App.Current as App).MyTips.Insert(new DoubleNode(line));
}
}
}
I'm getting this "Operation not permitted on IsolatedStorageFileStream", from the info inside the 2nd using statement. I tried with the build action of my "Tips.txt" set to resource, and content, yet I get the same result.
Thanks in advance.
Since you've added it to your project directory, you can't read it using Isolated Storage methods. There are various ways you can load the file. One way would be to set the text file's build type to Resource, then read it in as a stream:
//Replace 'MyProject' with the name of your XAP/Project
Stream txtStream = Application.GetResourceStream(new Uri("/MyProject;component/myTextFile.txt",
UriKind.Relative)).Stream;
using(StreamReader sr = new StreamReader(txtStream))
{
//your code
}

Reading large csv files

Which is the most performant way to read a large csv file in .NET?
Using FileStream? or another class?
Thanks!
You can use the StreamReader returned by FileInfo.OpenText:
Dim file As New FileInfo("path\to\file")
Using reader As StreamReader = file.OpenText()
While Not reader.EndOfStream
Dim nextLine As String = reader.ReadLine()
ProcessCsvLine(nextLine)
End While
End Using
If you want to read it all into memory, a simple File.ReadAllText() will do just fine.
EDIT: If your file is indeed very large, then you can use the StreamReader class, see here for details. This approach is sometimes inevitable but should mostly be avoided for style reasons. See here for a more in-depth discussion.
The most efficient way of doing this is by taking advantage of deffered execution in LINQ. You can create a simple Linq-To-Text function that read one line at a time, work on it and then continue. This is really helpful since the file is really large.
I would desist from using the ReadBlock or ReadBlock or ReadToEnd methods of StreamReader class since they tend to read a number of lines at once or even the entire lines in the file. This ends up consuming more memory than if a line was read one at a time.
public static IEnumerable<string> Lines(this StreamReader source)
{
String line;
if (source == null)
throw new ArgumentNullException("source");
while ((line = source.ReadLine()) != null)
{
yield return line;
}
}
Note that the function is an extension method of the StreamReader class. This means it can be used as follows:
class Program
{
static void Main(string[] args)
{
using(StreamReader streamReader = new StreamReader("TextFile.txt"))
{
var tokens = from line in streamReader.Lines()
let items = line.Split(',')
select String.Format("{0}{1}{2}",
items[1].PadRight(16),
items[2].PadRight(16),
items[3].PadRight(16));
}
}
}
I had very good experience with this library:
http://www.codeproject.com/KB/database/CsvReader.aspx
I am using this library for Csv reading. This is really nice to use
http://www.codeproject.com/Articles/11698/A-Portable-and-Efficient-Generic-Parser-for-Flat-F