Is there a way i can write to CSV Faster? [duplicate] - vb.net

Could somebody please tell me why the following code is not working. The data is saved into the csv file, however the data is not separated. It all exists within the first cell of each row.
StringBuilder sb = new StringBuilder();
foreach (DataColumn col in dt.Columns)
{
sb.Append(col.ColumnName + ',');
}
sb.Remove(sb.Length - 1, 1);
sb.Append(Environment.NewLine);
foreach (DataRow row in dt.Rows)
{
for (int i = 0; i < dt.Columns.Count; i++)
{
sb.Append(row[i].ToString() + ",");
}
sb.Append(Environment.NewLine);
}
File.WriteAllText("test.csv", sb.ToString());
Thanks.

The following shorter version opens fine in Excel, maybe your issue was the trailing comma
.net = 3.5
StringBuilder sb = new StringBuilder();
string[] columnNames = dt.Columns.Cast<DataColumn>().
Select(column => column.ColumnName).
ToArray();
sb.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in dt.Rows)
{
string[] fields = row.ItemArray.Select(field => field.ToString()).
ToArray();
sb.AppendLine(string.Join(",", fields));
}
File.WriteAllText("test.csv", sb.ToString());
.net >= 4.0
And as Tim pointed out, if you are on .net>=4, you can make it even shorter:
StringBuilder sb = new StringBuilder();
IEnumerable<string> columnNames = dt.Columns.Cast<DataColumn>().
Select(column => column.ColumnName);
sb.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in dt.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field => field.ToString());
sb.AppendLine(string.Join(",", fields));
}
File.WriteAllText("test.csv", sb.ToString());
As suggested by Christian, if you want to handle special characters escaping in fields, replace the loop block by:
foreach (DataRow row in dt.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field =>
string.Concat("\"", field.ToString().Replace("\"", "\"\""), "\""));
sb.AppendLine(string.Join(",", fields));
}
And last suggestion, you could write the csv content line by line instead of as a whole document, to avoid having a big document in memory.

I wrapped this up into an extension class, which allows you to call:
myDataTable.WriteToCsvFile("C:\\MyDataTable.csv");
on any DataTable.
public static class DataTableExtensions
{
public static void WriteToCsvFile(this DataTable dataTable, string filePath)
{
StringBuilder fileContent = new StringBuilder();
foreach (var col in dataTable.Columns)
{
fileContent.Append(col.ToString() + ",");
}
fileContent.Replace(",", System.Environment.NewLine, fileContent.Length - 1, 1);
foreach (DataRow dr in dataTable.Rows)
{
foreach (var column in dr.ItemArray)
{
fileContent.Append("\"" + column.ToString() + "\",");
}
fileContent.Replace(",", System.Environment.NewLine, fileContent.Length - 1, 1);
}
System.IO.File.WriteAllText(filePath, fileContent.ToString());
}
}

A new extension function based on Paul Grimshaw's answer. I cleaned it up and added the ability to handle unexpected data. (Empty Data, Embedded Quotes, and comma's in the headings...)
It also returns a string which is more flexible. It returns Null if the table object does not contain any structure.
public static string ToCsv(this DataTable dataTable) {
StringBuilder sbData = new StringBuilder();
// Only return Null if there is no structure.
if (dataTable.Columns.Count == 0)
return null;
foreach (var col in dataTable.Columns) {
if (col == null)
sbData.Append(",");
else
sbData.Append("\"" + col.ToString().Replace("\"", "\"\"") + "\",");
}
sbData.Replace(",", System.Environment.NewLine, sbData.Length - 1, 1);
foreach (DataRow dr in dataTable.Rows) {
foreach (var column in dr.ItemArray) {
if (column == null)
sbData.Append(",");
else
sbData.Append("\"" + column.ToString().Replace("\"", "\"\"") + "\",");
}
sbData.Replace(",", System.Environment.NewLine, sbData.Length - 1, 1);
}
return sbData.ToString();
}
You call it as follows:
var csvData = dataTableOject.ToCsv();

If your calling code is referencing the System.Windows.Forms assembly, you may consider a radically different approach.
My strategy is to use the functions already provided by the framework to accomplish this in very few lines of code and without having to loop through columns and rows. What the code below does is programmatically create a DataGridView on the fly and set the DataGridView.DataSource to the DataTable. Next, I programmatically select all the cells (including the header) in the DataGridView and call DataGridView.GetClipboardContent(), placing the results into the Windows Clipboard. Then, I 'paste' the contents of the clipboard into a call to File.WriteAllText(), making sure to specify the formatting of the 'paste' as TextDataFormat.CommaSeparatedValue.
Here is the code:
public static void DataTableToCSV(DataTable Table, string Filename)
{
using(DataGridView dataGrid = new DataGridView())
{
// Save the current state of the clipboard so we can restore it after we are done
IDataObject objectSave = Clipboard.GetDataObject();
// Set the DataSource
dataGrid.DataSource = Table;
// Choose whether to write header. Use EnableWithoutHeaderText instead to omit header.
dataGrid.ClipboardCopyMode = DataGridViewClipboardCopyMode.EnableAlwaysIncludeHeaderText;
// Select all the cells
dataGrid.SelectAll();
// Copy (set clipboard)
Clipboard.SetDataObject(dataGrid.GetClipboardContent());
// Paste (get the clipboard and serialize it to a file)
File.WriteAllText(Filename,Clipboard.GetText(TextDataFormat.CommaSeparatedValue));
// Restore the current state of the clipboard so the effect is seamless
if(objectSave != null) // If we try to set the Clipboard to an object that is null, it will throw...
{
Clipboard.SetDataObject(objectSave);
}
}
}
Notice I also make sure to preserve the contents of the clipboard before I begin, and restore it once I'm done, so the user does not get a bunch of unexpected garbage next time the user tries to paste. The main caveats to this approach is 1) Your class has to reference System.Windows.Forms, which may not be the case in a data abstraction layer, 2) Your assembly will have to be targeted for .NET 4.5 framework, as DataGridView does not exist in 4.0, and 3) The method will fail if the clipboard is being used by another process.
Anyways, this approach may not be right for your situation, but it is interesting none the less, and can be another tool in your toolbox.

I did this recently but included double quotes around my values.
For example, change these two lines:
sb.Append("\"" + col.ColumnName + "\",");
...
sb.Append("\"" + row[i].ToString() + "\",");

Try changing sb.Append(Environment.NewLine); to sb.AppendLine();.
StringBuilder sb = new StringBuilder();
foreach (DataColumn col in dt.Columns)
{
sb.Append(col.ColumnName + ',');
}
sb.Remove(sb.Length - 1, 1);
sb.AppendLine();
foreach (DataRow row in dt.Rows)
{
for (int i = 0; i < dt.Columns.Count; i++)
{
sb.Append(row[i].ToString() + ",");
}
sb.AppendLine();
}
File.WriteAllText("test.csv", sb.ToString());

4 lines of code:
public static string ToCSV(DataTable tbl)
{
StringBuilder strb = new StringBuilder();
//column headers
strb.AppendLine(string.Join(",", tbl.Columns.Cast<DataColumn>()
.Select(s => "\"" + s.ColumnName + "\"")));
//rows
tbl.AsEnumerable().Select(s => strb.AppendLine(
string.Join(",", s.ItemArray.Select(
i => "\"" + i.ToString() + "\"")))).ToList();
return strb.ToString();
}
Note that the ToList() at the end is important; I need something to force an expression evaluation. If I was code golfing, I could use Min() instead.
Also note that the result will have a newline at the end because of the last call to AppendLine(). You may not want this. You can simply call TrimEnd() to remove it.

Try to put ; instead of ,
Hope it helps

The error is the list separator.
Instead of writing sb.Append(something... + ',') you should put something like sb.Append(something... + System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator);
You must put the list separator character configured in your operating system (like in the example above), or the list separator in the client machine where the file is going to be watched. Another option would be to configure it in the app.config or web.config as a parammeter of your application.

To write to a file, I think the following method is the most efficient and straightforward: (You can add quotes if you want)
public static void WriteCsv(DataTable dt, string path)
{
using (var writer = new StreamWriter(path)) {
writer.WriteLine(string.Join(",", dt.Columns.Cast<DataColumn>().Select(dc => dc.ColumnName)));
foreach (DataRow row in dt.Rows) {
writer.WriteLine(string.Join(",", row.ItemArray));
}
}
}

Read this and this?
A better implementation would be
var result = new StringBuilder();
for (int i = 0; i < table.Columns.Count; i++)
{
result.Append(table.Columns[i].ColumnName);
result.Append(i == table.Columns.Count - 1 ? "\n" : ",");
}
foreach (DataRow row in table.Rows)
{
for (int i = 0; i < table.Columns.Count; i++)
{
result.Append(row[i].ToString());
result.Append(i == table.Columns.Count - 1 ? "\n" : ",");
}
}
File.WriteAllText("test.csv", result.ToString());

To mimic Excel CSV:
public static string Convert(DataTable dt)
{
StringBuilder sb = new StringBuilder();
IEnumerable<string> columnNames = dt.Columns.Cast<DataColumn>().
Select(column => column.ColumnName);
sb.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in dt.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field =>
{
string s = field.ToString().Replace("\"", "\"\"");
if(s.Contains(','))
s = string.Concat("\"", s, "\"");
return s;
});
sb.AppendLine(string.Join(",", fields));
}
return sb.ToString().Trim();
}

Here is an enhancement to vc-74's post that handles commas the same way Excel does. Excel puts quotes around data if the data has a comma but doesn't quote if the data doesn't have a comma.
public static string ToCsv(this DataTable inDataTable, bool inIncludeHeaders = true)
{
var builder = new StringBuilder();
var columnNames = inDataTable.Columns.Cast<DataColumn>().Select(column => column.ColumnName);
if (inIncludeHeaders)
builder.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in inDataTable.Rows)
{
var fields = row.ItemArray.Select(field => field.ToString().WrapInQuotesIfContains(","));
builder.AppendLine(string.Join(",", fields));
}
return builder.ToString();
}
public static string WrapInQuotesIfContains(this string inString, string inSearchString)
{
if (inString.Contains(inSearchString))
return "\"" + inString+ "\"";
return inString;
}

Here is my solution, based on previous answers by Paul Grimshaw and Anthony VO.
I've submitted the code in a C# project on Github.
My main contribution is to eliminate explicitly creating and manipulating a StringBuilder and instead working only with IEnumerable. This avoids the allocation of a big buffer in memory.
public static class Util
{
public static string EscapeQuotes(this string self) {
return self?.Replace("\"", "\"\"") ?? "";
}
public static string Surround(this string self, string before, string after) {
return $"{before}{self}{after}";
}
public static string Quoted(this string self, string quotes = "\"") {
return self.Surround(quotes, quotes);
}
public static string QuotedCSVFieldIfNecessary(this string self)
{
return (self == null) ? "" : (self.Contains('"') || self.Contains('\r') || self.Contains('\n') || self.Contains(',')) ? self.Quoted() : self;
}
public static string ToCsvField(this string self) {
return self.EscapeQuotes().QuotedCSVFieldIfNecessary();
}
public static string ToCsvRow(this IEnumerable<string> self){
return string.Join(",", self.Select(ToCsvField));
}
public static IEnumerable<string> ToCsvRows(this DataTable self) {
yield return self.Columns.OfType<object>().Select(c => c.ToString()).ToCsvRow();
foreach (var dr in self.Rows.OfType<DataRow>())
yield return dr.ItemArray.Select(item => item.ToString()).ToCsvRow();
}
public static void ToCsvFile(this DataTable self, string path) {
File.WriteAllLines(path, self.ToCsvRows());
}
}
This approach combines nicely with converting IEnumerable to DataTable as asked here.

StringBuilder sb = new StringBuilder();
SaveFileDialog fileSave = new SaveFileDialog();
IEnumerable<string> columnNames = tbCifSil.Columns.Cast<DataColumn>().
Select(column => column.ColumnName);
sb.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in tbCifSil.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field =>string.Concat("\"", field.ToString().Replace("\"", "\"\""), "\""));
sb.AppendLine(string.Join(",", fields));
}
fileSave.ShowDialog();
File.WriteAllText(fileSave.FileName, sb.ToString());

public void ExpoetToCSV(DataTable dtDataTable, string strFilePath)
{
StreamWriter sw = new StreamWriter(strFilePath, false);
//headers
for (int i = 0; i < dtDataTable.Columns.Count; i++)
{
sw.Write(dtDataTable.Columns[i].ToString().Trim());
if (i < dtDataTable.Columns.Count - 1)
{
sw.Write(",");
}
}
sw.Write(sw.NewLine);
foreach (DataRow dr in dtDataTable.Rows)
{
for (int i = 0; i < dtDataTable.Columns.Count; i++)
{
if (!Convert.IsDBNull(dr[i]))
{
string value = dr[i].ToString().Trim();
if (value.Contains(','))
{
value = String.Format("\"{0}\"", value);
sw.Write(value);
}
else
{
sw.Write(dr[i].ToString().Trim());
}
}
if (i < dtDataTable.Columns.Count - 1)
{
sw.Write(",");
}
}
sw.Write(sw.NewLine);
}
sw.Close();
}

Possibly, most easy way will be to use:
https://github.com/ukushu/DataExporter
especially in case of your data of datatable containing /r/n characters or separator symbol inside of your dataTable cells. Almost all of other answers will not work with such cells.
only you need is to write the following code:
Csv csv = new Csv("\t");//Needed delimiter
var columnNames = dt.Columns.Cast<DataColumn>().
Select(column => column.ColumnName).ToArray();
csv.AddRow(columnNames);
foreach (DataRow row in dt.Rows)
{
var fields = row.ItemArray.Select(field => field.ToString()).ToArray;
csv.AddRow(fields);
}
csv.Save();

Most existing answers can easily cause OutOfMemoryException, so I decided to write my own answer.
DON' T DO THIS:
using a DataSet + StringBuilder causes the data to occupy the memory 3x at once:
Load All Data into DataSet
Copy all data into StringBuilder
Copy the data to string using StringBuilder.ToString();
Instead you should write each row to a FileStream separately. There is no need to create the whole CSV in memory.
Even better, use a DataReader instead DataSet. That way you can read from database billions of records one by one a write the to a file one by one.
If you don't mind using an external library for CSV, I can recommend the most popular CsvHelper, which has no dependencies.
using (var writer = new FileWriter("test.csv"))
using (var csv = new CsvWriter(writer, CultureInfo.InvariantCulture))
{
foreach (DataColumn dc in dt.Columns)
{
csv.WriteField(dc.ColumnName);
}
csv.NextRecord();
foreach (DataRow dr in dt.Rows)
{
foreach (DataColumn dc in dt.Columns)
{
csv.WriteField(dr[dc]);
}
csv.NextRecord();
}
writer.ToString().Dump();
}

In case anyone else stumbles on this, I was using File.ReadAllText to get CSV data and then I modified it and wrote it back with File.WriteAllText. The \r\n CRLFs were fine but the \t tabs were ignored when Excel opened it. (All solutions in this thread so far use a comma delimiter but that doesn't matter.) Notepad showed the same format in the resulting file as in the source. A Diff even showed the files as identical. But I got a clue when I opened the file in Visual Studio with a binary editor. The source file was Unicode but the target was ASCII. To fix, I modified both ReadAllText and WriteAllText with third argument set as System.Text.Encoding.Unicode, and from there Excel was able to open the updated file.

Related

Automating Import of large csv files(~3gb) with C#

I am a bit new to this but my goal is to import the data from a csv file into a sql table and include additional values for each row being the file name and date. I was able to accomplish this using entity frame work and iterating through each row of the file but with the size of the files it will take too long too actually complete.
I am looking for a method to accomplish this import faster. I was looking into potentially using csvhelper with sqlbulkcopy to accomplish this but was not sure if there was a way to pass in the additional values needed for each row.
public void Process(string filePath)
{
InputFilePath = filePath;
DateTime fileDate = DateTime.Today;
string[] fPath = Directory.GetFiles(InputFilePath);
foreach (var file in fPath)
{
string fileName = Path.GetFileName(file);
char[] delimiter = new char[] { '\t' };
try
{
using (var db = new DatabaseName())
{
using (var reader = new StreamReader(file))
{
string line;
int count = 0;
int sCount = 0;
reader.ReadLine();
reader.ReadLine();
while ((line = reader.ReadLine()) != null)
{
count++;
string[] row = line.Split(delimiter);
var rowload = new ImportDestinationTable()
{
ImportCol0 = row[0],
ImportCol1 = row[1],
ImportCol2 = TryParseNullable(row[2]),
ImportCol3 = row[3],
ImportCol4 = row[4],
ImportCol5 = row[5],
IMPORT_FILE_NM = fileName,
IMPORT_DT = fileDate
};
db.ImportDestinationTable.Add(rowload);
if (count > 100)
{
db.SaveChanges();
count = 0;
}
}
db.SaveChanges();
//ReadLine();
}
}
}
static int? TryParseNullable(string val)
{
int outValue;
return int.TryParse(val, out outValue) ? (int?)outValue : null;
}
}

INSERT INTO MSSQL from textfile contains NULL values on INTEGER

I have problems to insert/bulk NULL values from textfile into MSSQL.
When replace NULL value with a number it works with no problem.
2 Columns is set to ALLOW NULLS,
PublicationCaption and PublicationNumber
Here is example of text file
1#DI#Dagens Industri#435#358#2016-10-19
2#DN#Dagens Nyheter#NULL#359#2016-10-19
I think there is some problem with the foreach loop in code where I need add something to make this work.
Here is the code I'm using
public static DataTable Publication()
{
DataTable dtPublication = new DataTable();
dtPublication.Columns.AddRange(new DataColumn[6] { new DataColumn("ID", System.Type.GetType("System.Int32")),
new DataColumn("PublicationCode", System.Type.GetType("System.String")),
new DataColumn("PublicationCaption",System.Type.GetType("System.String")),
new DataColumn("PublicationNumber", System.Type.GetType("System.Int32")),
new DataColumn("ProductNumber", System.Type.GetType("System.Int32")),
new DataColumn("CreatedDate", System.Type.GetType("System.DateTime")),
});
for (int i = 0; i < dtPublication.Columns.Count; i++)
{
dtPublication.Columns[i].AllowDBNull = true;
}
string txtData = File.ReadAllText(#"C:\Publication2.txt", System.Text.Encoding.Default);
foreach (string row in txtData.Split('\n'))
{
if (!string.IsNullOrEmpty(row))
{
dtPublication.Rows.Add();
int i = 0;
foreach (string cell in row.Split('#'))
{
dtPublication.Rows[dtPublication.Rows.Count - 1][i] = cell;
i++;
}
}
}
return dtPublication;
}
Im getting (The input string had an incorrect format. Unable to store in the PublicationNumber column. Type Int32 is expected.) when DEBUGGING.
Please I need some advise, help with this to solve the problem.
Thanks for your time.
The database doesn't know that the "NULL" string you try to insert actually means a null value. To fix this, change the "NULL" string to DBNull.Value:
if (cell == "NULL")
dtPublication.Rows[dtPublication.Rows.Count - 1][i] = DBNull.Value;
else
dtPublication.Rows[dtPublication.Rows.Count - 1][i] = cell;
There is no feature that translates a string "NULL" to a nullable field in a DataTable. You have to implement it yourself:
object value = DBNull.Value;
if(!"NULL".Equals(cell, StringComparison.InvariantCultureIgnoreCase))
value = cell;
dtPublication.Rows[dtPublication.Rows.Count - 1][i] = value;

JavaFX troubles with removing items from ArrayList

I have 2 TableViews (tableProduct, tableProduct2). The first one is populated by database, the second one is populated with selected by user items from first one (addMeal method, which also converts those to simple ArrayList). After adding/deleting few objects user can save current data from second Table to txt file. It seems to work just fine at beginning. But problem starts to show a bit randomly... I add few items, save it, delete few items, save it, everything is fine. Then after few actions like that, one last object stays in txt file, even though the TableView is empty. I just can't do anything to remove it and I get no errors...
Any ideas what's going on?
public void addMeal() {
productData selection = tableProduct.getSelectionModel().getSelectedItem();
if (selection != null) {
tableProduct2.getItems().add(new productData(selection.getName() + "(" + Float.parseFloat(weightField.getText()) + "g)", String.valueOf(Float.parseFloat(selection.getKcal())*(Float.parseFloat(weightField.getText())/100)), String.valueOf(Float.parseFloat(selection.getProtein())*(Float.parseFloat(weightField.getText())/100)), String.valueOf(Float.parseFloat(selection.getCarb())*(Float.parseFloat(weightField.getText())/100)), String.valueOf(Float.parseFloat(selection.getFat())*(Float.parseFloat(weightField.getText())/100))));
productlist.add(new productSimpleData(selection.getName() + "(" + Float.parseFloat(weightField.getText()) + "g)", String.valueOf(Float.parseFloat(selection.getKcal())*(Float.parseFloat(weightField.getText())/100)), String.valueOf(Float.parseFloat(selection.getProtein())*(Float.parseFloat(weightField.getText())/100)), String.valueOf(Float.parseFloat(selection.getCarb())*(Float.parseFloat(weightField.getText())/100)), String.valueOf(Float.parseFloat(selection.getFat())*(Float.parseFloat(weightField.getText())/100))));
}
updateSummary();
}
public void deleteMeal() {
productData selection = tableProduct2.getSelectionModel().getSelectedItem();
if(selection != null){
tableProduct2.getItems().remove(selection);
Iterator<productSimpleData> iterator = productlist.iterator();
productSimpleData psd = iterator.next();
if(psd.getName().equals(String.valueOf(selection.getName()))) {
iterator.remove();
}
}
updateSummary();
}
public void save() throws IOException {
File file = new File("C:\\Users\\Maciek\\Desktop\\test1.txt");
if(file.exists()){
file.delete();
}
FileWriter fw = null;
BufferedWriter bw = null;
try {
fw = new FileWriter(file);
bw = new BufferedWriter(fw);
Iterator iterator;
iterator = productlist.iterator();
while (iterator.hasNext()) {
productSimpleData pd;
pd = (productSimpleData) iterator.next();
bw.write(pd.toString());
bw.newLine();
}
} catch (IOException e) {
e.printStackTrace();
} finally {
bw.flush();
bw.close();
}
}
and yeah, I realize addMethod inside if statement looks scary but don't mind it, that part is allright after all...
You only ever check the first item in the productlist list to determine, if the item should be removed. Since you do not seem to write to the List anywhere without doing a similar modification to the items of tableProduct2, you can just do the same in this case.
public void deleteMeal() {
int selectedIndex = tableProduct2.getSelectionModel().getSelectedIndex();
if(selectedIndex >= 0) {
tableProduct2.getItems().remove(selectedIndex);
productlist.remove(selectedIndex);
}
updateSummary();
}
This way you also prevent issues, if there are 2 equal items in the list, which could lead to the first one being deleted when the second one is selected...
and yeah, I realize addMethod [...] looks scary
Yes, it does, so it's time to rewrite this:
Change the properties in productData and productSimpleData to float and don't convert the data to String until you need it as String.
if (selection != null) {
float weight = Float.parseFloat(weightField.getText());
float weight100 = weight / 100;
float calories = Float.parseFloat(selection.getKcal())*weight100;
float protein = Float.parseFloat(selection.getProtein())*weight100;
float carb = Float.parseFloat(selection.getCarb())*weight100;
float fat = Float.parseFloat(selection.getFat())*weight100;
ProductData product = new productData(
selection.getName() + "(" + weight + "g)",
calories,
protein,
carb,
fat);
productlist.add(new productSimpleData(product.getName(), calories, protein, carb, fat));
tableProduct2.getItems().add(product);
}
Also that this kind of loop can be rewritten to an enhanced for loop:
Iterator iterator;
iterator = productlist.iterator();
while (iterator.hasNext()) {
productSimpleData pd;
pd = (productSimpleData) iterator.next();
bw.write(pd.toString());
bw.newLine();
}
Assuming you've declared productlist as List<productSimpleData> or a subtype, you can just do
for (productSimpleData pd : productlist) {
bw.write(pd.toString());
bw.newLine();
}
furthermore you could rely on a try-with-resources to close the writers for you:
try (FileWriter fw = new FileWriter(file);
BufferedWriter bw = new BufferedWriter(fw)){
...
} catch (IOException e) {
e.printStackTrace();
}
Also there is no need to delete the file since java overwrites the file by default and only appends data if you specify this in an additional constructor parameter for FileWriter.

Using C# remove unnecessary “TABLE_NAME” from Excel worksheets

Can anyone tell me, I am going to upload excel file, this file has unnecessary table like "_xlnm#Print_Titles" that I need to remove or delete that field. This a my method. But it is does not work for remove or delete.
static string[] GetExcelSheetNames(string connectionString)
{
OleDbConnection con = null;
DataTable dt = null;
con = new OleDbConnection(connectionString);
con.Open();
dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if ((dt == null) )
{
return null;
}
String[] excelSheetNames = new String[dt.Rows.Count];
int i = 0;
foreach (DataRow row in dt.Rows)
{
excelSheetNames[i] = row["TABLE_NAME"].ToString();
if ((excelSheetNames[i].Contains("_xlnm#Print_Titles") || (excelSheetNames[i].Contains("Print_Titles"))))
{
if (true)
{
row.Table.Rows.Remove(row);
dt.AcceptChanges();
}
}
i++;
}
return excelSheetNames;
}
Instead of removing items in the foreach loop, we'll find them and add them to a list, then we'll go through that list and remove them from your data table.
static string[] GetExcelSheetNames(string connectionString)
{
OleDbConnection con = null;
DataTable dt = null;
con = new OleDbConnection(connectionString);
con.Open();
dt = con.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if ((dt == null))
{
return null;
}
String[] excelSheetNames = new String[dt.Rows.Count];
var rowsToRemove = new List<DataRow>();
for (int i = 0; i < dt.Rows.Count; i++)
{
var row = dt.Rows[i];
excelSheetNames[i] = row["TABLE_NAME"].ToString();
if ((excelSheetNames[i].Contains("_xlnm#Print_Titles") || (excelSheetNames[i].Contains("Print_Titles"))))
{
rowsToRemove.Add(dt.Rows[i]);
}
i++;
}
foreach (var dataRow in rowsToRemove)
{
dt.Rows.Remove(dataRow);
}
return excelSheetNames;
}
Those _xlnm and "$" are sheets that, turns out, shouldn't be normally accessed by the users.
You can solve this in 2 ways.
Ignore them
Drop them
The former is highly recommended.
To do this you need to use the following code:
if (!dt.Rows[i]["Table_Name"].ToString().Contains("FilterDatabase") && !dt.Rows[i]["Table_Name"].ToString().EndsWith("$'"))
{
}
You can either use .Contains() and/or .EndsWith() to filter out those sheets.

Export LinqPAD results to Excel file without the rowcount metadata

I was trying to write to an excel file just the resultset from a query but I keep getting the header column with the row count, which is messing up the subsequent data processing I need to do. I could go in the exported file and delete the first row, but it would be much better if I could export a dataset without the header row.
Here's my hack, I wonder if anyone has a better way to do it. I am taking the generated html and using regex to yank out the header row:
public string DumpToHtmlString<T>(T objectToSerialize, string filePath )
{
string strHTML = "", outpuWithoutHeader ="";
try
{
var writer = LINQPad.Util.CreateXhtmlWriter(true);
writer.Write(objectToSerialize);
strHTML = writer.ToString();
outpuWithoutHeader = Regex.Replace(strHTML, "<tr><td class=\"typeheader\"((\\s*?.*?)*?)<\\/(tr|TR)>", "", RegexOptions.Multiline);
System.IO.File.WriteAllText(filePath, outpuWithoutHeader );
}
catch (Exception exc)
{
Debug.Assert(false, "Investigate why ?" + exc);
}
return outpuWithoutHeader;
}
Is the objectToSerialize an IEnumerable? If so, the LINQPad beta has a WriteCsv method which is designed to create Excel-friendly CSV files:
Util.WriteCsv(data, #"c:\temp\results.csv");
Otherwise, you're safer using the LINQ-to-XML DOM for modifying the output rather than regex. The following code illustrates how to remove formatting from LINQPad output; you can adapt it to remove headings and totals as well:
XDocument doc = XDocument.Load (...);
XNamespace xns = "http://www.w3.org/1999/xhtml";
doc.Descendants (xns + "script").Remove ();
doc.Descendants (xns + "span").Where (el => (string)el.Attribute ("class") == "typeglyph").Remove ();
doc.Descendants ().Attributes ("style").Where (a => (string)a == "display:none").Remove ();
doc.Descendants (xns + "style").Remove ();
doc.Descendants (xns + "tr").Where (tr => tr.Elements ().Any (td => (string)td.Attribute ("class") == "typeheader")).Remove ();
doc.Descendants (xns + "i").Where (e => e.Value == "null").Remove ();
foreach (XElement anchor in doc.Descendants (xns + "a").ToArray ())
anchor.ReplaceWith (anchor.Nodes ());
var presenters = doc.Descendants (xns + "table")
.Where (el => (string)el.Attribute ("class") == "headingpresenter")
.Where (e => e.Elements ().Count () == 2)
.ToArray ();
foreach (var p in presenters)
{
var heading = p.Elements ().First ().Elements ();
var content = p.Elements ().Skip (1).First ().Elements ();
if (stripFormatting)
p.ReplaceWith (heading, new XElement (xns + "p", content));
else
p.ReplaceWith (
new XElement (xns + "br"),
new XElement (xns + "span", new XAttribute ("style", "color: green; font-weight:bold; font-size: 110%;"), heading),
content);
}
// Excel centre-aligns th even if the style says otherwise. So we replace them with td elements.
foreach (var th in doc.Descendants (xns + "th"))
{
th.Name = xns + "td";
if (!stripFormatting && th.Attribute ("style") == null)
th.Add (new XAttribute ("style", "font-weight: bold; background-color: #ddd;"));
}
string finalResult = doc.ToString().Replace ("Ξ", "").Replace ("▪", "");