Convert xls File to csv, but extra rows added? - vb.net

So, I am trying to convert some xls files to a csv, and everything works great, except for one part. The SaveAs function in the Excel interop seems to export all of the rows (including blank ones). I can see these rows when I look at the file using Notepad. (All of the rows I expect, 15 rows with two single quotes, then the rest are just blank). I then have a stored procedure that takes this csv and imports to the desired table (this works on spreadsheets that have been manually converted to csv (e.g. open, File--> Saves As, etc.)
Here is the line of code I am using for my SavesAs in my code. I have tried xlCSV, xlCSVWindows, and xlCSVDOS as my file format, but they all do the same thing.
wb.SaveAs(aFiles(i).Replace(".xls", "B.csv"), Excel.XlFileFormat.xlCSVMSDOS, , , , False) 'saves a copy of the spreadsheet as a csv
So, is there some additional step/setting I need to do to not get the extraneuos rows to show up in the csv?
Note that if I open this newly created csv, and then click Save As, and choose csv, my procedure likes it again.

When you create a CSV from a Workbook, the CSV is generated based upon your UsedRange. Since the UsedRange can be expanded simply by having formatting applied to a cell (without any contents) this is why you are getting blank rows. (You can also get blank columns due to this issue.)
When you open the generated CSV all of those no-content cells no longer contribute to the UsedRange due to having no content or formatting (since only values are saved in CSVs).
You can correct this issue by updating your used range before the save. Here's a brief sub I wrote in VBA that would do the trick. This code would make you lose all formatting, but I figured that wasn't important since you're saving to a CSV anyway. I'll leave the conversion to VB.Net up to you.
Sub CorrectUsedRange()
Dim values
Dim usedRangeAddress As String
Dim r As Range
'Get UsedRange Address prior to deleting Range
usedRangeAddress = ActiveSheet.UsedRange.Address
'Store values of cells to array.
values = ActiveSheet.UsedRange
'Delete all cells in the sheet
ActiveSheet.Cells.Delete
'Restore values to their initial locations
Range(usedRangeAddress) = values
End Sub

Tested your code with VBA and Excel2007 - works nice.
However, I could replicate it somewhat, by formatting an empty cell below my data-cells to bold. Then I would get empty single quotes in the csv. BUT this was also the case, when I used SaveAs.
So, my suggestion would be to clear all non-data cells, then to save your file. This way you can at least exclude this point of error.

I'm afraid that may not be enough. It seems there's an Excel bug that makes even deleting the non-data cells insufficient to prevent them from being written out as empty cells when saving as csv.
http://answers.microsoft.com/en-us/office/forum/office_2010-excel/excel-bug-save-as-csv-saves-previously-deleted/2da9a8b4-50c2-49fd-a998-6b342694681e

Another way, without a script. Hit Ctrl+End . If that ends up in a row AFTER your real data, then select the rows from the first one until at least the row this ends up on, right click, and "Clear Contents".

Related

Preventing excel from altering cell contents to take literal values

I have scripted out a module that reads data I paste in from a SQL dump, and converts it into a data insert script to SQL. It is working great, only problem is for cells that contain items like:
11-20
instead of filling my value as '11-20', it is converting it to '20-Nov'.
I have adjusted how i read the cell from text to value to value2, Text comes the closest to right (rest do a math calc that tosses my overall sheet off even worse, namely dates). I have also tried such things as a Range("X:X").clear and clearformat as well. This also does not do the trick.
How do I force my string read of this cell to be the literal CSV content, and ignore the formula/calculations that excel is tossing at me?
EDIT:
Thanks to BraX for the solution!
I was unable to accomplish this by copying from one tab to the next within Excel, but i did get it to work by pausing my operation with a message box prompting the user to simply navigate to SQL and put the contents of the data in to the clip board. This works perfectly now!
Cells.Select
Selection.NumberFormat = "#"
MsgBox "Please navigate to SQL and copy your data to be insurted, including headers. When done click OK"
Range("A1").Select
ActiveSheet.Paste
Before adding the data to the sheet, set the NumberFormat to # for the affected columns.
Example:
ActiveWorkBook.WorkSheets("Sheet1").Range("A:A").NumberFormat = "#"
That will format the values to Text and prevent them from being interpreted as a Date when the data is added. Once Excel decides it's a date, it's too late.
i am pretty sure all it takes is to format the cells, right click on the range of cells you want formatted as plain text, click "FORMAT CELLS" and under the "NUMBER TAB" it shows the "CATEGORY" of the type of formatting, e.g: "General, Number, Currency, etc" select where it says "Text" and click ok, any text will be evaluated the same way you write it, so even if you write "11-5-18" it will be just that and won't be considered a date or anything.

Excel sheet Deletes the formulas present in the sheet when I open it. How to avoid this?

I'm uploading an excel file that contains sheets, to my server which encodes to base 64 so I decode it as required and process it by adding data in sheet 5 as column1 and column2 with certain number of rows. At the time of uploading, this sheet has some specific formulas on sheet 5 that makes changes in other sheets. So on opening the file which I send as response after editing from server, There comes this prompt that reads
"Excel Found unreadable content in 'MyDownloadedExcelData.xlsx'. Do you want to recover the contents of this workbook?If you trust the source of this workbook, click Yes', with Yes and no buttons
and when I click on yes and open the sheet, all the formulas are deleted.
I see something like
Excel was able to open the file by repairing ot removing the unreadable content.
Removed Records :Formula from /xl/calcChain.xml Part
Repaired Records : Cell Information from /xl/worksheets/sheet1.xml part etc
So, How do I make sure my formulas in the sheet are retained?
Using VBA you could have an on close event that pastes values and an on open event that recreates the formulas. Your file would essentially save with static data, but then be used with functions intact.
If this solution is of interest I can help provide some coding framework.

Excel VBA; Search rows grabbing values and pasting them into another worksheet

I have two workbooks;
(WB1) with two sheets; "Input" and "Output"
and
(MacroWB) with the macro and a "Column Header" list.
Example file: "Messy" sheet = input, "Organized" = output
https://drive.google.com/file/d/0B-leh2Ii2uh9bDBFbDBHbGcxbUU/view?usp=sharing
I need help coding a macro to do the following:
1) Create a loop to go through each row of the "Input" sheet searching for values matching cells in the "Column Header" list.
2) When a matching value is found; take the data from the cell immediately to it's right (in the "Input" sheet) and paste it into the corresponding column of the "Output" worksheet.
3) Once every "Column Header" item has been searched/pasted for that row; move to the next row of the "Input" sheet. Rinse and repeat until all rows of the "Input" sheet have been searched/pasted.
Here is an example, the letters are to be column headers and the numbers are to be copied to the appropriate "Output" sheet column.
https://drive.google.com/file/d/0B-leh2Ii2uh9TXRGTnFDRU1jY0U/view?usp=sharing
Keep in mind that the actual data file has ~50 columns and ~3000 rows.
Also that the data is not all Letter/Numbers like the table above, it is more like the data in the linked .xlsx file.
If there is anything I haven't been clear about, please ask and I will try my best to clarify. Also I may be WAY over thinking this, if so.. please let me know.
THANK YOU ANYONE THAT CAN GET ME GOING IN THE CORRECT DIRECTION!!!
-Joe
Skip the the VBA and use Text to Columns the Data tab. I'malways copying html and its works 99% of the time. If the html is pretty and properly formated you may get away with using the fixed width option, otherwise gor for the delimted and choose "tab". If tab doesn't work try using spaces, assuming that your cells don't contain spaces.
The other option that I've had work on rare occasions that text to columns doesn't is simply saving the text in word and saving as rtf and then opening that in notepad++ (which everyone should have.) Copy from ++ to excel and that usually fixes the problem.
EDIT: If you right click before pasting and click "paste special" this regularly helps with html pasting.
In your sample file, I used the following formula in A2 of Organized sheet (assumed 50 as max columns in Messy):
=IFERROR(OFFSET(Messy!A1,0,MATCH(Organized!A$1,OFFSET(Messy!A1,0,0,1,50), )),"")
Dragging it to H11 produced the following result:
The sample data is not complete, and some 'tags' in Messy sheet are not consistent (SiteID vs SITE_ID), but it should help you get started.

Excel Reference External Sheet

An excel question for you gurus. I've tried searching high and low and haven't come up with an effective solution.
I'm trying to create a formula that will lookup a value in an external sheet. I'm using the SUMPRODUCT formula and it works perfectly. Formula is below:
=SUMPRODUCT(--('File\Path\[file.xlsx]SheetName!$D$1:$D$1000=$B3), --('File\Path\[file.xlsx]SheetName'!$O$1:$O$1000=$A3), 'File\Path\[file.xlsx]SheetName'!$Q$1:$Q$1000)
The issue I'm running into, however, is that the source file is updated every day. Although the workbook name stays the same, the sheet name changes. A random string gets assigned to the source sheet name each time it is updated. As such SheetName becomes SheetName ase341.
Is there a way to have the formula read the external sheet number instead of the name? I want the formula to update regardless of the sheet name. If there's no way to read the sheet position is there a way to change the sheet name via a formula in an external workbook?
Usage Example
I have a workbook (analysis) and it pulls data from another workbook (source). Source is updated every day with new data. The data in Source is updated by downloading a report from the internet and saving over the old source file. As such, the file name stays the same but whatever is inside the file is always different (including the sheet name). There is always only ever one sheet in the Source with the same number of columns, always in the same position.
There is a really neat way to refer to a block of cells in an external workbook in which the sheetname or even the block address may vary. Say we have:
=SUM('C:\Users\James\Desktop\[Book1.xlsx]Sheet1'!$B$2:$B$9)
however the sheetname may vary. First assign a Defined Name to the block in Book1 (say XXX)
Then we can use:
=SUM('C:\Users\James\Desktop\Book1.xlsx'!XXX)
It does not matter if the sheetname changes, the Defined Name will change with it!
Your issue would be most efficiently solved with VBA, but if you're just getting started this might not be the best route.
You can get the sheetname or filename with just a formula, though:
http://www.ozgrid.com/VBA/return-sheet-name.htm

How to copy conditionally formatted cells from Calc as a table into Writer

I have a LibreOffice Calc spreadsheet that uses some conditional formatting of cells. I would like to copy it into Writer as a table. The colours/formats of the cells should remain as they were due to the conditional formatting in Calc. Unfortunately when I do that, the formatting vanishes.
How can I copy it keeping the formatting?
Of course the Writer version no longer has to be conditional, but I need to keep current colours.
My work is done so eventually I can do the trick in Calc first (abandon the "conditional" part, and just preserve the formatting as - is). However due to amount of data I would prefer not to do it manually.
Is macro the only way to do that?
Use Insert -> Object -> OLE Object
Choose Create from file
Pick the right .ods file.
If you want to modify further (in my case - I need to create many tables from one spreadsheet as the original file is humongous - up to CL column) - do not tick "Link to the file" option.
After pressing OK, the spreadsheet is inserted as is (cloned and embedded), with the conditional formatting. Can be further modified (e.g. rows/cols can be deleted, hidden or whatever is needed). The conditional formatting remains active.
I personally prefer to copy as an image. This ensures the format is always exactly as it was in the spreadsheet and that no weird OLE/DDE links go wrong.
However, you specifically ask for a table. For that there are three (or 2.5) options:
Insert the entire as spreadsheet as an object. In Windows that can be done as Ister describes in his answer. This will be editable as an inline mini-sheet (Writer will invoke Calc for any editing actions).
Insert a part of the sheet as an object: Select what you want in the document, copy to the clipboard, go to Writer and select Edit->Paste Special. Then select the OLE option, or if on Linux, select "calc8". This will be editable as an inline mini-sheet.
Insert as HTML. This creates a standalone table. Formatting will not be 100% as in the sheet, as fonts, etc, will be reset by Writer, but it is a native Writer table that you can manipulate in Writer without invoking Calc. Colors, etc, are preserved.
If you use any of the object embedding options, you'll notice that formulae are kept intact (when not referring the data outside the pasted sheet or region). If you want all the data to be verbatim, then you need an intermediate step:
Select the data in your original sheet that you wish
Copy to the clipboard
Create a new sheet and place the cursor in the same spot as the first cell of the copied data (e.g. if your copied region is B4:X99, then place the cursor in B4 of the new sheet)
Select Edit->Paste Special
In the Paste Special window, check only the following options and click OK:
Text
Numbers
Date & Time
Formats