Delete top row of excel file using SSIS - sql

I have an excel file that has a header row which is a row that I want to delete. The header row in thsi file are the cells of A1 to W1 merged into one. This causes a problem when I try to read the file because I am expecting column names. Proper column names exist in the second row of the file, which is why I want to delete the first.
To accomplish this I thought I'd be able to use the 'Excel Source' item in SSIS since it supports a SQL option to write a query. What I want to do is something like this:
SELECT * from ExcelFile WHERE Row > 1
My file only has data in columns A thru W.
I don't know what syntax I can use in the query to do this. The query builder that is in the Excel Source item will allow me to do many things with columns but I don't see an option for doing anything with rows. Searching online and using the help didn't get me anywhere.

None of these solutions will work because the Excel driver will be confused by the merged first line. You won't be able to use any driver features such as skip first row to do this. You need to run some script to open the Excel file and delete the row manually.
There is some basic sample script at this site:
http://www.sqlservercentral.com/Forums/Topic1327014-1292-1.aspx
The code below is adapted from the code written by snsingh at that site.
You would obviously want to use connnection manager properties, not hard coded paths
Excel needs to be installed on the SSIS Server for it to work - this is the only way to use Excel automation.
Dim filename As String
Dim appExcel As Object
Dim newBook As Object
Dim oSheet1 As Object
appExcel = CreateObject("Excel.Application")
filename = "C:\test.xls"
appExcel.DisplayAlerts = False
newBook = appExcel.Workbooks.Open(filename)
oSheet1 = newBook.worksheets("Sheet1")
oSheet1.Range("A1").Entirerow.Delete()
newBook.SaveAs(filename, FileFormat:=56)
appExcel.Workbooks.Close()
appExcel.Quit()

You don't need to use a syntax.
Go to control flow..
Pull in a data flow task.
Add a excel file source...add a conection manager
With excel sheet.
Open your connection manager and then check the box which says.
Column names In first row. That's it and add ur destination.

Related

Adding a macro file to an XLSX/XLSM file using Packaging

I'm using System.IO.Packaging to build simple Excel files. One of our customers would like to have an autorun macro that updates data and recalcs the sheet.
Pulling apart existing sheets I can see that all you need to do is add the vbaProject.bin file and change a few types in the _rels. So I made the macro in one file, extracted the vbaProject.bin, copied it into another file, and presto, there's the macro.
I know how to add package parts when they are in XML format, like the sheets or the workbook itself, but I've never added a binary file and I can't figure it out. Has anyone done this before?
Ok I got it. Following TnTinMn's suggestion:
Open a new workbook and type in your macro. Change the extension to
zip, open it, open the xl folder and copy out the vbaProject.bin
to somewhere easy to find.
In your .Net code, make a new Part and add it to the Package as
'xl/vbaProject.bin'. Copy over byte-for-byte from the
vbaProject.bin you extracted above. It will be compressed as you
add the bytes.
Then you have to add a relationship to the workbook that points to
your new file. You can find those relationships in
xl/_rels/workbook.xml.rels.
You also have to add a content type entry at the root of the
document, which goes into the [Content Types].xls. This happens automatically when you use the ContentType parameter of CreatePart
And finally, change the extension to .xlsm or .xltm
I'm extracting the following from many places in my code, so this is pseudo...
'the package...
Dim xlPackage As Package = Package.Open(WBStream, FileMode.Create)
'start with the workbook, we need the object before we physically insert it
Dim xlPartUri As URI = PackUriHelper.CreatePartUri(New Uri(GetAbsoluteTargetUri("/", "xl/workbook.xml"), UriKind.Relative)) 'the URI is relative to the outermost part of the package, /
Dim xlPart As PackagePart = xlPackage.CreatePart(xlPartUri, "application/vnd.ms-excel.sheet.macroEnabled.main+xml", CompressionOption.Normal)
'add an entry in the root _rels folder pointing to the workbook
xlPackage.CreateRelationship(xlPartUri, TargetMode.Internal, "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument", "xlWorkbook") 'it turns out the ID can be anything unique
'now that we have the WB part, we can make our macro relative to it
Dim xlMacroUri As URI = PackUriHelper.CreatePartUri(New Uri(GetAbsoluteTargetUri("/xl/workbook.xml", "vbaProject.bin"), UriKind.Relative))
Dim xlMacroPart as PackagePart = xlPackage.CreatePart(xlPartUri, "application/vnd.ms-office.vbaProject", CompressionOption.Normal)
'time we link the vba to the workbook
xlParentPart.CreateRelationship(xlPartUri, TargetMode.Internal, "http://schemas.microsoft.com/office/2006/relationships/vbaProject", "rIdMacro") 'the ID on the macro can be anything as well
'copy over the data from the macro file
Using MacroStream As New FileStream("C:\yourdirectory\vbaProject.bin", FileMode.Open, FileAccess.Read)
MacroStream.CopyTo(xlMacroPart.GetStream())
End Using
'
'now write data into the main workbook any way you like, likely using new Parts to add Sheets

Identifying exact location of data entry issues when bulk copying or importing from excel to access

One of the requirements of a project that I have is to allow users to import or copy and paste in bulk a few hundred rows from excel to access. However, there is a reasonable chance due to human error that there will be some data validation issues between the imported data and the table structure/referential integrity rules. I would like to be able to identify exactly the field/s and record/s where these issues are occuring so that I can point them out to the user for correction.
As such the standard error essages like 'you cannot add or change a record because a related record is required in...' or 'data type mismatch in criteria or expression' are not descriptive enough to the exact location of the problem so even if I catch them I can't really give a better descriptor anyway
I am debating importing to a completely free text temporary table, then looping an insert to move one row at a time from the temp table to the properly validated table and using dbfailonerror to catch issues on individual records that need correction (the user needs to correct them I can't do this through code)
My question is whether this is a reasonable approach, is there a better/easier way, or a way to get a more specific error from access rather than using a loop?
Thanks
There are 2 ways to do this. I'm not sure what method you are using to do the import but if is as simple as copying rows from the excel sheet to the table Access will generate a Paste_errors table that will show the rows it couldn't import. The wizard will do the same thing but I think its prone to crashing.
The way I typically do it is actually have the end user use an excel template with a VBA backend that does the uploading. You can check each value conditionally and give a better descriptive alert and/or shuttle any defective rows to a temporary table for you to review.
You can do this the opposite way and do the import through Access VBA but that would be more coding since you would have to create an Excel object in code, open the sheet, etc.
I setup a Quickbooks export of a accounts receivable table by creating a User DSN on the local machine pointing to the Access file, opening an ADO recordset and looping through the rows one column at a time applying logic to each row.
Quickbooks would ask for an existing file to dump the data into so I made that a template file on the network. It sounds like your users may have to enter directly into the spreadsheet so you would have to distribute the template but the results are the same.
Example of Looping through the sheet and validating rows
Create a DSN file to the database and store it on a shared drive, this way you don't have to hardcode the string and if you need to change something, you only have to change the dsn file instead of redistributing templates.
Public Sub Upload
'Declare the main recordset that the records will be uploaded to.
Dim rstUpload as New Adodb.Recordset
'Declare a utility recordset object that you can reuse to check referential tables
Dim rstCheck as New Adodb.recordset
'Declare a utility command object that you can reuse to check referential tables
Dim SQLCommand as New Adodb.Command
'Declare the connection object to the database
Dim dataConn as New Adodb.Connection
'A tracking flag if you find something in a row that won't upload
Dim CannotUpload as Boolean
'Open the connection to the access database
dataConn.Open "\\Server\Share\Mydatabase.dsn" 'Your dsn file'
Set SQLCommand.ActiveConnection = DataConn
rst.Open "yourTable", dataConn, adOpenDynamic, adLockBatchOptimistic
For i = 1 to 100 ' Rows
*You may want to do a pass through the rows so you can get an accurate count, usually just loop through the rows until a column that must have data is blank. If your users are skipping rows that is going to be a problem.
rstUpload.AddNew
'Set the tracking Flag to False indicating you can upload this row, this will be changed if any field cannot be validated
CannotUpload = False
'First Column/Field: 'Non critical field, any value will do
rstUpload("Field1").Value = Range(i,1).Value '
'Second Column/Field has a referential integrity constraints
'Run a query against the table that has the values you are trying to validate.
SQLCommand.CommandText = "Select IDField From YourTable where ID = " & Range(i,2).Value
Set rstCheck = SQLCommand.Execute
'Check if any records were returned, if none than the value you are checking is invalid
If rstCheck.RecordCount > 0 Then 'we matched the value in the cell
rstUpload ("Field2").Value = Range(i,2).Value
else
'Design a flagging method, maybe highlight the cell in question
CannotUpload = True
End if
....continue through your columns in the same fashion, reusing the SQLCommand Object and the Utility recordset when you need to lookup something in another table
'After last column
If CannotUpload = False Then
'Nothing triggered the flag so this row is good to go
rstUpload.Update
Else
'Cannot Upload is true and this row won't be uploaded
rstUpload.Cancel
End If
Next i
dataconn.Close
set dataConn = Nothing
set rstUpload = Nothing
set rstCheck = Nothing
set SQLCommand = Nothing

reg . Teradata SQL / BTEQ / BTET . Export to an excel report with 2 tabs using BTEQ

I can do this beautifully in SQL Assistant and the result will show up in 2 tabs which can be saved to an excel file with 2 tabs.
BT;
sel <query1>
;
Sel <query2>
;
ET
I want to know if something similar can be done in BTEQ. I am prolly hoping against hope but I thought I'd had done my asking around. Using BTEQ I can export 2 different queries to 2 different excel files , and that is fine. I want to know if, its poss. to export the o/p of 2 queries into a single excel file with 2 tabs
and a small corollary question : In SQL Assistant, is it poss to export the result of 2 queries run serially to a 2 tab , excel report by some method other than BTET
( P.S BTET in BTEQ is meaningless , I am using it only in SQL Ass. So dont mean using BTET )
In SQL Assistant there's a setting Tools -> Options -> General -> Use a separate Answer window for which is set by default to Each Query. When you run multiple Select either using F5 or F9 you get all result sets as tabs in a single window and the File -> Save then asks for Save all sheets? This is independent of BT/ET.
In BTEQ there's no way to get multiple results into one file, in fact there's no support for Excel, it's only the very old DIF format which doesn't support multiple tabs.
I have had similar problems.
What you can do is use BTEQ to export the data to a CSV and then use a VBS script to open up the Excel file and copy the CSV in a new worksheet.
First you need to delete the old one otherwise it will get renamed.
Option Explicit
Dim MyPath, objExcel, objWorkbook, objSourceData, objTargetSheet
MyPath = "C:\Users\user\"
Set objExcel = CreateObject("Excel.Application")
objExcel.Application.Visible = True
Set objWorkbook = objExcel.Workbooks.Open(MyPath & "ExcelFile.xlsx")
objExcel.Application.DisplayAlerts = False
objExcel.Sheets("worksheetname").Delete
objExcel.Application.DisplayAlerts = True
Set objSourceData = objExcel.Workbooks.Open(MyPath & "csvfilename.csv")
objSourceData.sheets(1).Copy objworkbook.Sheets(objworkbook.Sheets.Count)
objworkbook.Save
objworkbook.Close
objExcel.Application.Quit
WScript.Quit

VB.NET - Working with tab delimited text file

I need help with how to work on text file (like database).
I create excel GUI (with macro's), that search imputed string in sheets with lots of data and display entire row with matching string (for people with installed MS office)
Now I must create alternative VB.Net application working only on tab delimited text files (without ADO.Net) for people who haven't installed MS office, and I don't know how start to work with it.
import them? if yes, then how.
working directly on them? if yes, then how.
My text files is exported excels files/sheets to tab delimited .txt, with loots of columns (100+) with headers, and lots of rows 500+
need help :)
thx
If you want to get the headers from the first line of the file then do this ...
Sub Main()
Dim dt = New DataTable
Dim lines = File.ReadAllLines("TextFile1.txt")
Dim headers = lines(0).Split(vbTab)
For Each header In headers
dt.Columns.Add(header)
Next
For Each line In lines.Skip(1)
Dim parts = line.Split(vbTab)
dt.Rows.Add(parts)
Next
End Sub

Combine several Excel files into one using VB

I have a folder with several excel files that have a date field, i.e. 08-24-2010-123320564.xls. I want to be able to have some VB scripting that will simply take the files that start with todays date and merge them into one file.
08-24-2010-123320564.xls
08-24-2010-123440735.xls
08-24-2010-131450342.xls
into
08-24-2010.xls
Can someone please help?
Thanks
GabrielVA
assuming you just want to append rows in simple spreadsheets, follow this logic:
Psuedocode
use an excel macro
(you could just as well automate excel from vb but why vbscript alone since you need excel anyway?)
have it to a dir listing (dir function)
dim a date_start variable init to "new"
dim a merged_spreadsheet as new doc default to nothing
loop thru result of dir
if date_start <> start of filename
if merged_spreadsheet is not nothing
save it
set it to nothing
store start of date (left mid function) in date_start
if merged_spreadsheet is nothing
make a new one
open the file from the dir command's loop
select all the data
copy it
go to first empty row in merged_spreadsheet
paste it
loop files
if merged_spreadsheet is not nothing
save it
If you're not happy with all those 'nothings', you can set a separate flag to keep track of whether you have a merged_spreadsheet or not. Think about what happens for just one file in a date, no files at all, etc.
Of course you will tear out your hair finding out how to automate those excel functions. The secret to turning 'hard' into 'pretty darn easy' is this:
macro recorder will reveal the automation commands
They are not intuitive. So record a macro. Then do things that you'll need to do in your code. Stop recording and look at the result.
For example:
* load/save files
* select only entered fields
* Select all
* copy / switch files / paste
* create new sheet
* click in various single cells and type (how to examine/set a cell's contents)
In summary-
(1) Know exactly the steps to what you're doing
(2) Use macro recorder to give away the secrets of the excel object model. Steal its secrets.
Really this won't be all that hard if you marry these two concepts cleverly. Since the macro will be vbscript (at least if you use office 97 ;-) you can probably run it from vbs or vb6 if you want. A hop to vb.net shouldn't be that hard either.
Aspose makes it pretty easy to work with excel-files in .NET http://www.aspose.com