I am currently creating an SSIS package which imports data from excel to sql server. My problem now is I have 4998 rows on my excel source and everytime I try to run my SSIS package, it imports 5,010 data. I don't know where did it get the excess data. how can I fix this? can anyone help me please. Thanks!
Excel does this sometimes. It keeps track of the "data range", which sometimes extends beyond the actual data that exists. This "data range" is what SSIS will import.
If you look at your import table, you should be seeing 12 blank rows. Unless you've imported the header row, if any, as a row. That's another possibility: does your Excel sheet have multiple header rows (many do)?
It is possible to apply some kind of SQL to Excel datasources, but I avoid it because I don't trust Excel's handling of data. The solution is probably to identify a key column, which should always have a non-NULL, non-blank value of a certain type (e.g. a number or a date) in every "real" data row. Then delete rows that break this rule from your import table.
Excel importing is no fun!
Related
I have an ETL process set up to take data from an Excel spreadsheet and store it in a database using SSIS. However, one of the columns in the the Excel file is formatted as a percent, and it will sometimes erroneously be stored as a NULL value in the database, as if there was some sort of translation error.
Pictured is the exact format being used for the column in Excel.
Interestingly, these percent values do load properly on some days, but for some reason one particular Excel sheet I was given as an example of this issue will not load any of them at all when put through the SSIS processor.
In Excel, these values will show up like "50.00%", and when the SSIS processor is able to translate them properly it will display as the decimal equivalent in the database, "0.5", which is what I want instead of the NULL values. The data type I am using in SSIS for this is Unicode string [DT_WSTR], and it is saved as an NVARCHAR in the database.
Any insight as to why these values will sometimes not display/translate as intended? I have tried messing around with the data types in SSIS/SQL Server, but it has either resulted in no change or error. When I put test values in the Excel sheet, such as "test" to see if it is importing anything at all from this column, it does seem to work (just not for the percent numbers that I need).
The issue was caused by the "mixed data types" that were present in the first few rows of my data (the "mixed" part being blank fields), which would explain why some sheets would work and others wouldn't.
https://stackoverflow.com/a/542573/11815822
Setting the connection string to accommodate for this fixed the issue.
Anyone knows why a duplicate table will be generated when using SQL Server Import and Export Wizard to import an Excel file? For example, a table named LN$LN appears after the original table named LN$
Thanks!
I think there may be a hidden sheets in your excel
can you make a try like below.
Was just having an issue similar to this and then noticed the following in the excel spreadsheet I had been given. There was an entire duplicate of the data set hiding in the collapsed rows here only visible by the number color change and extra line. Oddly, unhide and cell height did not seem to work on this, but I was able to expand them back out and delete them by selecting above and below and double-clicking over it (just like you can for the column width to expand to fit).
If anyone is experiencing duplicate or unexpected data appearing during import check for something like this as well. Note that it can also appear in the middle (it does not have to be at the end) as I found several of these in the file I was given.
In my current Database I have a table whose data is manually entered or comes in an excel sheet every week. Before we had the "manual entry option", the table would be dropped and replaced by the excel version.
Now because there is data that only exists in the original table this can not be done.
I'm trying to find a way to update the original table with changes and additions from the (excel) table while preserving all rows not in the new sheet.
I've been attempting to simply use an insert query and an update query /but/ I can't find a way to detect changes in a record.
Any suggestions? I can provide the current sql if you'd find that helpful.
Based on what I have read so far, I think I can offer some suggestions:
It appears you have control of the MS Access. I would suggest adding a field to your data table called "source". Modify your form in the access database to store something like "m" for manual entry in the source field. When you import the excel, store an "e" for excel in the field.
You would need to do a one time scrub of the data to mark existing records as manual entries or excel entries. There are a couple of ways you can do it through automation/queries that I can explain in detail if you want.
Once past these steps, your excel process is fairly simple. You can delete all records with source = "e" and then do a full excel import. Manual records would remain unchanged.
This concept will allow you to add new sources and codes and allow you to handle each differently if needed. You just need to spend some time cleaning up your old data. I think you will find it worth it in the end.
Good Luck.
I am attempting to upload an Excel document into Access. I have used VBA to unhide all columns and rows and then delete columns and rows that are not being used. All of the worksheets upload into Access properly except one. This particular worksheet attempts to upload a field and label it Field 12. I am unable to find a way to delete this field. Any help?
It is probably the first column after your data...
Try either in VBA or in Excel deleting the columns to the right of your data (not just contents but an actual delete). I've found this typically happens when the columns to the right of your data contained data at one point and Access / Excel sees those as still containing data. Then try your import again.
Alternatively, you could upload into a new Access staging table before pulling your desired known columns into the final table through an INSERT query. Then you can delete the staging table if you like or delete it before the next import. In this way, each import can have its own "added columns".
We have are relatively simple Reporting Services report that our users commonly export to Excel. I've noticed that the files produced by the Excel export seem unusually large. If I open one of these files and just click save, without making any changes, the file size reduces to about half of it's previous size. Has anyone else run into this and is there a known workaround?
You've mentioned that the report is relatively simple, but this is important to check. The export to Excel will go to extraordinary lengths to try and maintain how your report looks.
If you have lots of different borders or colours (particularly if different formatting is determined by the data in your report) this will bloat the file.
Also check if many columns with very small and unusual sizes are created in the exported worksheet. The export does this to try and match alignment in Excel with the original report.
Try recreating your report as a basic table with no formatting or headers/footers and see if you can reproduce the problem. If Excel's behaviour is acceptable then add each piece of formatting back until it goes awry. Please let us know what you find.
I don't have an immediate solution, but a common problem in Excel is files bloating because one/some/all of the worksheets have saved all 64K rows instead of the ones being used. The fix in Excel is to select all the lower rows not being used, and delete them, then save the spreadsheet, close and reopen. Therefore, I'd pursue the angle of extra rows being saved in the export, and see if there is a way to keep this from happening.
What tool are you using when exporting to Excel?
I have also managed to reduce # of rows in my Excel worksheet by copying it to another worksheet, then deleting the original sheet.
You could also try copying only the data in your worksheet, and paste it into a new Excel Workbook (file).
I had the issue where the exported Excel files took and extremely long to open and they would stop responding every time you clicked on a cell.
Also, extra and merged columns would appear in exported excel files.
The fix was to make sure my header text boxes lined up with the beginning and end of columns in the data table just below it. Once both were aligned, there were no more extra columns in the exported excel spreadsheets and the performance was back to normal.
Here's the reference that helped me understand the issue:
http://www.codegur.press/12747988/issue-report-export-to-excel-in-rdlc-report
May be I am answering your question very late. Here's the solution for exporting to CSV.
You need to give the a name that you want to see as a column header for the field (not the column name) in the designer.
By default all the text headers are exported as a separate columns along with the table columns and make sure that you name the Design name in the properties with the name you want to see.
The other important thing to note about the option DataElementOutput which is set to Auto meaning it will be exported. You can change that if you don't want it to be exported.
The last but not least thing ... after you export the data looks messed up. You need select the whole first column and go to the Data tab - > convert text to column -> use the delimiter as comma and say Finish. That should solve your issue.