SSIS flat file with string larger than 50 - sql

SSIS by default makes the datatype to be String with length 50, what if the string in a certain column is larger than 50 and also I can't use suggest types (it sucks!).
Is there a way to fix this, rather than manually increasing the sizes ie. manually editing the column lengths/datatypes in the flat file manager's advanced tab, ideally changing datatypes based on the destination (sql server) mapping columns' datatypes?

You can set datatypes in the flat file connection manager. In the advanced section.

I've heard good things about BIDS Helper, but haven't used it myself.
I haven't found a way to change default length, or to stop it from resetting when changing the connection manager. I was pleased that you can select all columns at once in the advanced editor and change them simultaneously, that's something...

The best way I could do this was write C# code that modifies the ssis package xml file and increases the string length values by looking at the lengths of the destination table (using information_schema query)

Related

Proper formatting of Excel sheets to avoid errors in SQL querying?

What do you avoid when creating and filling out a Excel spreadsheet of data for a SQL database (certain formats, characters, character length issues?)
2.Does it matter how dates are formatted?
VARCHAR or INTEGER errors you've seen?
Finally, what SQL or Python queries did you use to address errors you found that you might have shared for questions 1-3?
The easiest way would be, if you can import Database-EDI (e.g. Oracle SQL Developer) a TXT- or CSV-Excel-Export into our Database.
→ Depending on the database, different requirements must be observed.
The main focus is on the correct formatting with regard to the country settings (Excel & database):
Excel-Format-Date YYYY-M-DD HH24:MM / Databe-Timestamp YYYY-MM-DD HH24:MM:SS.FFFF
→ That would not work
In addition, make sure that Excel does not cut any numbers:
Excel-Format-Long-Number 89632150000 (orignal 896321512345 )
→ Excel automatically shortens the number in the standard settings.
The length of a text must not exceed the specified maximum length in the assigned column of the type (VARCHAR).
I think these would be the main points to look out for.

Import PostgreSQL dump into SQL Server - data type errors

I have some data which was dumped from a PostgreSQL database (allegedly, using pg_dump) which needs to get imported into SQL Server.
While the data types are ok, I am running into an issue where there seems to be a placeholder for a NULL. I see a backslash followed by an uppercase N in many fields. Below is a snippet of the data, as viewed from within Excel. Left column has a Boolean data type, and the right one has an integer as the data type
Some of these are supposed to be of the Boolean datatype, and having two characters in there is most certainly not going to fly.
Here's what I tried so far:
Import via dirty read - keeping whatever datatypes SSIS decided each field had; to no avail. There were error messages about truncation on all of the boolean fields.
Creating a table for the data based on the correct data types, though this was more fun... I needed to do the same as in the dirty read, as the source would otherwise not load properly. There was also a need to transform the data into the correct data type for insertion into the destination data source; yet, I am getting truncation issues, when it most certainly shouldn't be.
Here is a sample expression in my derived column transformation editor:
(DT_BOOL)REPLACE(observation,"\\N","")
The data type should be Boolean.
Any suggestion would be really helpful!
Thanks!
Since I was unable to circumvent the SSIS rules in order to get my data into my tables without an error, I took the quick-and-dirty approach.
The solution which worked for me was to have the source data read each column as if it were a string, and the destination table had all fields be of the datatype VARCHAR. This destination table will be used as a staging table, once in SS, I can manipulate as needed.
Thank you #cha for your input.

Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot

I'm trying to import a flat file into an oledb target sql server database.
here's the field that's giving me trouble:
here are the properties of that flat file connection, specifically the field:
here's the error message:
[Source - 18942979103_txt [424]] Error: Data conversion failed. The
data conversion for column "recipient-name" returned status value 4
and status text "Text was truncated or one or more characters had no
match in the target code page.".
What am I doing wrong?
Here is what fixed the problem for me. I did not have to convert to Excel. Just modified the DataType when choosing the data source to "text stream" (Figure 1). You can also check the "Edit Mappings" dialog to verify the change to the size (Figure 2).
Figure 1
Figure 2
After failing by increasing the length or even changing to data type text, I solved this by creating an XLSX file and importing. It accurately detected the data type instead of setting all columns as varchar(50). Turns out nvarchar(255) for that column would have done it too.
I solved this problem by ORDERING my source data (xls, csv, whatever) such that the longest text values on at the top of the file. Excel is great. use the LEN() function on your challenging column. Order by that length value with the longest value on top of your dataset. Save. Try the import again.
SQL Server may be able to suggest the right data type for you (even when it does not choose the right type by default) - clicking the "Suggest Types" button (shown in your screenshot above) allows you to have SQL Server scan the source and suggest a data type for the field that's throwing an error. In my case, choosing to scan 20000 rows to generate the suggestions, and using the resulting suggested data type, fixed the issue.
While an approach proposed above (#chookoos, here in this q&a convert to Excel workbook) and import resolves those kinds of issues, this solution this solution in another q&a is excellent because you can stay with your csv or tsv or txt file, and perfom the necessary fine tuning without creating a Microsoft product related solution
I've resolved it by checking the 'UNICODE'checkbox. Click on below Image link:
You need to go increase the column length while importing the data for particular column.
Choose a data source >> Advanced >> increase the column from default 50 to 200 or more.
Not really a technical solution, but SQL Server 2017 flat file import is totally revamped, and imported my large-ish file with 5 clicks, handled encoding / field length issues without any input from me
SQl Management Studio data import looks at the first few rows to determine source data specs..
shift your records around so that the longest text is at top.
None of the above worked for me. I SOLVED my problem by saving my source data (save as) Excel file as a single xls Worksheet Excel 5.0/95 and imported without column headings. Also, I created the table in advance and mapped manually instead of letting SQL create the table.
I had similar problem against 2 different databases (DB2 and SQL), finally I solved it by using CAST in the source query from DB2. I also take advantage of using a query by adapting the source column to varchar and avoiding the useless blank spaces:
CAST(RTRIM(LTRIM(COLUMN_NAME)) AS VARCHAR(60) CCSID UNICODE
FOR SBCS DATA) COLUMN_NAME
The important issue here is the CCSID conversion.
It usually because in connection manager it may be still of 50 char , hence I have resolved the problem by going to Connection Manager--> Advanced and then change to 100 or may be 1000 if its big enough

Fixed Length Text File to SQL Data Table

I have a text file (~100,000+ rows), where each column is a fixed length and I need to get it into a SQL Server database table. Each one of our clients are required to get this data, but each text file is slightly different so we have to manually go in and adjust the character spacing in a SQL stored procedure.
I was wondering if there is a way that we can use XML/XSD/XSLT instead. This way, I would not have to go in and manually edit the stored procedures.
What we do currently is this:
1.) SQL server stored procedure reads a text file from the disk
2.) Each record is split into an XML element and dumped into a temporary table
3.) Using SQL Server's string manipulation, each element is parsed
4.) Each column is dumped into
For clarification, here are a couple of examples...
One client's text file would have the following:
Name [12 Characters]
Employer [20 Characters]
Income [7 Characters]
Year-Qtr [5 Characters]
JIM JONES HOMERS HOUSE OF HOSE100000 20113
Another client's text file would have the following:
Year-Qtr [5 Characters]
Income [7 Characters]
Name [12 Characters]
Employer [20 Characters]
20113100000 JIM JONES HOMERS HOUSE OF HOSE
They basically all have the same fields, some may have a couple more are a couple less, just in different orders.
Using SQL Server xml processing functions to import a fixed length text file seems like a backwards way of doing things (no offense).
You don't need to build your own application, Microsoft has already built one for you. It's ingeniously called BCP Utility. If needed, you can create a format file that tells BCP Utility how to import your data. The best part is it's ridiculously fast and you can import the data to SQL Server from a remote machine (as in the file doesn't have to be located on the SQL Server box to import it)
To address the fact that you need to be able to change the column widths, I don't think editing the format file would be to bad.
Ideally you would be able to use a delimited format instead of an ever-changing fixed length format, that would make things much easier. It might be quick and easy for you to import the data into excel and save it in a delimited format and then go from there.
Excel, Access, all the flavors of VB and C# have easy-to-use drivers for treating text files as virtual database tables, usually with visual aids for mapping the columns. And reading and writing to SQL Server is of course cake. I'd start there.
100K rows should not be a problem unless maybe you're doing it hourly for several clients.
I'd come across File Helpers a while back when I was looking for a CSV parser. The example I've linked to shows you how you can use basic POCOs decorated with attributes to represent the file you are trying to parse. Therefore you'd need a Customer specific POCO in order to parse their files.
I haven't tried this myself, but it could be worth a look.

SSIS 2005 - How to Import a Fixed Width Flat File?

I have a flat file that looks something like this:
junk I don't care about \n
\n
columns names\n
val1 val2 val3\n
val1 val2 val3\n
columns names \n
val1 val2 val3\n
I only care the lines with values. These value lines are all fixed width format and have the same line length. The other junk lines and column names can have any line width.
When I try the flat file fixed width option or the ragged right option the preview looks all wrong. Any ideas what the easiest way to get this into SSIS is?
You cannot use the fixed width option and I seem to recall that the ragged right option only applies if the raggedness is in the entire last column.
You can use the ragged right option and read the entire thing into a string column and then use derived columns.
Alternatively, pre-process the file (possibly in SSIS, using a ragged-right with a conditional split, outputting to a flat file) to filter out the lines you are going to ignore and then you can use the flat file connection manager on the resulting file.
Another option is to code a data source script task by hand.
It would be nice if you could use more complex files by being able to define new connection manager layouts on the outputs of other data flows, but that is not currently available in SSIS.
This is basically the same problem I posed in this question: How to process ragged right text files with many suppressed columns in SSIS or other tool?
Try this after removing the junk at the top manually.
set the task with fixed width option
Add columns manually to the advanced tab. Here you need to add 3 columns with each of length 4.
If it works.. Then you can use a script task to read the flat file and remove the junk before you go for the data flow task.