Import fixed width text to SQL - sql

We have records in this format:
99 0882300 25 YATES ANTHONY V MAY 01 12 04 123456 12345678
The width is fixed and we need to import it into SQL. We tried bulk import, but it didn't work because it's not ',' or '\t' separated. It's separated by individual spaces, of various lengths, in the text file, which is where our dilemma is located.
Any suggestions on how to handle this? Thanks!

question is pretty old but might still be relevant.
I had exactly the same problem as you.
My solution was to use BULK INSERT, together with a FORMAT file.
This would allow you to:
keep the code much leaner
have the mapping for the text file
to upload in a separate file that you can easy tweak
skip columns if you fancy
To cut to the chase, here is my data format (that is one line)
608054000500SS001 ST00BP0000276AC024 19980530G10379 00048134501283404051N02912WAC 0024 04527N05580WAC 0024 1998062520011228E04ST 04856 -94.769323 26.954832
-94.761114 26.953626G10379 183 1
And here is my SQL code:
BULK INSERT dbo.TARGET_TABLE
FROM 'file_to_upload.dat'
WITH (
BATCHSIZE = 2000,
FIRSTROW = 1,
DATAFILETYPE = 'char',
ROWTERMINATOR = '\r\n',
FORMATFILE = 'formatfile.Fmt'
);
Please note the ROWTERMINATOR parameter set there, and the DATAFILETYPE.
And here is the format file
11.0
6
1 SQLCHAR 0 12 "" 1 WELL_API SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 19 "" 2 SPACER1 SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 8 "" 3 FIELD_CODE SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 95 "" 4 SPACER2 SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 5 "" 5 WATER_DEPTH SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 93 "" 6 SPACER3 SQL_Latin1_General_CP1_CI_AS
I put documentation links below, but what you must note is the following:
the ""s in the 5th column, which indicates the separator (for a .csv would be obviously ","), which in our case is set to just "";
column 2 is fully "SQLCHAR", as it's a text file. This must stay so even if the destination field in the data table is for example an integer (it is my case)
Bonus note: in my case I only needed three fields, so the stuff in the middle I just called "spacer", and in my format file gets ignored (you change numbers in column 6, see documentation).
Hope it answers your needs, works fine for me.
Cheers
Documentation here:
https://msdn.microsoft.com/en-us/library/ms178129.aspx
https://msdn.microsoft.com/en-us/library/ms187908.aspx

When you feel more at home with SQL than importing tools, you could bulk import the file into a single VARCHAR(255) column in a staging table. Then process all the records with SQL and transform them to your destination table:
CREATE TABLE #DaTable(MyString VARCHAR(255))
INSERT INTO #DaTable(MyString) VALUES ('99 0882300 25 YATES ANTHONY V MAY 01 12 04 123456 12345678')
INSERT INTO FInalTable(Col1, Col2, Col3, Name)
SELECT CAST(SUBSTRINg(MyString, 1, 3) AS INT) as Col1,
CAST(SUBSTRING(MyString, 4, 7) AS INT) as Col2,
CAST(SUBSTRING(MyString, 12, 3) AS INT) as Col3,
SUBSTRING(MyString, 15, 6) as Name
FROM #DaTable
result: 99 882300 25 YATES

To import from TXT to SQL:
CREATE TABLE #DaTable (MyString VARCHAR(MAX));
And to import from a file
BULK INSERT #DaTable
FROM'C:\Users\usu...IDA_S.txt'
WHITH
(
CODEPAGE = 'RAW'
)
3rd party edit
The sqlite docs to import files has an example usage to insert records into a pre-existing temporary table from a file which has column names in its first row:
sqlite> .import --csv --skip 1 --schema temp C:/work/somedata.csv tab1
My advice is to import the whole file in a new table (TestImport) with 1 column like this
sqlite> .import C:/yourFolder/text_flat.txt TestImport
and save it to a db file
sqlite> .save C:/yourFolder/text_flat_out.db
And now you can do all sorts of etl with it.

I did this for a client a while back and, sad as it may seem, Microsoft Access was the best tool for the job for his needs. It's got support for fixed width files baked in.
Beyond that, you're looking at writing a script that translates the file's rows into something SQL can understand in an insert/update statement.
In Ruby, you could use the String#slice method, which takes an index and length, just like fixed width files' definitions are usually expressed in. Read the file in, parse the lines, and write it back out as a SQL statement.

Use SSIS instead.
This is much clearer and has various options for the import of (text) files

Related

How to generate a fixed width file with different data types in each row in SSIS

I want to find out if there is a better way to present a data the way I need using T-SQL and generating a TEXT file in SSIS from SQL command(Exec stored proc).
So I need to present a data in a text file with fixed width for each column, like col1 - 10 alphacharacters, col2 numeric with zero pads, col3 blanks etc.
the Total number of characters in a line cannot exceed and must be 275.
However Each Row is going to have different data and different column requirements.
So if
First row: '1', col1 - 22 alphacharacters, col2 numeric with zero pads 10 characters, col3 blanks(fill up to 275)
Second row:'2', col1 - date 6 characters,col2 3 blanks
col3 30 alphacharacters, col4 numeric, col 5 blanks(fill up to 275)
What I come up with is to concat the row into 1 big string and then Union ALL the rows.
And in SSIS I do Ragged right without columns headers and the text file is coming up exactly as I want, but I wonder if there is a better way to do that
I Figured how to manipulate data with different functions, so i'll just present a code without them to make everything simple
SELECT CONCAT('1',COL1, COL2, REPLICATE(' ',n...)) as 123
FROM MYTABLE
UNION ALL
SELECT CONCAT('2',COL1,REPLICATE(' ',3),COL2, COL3, REPLICATE (' ',n..)) as 123
FROM MYTABLE2
The Layout of results should look like that in a text file (I just put random data for this example to make visualization better)
1Microsoft SQL Server 0000002017
208202019 John Doe 00000015
208202019 Jane Doe 00000109
208202019 Will Smith 00001996
Rephrasing your task - you need to set a custom header for your Flat File, and output sows in Ragged format.
The alternative approach is the following. Create a string variable and fill it with header row data as desired. On the Data Flow task which fills in the Flat File Destination, locate [Flat File Destination].[Header] property and set its expression to the string variable from above. This will create a file with defined header. Then on the data flow itself - create a string computed column where you format your output string, and later - save this column into the Flat File Destination.
This is more SSIS approach, since you do not have to do complex SQL statements.

Additional 0 in varbinary insert in SSMS

I have a problem when I am trying to move a varbinary(max) field from one DB to another.
If I insert like this:
0xD0CF11E0A1B11AE10000000
It results the beginning with an additional '0':
0x0D0CF11E0A1B11AE10000000
And I cannot get rid of this. I've tried many tools, like SSMS export tool or BCP, but without any success. And it would be better fro me to solve it in a script anyway.
And don't have much kowledge about varbinary (a program generates it), my only goal is to copy it:)
0xD0CF11E0A1B11AE10000000
This value contains an odd number of characters. Varbinary stores bytes. Each byte is represented by exactly two hexadecimal characters. You're either missing a character, or your not storing bytes.
Here, SQL Server is guessing that the most significant digit is a zero, which would not change the numeric value of the string. For example:
select 0xD0C "value"
,cast(0xD0C as int) "as_integer"
,cast(0x0D0C as int) "leading_zero"
,cast(0xD0C0 as int) "trailing_zero"
value 3_char leading_zero trailing_zero
---------- --------- --------------- ----------------
0d0c 3340 3340 53440
Or:
select 1 "test"
where 0xD0C = 0x0D0C
test
-------
1
It's just a difference of SQL Server assuming that varbinary always represents bytes.

converting input fields from text to columns in FoxPro or SQL

I have a set of input data in FoxPro. One of the fields, the grp field, is a concatenated string, the
individual portions of which are delimited by the pipe symbol, "|". Here are some examples of the values it can take:
grp:
ddd|1999|O|%
bce|%
aaa|2009|GON|Fixed|big|MAE|1
bbb|PAL|N|Fixed|MAE|1
aaa|SMK|O|Fixed|MAE|1|1
ddd|ERT|O|%
eef|%|N|%
afd|2000|O|%
afd|200907|O|%
swq|%|O|%
%
I would like to write a query that will separate the data above into separate fields and output them to another sql table, where the deciding factor for the separation is the pipe symbol. Taking the first two rows as an example, the output should read
Record 1:
Field1 = ddd Field2 = 1999 Field3 = O Field4 = %
Record 2:
Field1 = bce Field2 = % Field3 holds no value Field4 holds no value
It will not be known in advance what the greatest number of pipe symbols in the data will be. In the example above, it is 6, in records 3 and 5.
Is it actually possible to do this?
You can create a cursor and append the data into it using 'append from' (another way would be to use 2 alines, one for rows other for columns data). For example using your data as a text variable:
Local lcSample, lcTemp, lnFields, ix
TEXT to m.lcSample noshow
ddd|1999|O|%
bce|%
aaa|2009|GON|Fixed|big|MAE|1
bbb|PAL|N|Fixed|MAE|1
aaa|SMK|O|Fixed|MAE|1|1
ddd|ERT|O|%
eef|%|N|%
afd|2000|O|%
afd|200907|O|%
swq|%|O|%
%
ENDTEXT
lnFields = 0
Local Array laText[1]
For ix=1 To Alines(laText, m.lcSample)
m.lnFields = Max(m.lnFields, Occurs('|', m.laText[m.ix]))
Endfor
#Define MAXCHARS 20 && max field width expected
Local Array laField[m.lnFields,4]
For ix = 1 To m.lnFields
m.laField[m.ix,1] = 'F' + Ltrim(Str(m.ix))
m.laField[m.ix,2] = 'C'
m.laField[m.ix,3] = MAXCHARS
m.laField[m.ix,4] = 0
Endfor
lcTemp = Forcepath(Sys(2015)+'.txt', Sys(2023))
Strtofile(m.lcSample, m.lcTemp)
Create Cursor myData From Array laField
Append From (m.lcTemp) Delimited With "" With Character "|"
Erase (m.lcTemp)
Browse
However, in real world, this doesn't sound to be very realistic. You should know something about the data ahead.
And also, you could use FoxyClasses' import facilities to get the data. It lets you to choose the delimiters, map the columns etc. but requires some developer intervening for the final processing of the data.
The ALINES() function makes parsing easy. You could apply it to each line in turn. #Cetin has already showed you how to find out how many fields you need.
I had to do something very similar with some client data. They provided a field that was a space separated list of numbers that needed to be pulled out into a single column of numbers to match to an offer. Initially I dumped it to a text file, and imported it back into a new table. Something like the following:
create table groups ;
(group1 c(5), group2 c(5), group3 c(5), group4 c(5), group5 c(5))
select grp from infile to file grps.tmp noconsole plain
select groups
append from grps.tmp delimited with "" with character "|"

SQL Bulk Insert skipping last 849 Lines from Text File

Good day all! For some reason my bulk Insert statement is skipping the last 849 lines from the text file I am reading. I know this because when I manually add my own last line I don't see it in the table after the insert is done and when debugging I see the message: (133758 row(s) affected) and the text file has 134607 Lines, excluding the first 2.
My query looks like this:
BULK INSERT #TEMP FROM 'C:\Test\Test.txt'
WITH (FIELDTERMINATOR ='\t', ROWTERMINATOR = '0x0a', FIRSTROW = 2, MAXERRORS = 50, KEEPNULLS)
I have checked if the are more columns then what the table has and that's not the case. I have changed MAXERRORS from 10 to 20 to 30 to 40 to 50, to see if there are any changes but the Row(s) affected stays the same. Is there maybe something I haven't handled for or missing?
Thanks awesome people.
P.S. I am using this insert for another text file and table but with different column headers and there are less columns in the text file and it works perfectly.

Function to import data from a txt file to a SQL-Server table

I am quite new with SQL Server.
I have got a .txt file with values separated like this:
20120101;001;2;0;0;0;0;0;8
20120102;002;3;0;0;0;0;0;8
20120103;003;4;0;0;0;0;0;8
...
This file is in a server and I need to import this file everyday at the same hour autommaticaly, and update the changes (the file is changing every day). I was thinking in making a function to do it but I don't know if it is even possible.
Any help would be appreciated. Thanks in advance.
insert into your_table
SELECT a.*
FROM OPENROWSET( BULK 'c:\input.csv', FORMATFILE = 'c:\Format.fmt') AS a;
Exaample Format.fmt
9.0
6
1 SQLCHAR 0 15 ";" 1 col1 SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 25 ";" 2 col2 SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 6 ";" 3 col3 SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 5 ";" 4 col4 SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 15 ";" 5 col5 SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 100 "\r\n" 6 col6 SQL_Latin1_General_CP1_CI_AS
You could use SQL Server Integration Services (SSIS) to import flat files on a scheduled basis.
Here is an example that loops through all CSV files of same format in a given a folder and loads them into database.
How do I move files to an archive folder after the files have been processed?
Here is another example:
Validate CSV file Data before import into SQL Server table in SSIS?
Following answer shows how to schedule an SSIS package to run from inside a SQL Server Agent job.
How do I create a step in my SQL Server Agent Job which will run my SSIS package?
Hope that gives you an idea.