Import fixed width UTF-8 file into SQL 2008R2, variabel file names - sql

I have to import text files with different names (like the following) into SQL Server 2008.
XYZ0000746263.txt
XYZ0000746269.txt
XYZ0000745860.txt
The filename always starts with XYZ, and the number is always higher than the file before.
The format of the file is fixed-width, with UTF-8 encoding.
SHINST 1020130613
SHINSD0745860182650 940PI67100000 dataw11 2012CH 01002601900100 848CRU
SHINSD0745860182650 940PI67066900 dataa12 9434CH 00701801400030 848CRU
SHINSD0745860182650 940PI67160300 adsfaf13 1205CH 04203601000160 848CRU
SHINSD0745860182650 940PI67171300 data 14 1205CH 01803501200120 848ND1
SHINSD0745860182650 940PI67079000 asdfs15 8400CH 00702601400040 848ND1
SHINSD0745860182620 940PI67053900 data 16 6877CH 01904101100130 848ND1
SHINSD0745860182620 940PI67156100 text 17 3003CH 08906202902460 848ND2
SHINSD0745860182650 940PI67110700 alskdjf18 1000CH 02603900900130 848ND2
SHINSD0745860182620 940PI67123900 asfasdffa19 8048CH 01502300900020 848ND2
SHINSD0745860182650 940PI67066300 data 20 8952CH 01002601900090 848ND2
SHINSF000012
The first line contains SHINST, then the number of records in the file, then a date in the format YYYYMMDD.
The records contain SHINSD, then a 13-digit number, then the fixed-width records.
The last line contains SHINSF, then a six-digit number with the total number of lines of the file.
I want to automatically import files in this format into an SQL table. How can that be done?

You can try with:
BULK INSERT tablename FROM 'c:\File.txt' WITH (FIELDTERMINATOR = ' ')
If the ' '(space) is your field delimiter.
Of course table table name columns must be the same as in file (count and length).
You can make a procedure and give it a filenamepath only, as in these exaples
The only problem is first and last row in file, cause they are different and I don;t know if you need to insert the mas well. If you need to ignore them, maybe you'll need a script to eliminate them.

Related

Oracle SQL regex: extract every instance of a string and preceding/following characters

I'm pulling data from an Oracle CLOB field containing tens of thousands of characters. The data look like this:
...
196|9900000296567|V|
197|S05S53499|D|
198|TO|20170128000000|50118.0|||T|N|
196|9900009777884|V|
197|H02FC07599|D|
198|01|20170128000000|64452.0|||T|N|
198|02|20170128000000|14235.0|||T|N|
196|9900014386487|V|
197|S10C20599|D|
198|1|20170128000000|6246.0|||T|N|
196|9900015184256|V|
197|S13G44199|D|
198|L|20170128000000|1731.0|||T|N|
198|N|20170128000000|5915.0|||T|N|
196|9900018826270|V|
197|S10C20599|D|
198|01|20170128000000|3678.0|||T|N|
198|02|20170128000000|25286.0|||T|N|
...
I want to extract every occurrence of a string (e.g. S10C20599) with the preceding 25 characters and following 75 characters. If this bit is not possible I'd happily settle for the same number of preceding and following characters. I don't care if I get overlaps in the extracted data, and the code should not error if the search string occurs <25 characters from the beginning of the file or <75 characters from the end.
Thanks for any tips.
If there is only one value, you can use:
select regexp_substr(col, '.{0-25}S10C20599.{0-75}')
Otherwise, you need to some sort of recursive or hierarchical query to fetch multiple values from a single string.

Import all but last three rows of CSV file into SQL table

I have a CSV file that is created by another process (that I can't change) which includes a time stamp and user name in a line below the data. I need to import the data without the final line, because it causes an error due it being an invalid column value.
If I manually remove this line (which I want to avoid doing), my SQL can successfully import the data using:
BULK INSERT #TempReport
FROM 'D:\ac2000\Reg.csv'
WITH
(
FORMAT = 'CSV',
FIRSTROW = 2, -- second row so skip header row in csv file
FIELDTERMINATOR = ',', --CSV field delimiter
FIELDQUOTE = '"', -- Double quote mark is a text delimiter
ROWTERMINATOR = '\n'
)
I know that there is also a LASTROW option in Bulk Insert, but the CVS will have a different number of rows each time, so I need a way to calculate the number of rows, without importing it! ... or at least, not importing it with the method above, which only results in another error.
You can READ the csv file without import it.
SELECT *
FROM OPENROWSET (
BULK N'd:\temp\data.csv',
FORMATFILE = 'D:\temp\fmt.fmt',
FIRSTROW=2
) j
For example
data.csv
ID, User
1,User1
10,User 10
11,User 11
2,user2
Format File
13.0
2
1 SQLCHAR 0 2 "," 1 PersonID ""
2 SQLCHAR 0 25 "\r\n" 2 FirstName SQL_Latin1_General_CP1_CI_AS
You can use a .fmt or .xml and generate it using bcp.
You can read more on format file and how to generate it following links below.
NON xml format file
XML format file
Generate format file
And here documentation about OPENROWSET

SQL Server 2012 output to txt file is adding spaces to end of string

I have a project to output a query to .txt from SQL Server 2012 where the data type is varchar(25) and the string length is variable. The output is adding spaces to the end of the string to make the output equal to the column width. I have tried rtrim() however the spaces do not exist in the source data. I cannot change the column data type.
Data example is 'XXXXX.XXX' and output looks like this 'XXXXX.XXX '

invalid input syntax for integer with postgres

i have a table:
id | detail
1 | ddsffdfdf ;df, deef,"dgfgf",/dfdf/
when I did: insert into details values(1,'ddsffdfdf ;df, deef'); => got inserted properly
When I copied that inserted value from database to a file,the file had: 1 ddsffdfdf ;df, deef
Then I loaded the whole csv file to pgsql database,with values in the format: 1 ddsffdfdf ;df, deef
ERROR: invalid input syntax for integer: "1 ddsffdfdf ;df, deef is obtained. How to solve the problem?
CSVs need a delimiter that Postgres will recognize to break the text into respective fields. Your delimiter is a space, which is insufficient. Your CSV file should look more like:
1,"ddsffdfdf df, deef"
And your SQL should look like:
COPY details FROM 'filename' WITH CSV;
The WITH CSV is important because it tells Postgres to use a comma as the delimiter and parses your values based on that. Because your second field contains a comma, you want to enclose its value in quotes so that its comma is not mistaken for a delimiter.
To look at a good example of a properly formatted CSV file, you can output your current table:
COPY details TO '/your/filename.csv' WITH CSV;

SQL loader to load data into specific column of a table

Recently started working on SQL Loader, enjoying the way it works.
We are stuck with a problem where we have to load all the columns in csv format say (10 columns in excel)but the destination table contains around 15 fields.
filler works when you want you skip columns in source file but unsure what to do here.
using is staging table helps but is there any alternative?
Any help is really appreciated.
thanks.
You have to specify the columns in the control file
Recommended reading: SQL*Loader Control File Reference
10 The remainder of the control file contains the field list, which provides information about column formats in the table being loaded. See Chapter 6 for information about that section of the control file.
Excerpt from Chapter 6:
Example 6-1 Field List Section of Sample Control File
1 (hiredate SYSDATE,
2 deptno POSITION(1:2) INTEGER EXTERNAL(2)
NULLIF deptno=BLANKS,
3 job POSITION(7:14) CHAR TERMINATED BY WHITESPACE
NULLIF job=BLANKS "UPPER(:job)",
mgr POSITION(28:31) INTEGER EXTERNAL
TERMINATED BY WHITESPACE, NULLIF mgr=BLANKS,
ename POSITION(34:41) CHAR
TERMINATED BY WHITESPACE "UPPER(:ename)",
empno POSITION(45) INTEGER EXTERNAL
TERMINATED BY WHITESPACE,
sal POSITION(51) CHAR TERMINATED BY WHITESPACE
"TO_NUMBER(:sal,'$99,999.99')",
4 comm INTEGER EXTERNAL ENCLOSED BY '(' AND '%'
":comm * 100"
)
In this sample control file, the numbers that appear to the left would not appear in a real control file. They are keyed in this sample to the explanatory notes in the following list:
1 SYSDATE sets the column to the current system date. See Setting a Column to the Current Date.
2 POSITION specifies the position of a data field. See Specifying the Position of a Data Field.
INTEGER EXTERNAL is the datatype for the field. See Specifying the Datatype of a Data Field and Numeric EXTERNAL.
The NULLIF clause is one of the clauses that can be used to specify field conditions. See Using the WHEN, NULLIF, and DEFAULTIF Clauses.
In this sample, the field is being compared to blanks, using the BLANKS parameter. See Comparing Fields to BLANKS.
3 The TERMINATED BY WHITESPACE clause is one of the delimiters it is possible to specify for a field. See TERMINATED Fields.
4 The ENCLOSED BY clause is another possible field delimiter. See Enclosed Fields.