SQL loader to load data into specific column of a table - sql

Recently started working on SQL Loader, enjoying the way it works.
We are stuck with a problem where we have to load all the columns in csv format say (10 columns in excel)but the destination table contains around 15 fields.
filler works when you want you skip columns in source file but unsure what to do here.
using is staging table helps but is there any alternative?
Any help is really appreciated.
thanks.

You have to specify the columns in the control file
Recommended reading: SQL*Loader Control File Reference
10 The remainder of the control file contains the field list, which provides information about column formats in the table being loaded. See Chapter 6 for information about that section of the control file.
Excerpt from Chapter 6:
Example 6-1 Field List Section of Sample Control File
1 (hiredate SYSDATE,
2 deptno POSITION(1:2) INTEGER EXTERNAL(2)
NULLIF deptno=BLANKS,
3 job POSITION(7:14) CHAR TERMINATED BY WHITESPACE
NULLIF job=BLANKS "UPPER(:job)",
mgr POSITION(28:31) INTEGER EXTERNAL
TERMINATED BY WHITESPACE, NULLIF mgr=BLANKS,
ename POSITION(34:41) CHAR
TERMINATED BY WHITESPACE "UPPER(:ename)",
empno POSITION(45) INTEGER EXTERNAL
TERMINATED BY WHITESPACE,
sal POSITION(51) CHAR TERMINATED BY WHITESPACE
"TO_NUMBER(:sal,'$99,999.99')",
4 comm INTEGER EXTERNAL ENCLOSED BY '(' AND '%'
":comm * 100"
)
In this sample control file, the numbers that appear to the left would not appear in a real control file. They are keyed in this sample to the explanatory notes in the following list:
1 SYSDATE sets the column to the current system date. See Setting a Column to the Current Date.
2 POSITION specifies the position of a data field. See Specifying the Position of a Data Field.
INTEGER EXTERNAL is the datatype for the field. See Specifying the Datatype of a Data Field and Numeric EXTERNAL.
The NULLIF clause is one of the clauses that can be used to specify field conditions. See Using the WHEN, NULLIF, and DEFAULTIF Clauses.
In this sample, the field is being compared to blanks, using the BLANKS parameter. See Comparing Fields to BLANKS.
3 The TERMINATED BY WHITESPACE clause is one of the delimiters it is possible to specify for a field. See TERMINATED Fields.
4 The ENCLOSED BY clause is another possible field delimiter. See Enclosed Fields.

Related

Replace comma (,) with dot (.) in SQL and Float Datatype

I created a Table with numeric values like 9,35 or 10,5 in it. The Datatype is float. The table looks like this in the short version:
Currency | Euro | 2018 |
USD | 1 | 9,35 |
Now I want to update my table and replace all komma (,) with a dot (.)
I tried it with this code:
update dbo.[Table]
set [2018] = replace([2018], ',','.')
It says that 24 Rows are affected but when I Update my table it changed nothing.
If I use this code:
select replace ([2018],',','.') from dbo.[Table]
Then it works fine but it don't update my table...
Numeric columns do not contain a separator - they use a separator when the data is displayed. The SQL server was probably set up with a culture that uses commas instead of decimals when it displays data. The coma is not stored with the value.
But, all you need to do is specify the format when you display the data, meaning in a report, form, app, whatever. That's where you specify how to format the values.
I would not format the data in the actual SQL query (e.g. converting the data to a string and specifying the format), since it makes it harder to do aggregations and other numeric operations on the client, and takes up more space in memory (which may not be a problem until you get to a massive scale).

Display certain sequence only in VARCHAR

I have a column error_desc with values like:
Failure occurred in (Class::Method) xxxxCalcModule::endCustomer. Fan id 111232 is not Effective or not present in BL9_XXXXX for date 20160XXX.
What SQL query can I use to display only the number 111232 from that column? The number is placed at 66th position in VARCHAR column and ends 71st.
SELECT substr(ERROR_DESC,66,6) as ABC FROM bl1_cycle_errors where error_desc like '%FAN%'
This solution uses regular expressions.
The challenge I faced was on pulling out alphanumerics. We have to retain only numbers and filter out string,alphanumerics or punctuations in this case, to detect the standalone number.
Pure strings and words not containing numbers can be easily filtered out using
[^[:digit:]]
Possible combinations of alphanumerics are :
1.Begins with a character, contains numbers, may end with characters or punctuations :
[a-zA-Z]+[0-9]+[[:punct:]]*[a-zA-Z]*[[:punct:]]*
2.Begins with numbers and then contains alphabets,may contain punctuations :
[0-9]+[[:punct:]]*[a-zA-Z]+[[:punct:]]*
Begins with numbers then contains punctuations,may contain alphabets :
-- [0-9]+[a-zA-Z][[:punct:]]+[a-zA-Z] --Not able to highlight as code, refer solution's last regex combination
Combining these regular expressions using | operator we get:
select trim(REGEXP_REPLACE(error_desc,'[^[:digit:]]|[a-zA-Z]+[0-9]+[[:punct:]]*[a-zA-Z]*[[:punct:]]*|[0-9]+[[:punct:]]*[a-zA-Z]+[[:punct:]]*|[0-9]+[a-zA-Z]*[[:punct:]]+[a-zA-Z]*',' '))
from error_table;
Will work in most cases.

SQL - need help in parsing text of a field

I have a select query and it fetches a field with complex data. I need to parse that data in specified format. please help with your expertise:
selected string = complexType|ChannelCode=PB - Phone In A Box|IncludeExcludeIndicator=I
expected output - PB|I
Please help me in writing a sql regular expression to accomplish this output.
The first step in figuring out the regular expression is to be able to describe it plain language. Based on what we know (and as others have said, more info is really needed) from your post, some assumptions have to be made.
I'd take a stab at it by describing it like this, which is based on the sample data you provided: I want the sets of one or more characters that follow the equal signs but not including the following space or end of the line. The output should be these sets of characters, separated by a pipe, in the order they are encountered in the string when reading from left to right. My assumptions are based on your test data: only 2 equal signs exist in the string and the last data element is not followed by a space but by the end of the line. A regular expression can be built using that info, but you also need to consider other facts which would change the regex.
Could there be more than 2 equal signs?
Could there be an empty data element after the equal sign?
Could the data set after the equal sign contain one or more spaces?
All these affect how the regex needs to be designed. All that said, and based on the data provided and the assumptions as stated, next I would build a regex that describes the string (really translating from the plain language to the regex language), grouping around the data sets we want to preserve, then replace the string with those data sets separated by a pipe.
SQL> with tbl(str) as (
2 select 'complexType|ChannelCode=PB - Phone In A Box|IncludeExcludeIndicator=I' from dual
3 )
4 select regexp_replace(str, '^.*=([^ ]+).*=([^ ]+)$', '\1|\2') result from tbl;
RESU
----
PB|I
The match regex explained:
^ Match the beginning of the line
. followed by any character
* followed by 0 or more 'any characters' (refers to the previous character class)
= followed by an equal sign
( start remembered group 1
[^ ]+ which is a set of one or more characters that are not a space
) end remembered group one
.*= followed by any number of any characters but ending in an equal sign
([^ ]+) followed by the second remembered group of non-space characters
$ followed by the end of the line
The replace string explained:
\1 The first remembered group
| a pipe character
\2 the second remember group
Keep in mind this answer is for your exact sample data as shown, and may not work in all cases. You need to analyse the data you will be working with. At any rate, these steps should get you started on breaking down the problem when faced with a challenging regex. The important thing is to consider all types of data and patterns (or NULLs) that could be present and allow for all cases in the regex so you return accurate data.
Edit: Check this out, it parses all the values right after the equal signs and allows for nulls:
SQL> with tbl(str) as (
2 select 'a=zz|complexType|ChannelCode=PB - Phone In A Box|IncludeExcludeIndicator=I - testing|test1=|test2=test2 - testing' from dual
3 )
4 select regexp_substr(str, '=([^ |]*)( |||$)', 1, level, null, 1) output, level
5 from tbl
6 connect by level <= regexp_count(str, '=')
7 ORDER BY level;
OUTPUT LEVEL
-------------------- ----------
zz 1
PB 2
I 3
4
test2 5
SQL>

How to query for special characters

I have a large table filled with vendor information.
I need to split this list into two separate lists based on column VENDOR_NAME. One list where VENDOR_NAME is all regular characters and numbers, another list where VENDOR_NAME is special/foreign characters.
I am not sure what the SELECT statements would be to view this information off of the existing master table. Then I could just create two new tables.
VENDOR_NAME only numbers and regular characters
VENDOR_NAME only foreign characters
Example:
Regular: BLUE RIBBON TAG & LABEL CORP
Foreign: 俞章平
Regular: ULSTER-SOCIETY OF GASTROENTEROLOGY/1
Foreign: 马建忠
You could use the function ASCIISTR():
ASCIISTR takes as its argument a string, or an expression that
resolves to a string, in any character set and returns an ASCII
version of the string in the database character set. Non-ASCII
characters are converted to the form \xxxx, where xxxx represents a
UTF-16 code unit.
To get all strings without special characters:
SELECT * FROM table
WHERE INSTR(ASCIISTR(vendor_name),'\') = 0
You have to take care, of course, that strings with '\' would be filtered out by this as well, since the backslash is translated to '\005C' by ASCIISTR. Maybe like this:
WHERE INSTR(REPLACE(ASCIISTR(vendor_name),'\005C','_' ),'\') = 0

Import fixed width UTF-8 file into SQL 2008R2, variabel file names

I have to import text files with different names (like the following) into SQL Server 2008.
XYZ0000746263.txt
XYZ0000746269.txt
XYZ0000745860.txt
The filename always starts with XYZ, and the number is always higher than the file before.
The format of the file is fixed-width, with UTF-8 encoding.
SHINST 1020130613
SHINSD0745860182650 940PI67100000 dataw11 2012CH 01002601900100 848CRU
SHINSD0745860182650 940PI67066900 dataa12 9434CH 00701801400030 848CRU
SHINSD0745860182650 940PI67160300 adsfaf13 1205CH 04203601000160 848CRU
SHINSD0745860182650 940PI67171300 data 14 1205CH 01803501200120 848ND1
SHINSD0745860182650 940PI67079000 asdfs15 8400CH 00702601400040 848ND1
SHINSD0745860182620 940PI67053900 data 16 6877CH 01904101100130 848ND1
SHINSD0745860182620 940PI67156100 text 17 3003CH 08906202902460 848ND2
SHINSD0745860182650 940PI67110700 alskdjf18 1000CH 02603900900130 848ND2
SHINSD0745860182620 940PI67123900 asfasdffa19 8048CH 01502300900020 848ND2
SHINSD0745860182650 940PI67066300 data 20 8952CH 01002601900090 848ND2
SHINSF000012
The first line contains SHINST, then the number of records in the file, then a date in the format YYYYMMDD.
The records contain SHINSD, then a 13-digit number, then the fixed-width records.
The last line contains SHINSF, then a six-digit number with the total number of lines of the file.
I want to automatically import files in this format into an SQL table. How can that be done?
You can try with:
BULK INSERT tablename FROM 'c:\File.txt' WITH (FIELDTERMINATOR = ' ')
If the ' '(space) is your field delimiter.
Of course table table name columns must be the same as in file (count and length).
You can make a procedure and give it a filenamepath only, as in these exaples
The only problem is first and last row in file, cause they are different and I don;t know if you need to insert the mas well. If you need to ignore them, maybe you'll need a script to eliminate them.