zero padding in teradata sql - sql

Table A
Id varchar(30)
I'm trying to re-create a logic where I have to use 9 digit Ids irrespective of the actual length of the Value of the Id field.
So for instance, if the Id is of length 6, I'll need to left pad with 3 leading zeros. The actual length can be anything ranging from 1 to 9.
Any ideas how to implement this in Teradata SQL?

If the actual length is 1 to 9 characters why is the column defined as VarCar(30)?
If it was a numeric column it would be easy:
CAST(CAST(numeric_col AS FORMAT '9(9)') AS CHAR(9))
For strings there's no FORMAT like that, but depending on your release you might have an LPAD function:
LPAD(string_col, 9, '0')
Otherwise it's:
SUBSTRING('000000000' FROM CHAR_LENGTH(string_col)+1) || string_col,
If there are more than nine characters all previous calculations will return them.
If you want to truncate (or a CHAR instead of a VARCHAR result) you have to add a final CAST AS CHAR(9)
And finally, if there are leading or trailing blanks you might want to use TRIM(string_col)

Related

Oracle - Why is CHAR Column is automatically adding a leading zero?

I am working with an Oracle DB 11g
I have a database table with the primary key being a CHAR(4) - Though only numbers are used for this column.
I noticed that there are some records that for example show '0018' or '0123'.
So few things I noticed odd and needed some help on
-Does a CHAR column "automatically" pad zeros to a value?
-Also I noticed when writing a SQL that if I DONT use quotes in my where clause that it returns results, but if I do use quotes it does not? So for example
DB CHAR(4) column has a key of '0018'
I use this query
SELECT * FROM TABLE_A WHERE COLUMN_1=18;
I get the row as expected.
But when I try the following
SELECT * FROM TABLE_A WHERE COLUMN_1='18';
This does NOT work but this does work again
SELECT * FROM TABLE_A WHERE COLUMN_1='0018';
So I am a bit confused how the first query can work as expected without quotes?
Does a CHAR column "automatically" pad zeros to a value?
No. From the documentation:
If you insert a value that is shorter than the column length, then Oracle blank-pads the value to column length.
So if you insert the number 18 it will be implicitly converted to the string '18 ', with two trailing spaces. You can see that in this fiddle, which also shows the comparisons.
That means something else is zero-padding your data - either your application/code before inserting, or possibly in a trigger.
Also I noticed when writing a SQL that if I DONT use quotes in my where clause that it returns results, but if I do use quotes it does not
The data type comparison and conversion rules are shown in the documentation too:
When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.
When you do:
SELECT * FROM TABLE_A WHERE COLUMN_1=18;
the string '0018' is implicitly converted to the number 18 so that it can be compared with your numeric literal. The leading zeros are meaningless once it's converted, so '0018', '018 ' and 18 ' would all match.
With your zero-padded column value that matches and you do get a result: 18 ('0018' converted to a number) = 18
That means that every value in the table has to be converted before it can be compared; which also means that if you has a normal index on column_1 then it wouldn't be utilised in that comparison.
When you do:
SELECT * FROM TABLE_A WHERE COLUMN_1='18';
the column and literal are the same data type so no conversion has to be applied (so a normal index can be used). Oracle will use blank-padded comparison semantics here, because the column is char, padding the shorter literal value to the column size as '18 ', and then it will only match if the strings match exactly - so '18 ' would match but '0018' or ' 18 ' or anything else would not.
With your zero-padded column value that does not match and you don't get a result: '0018' != '18 ' ('18' padded to length 4)
When you do:
SELECT * FROM TABLE_A WHERE COLUMN_1='0018';
the column and literal are the same data type so no conversion, no padding is applied as the literal is already the same length as the column value, and again it will only match if the strings match exactly - so '0018' would match but '18 ' or ' 18 ' or anything else would not.
With your zero-padded column value that matches and you do get a result: '0018' = '0018'
Does a CHAR column "automatically" pad zeros to a value?
Not always zero's sometimes spaces. if all characters values are numeric yes it will pad zeros up to a fixed size of the character field.
So I am a bit confused how the first query can work as expected without quotes?
Because of implicit type conversions. The system is casting either the char to numeric or the numeric to char in which case it either drops the leading zeros and compares numeric values or it pads to be of the same data type and then compares. I'm pretty sure it's going character to numeric and thus the leading zeros are dropped when comparing.
See: https://docs.oracle.com/cd/B13789_01/server.101/b10759/sql_elements002.htm for more details on data type comparison and implicit casting
More:
in the case of : SELECT * FROM TABLE_A WHERE COLUMN_1='18'; I
think the 18 is already a character data so it becomes '18 ' (note 2 spaces after 18)
compared to '0018'
SELECT * FROM TABLE_A WHERE COLUMN_1=18; columN_1 gets cast to numeric so 18=18
SELECT * FROM TABLE_A WHERE COLUMN_1='0018'; column_1 is already a char(4) so '0018' = '0018'

Regex to split values in PostgreSQL

I have a list of values coming from a PGSQL database that looks something like this:
198
199
1S
2
20
997
998
999
C1
C10
A
I'm looking to parse this field a bit into individual components, which I assume would take two regexp_replace function uses in my SQL. Essentially, any non-numeric character that appears before numeric ones needs to be returned for one column, and the other column would show all non-numeric characters appearing AFTER numeric ones.
The above list would then be split into this layout as the result from PG:
I have created a function that strips out the non-numeric characters (the last column) and casts it as an Integer, but I can't figure out the regex to return the string values prior to the number, or those found after the number.
All I could come up with so far, with my next to non-existant regex knowledge, was this: regexp_replace(fieldname, '[^A-Z]+', '', 'g'), which just strips out anything not A-Z, but I can;t get to to work with strings before numeric values, or after them.
For extracting the characters before the digits:
regexp_replace(fieldname, '\d.*$', '')
For extracting the characters after the digits:
regexp_replace(fieldname, '^([^\d]*\d*)', '')
Note that:
if there are no digits, the first will return the original value and then second an empty string. This way you are sure that the concatenation is equal to the original value in this case also.
the concatenation of the three parts will not return the original if there are non-numerical characters surrounded by digits: those will be lost.
This also works for any non-alphanumeric characters like #, [, ! ...etc.
Final SQL
select
fieldname as original,
regexp_replace(fieldname, '\d.*$', '') as before_s,
regexp_replace(fieldname, '^([^\d]*\d*)', '') as after_s,
cast(nullif(regexp_replace(fieldname, '[^\d]', '', 'g'), '') as integer) as number
from mytable;
See fiddle.
This answer relies on information you delivered, which is
Essentially, any non-numeric character that appears before numeric
ones needs to be returned for one column, and the other column would
show all non-numeric characters appearing AFTER numeric ones.
Everything non-numeric before a numeric value into 1 column
Everything non-numeric after a numeric value into 2 column
So there's assumption that you have a value that has a numeric value in it.
select
val,
regexp_matches(val,'([a-zA-Z]*)\d+') AS before_numeric,
regexp_matches(val,'\d+([a-zA-Z]*)') AS after_numeric
from
val;
Attached SQLFiddle for a preview.

Pad numbers with leading zeros in an Access query

I have a column of numbers between 0 - 6 digits long. For those less than 6 I need to pad out with zeros to ensure they are all 6 digits i.e 12563 = 012563 or 23 000023 etc etc. Can someone recommend a solution?
Probably the easiest way to pad numbers with leading zeros would be to use the Format() function, as in
Format(fieldName, "000000")
If you're searching on this (like for PIN numbers, where '12' would be represented as '000012' here's an example using Gord's correct answer;
SELECT CStr(Format(fieldName,"000000")) FROM table WHERE CStr(Format(fieldName,"000000"))="000012";
I had a similar issue. I couldn't change the field on the actual file because it was a split database and it had to be changed on the data source (Database_be). I went to the data source and made the change from Number to Short Text to all tables and that was it... Like magic!!
Try:
Update TABLE set DIGITS = string(6- len(DIGITS),"0")
DIGITS TABLE is the table where your numbers are stored.
DIGITS is the field that contains your numbers.
The above does NOT work.
Corrected version:
Update TABLE set DIGITS = string(6- len(DIGITS),"0")&DIGITS
The number '6' can be altered for whatever the total length of your field.

SQL Select by condition on a integer field

I have an integer column in my table. It is product id and has values like
112233001
112233002
113311001
225577001
This numbering (AABBCCDDD) is formed of 4 parts:
AA : first level category
BB : second level category
CC : third level category
DDD : counter
I want to check condition in my SELECT statement to select rows that for example have BB = 33 and AA = 11
Please help
Would this suffice:
select x from table where field >= 113300000 and field < 113400000
SELECT * FROM YOURTABLE
WHERE
substr(PRODUCT_ID, 3, 2)='33'
AND
substr(PRODUCT_ID, 1, 2)='11'
OR
SELECT * FROM YOURTABLE
WHERE
PRODUCT_ID LIKE '11%33%'
and yes in short you have to convert to string
reference of substr
Purpose
The SUBSTR functions return a portion of char, beginning at character position, substring_length characters long. SUBSTR calculates lengths using characters as defined by the input character set. SUBSTRB uses bytes instead of characters. SUBSTRC uses Unicode complete characters. SUBSTR2 uses UCS2 code points. SUBSTR4 uses UCS4 code points.
If position is 0, then it is treated as 1.
If position is positive, then Oracle Database counts from the beginning of char to find the first character.
If position is negative, then Oracle counts backward from the end of char.
If substring_length is omitted, then Oracle returns all characters to the end of char. If substring_length is less than 1, then Oracle returns null.
char can be any of the datatypes CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. Both position and substring_length must be of datatype NUMBER, or any datatype that can be implicitly converted to NUMBER, and must resolve to an integer. The return value is the same datatype as char. Floating-point numbers passed as arguments to SUBSTR are automatically converted to integers.
Select field from table where substr(field,,) = value
This seems like it could work. Otherwise you may have to cast them as strings and parse the values out that you need which would make your queries much slower.
SELECT *
FROM table t
WHERE t.field >= 113300000
AND t.field < 113400000
u need to use _ wildcard char -
SELECT *
FROM TABLE
WHERE
FIELD LIKE '1133_____'
here, each _ is for one char. So you need to put the same number of _ to keep the length same

sql query for alphanumeric ID in hex

I want to be able to differentiate between a string that is alphnumeric and a string that is in hex format.
My current query is:
<columnName> LIKE '?_____=' + REPLICATE('[0-9A-Fa-f]',16)
I found this method of searching for hex ID's online and I thought it was working. However after getting a significantly larger sample size I can see a high false positive rate in my results. The problem is that this gives me all the results I do want but it also gives me a bunch of results I dont care about. For example:
I want to see:
<url>.php?mains=d7ad916d1c0396ff
but i dont want to see:
<url>.php?mblID=2007012422060265
The difference between the 2 strings is that the 16 characters at the end that i want to collect are all numeric and not a hex ID. What are some ways you guys use to limit the results to hex ID only? Thanks in advnace.
UPDATE:
Juergen brought up a good point, the second number could be a hex value to. Not all hex numbers contain [a-F]. I would like to rephrase the question to state that I am looking for an ID with both letters and numbers in it, not just numbers.
The simplest way is just to add a separate clause for that restriction:
<columnName> LIKE '?_____=' + REPLICATE('[0-9A-Fa-f]',16)
AND <columnName> NOT LIKE '?_____=' + REPLICATE('[0-9]',16)
It should be fairly simple to determine if a string contains only numbers...
Setting up a test table:
CREATE TABLE #Temp (Data char(32) not null)
INSERT #Temp
values ('<url>.php?mains=d7ad916d1c0396ff')
,('<url>.php?mblID=2007012422060265 ')
Write a query:
SELECT
right(Data, 16) StringToCheck
,isnumeric(right(Data, 16)) IsNumeric
from #Temp
Get results:
StringToCheck IsNumeric
d7ad916d1c0396ff 0
2007012422060265 1
So, if the IsNumeric function returns 0, it could be a hex string.
This makes several assumptions:
The rightmost 16 characters are what you want to check
You only ever hit 16 characters. I don't know when the string would get too long to check.
A non-numeric character means hex. Any chance of "Q" or "~" being embedded in the string?