Teradata - joining char with varchar - sql

I have two tables A and B. And I want to join A with B on A.col_1 = B.col_2. col_1 has datatype VARCHAR(35) while col_2 has datatype CHAR(35). The following statement caused problem while joining the two tables: no record returned, which mean the two tables cannot be joined. col_1 usually has 8 - 11 digits, the same with col_2. My understanding is the even I used "LENGTH(B.col_2 )-1" but the trailing spaces should not be problems as long as the values of col_1 and col_2 are the same.
What is causing this issue?
ON A.col_1 =SUBSTR(B.col_2 ,1,LENGTH(B.col_2 )-1)
Thanks!

I guess B is the char.
This will explain you what happens here:
select char_length(cast('abc' as char(10)));
10
Your substr does not take the real length of the char string but the padded length, therefore you get the original string minus 1 space.
In order to solve the issue use -
SUBSTR(B.col_2 ,1,LENGTH(cast(B.col_2 as varchar(35))-1)
or
SUBSTR(B.col_2 ,1,LENGTH(rtrim(B.col_2)-1)
... and yes, char/varchar does not matter for comparison
select 1 where cast('abc' as varchar(10)) = cast('abc' as char(10))
1

Related

SQL SERVER 2019 - LEN(FLOOR(CAST([value] AS FLOAT))) defaulting to 12

I am running the following code to get the length of a value before the decimal place:
SELECT LEN(FLOOR(CAST([VALUE] AS FLOAT))) FROM TABLE1 WHERE VALUE2 <> 'B'
The [VALUE] column in TABLE1 is of type nvarchar(30) hence the cast. The column also contains some non-numeric values but these are filtered out by the WHERE clause as they all have a 'B' value for VALUE2.
The code works as expected and returns '6' for values with 6 digits such as '123456.123'. It also works correctly for values with less than 6 digits. However, the code simply returns '12' for any value with greater than 6 digits such as '12345678'.
I've done some googling and can't seem to find a reason for this? Any explanations / alterations / alternatives would be much appreciated!
LENGTH() function expects string expression, so the float value is implicitly converted to string using scientific notation. The following statement demonstrates this issue and the unexpected result:
SELECT
LEN(FLOOR(CAST([VALUE] AS FLOAT))),
FLOOR(CAST([VALUE] AS FLOAT)),
CONVERT(varchar(50), FLOOR(CAST([VALUE] AS FLOAT)))
FROM (VALUES
(N'12345678')
) TABLE1 ([VALUE])
Result:
12 12345678 1.23457e+007
A possible solution, without using an integer (and/or float) conversion, is the following statement:
SELECT CHARINDEX(N'.', CONCAT([VALUE], N'.')) - 1
FROM (VALUES
(NULL),
(N'12345678'),
(N'123456.123'),
(N'99999.923')
) TABLE1 ([VALUE])
I am running the following code to get the length of a value before the decimal place:
This value is called the log base 10 plus 1 -- at least for numbers greater than 1. So how about using:
floor(log10(value)) + 1
You can tweak this for values less than 1 (including negative values) if that is needed.

How to fetch only a part of string

I have a column which has inconsistent data. The column named ID and it can have values such as
0897546321
ABC,0876455321
ABC,XYZ,0873647773
ABC,
99756
test only
The SQL query should fetch only Ids which are of 10 digit in length, should begin with a 08 , should be not null and should not contain all characters. And for those values, which have both digits and characters such as ABC,XYZ,0873647773, it should only fetch the 0873647773 . In these kind of values, nothing is fixed, in place of ABC, XYZ , it can be anything and can be of any length.
The column Id is of varchar type.
My try: I tried the following query
select id
from table
where id is not null
and id not like '%[^0-9]%'
and id like '[08]%[0-9]'
and len(id)=10
I am still not sure how should I deal with values like ABC,XYZ,0873647773
P.S - I have no control over the database. I can't change its values.
SQL Server generally has poor support regular expressions, but in this case a judicious use of PATINDEX is viable:
SELECT SUBSTRING(id, PATINDEX('%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%', ',' + id + ','), 10) AS number
FROM yourTable
WHERE ',' + id + ',' LIKE '%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%';
Demo
If you normalise your data, and split the delimited data into parts, you can achieve this some what more easily:
SELECT SS.value
FROM dbo.YourTable YT
CROSS APPLY STRING_SPLIT(YT.YourColumn,',') SS
WHERE LEN(SS.value) = 10
AND SS.value NOT LIKE '%[^0-9]%';
If you're on an older version of SQL Server, you'll have to use an alternative String Splitter method (such as a XML splitter or user defined inline table-value function); there are plenty of examples on these already on Stack Overflow.
db<>fiddle

Why ISNumeric() Transact-SQL function treats some var-chars as Int [duplicate]

This question already has answers here:
CAST and IsNumeric
(11 answers)
Closed 4 years ago.
SELECT some_column
FROM some_table
WHERE some_column = '3.'
Return row/rows
SELECT some_column
FROM some_table
WHERE ISNUMERIC(some_column) = 0
AND some_column IS NOT NULL
AND some_column <> ''
Does not return any non numeric row/rows. there is a row which has a column value of '3.'
Am i missing something. Please Advise.
I don't understand the question. ISNUMERIC('3.') returns 1. So it would not be returned by the second query. And, presumably, no other rows would either.
Perhaps you really intend: somecolumn not like '%[^0-9]%'. This will guarantee that somecolumn has only the digits from 0-9.
In SQL Server 2012+, you can also use try_convert(int, somecolumn) is not null.
ISNUMERIC() has some flaws. It can return True/1 for values that are clearly not numbers.
SELECT
ISNUMERIC('.'),
ISNUMERIC('$')
If this is causing you issues, try using TRY_PARSE()
SELECT
TRY_PARSE('2' AS INT)
You can use this and filter for non-null results.
SELECT some_column
FROM some_table
WHERE TRY_PARSE(some_column AS INT) IS NOT NULL
ISNUMERIC is a legacy function, I would don't like it personally. It does exactly what it's supposed to do which is usually not what you need ("Is Numeric" is a very subjective question in Computer Science.) I recently wrote this query to clarify things for some co-workers:
SELECT string = x, [isnumeric says...] = ISNUMERIC(x)
FROM (VALUES ('1,2,3,,'),(',,,,,,,,,,'),(N'﹩'),(N'$'),(N'¢'),('12,0'),(N'52,3,1.25'),
(N'-4,1'),(N'56.'),(N'5D105'),('1E1'),('\4'),(''),(N'\'),(N'₤'),(N'€')) x(x);
Returns:
string isnumeric says...
---------- -----------------
1,2,3,, 1
,,,,,,,,,, 1
﹩ 0
$ 1
¢ 0
12,0 1
52,3,1.25 1
-4,1 1
56. 1
5D105 1
1E1 1
\4 1
0
\ 1
₤ 1
€ 1
TRY_CAST or TRY_CONVERT, or WHERE somecolumn not like '%[^0-9]%' as Gordon said, could be a good alternative.
For performance reasons it might not be a bad idea to pre-aggregate, persist and index the column by adding a new computed column. E.g. something like
ALTER <your table>
ADD isGoodNumber AS (ABS(SIGN(PATINDEX('%[^0-9]%',<your column>))-1)
This would return a 1 for rows only containing digits or a 0 otherwise. You can then index isGoodNumber (you pick a better name) for better performance.

SQL - Changing data type of an alphanumeric column

I'm on Teradata. I have an ID column that looks like this:
23
34
W7
007
021
90
GS8
I want to convert the numbers to numeric so the 007 should be 7 and 021 be 21. When a number is stored as a string, I usually do column * 1 to convert to numeric but in this case it gives me a bad character error since there are letters in there.
How would I do this in a select statement within a query?
Assuming that numeric values always start with a number, then something like this should work:
update t
set col = (case when substr(col, 1, 1) between '0' and '9'
then cast(cast(col as int) as varchar(255))
else col
end);
Or, you can forget the conversion and do:
update t
set col = trim(leading '0' from col);
Note: both of these assume that if the first character is a digit then the whole string comprises digits. The second assumes that the values are not all zeroes (or, more specifically, that returns the empty string).
Simply use TO_NUMBER(col) which returns NULL when the cast fails.

Finding rows that don't contain numeric data in Oracle

I am trying to locate some problematic records in a very large Oracle table. The column should contain all numeric data even though it is a varchar2 column. I need to find the records which don't contain numeric data (The to_number(col_name) function throws an error when I try to call it on this column).
I was thinking you could use a regexp_like condition and use the regular expression to find any non-numerics. I hope this might help?!
SELECT * FROM table_with_column_to_search WHERE REGEXP_LIKE(varchar_col_with_non_numerics, '[^0-9]+');
To get an indicator:
DECODE( TRANSLATE(your_number,' 0123456789',' ')
e.g.
SQL> select DECODE( TRANSLATE('12345zzz_not_numberee',' 0123456789',' '), NULL, 'number','contains char')
2 from dual
3 /
"contains char"
and
SQL> select DECODE( TRANSLATE('12345',' 0123456789',' '), NULL, 'number','contains char')
2 from dual
3 /
"number"
and
SQL> select DECODE( TRANSLATE('123405',' 0123456789',' '), NULL, 'number','contains char')
2 from dual
3 /
"number"
Oracle 11g has regular expressions so you could use this to get the actual number:
SQL> SELECT colA
2 FROM t1
3 WHERE REGEXP_LIKE(colA, '[[:digit:]]');
COL1
----------
47845
48543
12
...
If there is a non-numeric value like '23g' it will just be ignored.
In contrast to SGB's answer, I prefer doing the regexp defining the actual format of my data and negating that. This allows me to define values like $DDD,DDD,DDD.DD
In the OPs simple scenario, it would look like
SELECT *
FROM table_with_column_to_search
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^[0-9]+$');
which finds all non-positive integers. If you wau accept negatiuve integers also, it's an easy change, just add an optional leading minus.
SELECT *
FROM table_with_column_to_search
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^-?[0-9]+$');
accepting floating points...
SELECT *
FROM table_with_column_to_search
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^-?[0-9]+(\.[0-9]+)?$');
Same goes further with any format. Basically, you will generally already have the formats to validate input data, so when you will desire to find data that does not match that format ... it's simpler to negate that format than come up with another one; which in case of SGB's approach would be a bit tricky to do if you want more than just positive integers.
Use this
SELECT *
FROM TableToSearch
WHERE NOT REGEXP_LIKE(ColumnToSearch, '^-?[0-9]+(\.[0-9]+)?$');
After doing some testing, i came up with this solution, let me know in case it helps.
Add this below 2 conditions in your query and it will find the records which don't contain numeric data
and REGEXP_LIKE(<column_name>, '\D') -- this selects non numeric data
and not REGEXP_LIKE(column_name,'^[-]{1}\d{1}') -- this filters out negative(-) values
Starting with Oracle 12.2 the function to_number has an option ON CONVERSION ERROR clause, that can catch the exception and provide default value.
This can be used for the test of number values. Simple set NULL when the conversion fails and filer all not NULL values.
Example
with num as (
select '123' vc_col from dual union all
select '1,23' from dual union all
select 'RV12P2000' from dual union all
select null from dual)
select
vc_col
from num
where /* filter numbers */
vc_col is not null and
to_number(vc_col DEFAULT NULL ON CONVERSION ERROR) is not null
;
VC_COL
---------
123
1,23
From http://www.dba-oracle.com/t_isnumeric.htm
LENGTH(TRIM(TRANSLATE(, ' +-.0123456789', ' '))) is null
If there is anything left in the string after the TRIM it must be non-numeric characters.
I've found this useful:
select translate('your string','_0123456789','_') from dual
If the result is NULL, it's numeric (ignoring floating point numbers.)
However, I'm a bit baffled why the underscore is needed. Without it the following also returns null:
select translate('s123','0123456789', '') from dual
There is also one of my favorite tricks - not perfect if the string contains stuff like "*" or "#":
SELECT 'is a number' FROM dual WHERE UPPER('123') = LOWER('123')
After doing some testing, building upon the suggestions in the previous answers, there seem to be two usable solutions.
Method 1 is fastest, but less powerful in terms of matching more complex patterns.
Method 2 is more flexible, but slower.
Method 1 - fastest
I've tested this method on a table with 1 million rows.
It seems to be 3.8 times faster than the regex solutions.
The 0-replacement solves the issue that 0 is mapped to a space, and does not seem to slow down the query.
SELECT *
FROM <table>
WHERE TRANSLATE(replace(<char_column>,'0',''),'0123456789',' ') IS NOT NULL;
Method 2 - slower, but more flexible
I've compared the speed of putting the negation inside or outside the regex statement. Both are equally slower than the translate-solution. As a result, #ciuly's approach seems most sensible when using regex.
SELECT *
FROM <table>
WHERE NOT REGEXP_LIKE(<char_column>, '^[0-9]+$');
You can use this one check:
create or replace function to_n(c varchar2) return number is
begin return to_number(c);
exception when others then return -123456;
end;
select id, n from t where to_n(n) = -123456;
I tray order by with problematic column and i find rows with column.
SELECT
D.UNIT_CODE,
D.CUATM,
D.CAPITOL,
D.RIND,
D.COL1 AS COL1
FROM
VW_DATA_ALL_GC D
WHERE
(D.PERIOADA IN (:pPERIOADA)) AND
(D.FORM = 62)
AND D.COL1 IS NOT NULL
-- AND REGEXP_LIKE (D.COL1, '\[\[:alpha:\]\]')
-- AND REGEXP_LIKE(D.COL1, '\[\[:digit:\]\]')
--AND REGEXP_LIKE(TO_CHAR(D.COL1), '\[^0-9\]+')
GROUP BY
D.UNIT_CODE,
D.CUATM,
D.CAPITOL,
D.RIND ,
D.COL1
ORDER BY
D.COL1