Hive - how to check if a numeric columns have number/decimal? - sql

I am trying to generate a hive query which will take multiple numeric column names and check whether it is has numeric values. If the column has numeric values then the output should be (column name,true) else if the field has NULL or some string value the output should be (column name,false)
SELECT distinct (test_nr1,test_nr2) FROM test.abc WHERE (test_nr1,test_nr2) not like '%[^0-9]%';
SELECT distinct test_nr1,test_nr2 from test.abc limit 2;
test_nr1 test_nr2
NULL 81432269
NULL 88868060
the desired output should be :
test_nr1 false
test_nr2 true
Since test_nr1 is a decimal field and it has NULL values, it should output false.
Appreciate valuable suggestions.

You can use cast function. It returns NULL when the value can not not be cast to numeric.
For example:
select case when cast('23ccc' as double) is null then false else true end as IsNumber;

You're trying to use character class pattern matching syntax here, and it doesn't work in every SQL implementation IIRC, however, regexp matching works in most, if not all, SQL implementations.
Considering you're using hive, this should do it:
SELECT ('test_nr1', test_nr1 RLIKE '\d'), ('test_nr2', test_nr2 RLIKE '\d') FROM test.abc;
You should remember that regexp matching is very slow in SQL though.

Related

How to query empty string in postgresql?

I have table, in which a column has empty fields.when i try to run simple query like below it is not
returning the data.
select * from table1 where "colunm1" = ''
But below two queries returns data
1,
select * from table1
where coalesce("colunm1", '') = ''
2,
select * from table1
where "colunm1" is null
Can someone tell me the reason?
TIA
You have describe the behavior of a column that is NULL. NULL is not the same as an empty string.
It fails any equality comparison. However, you can use is null or (less preferentially) coalesce().
The only database that treats an empty string like a NULL value, is Oracle.
Logically, for a DBMS, that would be wrong.
Relational DBMSs are all about :
set theory
Boolean algebra
A NULL value is a value that we don't know. A string literal of '' is a string whose value we know. It is a string with no characters, with a length of 0. We don't know of a NULL as a string how long it is, how many and, if any, which, characters it contains.
So it is only logical that:
the comparison '' = '' will evaluate to TRUE
the comparison NULL = NULL will evaluate to FALSE , as any comparison with a NULL value will evaluate to FALSE.
And functions like COALESCE() or NVL(), IFNULL(), ISNULL() will return the first parameter if it does not contain a NULL value. That is consistent.
Except in Oracle

Check if column value is Numeric. SSIS

I have a column with datatype of varchar. I would like to replace all the values that are not numeric with NULL.
So for example my column can contain a value of MIGB_MGW but also 1352. The current expression I am using with Derived Column Transformation Editor is:
(DT_I4)kbup == (DT_I4)kbup ? 1 : 0
But of course this replaces all the values I want to keep with 1. What expression would I use to keep the numeric values? (1352 in this example)
If you want a null of varchar type, you can use NULL(DT_STR). For a DT_I4 you can use NULL(DT_I4) etc.
You can then use (DT_I4)kbup in place of your 1 to return the original varchar value that you want to keep, converted to a DT_I4:
(DT_I4)kbup == (DT_I4)kbup ? (DT_I4)kbup : NULL(DT_I4)
You could just convert them with a Derived Column and then use the ignore failure option in the Error output.
Use NOT LIKE
SELECT CASE
WHEN col NOT LIKE '%[^0-9]%' THEN col
ELSE NULL
END as Only_Numeric
FROM (VALUES ('MIGB_MGW'),
('1352')) tc(col)
Result :
Only_Numeric
------------
NULL
1352
Another option if 2012+ is Try_Convert()
SELECT Try_Convert(float,col)
FROM (VALUES ('MIGB_MGW'),
('2.6e7'),
('2.6BMW'),
('1352')) tc(col)
Returns
NULL
26000000
NULL
1352

DB2 SQL - How can I display nothing instead of a hyphen when the result of my case statement is NULL?

All,
I'm writing a query that includes a CASE statement which compares two datetime fields. If Date B is > Date A, then I'd like the query to display Date B. However, if Date B is not > Date A, then the user who will be getting the report created by the query wants the column to be blank (in other words, not contain the word 'NULL', not contain a hyphen, not contain a low values date). I've been researching this today but have not come up with a viable solution so thought I'd ask here. This is what I have currently:
CASE
WHEN B.DTE_LNP_LAST > A.DTE_PROC_ACT
THEN B.DTE_LNP_LAST
ELSE ?
END AS "DATE OF DISCONNECT"
If I put NULL where the ? is, then I get a hyphen (-) in my query result. If I omit the Else statement, I also get a hyphen in the query result. ' ' doesn't work at all. Does anyone have any thoughts?
Typically the way nulls are displayed is controlled by the client software used to display query results. If you insist on doing that in SQL, you will need to convert the date to a character string:
CASE
WHEN B.DTE_LNP_LAST > A.DTE_PROC_ACT
THEN VARCHAR_FORMAT(B.DTE_LNP_LAST)
ELSE ''
END AS "DATE OF DISCONNECT"
Replace VARCHAR_FORMAT() with the formatting function available in your DB2 version on your platform, if necessary.
You can use the coalesce function
Coalesce (column, 'text')
If the first value is null, it will be replaced by the second one.

How do I sort a VARCHAR column in PostgreSQL that contains words and numbers?

I need to order a select query using a varchar column, using numerical and text order. The query will be done in a java program, using jdbc over postgresql.
If I use ORDER BY in the select clause I obtain:
1
11
2
abc
However, I need to obtain:
1
2
11
abc
The problem is that the column can also contain text.
This question is similar (but targeted for SQL Server):
How do I sort a VARCHAR column in SQL server that contains words and numbers?
However, the solution proposed did not work with PostgreSQL.
Thanks in advance, regards,
I had the same problem and the following code solves it:
SELECT ...
FROM table
order by
CASE WHEN column < 'A'
THEN lpad(column, size, '0')
ELSE column
END;
The size var is the length of the varchar column, e.g 255 for varying(255).
You can use regular expression to do this kind of thing:
select THECOL from ...
order by
case
when substring(THECOL from '^\d+$') is null then 9999
else cast(THECOL as integer)
end,
THECOL
First you use regular expression to detect whether the content of the column is a number or not. In this case I use '^\d+$' but you can modify it to suit the situation.
If the regexp doesn't match, return a big number so this row will fall to the bottom of the order.
If the regexp matches, convert the string to number and then sort on that.
After this, sort regularly with the column.
I'm not aware of any database having a "natural sort", like some know to exist in PHP. All I've found is various functions:
Natural order sort in Postgres
Comment in the PostgreSQL ORDER BY documentation

Conditionally branching in SQL based on the type of a variable

I'm selecting a value out of a table that can either be an integer or a nvarchar. It's stored as nvarchar. I want to conditionally call a function that will convert this value if it is an integer (that is, if it can be converted into an integer), otherwise I want to select the nvarchar with no conversion.
This is hitting a SQL Server 2005 database.
select case
when T.Value (is integer) then SomeConversionFunction(T.Value)
else T.Value
end as SomeAlias
from SomeTable T
Note that it is the "(is integer)" part that I'm having trouble with. Thanks in advance.
UPDATE
Check the comment on Ian's answer. It explains the why and the what a little better. Thanks to everyone for their thoughts.
select case
when ISNUMERIC(T.Value) then T.Value
else SomeConversionFunction(T.Value)
end as SomeAlias
Also, have you considered using the sql_variant data type?
The result set can only have one type associated with it for each column, you will get an error if the first row converts to an integer and there are strings that follow:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the nvarchar value 'word' to data type int.
try this to see:
create table testing
(
strangevalue nvarchar(10)
)
insert into testing values (1)
insert into testing values ('word')
select * from testing
select
case
when ISNUMERIC(strangevalue)=1 THEN CONVERT(int,strangevalue)
ELSE strangevalue
END
FROM testing
best bet is to return two columns:
select
case
when ISNUMERIC(strangevalue)=1 THEN CONVERT(int,strangevalue)
ELSE NULL
END AS StrangvalueINT
,case
when ISNUMERIC(strangevalue)=1 THEN NULL
ELSE strangevalue
END AS StrangvalueString
FROM testing
or your application can test for numeric and do your special processing.
You can't have a column that is sometimes an integer and sometimes a string. Return the string and check it using int.TryParse() in the client code.
ISNUMERIC. However, this accepts +, - and decimals so more work is needed.
However, you can't have the columns as both datatypes in one go: you'll need 2 columns.
I'd suggest that you deal with this in your client or use an ISNUMERIC replacement
IsNumeric will get you part of the way there. You can then add some further code to check whether it is an integer
for example:
select top 10
case
when isnumeric(mycolumn) = 1 then
case
when convert(int, mycolumn) = mycolumn then
'integer'
else
'number but not an integer'
end
else
'not a number'
end
from mytable
To clarify some other answers, your SQL statement can't return different data types in one column (it looks like the other answers are saying you can't store different data types in one column - yours are all strign represenations).
Therefore, if you use ISNUMERIC or another function, the value will be cast as a string in the table that is returned anyway if there are other strigns being selected.
If you are selecting only one value then it could return a string or a number, however your front end code will need to be able to return the different data types.
Just to add to some of the other comments about not being able to return different data types in the same column... Database columns should know what datatype they are holding. If they don't then that should be a BIG red flag that you have a design problem somewhere, which almost guarantees future headaches (like this one).