Concat String columns in hive - hive

I need to concat 3 columns from my table say a,b,c. If the length of the columns is greater than 0 then I have to concat all 3 columns and store it as another column d in the below format.
1:a2:b3:c
I have tried the following query but I am not sure how to proceed as I am getting null as the result.
select a,b,c,
case when length(a) >0 then '1:'+a else '' end + case when length(b) > 0 then '2:'+b else '' end + case when length(c) > 0 then '3:'+c else '' end AS d
from xyz;
Appreciate the help :)

Use concat() function:
select a,b,c,
concat(
case when length(a)>0 then concat('1:',a) else '' end,
case when length(b)>0 then concat('2:',b) else '' end,
case when length(c)>0 then concat('3:',c) else '' end
) as d
from (--test dataset
select stack(4, 'a','b','c', --all
'','b','c', --one empty
null,'b','c', --null
'','','' --all empty
) as (a,b,c)
)your_data;
Result:
OK
a b c 1:a2:b3:c
b c 2:b3:c
NULL b c 2:b3:c
Time taken: 0.284 seconds, Fetched: 4 row(s) - last one row is empty
As of Hive 2.2.0. you can use || operator instead of concat:
select a,b,c,
case when length(a)>0 then '1:'||a else '' end||
case when length(b)>0 then '2:'||b else '' end||
case when length(c)>0 then '3:'||c else '' end as d

Related

Nested Case Statement - Pulling the wrong value

Here is my syntax/question
,CASE WHEN nullif(ltrim(A),'') IS NOT NULL OR nullif(ltrim(B),'') IS NOT NULL THEN NULL
WHEN nullif(ltrim(C),'') IS NOT NULL THEN (
CASE WHEN nullif(MM,'') IS NOT NULL THEN MM
WHEN NN IS NOT NULL THEN XX
WHEN NN IS NULL THEN concat(UPPER(YY), ' ', UPPER(XX))
END
)
WHEN nullif(ltrim(D),'') IS NOT NULL OR nullif(ltrim(E),'') IS NOT NULL THEN concat(UPPER(XX), ' ', UPPER(YY))
ELSE ' '
END as 'Data_Item'
We have a series of statements to evaluate.
If fields A OR B are not null then we pull NULL
if field C is not null then we evaluate a nested case statement and this is where something is off with my code.
if field C is not null we should evaluate field MM if MM is not null we should return MM (this is what i should be getting but i'm not).
we continue to evaluate, if NN is not null we pull XX
if NN is null we concatenate YY space XX
and lastly if field D is not null or field E is not null then we concatenate fields XX space YY.
else we simply return a space
then end
--
In short, its a semi-complex series case statement, if the second scenario is true we have to evaluate like 3 or 4 nested statements. Till we find what is true.
for whatever reason my data is always returning the last statement (D is true) rather than the nested one.
This is your CASE simplified:
CASE
WHEN A <> '' OR B <> ''
THEN NULL
WHEN C <> ''
THEN CASE
WHEN MM <> '' THEN MM
WHEN NN IS NOT NULL THEN XX
WHEN NN IS NULL THEN Concat(Upper(YY), ' ', Upper(XX))
END
WHEN D <> '' OR E <> ''
THEN Concat(Upper(XX), ' ', Upper(YY))
ELSE ' '
END
Seems to match your logic ...

CONCAT with IF condition in SQL Server

I have a table with four columns presenting {YES, NO, N/A} values. What I'd like to obtain is a column with concatenated names of those columns which present a 'YES' value separate by a double underscore.
\, A, B, C, D
1, YES, NO, YES, N/A
2, NO, YES, N/A, N/A
3, YES, NO, NO, YES
Expected result:
A__C
B
A__D
Something like:
select CONCAT(
IF(A = 'YES', 'A'),
IF(B = 'YES', 'B'),
IF(C = 'YES', 'C'),
IF(D = 'YES', 'D'))
from my_table
Hope I understand you right, that you want a double underscore separator.
This solution works without any subquery or cte processing.
select substring(
iif(a='YES','__A','') + iif(b='YES','__B','') +
iif(c='YES','__C','') + iif(d='YES','__D','')
,3,100)
from table1
One should know that this: substring('', 3, 100) will work using SqlServer.
Assuming T1 is your table:
SELECT CASE WHEN LEN(X)>0 THEN LEFT(X, LEN(X)-2) ELSE '' END AS Y
FROM (
SELECT
CASE WHEN A='YES' THEN 'A__' ELSE '' END + CASE WHEN B='YES' THEN 'B__' ELSE '' END + CASE WHEN C='YES' THEN 'C__' ELSE '' END + CASE WHEN D='YES' THEN 'D__' ELSE '' END AS X
FROM T1
) A
WITH ABC
as
(
Select
(
CASE WHEN A = 'YES' THEN 'A_' ELSE '' END as A +
CASE WHEN B = 'YES' THEN 'B_' ELSE '' END as B +
CASE WHEN C = 'YES' THEN 'C_' ELSE '' END as C +
CASE WHEN D = 'YES' THEN 'D_' ELSE '' END as D
) as output
)
Select case when len(output) = 2 then left (output,1)
else output
end as output
From ABC
Select case A then 'YES' then 'A' else '_'end + case B then 'YES' then 'B' else '_'end +case C then 'YES' then 'C' else '_'end +case D then 'YES' then 'D' else '_'end as result from my_table

convert certain column names with comma separated string from sql table with conditions

For example , I have this table with different column names and the Boolean value below it,
case1 case2 case3 case4
1 0 1 0
What I want to retrieve,only column names with 1 value. So, my desired results from the query should only be case1,case3
Desired Output : case1,case3
there is only one row fetch from sql query
Is there any way?
If I understand correctly, you could use a big case statement:
select stuff(( (case when case1 = 1 then ',case1' else '' end) +
(case when case2 = 1 then ',case2' else '' end) +
(case when case3 = 1 then ',case3' else '' end) +
(case when case4 = 1 then ',case4' else '' end)
), 1, 1, '') as columns
In the case you have multiple rows.
Query
select stuff((
(case when count(*) = sum(cast(case1 as int)) then ',case1' else '' end) +
(case when count(*) = sum(cast(case2 as int)) then ',case2' else '' end) +
(case when count(*) = sum(cast(case3 as int)) then ',case3' else '' end) +
(case when count(*) = sum(cast(case4 as int)) then ',case4' else '' end)), 1, 1, '')
as no_zero_columns
from your_table_name;
SQL Fiddle Demo

How to get a list of all date columns which have 'NULL' in all rows

I have a table in SQL Server with 369 columns.
Out of these 369, 127 columns are of type date, and some of those 127 date columns have null values in all rows.
How do I get the list of these date columns with NULL values in all rows?
This may be very resource-heavy operation, depending on table size, but this should do it:
Generate some SQL:
select
TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, column_name, DATA_TYPE, IS_NULLABLE,
cmd = 'case when count(*)=sum(case when [' + COLUMN_NAME + '] is null then 1 else 0 end) then 1 else 0 end as [' + COLUMN_NAME + '__isAllNulls],'
from information_schema.columns
where 1=1
and DATA_TYPE like '%date%'
and TABLE_NAME = '<yourtablename>'
Then write the select:
select
<paste CMD from above results>
from <yourtablename>
EDIT2: sample final query after pasting generated CMD column would look like this:
select
case when count(*)=sum(case when [DateAdded] is not null then 1 else 0 end) then 1 else 0 end as [DateAdded__isAllNulls],
case when count(*)=sum(case when [DateModified] is null then 1 else 0 end) then 1 else 0 end as [DateModified__isAllNulls],
case when count(*)=sum(case when [DateClosed] is null then 1 else 0 end) then 1 else 0 end as [DateClosed__isAllNulls]
from yourtable
The result should be something like:
DateAdded__isAllNulls DateModified__isAllNulls DateClosed__isAllNulls
1 0 0
EDIT: Simple min() or max() would be better than my case when count(*)=sum(case ... , as #Damien_the_Unbeliever suggested in his comment, but still you can use my code to generate the sql.

Compatible SQL to test for not null and not empty strings

I want to have compatible SQL for both Oracle database and Microsoft SQL server.
I want a compatible SQL expression that will return true for not null and not empty strings.
If I use:
column <> ''
it will work on Microsoft SQL server but not on Oracle database (as '' is null for Oracle)
If I use:
len(column) > 0
it will work on Microsoft SQL server but not on Oracle database (since it uses length() )
NULLIF is available on both Oracle (doc) and SQL Server (doc). This expression should work:
NULLIF(column, '') IS NOT NULL
In both servers, if column is NULL, then the output of NULLIF will just pass the NULL value through. On SQL Server, '' = '', so the output of NULLIF will be NULL. On Oracle, '' is already NULL, so it gets passed through.
This is my test on SQL Server 2008 R2 Express:
WITH SampleData AS
(SELECT 1 AS col1, NULL AS col2
UNION ALL
SELECT 2, ''
UNION ALL
SELECT 3, 'hello')
SELECT *
FROM SampleData
WHERE NULLIF(col2, '') IS NOT NULL;
And this is my test case on Oracle 10g XE:
WITH SampleData AS
(SELECT 1 AS col1, NULL AS col2 FROM DUAL
UNION ALL
SELECT 2, '' FROM DUAL
UNION ALL
SELECT 3, 'hello' FROM DUAL)
SELECT *
FROM SampleData
WHERE NULLIF(col2, '') IS NOT NULL;
Both return 3 as expected.
How about
CASE WHEN column = '' THEN NULL ELSE column END IS NOT NULL
I think the key here is to differentiate between the case when the empty string is equivalent to NULL and when it isn't:
WHERE CASE WHEN '' = '' THEN -- e.g., SQL Server this is true
CASE WHEN col <> '' AND col IS NOT NULL THEN 'Y'
ELSE 'N'
END
WHEN COALESCE(col,NULL) IS NOT NULL THEN 'Y' -- Not SS, e.g., Oracle
ELSE 'N'
END = 'Y';
If the first case is true then empty string is not the same as null, and we have to test for both string being not null and string not being the empty string. Otherwise, our task is easier because empty string and null evaluate the same.
A try to shorten #DCookie's answer. I like his ( '' = '' ) test.
CASE WHEN ( '' = '' ) THEN ( column <> '' )
ELSE ( column = column )
END
Sadly, the above will not work. The next works in SQL-Server. I can't test in Oracle now:
CASE WHEN '' = '' THEN CASE WHEN column <> '' THEN 1 ELSE NULL END
ELSE CASE WHEN column = column THEN 1 ELSE NULL END
END
which can be written also as:
( '' = '' AND column <> '' )
OR ( '' IS NULL AND column = column )