Replace empty string in hive- Nvl and COALESCE tried

Replace empty string in hive- Nvl and COALESCE tried - sql

How to replace an empty string(length 0 ) with some other value? Already used Nvl and COALESCE but both doesn't replace with the replacement value because the value is not null. i can use case statement but looking for a built in function if there is any.

As you are having empty strings so when we use coalesce or nvl works only if we are having null values in the data. These functions won't work with empty strings.
With Empty strings:
hive> select coalesce(string(""),"1");
+------+--+
| _c0 |
+------+--+
| |
+------+--+
hive> select nvl(string(""),"1");
+------+--+
| _c0 |
+------+--+
| |
+------+--+
With null values:
hive> select coalesce(string(null),"1");
+------+--+
| _c0 |
+------+--+
| 1 |
+------+--+
hive> select nvl(string(null),"1");
+------+--+
| _c0 |
+------+--+
| 1 |
+------+--+
Try to alter the table and add this property
TBLPROPERTIES('serialization.null.format'='')
if this property doesn't display empty string as null's then we need to use either case/if statement to replace empty strings.
You can use if statement
if(boolean testCondition, T valueTrue, T valueFalseOrNull)
hive> select if(length(trim(<col_name>))=0,'<replacement_val>',<col_name>) from <db>.<tb>;
Example:
hive> select if(length(trim(string("")))=0,'1',string("col_name"));
+------+--+
| _c0 |
+------+--+
| 1 |
+------+--+
hive> select if(length(trim(string("1")))=0,'1',string("col_name"));
+-----------+--+
| _c0 |
+-----------+--+
| col_name |
+-----------+--+

In Hive, empty string is treated like usual comparable value, not NULL. That is why there is no built-in function for this.
Using case statement:
case when col='' or col is null then 'something' else col end

Related

Oracle regex_replace ' from values

I need help wit removing "'pp'" from search results which appear at the biginning of text. Values in search resuls contain spaces and also '. I need to remove only 'pp from bigginig

This sounds like:
select regexp_replace(col, '^pp', '')
Or a case expression:
select (case when col like 'pp%' then substr(col, 3) else col end)

You don't need regular expressions and can use simple string functions.
If you want to use SELECT then:
SELECT value,
CASE
WHEN value LIKE 'pp%'
THEN SUBSTR( value, 3 )
ELSE value
END AS replaced_value
FROM table_name
Outputs:
VALUE | REPLACED_VALUE
:---- | :-------------
pp123 | 123
pp1pp | 1pp
123pp | 123pp
12345 | 12345
and, if you want to UPDATE the table:
UPDATE table_name
SET value = SUBSTR( value, 3 )
WHERE value LIKE 'pp%';
Then:
SELECT * FROM table_name;
Outputs:
| VALUE |
| :---- |
| 123 |
| 1pp |
| 123pp |
| 12345 |
db<>fiddle here

Get a substring in hive

I am trying to get a substring of a string from Hive. I have a string as this one: 2017-06-05 09:06:32.0
What I want is to get the first two digits from hour, that is, 09.
I get the entire hour with this command:
SELECT SUBSTR(hora,11) AS subhoras FROM axmugbcn18.bbdd WHERE hora = '2017-06-05 09:06:32.0'
The result of the command is: 09:06:32.0
In order to get only 09 I try this command:
SELECT REGEXP_EXTRACT(hora,'\d\d') AS subhoras FROM axmugbcn18.bbdd WHERE hora = '2017-06-05 09:09:32.0'
but results are blank.
How can I retrieve only the two digits of hour?
Thanks

There are several ways you can extract hours from timestamp value.
1.Using Substring function:
select substring(string("2017-06-05 09:06:32.0"),12,2);
+------+--+
| _c0 |
+------+--+
| 09 |
+------+--+
2.Using Regexp_Extract:
select regexp_Extract(string("2017-06-05 09:06:32.0"),"\\s(\\d\\d)",1);
+------+--+
| _c0 |
+------+--+
| 09 |
+------+--+
3.Using Hour:
select hour(timestamp("2017-06-05 09:06:32.0"));
+------+--+
| _c0 |
+------+--+
| 9 |
+------+--+
4.Using from_unixtime:
select from_unixtime(unix_timestamp('2017-06-05 09:06:32.0'),'HH');
+------+--+
| _c0 |
+------+--+
| 09 |
+------+--+
5.Using date_format:
select date_format(string('2017-06-05 09:06:32.0'),'hh');
+------+--+
| _c0 |
+------+--+
| 09 |
+------+--+
6.Using Split:
select split(split(string('2017-06-05 09:06:32.0'),' ')[1],':')[0];
+------+--+
| _c0 |
+------+--+
| 09 |
+------+--+

Try the below:
select
'2017-06-05 09:06:32.0' as t,
hour('2017-06-05 09:06:32.0'), -- output: 9
from_unixtime(unix_timestamp('2017-06-05 09:06:32.0'),'HH') -- output: 09
from table_name;
You can either try hour or unixtimestamp to get the desired result.
Hope this helps :)

SQL - left pad with the zero after symbol '-'

I am trying to left pad with a single zero after the '-'.
I did check the other answers here but didnt help me.
Here is the table :
+---------+
| Job |
+---------+
| 3254-1 |
| 3254-25 |
| 3254-6 |
+---------+
I need to left pad with single zero after '-' if the value is between 1 and 9 in the end
I want the results to be :
+---------+
| Job |
+---------+
| 3254-01 |
| 3254-25 |
| 3254-06 |
+---------+

You can use CHARINDEX(), SUBSTRING() and REPLACE() as:
CREATE TABLE Jobs(
Job VARCHAR(45)
);
INSERT INTO Jobs VALUES
('3254-1'),
('3254-25'),
('3254-6');
SELECT CASE
WHEN CHARINDEX('-', Job, 1)+1 < LEN(Job) THEN Job
ELSE
REPLACE(Job, '-', '-0')
END AS Job
FROM Jobs;
Results:
+----+---------+
| | Job |
+----+---------+
| 1 | 3254-01 |
| 2 | 3254-25 |
| 3 | 3254-06 |
+----+---------+

If you want an update, I think this is the simplest method:
update t
set job = replace(job, '-', '-0')
where job like '%-_';
This problem is simplified greatly because you are only adding a single padding character.

If you have version 2012+, then format function may be used as :
select concat(nr1, '-', format( cast ( q2.nr2 as int ), '00')) as result
from
(
select substring(q1.str,1,charindex('-',q1.str,1)-1) as nr1,
substring(q1.str,charindex('-',q1.str,1)+1,len(q1.str)) as nr2
from
(
select '3254-1' as str union all
select '3254-25' as str union all
select '3254-6' as str
) q1
) q2;
result
------
3254-01
3254-25
3254-06
Rextester Demo

hive regexp_extract after second occurrence of delimiter

we have a Hive table column which has string separated by ';' and we need to extract the string after second occurrence of ';'
+-----------------+
| col1 |
+-----------------+
| a;b;c;d |
| e;f; ;h |
| i;j;k;l |
+-----------------+
Required output:
+-----------+
| col1 |
+-----------+
| c |
| <null> |
| k |
+-----------+
select regexp_extract

Split the string on ; which will return an array of values and from this you can get the element at index 2.
select split(str,';')[2]
from tbl

If you want to convert empty and space-only strings to NULLs like in your example, then this macro can be useful:
create temporary macro empty_to_null(s string) case when trim(s)!='' then s end;
select empty_to_null(split(col1,'\\;')[2]);

Why cast as timestamp give out two different result

I have a hive table with two rows like this:
0: jdbc:hive2://localhost:10000/default> select * from t2;
+-----+--------+
| id | value |
+-----+--------+
| 10 | 100 |
| 11 | 101 |
+-----+--------+
2 rows selected (1.116 seconds)
but when I issue a query :
select cast(1 as timestamp) from t2;
it gives out unconsistent result, can anyone tell me the reason ?
0: jdbc:hive2://localhost:10000/default> select cast(1 as timestamp) from t2;
+--------------------------+
| _c0 |
+--------------------------+
| 1970-01-01 07:00:00.001 |
| 1970-01-01 07:00:00.001 |
+--------------------------+
2 rows selected (0.913 seconds)
0: jdbc:hive2://localhost:10000/default> select cast(1 as timestamp) from t2;
+--------------------------+
| _c0 |
+--------------------------+
| 1970-01-01 08:00:00.001 |
| 1970-01-01 07:00:00.001 |
+--------------------------+
2 rows selected (1.637 seconds)

I can't reproduce your problem, which Hive version are you using? Hive had a bug with timestamp and bigint (see https://issues.apache.org/jira/browse/HIVE-3454), but it doesn't explain your problem. For example Hive 0.14 gives different results for
SELECT (cast 1 as timestamp), cast(cast(1 as double) as timestamp) from my_table limit 5;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Replace empty string in hive- Nvl and COALESCE tried - sql

How to replace an empty string(length 0 ) with some other value? Already used Nvl and COALESCE but both doesn't replace with the replacement value because the value is not null. i can use case statement but looking for a built in function if there is any.

In Hive, empty string is treated like usual comparable value, not NULL. That is why there is no built-in function for this. Using case statement: case when col='' or col is null then 'something' else col end

Related

Oracle regex_replace ' from values

Get a substring in hive

SQL - left pad with the zero after symbol '-'

hive regexp_extract after second occurrence of delimiter

Why cast as timestamp give out two different result

Categories

Resources