Initcap of word - sql

I'm having a table x it contain the column resource_name in this column I'm having data like NASRI(SRI).
I'm applying initcap on this column it's giving output Nasri(sri). But my expected output is Nasri(Sri).
How I can achieve the desired result?
Thank you

One possible solution is to use split() with concat_ws(). If value does not contain '()', then it will also work correctly. Demo with ():
hive> select concat_ws('(',initcap(split('NASRI(SRI)','\\(')[0]),
initcap(split('NASRI(SRI)','\\(')[1])
);
OK
Nasri(Sri)
Time taken: 0.974 seconds, Fetched: 1 row(s)
And for value without () it also works good:
hive> select concat_ws('(',initcap(split('NASRI','\\(')[0]),
initcap(split('NASRI','\\(')[1])
);
OK
Nasri
Time taken: 0.697 seconds, Fetched: 1 row(s)

Related

Hive String to Timestamp conversion with Milliseconds

I have a requirement to convert the mentioned input string format and produce the desired output in timestamp as shown below.
Input: 16AUG2001:23:46:32.876086
Desired Output: 2001-08-16 23:46:32.876086
Output which is coming by running the below code: 2001-08-17 00:01:08
Query:
select '16AUG2001:23:46:32.876086' as row_ins_timestamp,
from_unixtime(unix_timestamp('16AUG2001:23:46:32.876086',
'ddMMMyyyy:HH:mm:ss.SSSSSS')) as row_ins_timestamp
from temp;
Milliseconds part is not getting converted as required. Please suggest.
unix_timestamp function does not preserve milliseconds.
Convert without milliseconds, then concatenate with millisecond part:
with your_data as (
select stack(3,
'16AUG2001:23:46:32.876086',
'16AUG2001:23:46:32',
'16AUG2001:23:46:32.123'
) as ts
)
select concat_ws('.',from_unixtime(unix_timestamp(split(ts,'\\.')[0],'ddMMMyyyy:HH:mm:ss')),split(ts,'\\.')[1])
from your_data;
Result:
2001-08-16 23:46:32.876086
2001-08-16 23:46:32
2001-08-16 23:46:32.123
Time taken: 0.089 seconds, Fetched: 3 row(s)

hive:varchar column could not return month

How to return month from varchar column and values like "20180912" in hive?
It's strange that it worked fine with function month() on string type in hive,however it returns null now.
And month(from_unixtime(unix_timestamp)(date,'yyyymmdd')) return vaules that do not match the real month
Use substr():
hive> select substr('20180912',5,2);
OK
09
Time taken: 1.675 seconds, Fetched: 1 row(s)

Casting the Bigint number Returns NULL

I need o convert a integer value to the highest data type in hive as my value is of 25 digits
select cast(18446744073709551614 as bigint);
NULL value will be returned for the above select stmnt;
I am very well aware that the supplied number is greater than the largest number of Bigint. But we are getting such values upon which i have to calculate the max,min,sum,avg
So how can i cast this type of values so that i will not get the NULLs.
Use decimal(38, 0) for storing numbers bigger than BIGINT, it can store 38 digits. BIGINT can store 19 digits. Read also manual about decimal type.
For literals postfix BD is required. Example:
hive> select CAST(18446744073709551614BD AS DECIMAL(38,0))+CAST(18446744073709551614BD AS DECIMAL(38,0));
OK
36893488147419103228
Time taken: 0.334 seconds, Fetched: 1 row(s)
hive> select CAST(18446744073709551614BD AS DECIMAL(38,0))*2;
OK
36893488147419103228
Time taken: 0.129 seconds, Fetched: 1 row(s)

How to get the number in the field in hive?

An example of this field is "/products/106017388" in the table.
What SQL query shall I write to get the number 106017388 from the field.
Many thanks.
You can try hive function regexp_extract
Something like
select regexp_extract(field_name, "([0-9]+)$", 1) from table_name;
Debuggex Demo for the description about the regex ([0-9]+)$
Documentation
You may use the split command in hive to extract the required value.Like below;
select * from test_stackoverflow;
1 /products/106017388
2 /products1/06017388
Time taken: 0.66 seconds, Fetched: 2 row(s)
select split(value,'[/]')[2] from test_stackoverflow;
OK
106017388
06017388
Time taken: 0.105 seconds, Fetched: 2 row(s)
Hope this helps!
SUBSTR('/products/106017388',11)
to get only the integer part..

Hive FROM_UNIXTIME() with milliseconds

I have seen enough posts where we divide by 1000 or cast to convert from Milliseconds epoch time to Timestamp. I would like to know how can we retain the Milliseconds piece too in the timestamp.
1440478800123 The last 3 bytes are milliseconds. How do i convert this to something like YYYYMMDDHHMMSS.sss
I need to capture the millisecond portion also in the converted timestamp
Thanks
select cast(epoch_ms as timestamp)
actually works, because when casting to a timestamp (as opposed to using from_unixtime()), Hive seems to assume an int or bigint is milliseconds. A floating point type is treated as seconds. That is undocumented as far as I can see, and possibly a bug. I wanted a string which includes the timezone (which can be important - particularly if the server changes to summer/daylight savings time), and wanted to be explicit about the conversion in case the cast functionality changes. So this gives an ISO 8601 date (adjust format string as needed for another format)
select from_unixtime(
floor( epoch_ms / 1000 )
, printf( 'yyyy-MM-dd HH:mm:ss.%03dZ', epoch_ms % 1000 )
)
create a hive udf in java
package com.kishore.hiveudf;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Date;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.hive.ql.udf.UDFType;
#UDFType(stateful = true)
public class TimestampToDateUDF extends UDF {
String dateFormatted;
public String evaluate(long timestamp) {
Date date = new Date(timestamp);
DateFormat formatter = new SimpleDateFormat("YYYYMMDDHHmmss:SSS");
dateFormatted = formatter.format(date);
return dateFormatted;
}
}
export as TimestampToDateUDF.jar
hive> ADD JAR /home/kishore/TimestampToDate.jar;
hive> create TEMPORARY FUNCTION toDate AS 'com.kishore.hiveudf.TimestampToDateUDF' ;
output
select * from tableA;
OK
1440753288123
Time taken: 0.071 seconds, Fetched: 1 row(s)
hive> select toDate(timestamp) from tableA;
OK
201508240144448:123
Time taken: 0.08 seconds, Fetched: 1 row(s)