Hive : Get latest n values from columns - hive

I want to select the largest n values in hive
use mydb;
select greatest_n(10, mycol1, mycol2) from mytab;
i am using hive 2.X. in hive 0.13, i was able to run the above and it worked. but now i get
FAILED: SemanticException [Error 10011]: Invalid function greatest_n
is there a way to do this in hive 2.X ?

Hive has greatest function since Hive-1.1.
Example:
hive> select greatest(1,2,3,4);
4
If you don't have greatest() function then try with the approach mentioned in this link.

Related

Hive Substring from complex subquery

As below substring is developed by Greenplum platform, we have to migrate similar operation to Hive supported. kindly help us.
select substring(a.dealer_msisdn from char_length(a.dealer_msisdn)-9) as dealer_msisdn
example of msisdn value with above query for greenplum
select substring('9970050916' from char_length('9970050916')-9) as dealer_msisdn
Please help me similar operation needs to migrate hive.
In hive substring function syntax is different:
substr(string|binary A, int start, int len), len is optional parameter.
Try this:
select substr('9970050916', char_length('9970050916')-9)
Read Hive UDFs manual

SQL in Hive SemanticException [Error 10004]: Line 3:5 Invalid table alias or column reference

trying o figure out hive sql, not having much luck with what appears to be basics, but I'm just not getting!!
I have a query;
select
from_unixtime(unix_timestamp(unixTimeStampField)+43200) as MyLocalTime,
cast(MyLocalTime as timestamp) as EventTime,
*
from mart.table
where names in ('abc','xyz')
What I am trying to do is, first convert the unixtime to my local time using from_unixtime then from this convert, using cast the column into a date/time field so my graphs can read it as a date/time vs a string value.
Am getting this error;
Error
Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 3:5 Invalid table alias or column reference
Tried some suggested fixes in the chats, but none I seem to get a result with. Thanks in advance
Can you please try this ?
If you select all columns along with something else, you need to alias the table and use it to fetch all columns.
select
from_unixtime(unix_timestamp(unixTimeStampField)+43200) as MyLocalTime,
cast(MyLocalTime as timestamp) as EventTime,
t.* -- You need to call the table by alias.
from mart.table t -- alias the table.
where names in ('abc','xyz')
Thanks for that, I did try and no luck unfortunately. I did though modify the unix conversion to then cast it as a timestamp, that seemed to work instead.
cast(from_unixtime(unix_timestamp(tfield)+43200)as TIMESTAMP)
so it looks like this
`select
cast(from_unixtime(unix_timestamp(tfield)+43200)as TIMESTAMP) as MyLocalTime,
*
from
mart.table
where
names in ('abc','xyz')`

Invalid table name with table search

In BigQuery I am using the following query:
SELECT
*
FROM
`properati-data-public:properties_mx.properties_sell_201***`
WHERE
_TABLE_SUFFIX BETWEEN '1501'
AND '1810'
Where properati-data-public:properties_mx.properties_sell_201501 is a valid table. When I use the query with multiple tables, I get the following error:
Query Failed
Error: Invalid table name: `properati-data-public:properties_mx.properties_sell_201***`
you should use:
`properati-data-public.properties_mx.properties_sell_20*`
Note:
. vs. :
20* vs. 201***
Also put below as a first line in your query to assure you are in Standard SQL mode
#standardSQL

SELECT database.table.column in Hive

Is it possible to use
SELECT DB.TABLE.COLUMN from DB.TABLE
in Hive?
I know it's possible to alias DB.TABLE as follows
SELECT T1.COLUMN FROM DB.TABLE AS T1
But, is there any way in Hive to select a column fully qualified by its database and table name, as shown in the first query above? I've done this before in MySQL but I don't know if there's a way to make Hive work this way.
No, that is not possible in Hive, you will get an exception:
SemanticException [Error 10004]: Line 1:7 Invalid table alias or column reference 'DB': (possible column names are: col)
And your second select sentence is valid.
To specify a database, either qualify the table names with database names ("db_name.table_name" starting in Hive 0.7) or issue the USE statement before the query statement (starting in Hive 0.6).
See language manual here: LanguageManual+Select

using function in google bigquery

I am trying to use sql function in my BigQuery query :
SELECT FORMAT("%T", NET.HOST(resolved_urls.url)) AS host, FROM [tableName] LIMIT 1000
But I get the following error:
Error: 2.15 - 2.55: Unrecognized function format
I am getting these error for every sql function i am trying to use.
Any Idea ? thank you
Those new functions are supported by BigQuery Standard SQL only
Try below
#standardSQL
SELECT FORMAT("%T", NET.HOST(resolved_urls.url)) AS host, FROM tableName LIMIT 1000
Also note: in Standard SQL you use `tableName` and not [tableName]