I am running below query in impala but getting error related to unix_timestamp
AnalysisException: No matching function with signature: unix_timestamp(TIMESTAMP, STRING).
Same query while running in hue, I am getting error related to add_months
Error while compiling statement: FAILED: ParseException line 8:28 cannot recognize input near 'select' 'add_months' '(' in expression specification
If I run a below separate add_months query, I am able to get the result:
select add_months(max(period), -6) from ph_com_b_gbl_sales.it_fact_sales
query :
select
product.brand as product_group,
sum(equivalent_units) as equivalent_units ,
sales.data_source,
from_unixtime(unix_timestamp(period,'yyyy-MM-dd'),'yyyy-MM')
from (select sales_line, territory_mapping_key from ph_com_b_gbl_sales.it_territory_mapping where sales_line = 'Linea Cardiometabolica') territory_mapping
inner join
(select territory_mapping_key,equivalent_units, sales_office, product_key, data_source, period from ph_com_b_gbl_sales.it_fact_sales where sales_office = 'ITOS' and data_source like '%DPC%' or data_source like'%Ex Factory%' or data_source like '%Sell-out%') sales
on sales.territory_mapping_key = territory_mapping.territory_mapping_key
inner join
(select brand,product_key from ph_com_b_gbl_sales.it_dim_product where brand like '%ENTRESTO%') product
on sales.product_key = product.product_key
where sales.period between (select add_months(max(period), -6) from ph_com_b_gbl_sales.it_fact_sales) and (select max(period) from ph_com_b_gbl_sales.it_fact_sales)
group by sales.data_source,sales.period,product.brand
My objective is to fetch the 6months data and that wont be the last 6 months data, I j just have to provide the range of 6 months
Related
create table db.temp
location '/user/temp' as
SELECT t1.mobile_no
FROM db.temp t1
WHERE NOT EXISTS ( SELECT NULL
FROM db.temp t2
WHERE t1.mobile_no = t2.mobile_no
AND t1.cell != t2.cell
AND t2.access_time BETWEEN t1.access_time
AND t1.access_time_5);
I need to get all the users who used the same cell for 5 hours of the time interval(access_time_5) from access time. This code perfectly fine with impala. But not works in Hive.
Gives an error
"Error while compiling statement: FAILED:
SemanticException [Error 10249]: line 23:25 Unsupported SubQuery
Expression"
I looked at a similar question related to this error. Can't figure out the solution. Any help would be highly appreciated!
Correlated BETWEEN is not supported in Hive as well as non-equi joins. Try to rewrite using LEFT JOIN, count rows with your condition and filter:
select mobile_no from
(
SELECT t1.mobile_no,
sum(case when t1.cell != t2.cell
and t2.access_time between t1.access_time and t1.access_time_5
then 1 else 0
end) as cnt_exclude
FROM db.temp t1
LEFT JOIN db.temp t2 on t1.mobile_no = t2.mobile_no
GROUP BY t1.mobile_no
)s
where cnt_exclude=0
The problem with such solution is that LEFT JOIN may produce huge duplication and it will affect performance, though it may work if the data is not too big.
It seems to me that window functions would be better for both databases. Let me assume that access_time is a Unix time (i.e. measured in seconds). You can easily convert the value to such a time:
SELECT t1.mobile_no
FROM (SELECT t1.*,
MIN(t1.cell) OVER (PARTITION BY mobile_no
ORDER BY access_time
RANGE BETWEEN 17999 preceding AND CURRENT ROW
) as min_cell,
MAX(t1.cell) OVER (PARTITION BY mobile_no
ORDER BY access_time
RANGE BETWEEN 17999 preceding AND CURRENT ROW
) as max_cell
FROM db.temp t1
) t1
WHERE min_cell = max_cell;
I have a Hive table 'Orders' with four columns (id String, name String, Order String, ts String). Sample data of table is as below.
-------------------------------------------
id name order ts
-------------------------------------------
1 abc completed 2018-04-12 08:15:26
2 def received 2018-04-15 06:20:17
3 ghi processed 2018-04-16 11:36:56
4 jkl received 2018-04-05 12:23:34
3 ghi received 2018-03-23 16:43:46
1 abc processed 2018-03-17 18:39:22
1 abc received 2018-02-25 20:07:56
The Order column has three states received -> processed -> completed. There are many orders for a single name and each has these three stages. I need the latest value of order for a given 'id' and 'name'. This may seem as a novice question for you but I am stuck with this.
I tried writing queries like below but they are not working and I couldn't use max function directly on 'ts' column as it is in String format. Please advice a best method.
Thanks in advance.
Queries I tried
SELECT
ORDER
FROM Orders
WHERE id = '1'
AND name = 'ghi'
AND ts = (
SELECT max(unix_timestamp(ts, 'yyyy-MM-dd HH:mm:SS'))
FROM Orders
)
Error while compiling statement: FAILED: ParseException line 2:0 cannot recognize input near 'select' 'max' '(' in expression specification
SELECT
ORDER
FROM Orders
WHERE id = '1'
AND name = 'ghi'
AND max(unix_timestamp(ts, 'yyyy-MM-dd HH:mm:SS'))
Error while compiling statement: FAILED: SemanticException [Error 10128]: Line 1:93 Not yet supported place for UDAF 'max'
select o.order from Orders o
inner join (
select id, name, order, max(ts) as ts
from Orders
group by id, name, order
) ord on d.id = ord.id and o.name = ord.name and o.ts = ord.ts where o.id = '1' and o.name = 'abc'
This query was executed but the output is not a single latest order stage but of each order stage with corresponding latest timestamp.
Please help.
For a given order, you want one row. Hence, you can use order by and limit:
SELECT o.*
FROM Orders o
WHERE id = 1 AND -- presumably id is a number
name = 'ghi'
ORDER BY ts DESC
LIMIT 1;
This should also have the best performance.
You can use the RANK analytical function to get your problem resolved as below:
select id,name,order,ts
from (select id,name,order,ts,rank() over(partition by id,name order by ts) r from orders)k
where r = 1
and id = '1'
and name = 'ghi'
If you want to get the latest record for all the ID's and name then you don't need to pass the values for "ID" and "NAME" you will get your desired result easily.
All the best!!!
I have a query to select last 5 vehicle are recently reported. which is stored in "VEHICLE_IN" table.
SELECT R.REG_NO FROM
(SELECT G.REG_NO FROM VEHICLE_IN G
INNER JOIN VEHICLE_MASTER M ON G.REG_NO=M.REG_NO
WHERE to_date((IN_DATE||' '||IN_TIME),'YYYY-MM-DD HH24:MI') >= (SYSDATE-30/1440)
AND M.VEHICLE_STAND='STAND1'
ORDER BY G.IN_DATE DESC, G.IN_TIME DESC) R
WHERE ROWNUM <= 5
GROUP BY R.REG_NO ;
[Record format (VARCHAR2(20)): IN_DATE : '2016-03-21'; IN_TIME: '18:27']
The query returns an error
But the condition works in some other query.
Anybody please help me to find out the mistake..
My Query is
select count(*) as cnt,
EXTRACT(day FROM current_date - min(txdate))::int as days,
sum (Select opening from acledgerbal l
where acname='Arv'
union all
Select sum(v2.debit-v2.credit) as opening from acvoucher2 v2 where
txdate<='05/03/2014') as opening
from acduebills acb,acledger l
where (acb.opening+acb.debit-acb.credit) > 0
and acb.unitname='Sales'
and l.acname='Arv'
and l.acno=acb.acno
Here it show more than one row returned by a subquery used as an expression Error.
How do using sum for the subquery.
I'm using postgresql 9.1
EDIT:
I want to get count of rows in acduebills tables which is (acb.opening+acb.debit-acb.credit) > 0 and acb.unitname='Sales'. After that I want to get difference of day which is minimum date in same condition. After that I want to get opening, which comes from two tables: acledgerbal and acvoucher2. acvoucher is table checked by the txdate condition.
How to get those detail in single query?. How to get Same details in multiple schema's?
Something like this:
SELECT count(*) AS cnt
, current_date - min(txdate)::date AS days -- subtract dates directly
, (SELECT round(sum(opening)::numeric, 2)
FROM (
SELECT opening
FROM acledgerbal
WHERE acname = 'Arv'
UNION ALL
SELECT debit - credit
FROM acvoucher2
WHERE txdate <= '2014-05-03'
) sub
) AS opening
FROM acduebills b
JOIN acledger l USING (acno)
WHERE ((b.opening + b.debit) - b.credit) > 0
AND b.unitname ='Sales'
AND l.acname = 'Arv';
round() to decimal places only works with type numeric, so I cast the sum.
The problem here in the following statement:
sum ( Select opening from acledgerbal l
where acname='Arv'
union all
Select sum(v2.debit-v2.credit) as opening from acvoucher2 v2,
txdate<='05/03/2014' )
You use UNION so this subquery returns at least 2 rows. So you get an error that subquery can't return more than one row: "more than one row returned by a subquery used as an expression"
Try to change it to:
(Select SUM(opening) from acledgerbal l WHERE acname='Arv')
+
(Select SUM(v2.debit-v2.credit) as opening from acvoucher2 v2
WHERE txdate<='05/03/2014')
hi i want to convert my sql subselect query to hql. my sql query is shown below
select distinct sum(goal_score) from(
select user_id,max(goal_score) goal_score from sc_student_final_results ssfr
where month=8 and year=2013 group by goal_id,user_id) ssfr group by ssfr.user_id
for the abo native sql command i have converted to hql as shown below
select distinct sum(goalScore) FROM (select userId,max(goalScore) goalScore FROM
StudentFinalResults sr where year=:year and month=:month and locationId =:siteid
group by userId,goalId) sr group by sr.userId
but i am getting the error
org.hibernate.hql.PARSER - line 1:37: unexpected token: (
org.hibernate.hql.PARSER - line 1:52: unexpected token: max
unexpected token: ( near line 1, column 37 [select distinct sum(goalScore)
FROM (select userId,max(goalScore) goalScore FROM
net.sankhya.scorecards.model.StudentFinalResults sr where year=:year and
month=:month and locationId =:siteid group by userId,goalId) sr group by sr.userId]
Please update your query with this, it might work out!
select distinct sum(sr3.goalScore) FROM (select userId,max(goalScore) FROM
StudentFinalResults sr2 where year=:year and month=:month and locationId =:siteid
group by userId,goalId) sr3 group by sr3.userId