SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns - hive

The below query is working fine in Oracle but it is not working in hive.
SELECT Q.tm_mo_id,
'1380' AS mrc_cd,
NVL (R.itm_profit_ctr_cd, '99') AS profit_center_cd,
MAX(CASE R.itm_profit_ctr_cd
WHEN NULL THEN 'UNASSIGN PROFIT CNTR'
ELSE R.itm_profit_ctr_ds
END) profit_center_desc,
SUM(Q.bp_grs_quota_am) AS mth_bp_plan_gts_am_usd,
SUM(Q.grs_quota_am) AS mth_ju_plan_gts_am_usd
FROM v_l_0002_gb_gds_us_quota_v_1 Q
LEFT JOIN
(SELECT * FROM
(SELECT ph_dtl_id,
itm_profit_ctr_cd,
MIN (itm_profit_ctr_ds) AS itm_profit_ctr_ds,
ROW_NUMBER () OVER (
PARTITION BY ph_dtl_id
ORDER BY COUNT(CASE profit_ctr_cd
WHEN 'JNJDUMMY' THEN NULL
WHEN '99' THEN NULL
ELSE profit_ctr_cd
END) DESC,
itm_profit_ctr_cd ASC) rn
FROM v_l_0002_gb_gds_us_sku_to_profit_center_lookup_v_1
GROUP BY ph_dtl_id,
itm_profit_ctr_cd) E
WHERE rn = 1 ) R
ON (Q.ph_dtl_id = R.ph_dtl_id)
WHERE SUBSTR (Q.tm_mo_id, 1, 4) = '2016'
GROUP BY Q.tm_mo_id,
NVL(R.itm_profit_ctr_cd, '99')

Related

Reset ROW_NUMBER() after break

I have the following SQL Server query on which i'm having some trouble using ROW_NUMBER().
Select
ROW_NUMBER() OVER (Partition by R.DriverName, CASE WHEN Sum(R.Points) > 0 THEN 1 ELSE 0 END Order By E.EventDate ASC) As 'RowID',
CASE WHEN Sum(R.Points) > 0 THEN 1 ELSE 0 END As 'PointsId',
R.DriverName,
R.EventID,
Format(E.EventDate, 'dd/MM/yyyy') as 'Event Date'
From Races R
Inner Join Events E
On E.EventID = R.EventID
Where R.SeriesID Like 'FOE' And E.EventType Like 'R' And R.DriverName Like 'Lucas di Grassi'
Group By R.DriverName, R.EventID, E.EventDate
Order By E.EventDate
And get the following result:
I want that after each 0 on PointsId Column, the RowID resets to 1 and adds up again until the next 0.
Can anyone help?
Thank you,
VĂ­tor
You need nested Analytic Functions:
Select
ROW_NUMBER() OVER (Partition by R.DriverName, grp Order By E.EventDate ASC) As 'RowID',
...
from
(
Select
-- assign a number to each group of rows
SUM(CASE WHEN Sum(R.Points) > 0 THEN 1 ELSE 0 END)
OVER (Partition by R.DriverName
Order By E.EventDate ASC) As grp,
...
From Races R
Inner Join Events E
On E.EventID = R.EventID
Where R.SeriesID Like 'FOE' And E.EventType Like 'R' And R.DriverName Like 'Lucas di Grassi'
Group By R.DriverName, R.EventID, E.EventDate
) as dt
Order By E.EventDate

How to get multiple columns in Crosstab

I would like a cross table from the following table.
The cross table should look like this
A pivot table does not seem to solve the problem, because only one column can be used at a time. But in our case we are dealing with 4 different columns. (payment, month, year and free of charge)
I solved the problem by splitting these 4 columns into four different pivot tables, using temporary tables and finally reassembling the obtained data. But this is very complicated, long and confusing, in short not very nice...
The years and months should be shown in ascending form, exactly as you can see in the cross table above.
I have been looking for a solution for quite a while but I can't find the same problem anywhere.
If someone would give me a short, elegant solution I would be very grateful.
Under http://www.sqlfiddle.com/#!18/7216f/2 you can see the problem definition.
Thank you!
You can rank records by date in a subquery with row_number(), and then pivot with conditional aggregation:
select
ClientId,
max(case when rn = 1 then Payment end) Payment1,
max(case when rn = 2 then Payment end) Payment2,
max(case when rn = 3 then Payment end) Payment3,
max(case when rn = 1 then [Month] end) Month1,
max(case when rn = 2 then [Month] end) Month2,
max(case when rn = 3 then [Month] end) Month3,
max(case when rn = 1 then [Year] end) Year1,
max(case when rn = 2 then [Year] end) Year2,
max(case when rn = 3 then [Year] end) Year3,
max(case when rn = 1 then FreeOfCharge end) FreeOfCharge1,
max(case when rn = 2 then FreeOfCharge end) FreeOfCharge2,
max(case when rn = 3 then FreeOfCharge end) FreeOfCharge3
from (
select
t.*,
row_number() over(partition by ClientId order by [Year], [Month]) rn
from mytable t
) t
group by ClientId
You can join the table with itself a few times, as in:
with p as (
select
*, row_number() over(partition by clientid order by year, month) as n
from Payment
)
select
p1.clientid,
p1.payment, p2.payment, p3.payment,
p1.month, p2.month, p3.month,
p1.year, p2.year, p3.year,
p1.freeofcharge, p2.freeofcharge, p3.freeofcharge
from p p1
left join p p2 on p2.clientid = p1.clientid and p2.n = 2
left join p p3 on p3.clientid = p1.clientid and p3.n = 3
where p1.n = 1
See Fiddle.

Hive rolling sum of data over date

I am working on Hive and am facing an issue with rolling counts. The sample data I am working on is as shown below:
and the output I am expecting is as shown below:
I tried using the following query but it is not returning the rolling count:
select event_dt,status, count(distinct account) from
(select *, row_number() over (partition by account order by event_dt
desc)
as rnum from table.A
where event_dt between '2018-05-02' and '2018-05-04') x where rnum =1
group by event_dt, status;
Please help me with this if some one has solved a similar issue.
You seem to just want conditional aggregation:
select event_dt,
sum(case when status = 'Registered' then 1 else 0 end) as registered,
sum(case when status = 'active_acct' then 1 else 0 end) as active_acct,
sum(case when status = 'suspended' then 1 else 0 end) as suspended,
sum(case when status = 'reactive' then 1 else 0 end) as reactive
from table.A
group by event_dt
order by event_dt;
EDIT:
This is a tricky problem. The solution I've come up with does a cross-product of dates and users and then calculates the most recent status as of each date.
So:
select a.event_dt,
sum(case when aa.status = 'Registered' then 1 else 0 end) as registered,
sum(case when aa.status = 'active_acct' then 1 else 0 end) as active_acct,
sum(case when aa.status = 'suspended' then 1 else 0 end) as suspended,
sum(case when aa.status = 'reactive' then 1 else 0 end) as reactive
from (select d.event_dt, ac.account, a.status,
max(case when a.status is not null then a.timestamp end) over (partition by ac.account order by d.event_dt) as last_status_timestamp
from (select distinct event_dt from table.A) d cross join
(select distinct account from table.A) ac left join
(select a.*,
row_number() over (partition by account, event_dt order by timestamp desc) as seqnum
from table.A a
) a
on a.event_dt = d.event_dt and
a.account = ac.account and
a.seqnum = 1 -- get the last one on the date
) a left join
table.A aa
on aa.timestamp = a.last_status_timestamp and
aa.account = a.account
group by d.event_dt
order by d.event_dt;
What this is doing is creating a derived table with rows for all accounts and dates. This has the status on certain days, but not all days.
The cumulative max for last_status_timestamp calculates the most recent timestamp that has a valid status. This is then joined back to the table to get the status on that date. Voila! This is the status used for the conditional aggregation.
The cumulative max and join is a work-around because Hive does not (yet?) support the ignore nulls option in lag().

SQL query min and max group by flag

I have a table as below :
How can I craft a SQL select statement so that MIN AND MAX EVENT DATE groups results by FLAG (0,1)?
So the result would be:
Just do conditional aggregation with use of window function
SELECT card_no, descr_reader,
max(CASE WHEN flag = 0 THEN event_date END) date_in,
max(CASE WHEN flag = 1 THEN event_date END) date_out
FROM
(
SELECT *,
COUNT(flag) OVER (PARTITION BY flag ORDER BY id) Seq
FROM table t
)t
GROUP BY card_no, descr_reader, Seq
An alternative if Window function does not work:
SELECT
t1.card_no, t1.descr_reader,
t1.event_date date_in,
(select top 1 event_date from test t2
where t2.card_no = t1.card_no and
t2.reader_no = t1.reader_no and
t2.descr_reader = t1.descr_reader and
t2.event_date > t1.event_date and
t2.flag = 1
order by t2.event_date ) as date_out
FROM test t1
WHERE t1.flag = 0

Multiple Condition with Groupby

I am getting error wih following code, "Msg 144, Level 15, State 1, Line 17
Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause.
"
SELECT [sddoc],
[Soldtopt],
[tradingname],
[DlvDate],
SUM(try_cast(Netvalue AS FLOAT)) AS Netvalue,
COUNT(DISTINCT SDDoc) AS Salesdoc ,
COUNT(DISTINCT
CASE
WHEN Netvalue = '0'
THEN 1
ELSE NULL
END) AS ZeroValue ,
COUNT(DISTINCT SDDoc) - COUNT(DISTINCT
CASE
WHEN Netvalue = '0'
THEN 1
ELSE NULL
END) AS Result
FROM d1
WHERE dlvdate='25.01.2017'
GROUP BY
CASE
WHEN SUM(try_cast(Netvalue AS FLOAT)) = 0
AND COUNT(DISTINCT SDDoc) = 1
AND COUNT(DISTINCT
CASE
WHEN Netvalue = '0'
THEN 1
ELSE NULL
END) = 1
THEN [sddoc]
END,
Soldtopt,
tradingname,
DlvDate
You can't use SUM or COUNT (aggregates) in the GROUP BY clause. Aggregate values must be calculated after groups are defined.
Also, your CASE lacks an ELSE clause.