I try this SQL statement:
SELECT
custodycd,
SUM(mramt) mramt_6_month,
txdate,
CASE
WHEN LAG(mramt, 6) OVER (ORDER BY txdate) IS NOT NULL
THEN SUM(mramt) OVER (ORDER BY txdate ROWS BETWEEN 6 PRECEDING AND 1 PRECEDING)
ELSE NULL
END AS mramt_6_month_1
FROM
(SELECT
MAX(mramt) mramt,
t.afacctno,
t.custodycd,
t.txdate
FROM
tbl_mr3007_log t
WHERE
txdate >= '30/nov/2020'
AND mramt <> 0
GROUP BY
t.afacctno,
t.txdate,
t.custodycd)
GROUP BY
custodycd,
txdate
ORDER BY
txdate
and I get an error:
ORA-00979: not a GROUP BY expression
Thanks for your help
The error message seems pretty clear. However, what you want to do is not clear.
I am guessing that for each custodycd you have multiple rows by date. Starting at the seventh row, you want the sum of the previous six rows.
If so, then the code looks like:
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY custodycd ORDER BY txdate) > 6
THEN SUM(SUM(mramt)) OVER (PARTITION BY custodycd
ORDER BY txdate
ROWS BETWEEN 6 PRECEDING AND 1 PRECEDING
)
END) AS mramt_6_month_1
Related
I have the following query:
select b.month_date,total_signups,active_users from
(
SELECT date_trunc('month',confirmed_at) as month_date
, count(distinct id) as total_signups
FROM follower.users
WHERE confirmed_at::date >= dateadd(day,-90,getdate())::date
and (deleted_at is null or deleted_at > date_trunc('month',confirmed_at))
group by 1
) a ,
(
SELECT date_trunc('month', inv.created_at) AS month_date
,COUNT(DISTINCT em.user_id) AS active_users
FROM follower.invitees inv
INNER JOIN follower.events
ON inv.event_id = em.event_id
where inv.created_at::date >= dateadd(day,-90,getdate())::date
GROUP BY 1
) b
where a.month_date=b.month_date
This returns three columns month date, total signups and active users, what I need is a rolling total for all users in the fourth column (rolling total of signups). I've tried over and partition functions with no luck. Could someone help? Appreciate it very much.
Try adding this column definition to your first Select:
SUM(total_signups)
OVER (ORDER BY b.month_date ASC rows between unbounded preceding and current row)
AS running_total
Here's a mini-demo
I'm trying to compute a running total and reset it to 0 based on 2 conditions or if the limit is reached.
Here is an example.
As in the image above, I need to get the running total while the following conditions are met:
monthly discount = 0 and monthly ticket=1
If one of discount=1 and ticket=0, the next value for running total has to be 0.
running_total<50
If running total>=50, the value for running total has to start from the value on the same row.
Here is what I'm trying to do now:
Is there any possibility to do this in HIVE? Thank you so much!!!
SELECT * ,
SUM(tag_flg) OVER (PARTITION BY account, flg_sum
ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS running_sum
FROM
( SELECT * ,
SUM(CASE
WHEN tag_flg>=50 THEN value
ELSE tag_flg
END) OVER (PARTITION BY account
ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS flg_sum
FROM
( SELECT * ,
CASE
WHEN month_disc =0
AND month_ticket = 1 THEN value
ELSE 0
END AS tag_flg
FROM source_table) x) y
Do the 40, 60 and 20 that aren't being accounted for matter at all in your report? Like would you want them to be counted then a new row added with a total of 0 to restart?
Here is the way I managed to do it:
SELECT *,
SUM(case when month_disc=1 OR month_ticket=0 then 0 else value end) OVER (PARTITION BY account, flg_sum, band_sum ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sum
FROM (
SELECT *,
FLOOR(SUM(case when month_disc=1 OR month_ticket=0 then 0 else value end) OVER (PARTITION BY account, flg_sum ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)/50.000001) as band_sum ---- create bands for running total
FROM (
SELECT *,
SUM(tag_flg) OVER (PARTITION BY account ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS flg_sum
FROM (
SELECT *,
CASE WHEN (month_disc=1 OR month_ticket=0) THEN 1 ELSE 0 END AS tag_flg ---- flag to count when the value is reset due to one of the conditions
FROM source_table) x ) y) z
Hey the schema is like this: for the whole dataset, we should order by machine_id first, then order by ss2k. after that, for each machine, we should find all the rows with at least consecutively 5 flag = 'census'. In this dataset, the result should be all the yellow rows..
I cannot return the last 4 rows of the yellow blocks by using this:
drop table if exists qz_panel_census_228_rank;
create table qz_panel_census_228_rank as
select t.*
from (select t.*,
count(*) filter (where flag = 'census') over (partition by machine_id, date order by ss2k rows between current row and 4 following) as census_cnt5,
count(*) filter (where flag = 'census') over (partition by machine_id, date) as count_census,
row_number() over (partition by machine_id, date order by ss2k) as seqnum,
count(*) over (partition by machine_id, date) as cnt
from qz_panel_census_228 t
) t
where census_cnt5 = 5
group by 1,2,3,4,5,6,7,8,9,10,11
DISTRIBUTED BY (machine_id);
You were close, but you need to search in both directions:
select t.*
from (select t.*,
case when count(*) filter (where flag = 'census')
over (partition by machine_id, date
order by ss2k
rows between 4 preceding and current row) = 5
or count(*) filter (where flag = 'census')
over (partition by machine_id, date
order by ss2k
rows between current row and 4 following) = 5
then 1
else 0
end as flag
from qz_panel_census_228 t
) t
where flag = 1
Edit:
This approach will not work unless you add an extra count for each possible 5 row window, e.g. 3 preceding and 1 following, 2 preceding and 2 following, etc. This results in ugly code and is not very flexible.
The common way to solve this gaps & islands problem is to assign consecutive rows to a common group first:
select *
from
(
select t2.*,
count(*) over (partition by machine_id, date, grp) as cnt
from
(
select t1.*
from (select t.*,
-- keep the same number for 'census' rows
sum(case when flag = 'census' then 0 else 1 end)
over (partition by machine_id, date
order by ss2k
rows unbounded preceding) as grp
from qz_panel_census_228 t
) t1
where flag = 'census' -- only census rows
) as t2
) t3
where cnt >= 5 -- only groups of at least 5 census rows
Wow, there has to be a better way of doing this, but the only way I could figure out was to create blocks of consecutive 'census' values. This looks awful but might be a catalyst to a better idea.
with q1 as (
select
machine_id, recorded, ss2k, flag, date,
case
when flag = 'census' and
lag (flag) over (order by machine_id, ss2k) != 'census'
then 1
else 0
end as block
from foo
),
q2 as (
select
machine_id, recorded, ss2k, flag, date,
sum (block) over (order by machine_id, ss2k) as group_id,
case when flag = 'census' then 1 else 0 end as census
from q1
),
q3 as (
select
machine_id, recorded, ss2k, flag, date, group_id,
sum (census) over (partition by group_id order by ss2k) as max_count
from q2
),
groups as (
select group_id
from q3
group by group_id
having max (max_count) >= 5
)
select
q2.machine_id, q2.recorded, q2.ss2k, q2.flag, q2.date
from
q2
join groups g on q2.group_id = g.group_id
where
q2.flag = 'census'
If you run each query within the with clauses in isolation, I think you will see how this evolves.
I have to draft a SQL query which does the following:
Compare current week (e.g. week 10) amount to the average amount over previous 4 weeks (Week# 9,8,7,6).
Now I need to run the query on a monthly basis so say for weeks (10,11,12,13).
As of now I am running it four times giving the week parameter on each run.
For example my current query is something like this :
select account_id, curr.amount,hist.AVG_Amt
from
(
select
to_char(run_date,'IW') Week_ID,
sum(amount) Amount,
account_id
from Transaction t
where to_char(run_date,'IW') = '10'
group by account_id,to_char(run_date,'IW')
) curr,
(
select account_id,
sum(amount) / count(to_char(run_date,'IW')) as AVG_Amt
from Transactions
where to_char(run_date,'IW') in ('6','7','8','9')
group by account_id
) hist
where
hist.account_id = curr.account_id
and curr.amount > 2*hist.AVG_Amt;
As you can see, if I have to run the above query for week 11,12,13 I have to run it three separate times. Is there a way to consolidate or structure the query such that I only run once and I get the comparison data all together?
Just an additional info, I need to export the data to Excel (which I do after running query on the PL/SQL developer) and export to Excel.
Thanks!
-Abhi
You can use a correlated sub-query to get the sum of amounts for the last 4 weeks for a given week.
select
to_char(run_date,'IW') Week_ID,
sum(amount) curAmount,
(select sum(amount)/4.0 from transaction
where account_id = t.account_id
and to_char(run_date,'IW') between to_char(t.run_date,'IW')-4
and to_char(t.run_date,'IW')-1
) hist_amount,
account_id
from Transaction t
where to_char(run_date,'IW') in ('10','11','12','13')
group by account_id,to_char(run_date,'IW')
Edit: Based on OP's comment on the performance of the query above, this can also be accomplished using lag to get the previous row's value. Count of number of records present in the last 4 weeks can be achieved using a case expression.
with sum_amounts as
(select to_char(run_date,'IW') wk, sum(amount) amount, account_id
from Transaction
group by account_id, to_char(run_date,'IW')
)
select wk, account_id, amount,
1.0 * (lag(amount,1,0) over (order by wk) + lag(amount,2,0) over (order by wk) +
lag(amount,3,0) over (order by wk) + lag(amount,4,0) over (order by wk))
/ case when lag(amount,1,0) over (order by wk) <> 0 then 1 else 0 end +
case when lag(amount,2,0) over (order by wk) <> 0 then 1 else 0 end +
case when lag(amount,3,0) over (order by wk) <> 0 then 1 else 0 end +
case when lag(amount,4,0) over (order by wk) <> 0 then 1 else 0 end
as hist_avg_amount
from sum_amounts
I think that is what you are looking for:
with lagt as (select to_char(run_date,'IW') Week_ID, sum(amount) Amount, account_id
from Transaction t
group by account_id, to_char(run_date,'IW'))
select Week_ID, account_id, amount,
(lag(amount,1,0) over (order by week) + lag(amount,2,0) over (order by week) +
lag(amount,3,0) over (order by week) + lag(amount,4,0) over (order by week)) / 4 as average
from lagt;
I'm facing an issue since 2 days regarding this query :
select distinct a.id,
a.amount as amount1,
(select max (a.date) from t1 a where a.id=t.id and a.cesitc='0' and a.date<t.date) as date1,
t.id, t.amount as amount2, t.date as date2
from t1 a
inner join t1 t on t.id = a.id and a.cevexp in ('0', '1' )
and exists (select t.id from t1 t
where t.id= a.id and t.amount <> a.amount and t.date > a.date)
and t.cesitc='1' and t.dafms='2015-07-31' and t.date >='2015-04-30' and '2015-07-31' >= t.daefga
and '2015-07-31' <= t.daecga and t.cevexp='1' and t.amount >'1'
Some details, the goal is to compare the difference in valuation of assets (id), column n2 (a.amount/ amount1) is the one which needs to be corrected.
I would like my a.mount/amount1 being correlated with my subquery 'date1' which is actually not the case. Same criterias have to be applied to find the correct amount1.
The outcomes of this query are currently displaying like this :
Id Amount1 Date1 id amount2 date2
1 100 04/03/2014 1 150 30/06/2015
1 102 04/03/2014 1 150 30/06/2015
1 170 04/03/2014 1 150 30/06/2015
the Amount1 matches with all Date1 < date2 instead of max(date1) < date2 that's why I have several amount1
Thanks in advance for helping hand :)
have a good day !
You can access the previous row's data using a Windowed Aggregate Function, there's no LEAD/LAG in Teradata, but it's easy to rewrite.
This will return the correct data for your example:
SELECT t.*,
MIN(amount) -- previous amount
OVER (PARTITION BY Id
ORDER BY date_valuation, dafms DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS prev_amount,
MIN(date_valuation) -- previous date
OVER (PARTITION BY Id
ORDER BY date_valuation, dafms DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS prev_date
FROM test5 AS t
QUALIFY cesitc = '1' -- return only the current row
If it doesn't work as expected you need to add more details of the applied logic.
Btw, if a column is a DECIMAL you shouldn't add quotes, 150 instead of '150'. And there's only one recommended way to write a date, using a date literal, e.g. DATE '2015-07-31'
The final query :
SELECT a.id, a.mtvbie, a.date_valuation, t.id,
MIN(t.amount) -- previous amount
OVER (PARTITION BY t.Id
ORDER BY t.date_valuation, t.dafms DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS prev_amount,
MIN(t.date_valuation) -- previous date
OVER (PARTITION BY t.Id
ORDER BY t.date_valuation, t.dafms DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS prev_date
FROM test5 t
inner join test5 a on a.id=t.id
where t.amount <> a.amount and a.cesitc='1' and a.date_valuation > t.date_valuation and a.dafms ='2015-07-31' and another criteria....
QUALIFY row_number () over (partition by a.id order a.cogarc)=1