I have a group by query returning avg and max from a set of records. I need to return a new column with the latest value of a column("records") based on another column ("dates").
This query
with x as (select 'A' process, 10 records, sysdate-5 dates from dual union all
select 'A' process, 20 records, sysdate-4 dates from dual union all
select 'A' process, 30 records, sysdate-3 dates from dual union all
select 'B' process, 25 records, sysdate-2 dates from dual union all
select 'B' process, 15 records, sysdate-1 dates from dual)
select process,
avg(records) avgu,
max(records) maxu
from x
group by process
order by 1
returns:
Process
AVG.
MAX.
A.
20
30.
B
20
25.
I need a new column (LATEST) with latest value of records based on dates, keeping the old columns too:
Process
MAX.
LATEST.
A.
30
30.
B
25
15.
I'm playing with some window functions like RANK OVER PARTITION but I can't get the desired outcome in a single query.
Thank you in advance for any idea.
Here's one option:
Sample data:
SQL> with x as (
2 select 'A' process,10 records,sysdate-5 dates from dual union all
3 select 'A',20,sysdate-4 from dual union all
4 select 'A',30,sysdate-3 from dual union all
5 select 'B',25,sysdate-2 from dual union all
6 select 'B',15,sysdate-1 from dual),
Query begins here: first find the latest value per each process, then - in the final query - aggregate required values.
7 temp as
8 (select process,
9 records,
10 dates,
11 first_value(records) over (partition by process order by dates desc) latest
12 from x
13 )
14 select process,
15 avg(records) avgu,
16 max(records) maxu,
17 max(latest) latest
18 from temp
19 group by process
20 order by 1;
P AVGU MAXU LATEST
- ---------- ---------- ----------
A 20 30 30
B 20 25 15
SQL>
Related
I have a table like below
ID_NUMBER
SALEDATA
SALEAMOUNT
1
2020-09-07
47,000
2
2020-03-25
51,470
3
2021-06-12
32,000
4
2018-10-12
37,560
I want to select the rows with the 2 most recent dates only. So my desired output would be like below
ID_NUMBER
SALEDATA
SALEAMOUNT
1
2020-09-07
47,000
3
2021-06-12
32,000
Can someone please guide me on where would i start with this in SQL? I tried using MAX() but it is only giving me the most recent.
Thank you!
In Standard SQL, you would use:
select t.*
from t
order by saledata desc
offset 0 row fetch first 2 row only;
Not all databases support fetch first. It might be spelled limit or select top or something else, depending on your database.
Another option, with the rank analytic function. Sample data till line #7, query begins at line #9. See comments within code.
SQL> with test (id_number, saledata, saleamount) as
2 -- sample data
3 (select 1, date '2020-09-07', 47000 from dual union all
4 select 2, date '2020-03-25', 51470 from dual union all
5 select 3, date '2021-06-12', 32000 from dual union all
6 select 4, date '2018-10-12', 37560 from dual
7 )
8 -- sort them by date in descending order, fetch the first two rows
9 select id_number, saledata, saleamount
10 from (select t.*,
11 rank() over (order by saledata desc) rn
12 from test t
13 )
14 where rn <= 2
15 order by saledata;
ID_NUMBER SALEDATA SALEAMOUNT
---------- ---------- ----------
1 2020-09-07 47000
3 2021-06-12 32000
SQL>
select top 2 from your data set order by the saledata column descending
I have multiple rows with values like
a_b_c_d_e_f and x_y_z_m_n_o
and I need a SQL query with a result like a+x_b+y_c+z_d+m.......
Sample data as requested
What I am willing to do is aggregate it at Datetime..aggregating Total is simple, but how can I do that for the last column, thanks.
Expected Result
Here's one option; read comments within code. I didn't feel like typing too much so two dates will have to do.
Sample data (you already have that & don't type it. Code you need begins at line #10):
SQL> with
2 -- sample data
3 test (datum, total, col) as
4 (select date '2020-07-20', 100, '10,0,20,30,0' from dual union all
5 select date '2020-07-20', 150, '15,3,40,30,2' from dual union all
6 --
7 select date '2020-07-19', 200, '50,6,50,30,8' from dual union all
8 select date '2020-07-19', 300, '20,1,40,10,2' from dual
9 ),
Split CSV values into rows. Note the RB value which will help us sum matching values
10 -- split comma-separated values into rows
11 temp as
12 (select
13 datum,
14 total,
15 to_number(regexp_substr(col, '\d+', 1, column_value)) val,
16 column_value rb
17 from test cross join
18 table(cast(multiset(select level from dual
19 connect by level <= regexp_count(col, ',') + 1
20 ) as sys.odcinumberlist))
21 ),
Computing summaries is simple; nothing special about it. We'll keep the RB value as it'll be needed in the last step:
22 -- compute summaries
23 summary as
24 (select datum,
25 sum(total) total,
26 sum(val) sumval,
27 rb
28 from temp
29 group by datum, rb
30 )
The last step. Using LISTAGG, aggregate comma-separated values back, but this time added to each other:
31 -- final result
32 select datum,
33 total,
34 listagg(sumval, ',') within group (order by rb) new_col
35 from summary
36 group by datum, total
37 order by datum desc, total;
DATUM TOTAL NEW_COL
------------------- ---------- --------------------
20.07.2020 00:00:00 250 25,3,60,60,2
19.07.2020 00:00:00 500 70,7,90,40,10
SQL>
I've seen many examples of rolling averages in oracle but done do quite what I desire.
This is my raw data
DATE SCORE AREA
----------------------------
01-JUL-14 60 A
01-AUG-14 45 A
01-SEP-14 45 A
02-SEP-14 50 A
01-OCT-14 30 A
02-OCT-14 45 A
03-OCT-14 50 A
01-JUL-14 60 B
01-AUG-14 45 B
01-SEP-14 45 B
02-SEP-14 50 B
01-OCT-14 30 B
02-OCT-14 45 B
03-OCT-14 50 B
This is the desired result for my rolling average
MMYY AVG AREA
-------------------------
JUL-14 60 A
AUG-14 52.5 A
SEP-14 50 A
OCT-14 44 A
JUL-14 60 B
AUG-14 52.5 B
SEP-14 50 B
OCT-14 44 B
The way I need it to work is that for each MMYY, I need to look back 3 months, and AVG the scores per dept. So for example,
For Area A in OCT, in the last 3 months from oct, there were 6 studies, (45+45+50+30+45+50)/6 = 44.1
Normally I would write the query like so
SELECT
AREA,
TO_CHAR(T.DT,'MMYY') MMYY,
ROUND(AVG(SCORE)
OVER (PARTITION BY AREA ORDER BY TO_CHAR(T.DT,'MMYY') ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),1)
AS AVG
FROM T
This will look over the last 3 enteries not the last 3 months
One way to do this is to mix aggregation functions with analytic functions. The key idea for average is to avoid using avg() and instead do a sum() divided by a count(*).
SELECT AREA, TO_CHAR(T.DT, 'MMYY') AS MMYY,
SUM(SCORE) / COUNT(*) as AvgScore,
SUM(SUM(SCORE)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) / SUM(COUNT(*)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM t
GROUP BY AREA, TO_CHAR(T.DT, 'MMYY') ;
Note the order by clause. If your data spans years, then using the MMYY format poses problems. It is better to use a format such as YYYY-MM for months, because the alphabetical ordering is the same as the natural ordering.
You can specify also ranges, not only rows.
SELECT
AREA,
TO_CHAR(T.DT,'MMYY') MMYY,
ROUND(AVG(SCORE)
OVER (PARTITION BY AREA
ORDER BY DT RANGE BETWEEN INTERVAL '3' MONTH PRECEDING AND CURRENT ROW))
AS AVG
FROM T
Since CURRENT ROW is the default, just ORDER BY DT RANGE INTERVAL '3' MONTH PRECEDING should work as well. Perhaps you have to do some fine-tuning, I did not test the behaviour regarding the 28/29/30/31 days per month issue.
Check the Oracle Windowing Clause for further details.
SQL> WITH DATA AS(
2 SELECT to_date('01-JUL-14','DD-MON-RR') dt, 60 score, 'A' area FROM dual UNION ALL
3 SELECT to_date('01-AUG-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
4 SELECT to_date('01-SEP-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
5 SELECT to_date('02-SEP-14','DD-MON-RR') dt, 50 score, 'A' area FROM dual UNION ALL
6 SELECT to_date('01-OCT-14','DD-MON-RR') dt, 30 score, 'A' area FROM dual UNION ALL
7 SELECT to_date('02-OCT-14','DD-MON-RR') dt, 45 score, 'A' area FROM dual UNION ALL
8 SELECT to_date('03-OCT-14','DD-MON-RR') dt, 50 score, 'A' area FROM dual UNION ALL
9 SELECT to_date('01-JUL-14','DD-MON-RR') dt, 60 score, 'B' area FROM dual UNION ALL
10 SELECT to_date('01-AUG-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
11 SELECT to_date('01-SEP-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
12 SELECT to_date('02-SEP-14','DD-MON-RR') dt, 50 score, 'B' area FROM dual UNION ALL
13 SELECT to_date('01-OCT-14','DD-MON-RR') dt, 30 score, 'B' area FROM dual UNION ALL
14 SELECT to_date('02-OCT-14','DD-MON-RR') dt, 45 score, 'B' area FROM dual UNION ALL
15 SELECT to_date('03-OCT-14','DD-MON-RR') dt, 50 score, 'B' area FROM dual)
16 SELECT TO_CHAR(T.DT, 'MON-RR') AS MMYY,
17 round(
18 SUM(SUM(SCORE)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)/
19 SUM(COUNT(*)) OVER (PARTITION BY AREA ORDER BY MAX(T.DT) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),1)
20 AS avg_score,
21 AREA
22 FROM data t
23 GROUP BY AREA, TO_CHAR(T.DT, 'MON-RR')
24 /
MMYY AVG_SCORE A
------ ---------- -
JUL-14 60 A
AUG-14 52.5 A
SEP-14 50 A
OCT-14 44.2 A
JUL-14 60 B
AUG-14 52.5 B
SEP-14 50 B
OCT-14 44.2 B
8 rows selected.
SQL>
From next time, I would expect you to provide the create and insert statements so that we don't have to spend time on preparing a test case.
And, why YY format? Haven't you seen the Y2K bug? Please use YYYY format.
I have an SQL Table in which I keep project information coming from primavera.
Suppose that i have columns for Start Date,End Date,Duration, and Total Qty as shown below .
How can i distribute Total Qty over Months using these information. What kind of additional columns, sql queries i need in order to get correct monthly distribution?
Thanks in Advance.
Columns in order:
itemname,quantity,startdate,duration,enddate
item1 -- 108 -- 2013-03-25 -- 720 -- 2013-07-26
item2 -- 640 -- 2013-03-25 -- 720 -- 2013-07-26
.
.
I think the key is to break the records apart by month. Here is an example of how to do it:
with months as (
select 1 as mon union all select 2 union all select 3 union all
select 4 as mon union all select 5 union all select 6 union all
select 7 as mon union all select 8 union all select 9 union all
select 10 as mon union all select 11 union all select 12
)
select item, m.mon, quantity / nummonths
from (select t.*, (month(enddate) - month(startdate) + 1) as nummonths
from t
) t join
months m
on month(t.startDate) <= m.mon and
months(t.endDate) >= m.mon;
This works because all the months are within the same year -- as in your example. You are quite vague on how the split should be calculated. So, I assumed that every month from the start to the end gets an equal amount.
I have two sets of pricing data (A and B). Set A consists of all of my pricing data per order over a month. Set B consists of all of my competitor's pricing data over the same month. I want to compare my competitor's lowest price to each of my prices per day.
Graphically, the data appears like this:
Date:-- Set A: -- Set B:
1---------25---------31
1---------54---------47
1---------23---------56
1---------12---------23
1---------76---------40
1---------42
I want pass only the lowest price to a case statement which evaluates which prices are better. I would like to process an entire month's worth of data all at one time, so in my example, Dates 1 thru 30(1) would be included and crunched all at once, and for each day, there would only be one value from set B included: the lowest price in the set.
Important notes: Set B does not have a datapoint for each point in Set A
Hopefully this makes sense. Thanks in advance for any help you may be able to render.
That's a strange example you have - do you really have prices ranging from 12 to 76 within a single day?
Anyway, left joining your (grouped) data with their (grouped) data should work (untested):
with
my_prices as (
select price_date, min(price_value) min_price from my_prices group by price_date),
their_prices as (
select price_date, min(price_value) min_price from their_prices group by price_date)
select
mine.price_date,
(case
when theirs.min_price is null then mine.min_price
when theirs.min_price >= mine.min_price then mine.min_price
else theirs.min_price
end) min_price
from
my_min_prices mine
left join their_prices theirs on mine.price_date = theirs.price_date
I'm still not sure that I understand your requirements. My best guess is that you want something like
SQL> ed
Wrote file afiedt.buf
1 with your_data as (
2 select 1 date_id, 25 price_a,31 price_b from dual
3 union all
4 select 1, 54, 47 from dual union all
5 select 1, 23, 56 from dual union all
6 select 1, 12, 23 from dual union all
7 select 1, 76, 40 from dual union all
8 select 1, 42, null from dual)
9 select date_id,
10 sum( case when price_a < min_price_b
11 then 1
12 else 0
13 end) better,
14 sum( case when price_a = min_price_b
15 then 1
16 else 0
17 end) tie,
18 sum( case when price_a > min_price_b
19 then 1
20 else 0
21 end) worse
22 from( select date_id,
23 price_a,
24 min(price_b) over (partition by date_id) min_price_b
25 from your_data )
26* group by date_id
SQL> /
DATE_ID BETTER TIE WORSE
---------- ---------- ---------- ----------
1 1 1 4