Aggregate in plsql - sql

ORGANIZATION_ID
BAY_ID
CASCADE_GROUP_ID
DOWNSTEAM_VALUE
1001
100012
1
2
1001
100014
1
4
1001
100016
1
6
1001
100018
1
8
I need to create a view by aggregating the values of the DOWNSTEAM_VALUE column mentioned in the above table. In the below example, the aggregation at the DOWNSTEAM_VALUE column should happen by looking at the BAY_ID. If in case, the first row containing BAY_ID is 100012, the downstream value should be calculated by adding up the DOWNSTEAM_VALUE of the current BAY_ID row + remaining DOWNSTEAM_VALUE values in ascending order such as 2+4+6+8 and show like 20 and same goes to next BAY_ID , the downstream value would be 4+6+8=18. Since the last BAY_ID doesn't have any more DOWNSTEAM_VALUE values to add, it should show 8.
ORGANIZATION_ID
BAY_ID
CASCADE_GROUP_ID
DOWNSTEAM_VALUE
1001
100012
1
20
1001
100014
1
18
1001
100016
1
14
1001
100018
1
8
Any help would be really appreciated. Thanks

You can use SUM analytic function with windowing clause for that like below.
select ORGANIZATION_ID
, BAY_ID
, CASCADE_GROUP_ID
, sum(DOWNSTEAM_VALUE)over(
partition by ORGANIZATION_ID, CASCADE_GROUP_ID
order by BAY_ID asc
ROWS BETWEEN CURRENT ROW AND UNBOUNDED
FOLLOWING) as DOWNSTEAM_VALUE
from your_table
;

Related

SQL Server: How to retrieve all record based on recent datetime

First off, apologies if this has been asked elsewhere as I was unable to find any solution. The best I get is retrieving latest 1 record or 2-3 records. I'm more in search of all records (the number could be dynamic, could be 1 or 2 or maybe 50+) based on recent Datetime value. Well so basically here is the problem,
I have a table as follows,
APILoadDatetime
RowId
ProjectId
Value
2021-07-13 15:09:14.620
1
Proj-1
101
2021-07-13 15:09:14.620
2
Proj-2
81
2021-07-13 15:09:14.620
3
Proj-3
111
2021-07-13 15:09:14.620
4
Proj-4
125
2021-05-05 04:46:07.913
1
Proj-1
99
2021-05-05 04:46:07.913
2
Proj-2
69
2021-05-05 04:46:07.913
3
Proj-3
105
2021-05-05 04:46:07.913
4
Proj-4
115
...
...
...
...
What I am looking to do is, write up a query which will give me all the recent data based on Datetime, so in this case, I should get the following result,
APILoadDatetime
RowId
ProjectId
Value
2021-07-13 15:09:14.620
1
Proj-1
101
2021-07-13 15:09:14.620
2
Proj-2
81
2021-07-13 15:09:14.620
3
Proj-3
111
2021-07-13 15:09:14.620
4
Proj-4
125
The RowId shows (as the name suggests) gives the number of Rows for a particular Datetime block. This will not always be 4, it's dynamic based on the data received so could be 1,2,4 or even 50+ ...
Hope I was able to convey the question properly, Thank you all for reading and Pre-Thank you to those who provide solution to this.
you can use window function row_number to find out the latest entry for each projectid:
select * from (
select * , rank() over (order by APILoadDatetime desc) rn
from tablename
) t where rn = 1
select top 1 with ties
*
from
tablename
order by
row_number() over(
partition by RowId
order by APILoadDatetime desc
);
TOP 1 works with WITH TIES here.
WITH TIES means that when ORDER BY = 1, then SELECT takes this record (because of TOP 1) and all others that have ORDER BY = 1 (because of WITH TIES).
Update #1:
If you need the last record by APILoadDatetime and several records which might have the same APILoadDatetime (as the first found), then the query is simplier:
select top 1 with ties
*
from
tablename
order by
APILoadDatetime desc;

Count first occurring record per time period

In my table trips , I have two columns: created_at and user_id
Unique users take many different trips. My goal is to count the very first trip made unique per each user_ids per year-month. I understand that in this case the min() function should be applied.
In a previous query, all unique users per year-month were aggregated:
SELECT to_char(created_at, 'YYYY-MM') as yyyymm, COUNT(DISTINCT user_id)
FROM trips
GROUP BY yyyymm
ORDER BY yyyymm;
Where in the above query should min() be integrated? In other words, instead of counting all unique user id's per month, I only need to count the first occurrence of unique user id per month.
The sample input would look like:
> routes
user_id created_at
1 1 2015-08-07 07:18:21
2 2 2015-05-06 20:43:52
3 3 2015-05-06 20:53:54
4 1 2015-03-30 20:09:07
5 2 2015-10-01 18:28:32
6 3 2015-08-07 07:29:29
7 1 2015-08-28 13:45:44
8 2 2015-08-07 07:37:31
9 3 2015-03-30 20:14:04
10 1 2015-08-07 07:08:50
And the output would be:
count Y-m
1 0 2015-01
2 0 2015-02
3 2 2015-03
4 0 2015-04
5 1 2015-05
Because the first occurrences of user_id 1 and 3 were in March and the first occurrence of user_id 2 was in May
You can do this with 2 levels of aggregation. Get the min time per user_id and then count.
SELECT to_char(first_time, 'YYYY-MM'),count(*)
from (
SELECT user_id,MIN(created_at) as first_time
FROM trips
GROUP BY user_id
) t
GROUP BY to_char(first_time, 'YYYY-MM')

Find nearest next date based on first row date

I have a table in postgresql db as follows:
sl_no | valid_from |
--------------------
1 02-04-2013
2 02-09-2012
3 02-11-2015
4 02-01-2011
5 02-10-2015
I want to get all rows orderby valid_from and along with one dummy column name as valid_to. Here, values of valid_to should come from the nearest next date of every valid_from value.
Something like below:
sl_no | valid_from | valid_to |
---------------------------------
4 02-01-2011 02-09-2012
2 02-09-2012 02-04-2013
1 02-04-2013 02-10-2015
5 02-10-2015 02-11-2015
3 02-11-2015 02-11-2015
Thanks..
The lead() will do that:
select sl_no, valid_from,
lead(valid_from, 1, valid_from) over (order by valid_from) as valid_to
from the_table
order by valid_from;
lead() picks the column value of specified column of the next row (defined by the order by). The parameters 1, valid_from specify that the database should look 1 row ahead and in case there is no such row, the third parameter is returned. lead(valid_from) is a short form of lead(valid_from, 1, null).
Set the manual for details:
http://www.postgresql.org/docs/current/static/tutorial-window.html
http://www.postgresql.org/docs/current/static/functions-window.html
SQLFiddle examle: http://sqlfiddle.com/#!15/61d53/1

Sorting of Date Format in Oracle

I am using a query in oracle which gives the below result (its a kind of month-wise transaction report):
Month Total Submitted Approved
--------------------------------------
DEC-14 2 2 0
APR-15 17 12 5
SEP-14 1 1 0
FEB-15 7 4 3
JUL-15 1 1 0
JAN-15 18 4 14
MAR-15 2 1 1
OCT-14 2 (null) (null)
JUN-15 136 91 45
JUN-14 1 1 0
MAY-15 179 63 116
I want to get the result in a sorted format, like JUN-14,SEP-14,OCT-14,DEC-14,JAN-15....so on. Thanks in advance.
order by date_column desc where date_column is the column that holds the date. This will order by the date_column in descending order.
Use asc to order in ascending order.
If month data type in character format you have to use
select * from table_name
order by to_char(to_date(month,'mm/yy'),'yy') asc,to_char(to_date(month,'mm/yy'),'mm') asc
if it is in date
select * from table_name
order by to_char(month,'yy') asc,to_char(month,'mm') asc
i assumed that you were using the following for displaying month column data.
TO_char(hiredate,'mon-yy')
if you used this then it will be easy for sorting them.
select your column list from table order by source_date_column asc;
for reference use the link

SQL Server : count types with totals by date change

I need to count a value (M_Id) at each change of a date (RS_Date) and create a column grouped by the RS_Date that has an active total from that date.
So the table is:
Ep_Id Oa_Id M_Id M_StartDate RS_Date
--------------------------------------------
1 2001 5 1/1/2014 1/1/2014
1 2001 9 1/1/2014 1/1/2014
1 2001 3 1/1/2014 1/1/2014
1 2001 11 1/1/2014 1/1/2014
1 2001 2 1/1/2014 1/1/2014
1 2067 7 1/1/2014 1/5/2014
1 2067 1 1/1/2014 1/5/2014
1 3099 12 1/1/2014 3/2/2014
1 3099 14 2/14/2014 3/2/2014
1 3099 4 2/14/2014 3/2/2014
So my goal is like
RS_Date Active
-----------------
1/1/2014 5
1/5/2014 7
3/2/2014 10
If the M_startDate = RS_Date I need to count the M_id and then for
each RS_Date that is not equal to the start date I need to count the M_Id and then add that to the M_StartDate count and then count the next RS_Date and add that to the last active count.
I can get the basic counts with something like
(Case when M_StartDate <= RS_Date
then [m_Id] end) as Test.
But I am stuck as how to get to the result I want.
Any help would be greatly appreciated.
Brian
-added in response to comments
I am using Server Ver 10
If using SQL SERVER 2012+ you can use ROWS with your the analytic/window functions:
;with cte AS (SELECT RS_Date
,COUNT(DISTINCT M_ID) AS CT
FROM Table1
GROUP BY RS_Date
)
SELECT *,SUM(CT) OVER(ORDER BY RS_Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Run_CT
FROM cte
Demo: SQL Fiddle
If stuck using something prior to 2012 you can use:
;with cte AS (SELECT RS_Date
,COUNT(DISTINCT M_ID) AS CT
FROM Table1
GROUP BY RS_Date
)
SELECT a.RS_Date
,SUM(b.CT)
FROM cte a
LEFT JOIN cte b
ON a.RS_DAte >= b.RS_Date
GROUP BY a.RS_Date
Demo: SQL Fiddle
You need a cumulative sum, easy in SQL Server 2012 using Windowed Aggregate Functions. Based on your description this will return the expected result
SELECT p_id, RS_Date,
SUM(COUNT(*))
OVER (PARTITION BY p_id
ORDER BY RS_Date
ROWS UNBOUNDED PRECEDING)
FROM tab
GROUP BY p_id, RS_Date
It looks like you want something like this:
SELECT
RS_Date,
SUM(c) OVER (PARTITION BY M_StartDate ORDER BY RS_Date ROWS UNBOUNDED PRECEEDING)
FROM
(
SELECT M_StartDate, RS_Date, COUNT(DISTINCT M_Id) AS c
FROM my_table
GROUP BY M_StartDate, RS_Date
) counts
The inline view computes the counts of distinct M_Id values within each (M_StartDate, RS_Date) group (distinctness enforced only within the group), and the outer query uses the analytic version of SUM() to add up the counts within each M_StartDate.
Note that this particular query will not exactly reproduce your example results. It will instead produce:
RS_Date Active
-----------------
1/1/2014 5
1/5/2014 7
3/2/2014 8
3/2/2014 2
This is on account of some rows in your example data with RS_Date 3/2/2014 having a later M_StartDate than others. If this is not what you want then you need to clarify the question, which currently seems a bit inconsistent.
Unfortunately, analytic functions are not available until SQL Server 2012. In SQL Server 2010, the job is messier. It could be done like this:
WITH gc AS (
SELECT M_StartDate, RS_Date, COUNT(DISTINCT M_Id) AS c
FROM my_table
GROUP BY M_StartDate, RS_Date
)
SELECT
RS_Date,
(
SELECT SUM(c)
FROM gc2
WHERE gc2.M_StartDate = gc.M_StartDate AND gc2.RS_Date <= gc.RS_Date
) AS Active
FROM gc
If you are using SQL 2012 or newer you can use LAG to produce a running total.
https://msdn.microsoft.com/en-us/library/hh231256(v=sql.110).aspx