Can multiple rows within a window be referenced by an analytic function?

Can multiple rows within a window be referenced by an analytic function? - sql

Given a table with:
ID VALUE
-- -----
1 1
2 2
3 3
4 4
I would like to compute something like this:
ID VALUE SUM
-- ----- ---
1 1 40 -- (2-1)*2 + (3-1)*3 + (4-1)*4 + (5-1)*5
2 2 26 -- (3-2)*3 + (4-2)*4 + (5-2)*5
3 3 14 -- (4-3)*4 + (5-3)*5
4 4 5 -- (5-4)*5
5 5 0 -- 0
Where the SUM on each row is the sum of the values of each subsequent row multiplied by the difference between the value of the subsequent row and the current row.
I could start with something like this:
CREATE TABLE x(id int, value int);
INSERT INTO x VALUES(1, 1);
INSERT INTO x VALUES(2, 2);
INSERT INTO x VALUES(3, 3);
INSERT INTO x VALUES(4, 4);
INSERT INTO x VALUES(5, 5);
SELECT id, value
,SUM(value) OVER(ORDER BY id ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS sum
FROM x;
id | value | sum
----+-------+-----
1 | 1 | 14
2 | 2 | 12
3 | 3 | 9
4 | 4 | 5
5 | 5 |
(5 rows)
where each row has the sum of all subsequent rows. But to take it further, I would really want something like this pseudo code:
SELECT id, value
,SUM( (value - FIRST_ROW(value)) * value )
OVER(ORDER BY id ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS sum
FROM x;
But this is not valid. And that is the crux of the question: is there a way to reference multiple rows in the window of an analytic function? Or a different way to approach this? The example above is contrived. I was actually playing with an interesting puzzle from another post Rollup Query which led me to this problem. I am trying this in Postgresql 9.1, but not bound to that.

Not quite sure if I've understood your requirement exactly here, but the query that you want is something like
select a.id, a.value, sum(( b.value - a.value ) * b.value )
from x a, x b
where a.id < b.id
group by a.id, a.value
Hope that helps.

Related

In sequelize, how do I select records that match all values that i am searching for?

As an example, I have the following table:
T | S
------
1 | 5
1 | 6
1 | 7
2 | 6
2 | 7
3 | 6
Query: array [1,2]
I want to select all values in S that have the value 1 AND 2 in the T Column.
So in the above example I should get as a result (6,7) because only 6 and 7 have for column T the values 1 and 2.
But i do not want to have 5 in my results as 5 does not have 2 in the T column.
How would I do this in sequelize?

how do i make (1,2) to be used as an array?
Either you insert the array joined as comma-separated literal into the query text (variant 1) or you join the array into one string literal and transfer it iinto the query as a parameter (variant 2).
Variant 1
SELECT s
FROM sourcetable
WHERE t IN (1,2) -- separate filter values
GROUP BY s
HAVING COUNT(DISTINCT t) = 2 -- unique values count
Variant 2
SELECT s
FROM sourcetable
WHERE FIND_IN_SET(t, '1,2') -- separate filter values
GROUP BY s
HAVING COUNT(DISTINCT t) = 2 -- unique values count
If (s,t) is unique then DISTINCT keyword may be removed.

Select latest available value SQL

Below is a test table for simplification of what I am looking to achieve in a query. I am attempting to create a query using a running sum which inserts into column b that last sum result that was not null. If you can imagine, i'm looking to have a cumulative sum the purchases of a customer every day, some days no purchases occurs for a particular customer thus I want to display the latest sum for that particular customer instead of 0/null.
CREATE TABLE test (a int, b int);
insert into test values (1,null);
insert into test values (2,1);
insert into test values (3,3);
insert into test values (4,null);
insert into test values (5,5);
insert into test values (6,null);
1- select sum(coalesce(b,0)),coalesce(0,sum(b)) from test
2- select a, sum(coalesce(b,0)) from test group by a order by a asc
3- select a, sum(b) over (order by a asc rows between unbounded preceding and current row) from test group by a,b order by a asc
I'm not sure if my interpretation of how coalesce works is correct. I thought this sum(coalesce(b,0)) will insert 0 where b is null and always take the latest cumulative sum of column b.
Think I may have solved it with query 3.
The result I expect will look like this:
a | sum
--------
1
2 1
3 4
4 4
5 9
6 9
Each records of a displays the last cumulative sum of column b.
Any direction would be of valuable.
Thanks

In Postgres you can also use the window function of SUM for a cummulative sum.
Example:
create table test (a int, b int);
insert into test (a,b) values (1,null),(2,1),(3,3),(4,null),(5,5),(6,null);
select a, sum(b) over (order by a, b) as "sum"
from test;
a | sum
-- | ----
1 | null
2 | 1
3 | 4
4 | 4
5 | 9
6 | 9
db<>fiddle here
And if "a" isn't unique, but you want to group on a?
Then you could use a suminception:
select a, sum(sum(b)) over (order by a) as "sum"
from test
group by a

How to add two values of the same column in a table

Consider the following table?
ID COL VALUE
1 A 10
2 B 10
3 C 10
4 D 10
5 E 10
Output:
ID COL VALUE
1 A 10
2 B 20
3 C 30
4 D 40
5 E 50

Based on your (deleted) comment in output it is taking up the sum of the upper values, it sounds like you're wanting a cumulative SUM().
You can do this with a windowed function:
Select Id, Col, Sum(Value) Over (Order By Id) As Value
From YourTable
Output
Id Col Value
1 A 10
2 B 20
3 C 30
4 D 40
5 E 50

Please make use of the the below code to obtain the cumulative sum. The code is working as expected with SQL Server 2012.
DECLARE #Table TABLE (ID int, COL CHAR(2), VALUE int)
INSERT #Table
(ID,COL,[VALUE])
VALUES
(1,'A',10),
(2,'B',10),
(3,'C',10),
(4,'D',10),
(5,'E',10)
SELECT t.ID,t.COL,SUM(VALUE) OVER (ORDER BY t.ID) AS VALUE
FROM #Table t

Not really sure what you are asking for. If my assumption is correct, you want to SUM the contents of a column and group it.
Select sum(value), col
from table
group by col

SQL - Add value with previous row only

I have a table named myvals with the following fields:
ID number
-- -------
1 7
2 3
3 4
4 0
5 9
Starting on 2nd row, I would like to add the number with the previous row number. So, my end result would look like this
ID number
-- ------
1 7
2 10
3 7
4 4
5 9

You could use the LAG analytic function
SELECT Id, number + LAG(number,1,0) OVER (ORDER BY Id) FROM table

First thing's first. You can't add to null to ID 1 must have a value.

create table #temp
(
month_type datetime,
value int
)
insert into #temp
Select '2015/01/01',1
union
Select '2015/02/01',2
union
Select '2015/03/01',3
union
Select '2015/04/01',4
SELECT t.value,t1.value,(t.value+t1.value)/2 FROM #temp t1
left join #temp t on t.month_type=Dateadd(MONTH,-1,t1.month_type)

Oracle: find duplicate rows in select query

My SQL query returns results with 4 columns "A", "B", "C", "D".
Suppose the results are:
A B C D
1 1 1 1
1 1 1 2
2 2 2 1
Is it possible to get the count of duplicate rows with columns "A", "B", "C" in each row.
e.g. the expected result is:
A B C D cnt
1 1 1 1 2
1 1 1 2 2
2 2 2 1 1
I tried using count(*) over. But it returns me the total number of rows returned by the query.
Another information is that in example I have mentioned only 3 columns based on which I need to check the count. But my actual query has such 8 columns. And number of rows in database are huge. So I think group by will not be a feasible option here.
Any hint is appreciable.
Thanks.

Maybe too late, but probably the count over as analytic function (aka window function) within oracle helps you. When I understand your request correctly, this should solve your problem :
create table sne_test(a number(1)
,b number(1)
,c number(1)
,d number(1)
,e number(1)
,f number(1));
insert into sne_test values(1,1,1,1,1,1);
insert into sne_test values(1,1,2,1,1,1);
insert into sne_test values(1,1,2,4,1,1);
insert into sne_test values(1,1,2,5,1,1);
insert into sne_test values(1,2,1,1,3,1);
insert into sne_test values(1,2,1,2,1,2);
insert into sne_test values(2,1,1,1,1,1);
commit;
SELECT a,b,c,d,e,f,
count(*) over (PARTITION BY a,b,c)
FROM sne_test;
A B C D E F AMOUNT
-- -- -- -- -- -- ------
1 1 1 1 1 1 1
1 1 2 4 1 1 3
1 1 2 1 1 1 3
1 1 2 5 1 1 3
1 2 1 1 3 1 2
1 2 1 2 1 2 2
2 1 1 1 1 1 1

To find duplicates you must group the data based on key column
select
count(*)
,empno
from
emp
group by
empno
having
count(*) > 1;
This allows you to aggregate by empno even when multiple records exist for each category (more than one).

You have to use a subquery where you get the count of rows, grouped by A, B and C. And then you join this subquery again with your table (or with your query), like this:
select your_table.A, your_table.B, your_table.C, your_table.D, cnt
from
your_table inner join
(SELECT A, B, C, count(*) as cnt
FROM your_table
GROUP BY A, B, C) t
on t.A = your_table.A
and t.B = your_table.B
and t.C = your_table.C

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Can multiple rows within a window be referenced by an analytic function? - sql

Not quite sure if I've understood your requirement exactly here, but the query that you want is something like select a.id, a.value, sum(( b.value - a.value ) * b.value ) from x a, x b where a.id < b.id group by a.id, a.value Hope that helps.

Related

In sequelize, how do I select records that match all values that i am searching for?

Select latest available value SQL

How to add two values of the same column in a table

SQL - Add value with previous row only

Oracle: find duplicate rows in select query

Categories

Resources