Column Sum in Select Query - sql

I have problem in sql server select query. I have following table.
ID-----Name----Quantity
1-------a-----------10
2-------b-----------30
3-------c-----------20
4-------d-----------15
5-------e-----------10
6-------f-----------30
7-------g-----------40
I want to select those record where the sum of Quantity < value. For example if I say select those record where the Quantity sum <65 then the output will be
ID-----Name----Quantity
1-------a-----------10
2-------b-----------30
3-------c-----------20
because if we include the next record then the sum of Quantity will 75.
I want to create this query. Please help me out.

You can simply use a correlated subquery to do so, and it will work fine for both MySQL, and SQL Server. But it is not the best performance wise solution:
SELECT
ID,
Name,
Quantity
FROM
(
SELECT
t1.ID,
t1.Name,
t1.Quantity,
(SELECT SUM(t2.Quantity)
FROM tablename AS t2
WHERE t2.ID <= t1.ID) AS Total
FROM Tablename AS t1
) AS t
WHERE Total < 65;
See it in action:
SQL Fiddle Demo
This will give you:
| ID | NAME | QUANTITY |
------------------------
| 1 | a | 10 |
| 2 | b | 30 |
| 3 | c | 20 |

Best performance wise solution is using recursive CTE.
WITH CTE_Prepare AS
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY id) AS RN
FROM TableRT
)
, CTE_Recursive AS
(
SELECT ID, Name, Quantity, QUantity AS SumQuantity, RN FROM CTE_Prepare WHERE RN = 1
UNION ALL
SELECT p.ID, p.Name, p.Quantity, r.SumQuantity + p.Quantity AS SumQuantity, r.RN + 1 AS RN FROM CTE_Recursive r
INNER JOIN CTE_Prepare p ON p.RN = r.RN+1
WHERE r.SumQuantity + p.Quantity < 65
)
SELECT *
FROM CTE_Recursive
OPTION (MAXRECURSION 0)
The first CTE is just to calculate ROW_NUMBERS to use instead of your IDs, because it's important not to have gaps and we can't usually be sure with any ID that it would be the case.
Second CTE is two-part recursive, adding Quantity for each next row. You can google about SQL Server recursive CTEs more if needed.
I think this is better then any other approach to find running totals (that's what this concept is called) because it only works with two rows at the time - not adding all previous rows for each calculation and it actually stops as soon it reaches the wanted mark.
SQLFiddle - few rows sample
SQLFiddle - 10000 rows
EDIT: Corrected a few mistakes. In order for this to be fast, WHERE clause needs to be inside CTE and not outside.

Related

How to find Max value in a column in SQL Server 2012

I want to find the max value in a column
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
2 1 10 P2
3 2 50 P2
4 2 80 P1
Above is my table structure. I just want to find the max total value only from the table. In that four row ID 1 and 2 have same value in CName but total val and PName has different values. What I am expecting is have to find the max value in ID 1 and 2
Expected result:
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
4 2 80 P1
I need result same as like mention above
select Max(Tot_Val), CName
from table1
where PName in ('P1', 'P2')
group by CName
This is query I have tried but my problem is that I am not able to bring PName in this table. If I add PName in the select list means it will showing the rows doubled e.g. Result is 100 rows but when I add PName in selected list and group by list it showing 600 rows. That is the problem.
Can someone please help me to resolve this.
One possible option is to use a subquery. Give each row a number within each CName group ordered by Tot_Val. Then select the rows with a row number equal to one.
select x.*
from ( select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt ) x
where x.No = 1;
An alternative would be to use a common table expression (CTE) instead of a subquery to isolate the first result set.
with x as
(
select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt
)
select x.*
from x
where x.No = 1;
See both solutions in action in this fiddle.
You can search top-n-per-group for this kind of a query.
There are two common ways to do it. The most efficient method depends on your indexes and data distribution and whether you already have another table with the list of all CName values.
Using ROW_NUMBER
WITH
CTE
AS
(
SELECT
ID, CName, Tot_Val, PName,
ROW_NUMBER() OVER (PARTITION BY CName ORDER BY Tot_Val DESC) AS rn
FROM table1
)
SELECT
ID, CName, Tot_Val, PName
FROM CTE
WHERE rn=1
;
Using CROSS APPLY
WITH
CTE
AS
(
SELECT CName
FROM table1
GROUP BY CName
)
SELECT
A.ID
,A.CName
,A.Tot_Val
,A.PName
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
table1.ID
,table1.CName
,table1.Tot_Val
,table1.PName
FROM table1
WHERE
table1.CName = CTE.CName
ORDER BY
table1.Tot_Val DESC
) AS A
;
See a very detailed answer on dba.se Retrieving n rows per group
, or here Get top 1 row of each group
.
CROSS APPLY might be as fast as a correlated subquery, but this often has very good performance (and better than ROW_NUMBER():
select t.*
from t
where t.tot_val = (select max(t2.tot_val)
from t t2
where t2.cname = t.cname
);
Note: The performance depends on having an index on (cname, tot_val).

Is SQL row_number order guaranteed when a CTE is referenced many times?

If I have a CTE definition that uses row_number() ordered by a non-unique column, and I reference that CTE twice in my query, is the row_number() value for each row guaranteed to be the same for both references to the CTE?
Example 1:
with tab as (
select 1 as id, 'john' as name
union
select 2, 'john'
union
select 3, 'brian'
),
ordered1 as (
select ROW_NUMBER() over (order by name) as rown, id, name
from tab
)
select o1.rown, o1.id, o1.name, o1.id - o2.id as id_diff
from ordered1 o1
join ordered1 o2 on o2.rown = o1.rown
Output:
+------+----+-------+---------+
| rown | id | name | id_diff |
+------+----+-------+---------+
| 1 | 3 | brian | 0 |
| 2 | 1 | john | 0 |
| 3 | 2 | john | 0 |
+------+----+-------+---------+
Is it guaranteed that id_diff = 0 for all rows?
Example 2:
with tab as (
select 1 as id, 'john' as name
union
select 2, 'john'
union
select 3, 'brian'
),
ordered1 as (
select ROW_NUMBER() over (order by name) as rown, id, name
from tab
),
ordered2 as (
select ROW_NUMBER() over (order by name) as rown, id, name
from tab
)
select o1.rown, o1.id, o1.name, o1.id - o2.id as id_diff
from ordered1 o1
join ordered2 o2 on o2.rown = o1.rown
Same output as above when I ran it, but that doesn't prove anything.
Now that I am joining two queries ordered1 and ordered2, can any guarantee be made about the value of id_diff = 0 in the result?
Example queries on http://rextester.com/AQDXP74920
I suspect that there is no guarantee in either case. If there is no such guarantee, then all CTEs using row_number() should always order by a unique combination of columns if the CTE may be referenced more than once in the query.
I have never heard this advice before, and would like some expert opinion.
No, there is no guarantee that ROW_NUMBER on a non-unique sort list returns the same sequence when a CTE is referenced multiple times. It is very likely to happen, but not guranteed, as the CTE is merely a view.
So always make the sort list unique in such a case, e.g. order by name, id.
The answer that Thorsten gave is correct, I just want to add some more details.
Users of SQL Server often think of CTEs as "temporary tables" or "derived tables. However, they are nothing of the sort. Although some databases do materialize CTEs (at least some of the time), SQL Server never materializes CTEs.
In fact, what happens, is that the CTE logic is inserted into the query -- just as if "replace(, )" were used on the query. This affects non-unique sorting keys. It also affects some non-deterministic functions, such as NEWID().
The advice in your case is simple: Whenever you use order by, include a unique key as the last order by key. You should do this whether order by is used in a window function or for a query. It is just a safe habit to get used to.

Finding min + 1 in Postgres

Hi guys i have a postgres table with a column for event and a column for sequence. Every event may have multiple sequences. For ex:
Event | Sequence
a | 1
a | 4
a | 5
b | 1
b | 2
Now i know that select min(sequence) group by event gives me the minimum sequence. How do i get the very next value after the min value. i hope that makes sense. Thanks in advance.
I'm Using Postgres 9.3.
You can use ROW_NUMBER() partitioning by Event and ordering by Sequence to get the second lowest sequence number per Event;
SELECT Event, Sequence
FROM (
SELECT Event, Sequence,
ROW_NUMBER() OVER (PARTITION BY Event ORDER BY Sequence) rn
FROM Table1
) z
WHERE rn = 2;
An SQLfiddle to test with.
EDIT A bit more complicated, but if you need a query that doesn't rely on ROW_NUMBER(), use a subquery with a self-join to exclude rows with minimum sequence for each event:
SELECT outer_query.Event, MIN(outer_query.Sequence) AS SecondMinSeq
FROM Table1 as outer_query
INNER JOIN (
SELECT Table1.Event, MIN(Sequence) AS MinSeq
FROM Table1
GROUP BY Table1.Event
) AS min_sequences
ON outer_query.Event = min_sequences.Event AND outer_query.Sequence <> min_sequences.MinSeq
GROUP BY outer_query.Event
SQL Fiddle: http://sqlfiddle.com/#!15/4438b/7

Creating an accumulating rollup

I have records in a table that have codes specific to a certain level and an amount attached to that level. They do not add up and that is not the issue.
I wish to create a query that sums up all the values by the level code plus those in the levels below it. I would also like the amount per level in the same query, but it is not necessary. I have create a sample table and output below. Does anyone have a good way of doing this? Also, is there an actual definition for this kind of roll up?
CREATE TABLE LEVEL_AMOUNTS(
LEVEL_CODE char(1)
AMOUNT integer
)
INSERT INTO LEVEL_AMOUNTS VALUES
('A',1),('A',1),('A',1),('A',1),
('B',1),('B',1),('B',1),('B',1),
('C',1),('C',1),('C',1),('C',1)
Output:
A | 12
B | 8
C | 4
with cte as (
select distinct level_code
from level_amounts
)
select l.level_code, sum(l.amount)
from cte
inner join level_amounts l on l.level_code <= cte.level_code
group by l.level_code
or
select l.level_code, sum(l.amount)
from level_amounts l
inner join (select distinct level_code
from level_amounts) l1
on l.level_code <= l1.level_code
group by l.level_code;
sqlfiddle
SQL Server 2008 does not have cumulative sums. You can do this with subqueries or joins:
with cte as (
select level_code, sum(amount) as amount
from amounts
group by level_code
)
select level_code,
(select sum(amount) from cte cte2 where cte.level_code <= cte2.level_code) as cumamount
from cte;

Select ranked records whose sum adds up to a certain value

+----+--------+----------+-----------+
| ID | Number | Reason | Timestamp |
+----+--------+-----------+----------+
| 1 | 2 | Business | date |
| 2 | 3 | Pleasure | date |
+----+--------+-----------+----------+
I've got a table that looks like above. I'm trying to figure out how I can get the latest records (table ranked by timestamp) whose Number columns add up to a certain value.
So in the above example, if I had a value of 5, I would want these 2 records (assuming they were the most recent). I would like to get an output string like the following:
2 (Business), 3 (Pleasure)
Any ideas?
If you are looking for the most recent records that add up to no more than X, then you need a cumulative sum. In SQL Server 2008, this is handled using a correlated subquery:
select t.*
from (select t.*,
(select sum(number) from t t2 where t2.id <= t.id) as CumeSumValue
from t
) t
where CumeSumValue <= X
I think >= requires a bit more logic.
select t.*
from (select t.*,
(select sum(number)
from t t2
where t2.id <= t.id
) as CumeSumValue
from t
) t
where CumeSumValue <= X or -- less than or equal
((CumeSumValue - Number < X) and (CumeSumValue > X)) -- first that is greater than
CumeSumValue >= X
Have you tried using a cursor?
This looks like the prime example for using a cursor.
http://msdn.microsoft.com/de-de/library/ms180169.aspx
At first you should get latest records
Select Max(ID) MAXID Into #t From t Group By Timestamp
Now in #t all latest records available . You can join in with table to access all fields :
Select t.* from t Join #t on t.ID = #t.MAXID
If you want to filter something it could be done with simple where clause of having .