Sum column but for each row - sql

How to do like
and I want to have result like
and what I already try is like this
select no, student, sum(point) from student group by no, student
But the result is not like what I expected.

You appear to want a "running sum" e.g.
select
no
, student
, sum(point) over(partition by student order by no) run_sum
from student
and it seems that the extra point might a subtraction of the 2nd last running sum from the final like this.
WITH cte
AS (
SELECT
no
, student
, SUM(point) OVER (PARTITION BY student ORDER BY no) run_sum
, ROW_NUMBER() OVER (PARTITION BY student ORDER BY no DESC) rn
FROM student)
SELECT
t.*
, coalesce(t.rum_sum,0) - coalesce(ex.run_sum,0) AS extra_point
FROM cte t
LEFT JOIN (
SELECT
*
FROM cte
WHERE rn = 2) ex ON t.student = ex.student
AND t.rn = 1
AND ex.rn = 2
;
but without more guidance on the required logic for extra_point that is just a guess.

Related

Filter out null values resulting from window function lag() in SQL query

Example query:
SELECT *,
lag(sum(sales), 1) OVER(PARTITION BY department
ORDER BY date ASC) AS end_date_sales
FROM revenue
GROUP BY department, date;
I want to show only the rows where end_date is not NULL.
Is there a clause used specifically for these cases? WHERE or HAVING does not allow aggregate or window function cases.
One method uses a subquery:
SELECT r.*
FROM (SELECT r. *,
LAG(sum(sales), 1) OVER (ORDER BY date ASC) AS end_date
FROM revenue r
) r
WHERE end_date IS NOT NULL;
That said, I don't think the query is correct as you have written it. I would assume that you want something like this:
SELECT r.*
FROM (SELECT r. *,
LEAD(end_date, 1) OVER (PARTITION BY ? ORDER BY date ASC) AS end_date
FROM revenue r
) r
WHERE end_date IS NOT NULL;
Where ? is a column such as the customer id.
Try this
select * from (select distinct *,SUM(sales) OVER (PARTITION BY dept) from test)t
where t.date in(select max(date) from test group by dept)
order by date,dept;
And one more simpler way without sub query
SELECT distinct dept,MAX(date) OVER (PARTITION BY dept),
SUM(sales) OVER (PARTITION BY dept)
FROM test;

How to select multiple max values from a sql table

I am trying to get the top performers from a table, grouped by the company but can't seem to get the grouping right.
I have tried to use subqueries but this goes beyond my knowledge
I am trying to make a query that selects the rows in green. In other words I want to include the name, the company, and what they paid but only the top performers of each company.
Here is the raw data
create table test (person varchar(50),company varchar(50),paid numeric);
insert into
test
values
('bob','a',200),
('jane','a',100),
('mark','a',350),
('susan','b',650),
('thabo','b',100),
('thembi','b',210),
('lucas','b',110),
('oscar','c',10),
('janet','c',20),
('nancy','c',30)
You can use MAX() in a subquery as
CREATE TABLE T(
Person VARCHAR(45),
Company CHAR(1),
Paid INT
);
INSERT INTO T
VALUES ('Person1', 'A', 10),
('Person2', 'A', 20),
('Person3', 'B', 10);
SELECT T.*
FROM T INNER JOIN
(
SELECT Company, MAX(Paid) Paid
FROM T
GROUP BY Company
) TT ON T.Company = TT.Company AND T.Paid = TT.Paid;
Demo
Or using a window function as
SELECT Person,
Company,
Paid
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Paid DESC) RN
FROM T
) TT
WHERE RN = 1;
Demo
Here's your query.
select a.person, a.company, a.paid from tableA a
inner join
(select person, company, row_number() over (partition by company order by paid desc) as rn from tableA) as t1
on t1.person = a.person and t1.company = a.company
where t1.rn = 1
Maybe something like
WITH ranked AS (SELECT person, company, paid
, rank() OVER (PARTITION BY company ORDER BY paid DESC) AS rnk
FROM yourtable)
SELECT person, company, paid
FROM ranked
WHERE rnk = 1
ORDER BY company;
You can use rank() function with partition by clause.
DENSE_RANK gives you the ranking within your ordered partition, but the ranks are consecutive. No ranks are skipped if there are ranks with multiple items.
WITH cte AS (
SELECT person, company, paid
rank() OVER (PARTITION BY company ORDER BY paid desc) rn
FROM yourtable
)
SELECT
*
FROM cte

Select most recent status for each ID and department code

I have the following table:
I want to get the most recent status for each dept_code that a CL_ID has. So the desired output would be this:
I have tried the following but this give me just the most recent status for each client and not each of their dept_codes.
SELECT *
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT] C
INNER JOIN
(SELECT CLIENT_NUMBER, MAX(STATUS_DATE) AS SDATE
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
GROUP BY CLIENT_NUMBER) X
ON X.CLIENT_NUMBER = C.CLIENT_NUMBER
AND X.SDATE = C.STATUS_DATE
ORDER BY C.CLIENT_NUMBER
Any help would be much appreciated. Thanks.
A convenient method that works in SQL Server is:
select top (1) cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
order by row_number() over (partition by cl_id, dept_code order by status_date desc);
A method that is efficient with the right indexes in almost any database is:
select cl.*
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl
where cl.status_date = (select max(cl2.status_date)
from [CIMSHR6_MERGED].[dbo].[C3CLSTAT] cl2
where cl2.cl_id = cl.cl_id and cl2.dept_code = cl.dept_code
);
The right index is on (cl_id, dept_code, status_date).
I would also use ROW_NUMBER, but with a subquery:
SELECT CL_ID, Status_date, Status, Dept_code
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY CL_ID, Dept_code ORDER BY Status_date DESC) rn
FROM CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) t
WHERE rn = 1;
1) Firstly group everything on Dept_Code,CL_ID and assign rank for each row with in the group in descending order.
2) Select all the rows with rnk=1 which would display your desired result.
SELECT Z.CL_ID,
Z.Status_Date,
Z.Status,
Z.Dept_Code
FROM
(
SELECT *,
RANK() OVER( PARTITION BY Dept_Code,CL_ID, ORDER BY Status_Date DESC ) AS rnk
FROM [CIMSHR6_MERGED].[dbo].[C3CLSTAT]
) Z
WHERE Z.rnk = 1;
This would work for almost all databases
select * from c3clstat c
where exists
(select 1 from c3clstat c1
where c1.cl_id=c.cl_id
and c1.dept_code=c.dept_code
group by cl_id,dept_code
having c.status_date=max(c1.status_date)
)

SQL Server Group By with Max on Date field

I hope i can explain the issue i'm having and hopefully so can point me in the same direction.
I'm trying to do a group by (Email Address) on a subset of data, then i'm using a max() on a date field but because of different values in other fields its bring back more rows then require.
I would just like to return the max record per email address and return the fields that are on the same row that are on the max record.
Not sure how i can write this query?
This is a task for ROW_NUMBER:
select *
from
(
select t.*,
-- assign sequential number starting with 1 for the maximum date
row_number() over (partiton by email_address order by datecol desc) as rn
from tab
) as dt
where rn = 1 -- only return the latest row
You can write this query using row_number():
select t.*
from (select t.*,
row_number() over (partition by emailaddress order by date desc) as seqnum
from t
) t
where seqnum = 1;
How about something like this?
select a.*
from baseTable as a
inner join
(select Email,
Max(EmailDate) as EmailDate
from baseTable
group by Email) as b
on a.Email = b.Email
and a.EmailDate = b.EmailDate

Column is invalid error when using derived table

I'm using ROW_NUMBER() and a derived table to fetch data from the derived table result.
However, I get the error message telling me I don't have the appropriate columns in the GROUP BY clause.
Here's the error:
Column 'tblCompetition.objID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
What column am I missing? Or am I doing something else wrong? Find below the query that is not working, and the (more simple) query that is working.
SQL Server 2008.
Query that isn't working:
SELECT
objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM
(
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) as count,
ROW_NUMBER() OVER(PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
) as t
WHERE sno <= 2 AND objTypeID = #objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
GROUP BY objID,objTypeID,userID,datAdded,count,sno
Simple query that is working:
SELECT objId,objTypeID,userId,datAdded FROM
(
SELECT objId,objTypeID,userId,datAdded,
ROW_NUMBER() OVER(PARTITION BY userId ORDER BY unqid DESC) as sno
FROM tblRdbCompetition
) as t
WHERE sno<=2 AND objtypeid=#objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
Thank you!
you need the GROUP BY in your subquery since that's where the aggregate is:
SELECT
objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM
(
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) as count,
ROW_NUMBER() OVER(PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
GROUP BY scc.objID,scc.objTypeID,scc.userID,scc.datAdded) as t
WHERE sno <= 2 AND objTypeID = #objTypeID
AND datAdded BETWEEN #datStart AND #datEnd
You cannot have count in a group by clause. Infact the count is derived when you have other fields in group by. Remove count from your Group by.
In the innermost query you are using
COUNT(sci.favID) as count,
which is an aggregate, and you select other non-aggregating columns along with it.
I believe you wanted an analytic COUNT instead:
SELECT objID,
objTypeID,
userID,
datAdded,
count,
sno
FROM (
SELECT scc.objID,scc.objTypeID,scc.userID,scc.datAdded,
COUNT(sci.favID) OVER (PARTITION BY scc.userID ) AS count,
ROW_NUMBER() OVER (PARTITION BY scc.userID ORDER BY scc.unqID DESC) as sno
FROM tblCompetition scc
LEFT JOIN
tblFavourites sci
ON sci.favID = scc.objID
AND sci.datTimeStamp BETWEEN #datStart AND #datEnd
) as t
WHERE sno = 1
AND objTypeID = #objTypeID