Using RollUp and Group By in SQL Server? - sql

I have table Sales in SQL Server 2012
Use tempdb
Go
CREATE TABLE Sales (EmpId INT, Yr INT, Sales MONEY)
INSERT Sales VALUES(1, 2005, 12000)
INSERT Sales VALUES(1, 2006, 18000)
INSERT Sales VALUES(1, 2007, 25000)
INSERT Sales VALUES(2, 2005, 15000)
INSERT Sales VALUES(2, 2006, 6000)
INSERT Sales VALUES(3, 2006, 20000)
INSERT Sales VALUES(3, 2007, 24000)
I want create a report with this results :
/*
EmpId------ Yr----- SUM(Sales) BY EmpId, Yr---------- SUM(Sales) BY EmpId ----------SUM(Sales)
1 2005 12000.00 12000.00 12000.00
1 2006 18000.00 30000.00 30000.00
1 2007 25000.00 55000.00 55000.00
1 NULL 55000.00 55000.00
2 2005 15000.00 15000.00 70000.00
2 2006 6000.00 21000.00 76000.00
2 NULL 21000.00 76000.00
3 2006 20000.00 20000.00 96000.00
3 2007 24000.00 44000.00 120000.00
3 NULL 44000.00 120000.00
NULL NULL 120000.00
*/
I write a query like this :
SELECT EmpId, Yr, SUM(Sales) AS Sales
FROM Sales
GROUP BY EmpId, Yr WITH ROLLUP
How can I change my query for get more columns like abauve.

In SQL Server 2012+, you can do cumulative sums using window functions. The following basically does what you want:
SELECT EmpId, Yr, SUM(Sales) AS Sales,
SUM(case when Yr is not null then SUM(Sales) end) OVER
(PARTITION BY EmpId
Order By (case when Yr is null then 0 else 1 end) desc, Yr
),
SUM(case when yr is not null then SUM(SALES) end) OVER
(Order by EmpId, (case when Yr is null then 0 else 1 end) desc, Yr)
FROM Sales
GROUP BY EmpId, Yr WITH ROLLUP
ORDER BY (case when EmpId is null then 0 else 1 end) desc, empid,
(case when Yr is null then 0 else 1 end) desc, yr;
This is tricky because the interplay between the rollup and the window functions requires care.
Here is the SQL Fiddle.
EDIT:
To fix the very last cell on the last row, you can add a case statement:
SELECT EmpId, Yr, SUM(Sales) AS Sales,
SUM(case when Yr is not null then SUM(Sales) end) OVER
(PARTITION BY EmpId
Order By (case when Yr is null then 0 else 1 end) desc, Yr
),
(case when yr is null and empid is null
then sum(case when yr is not null and empid is not null then sum(sales) end) over ()
else SUM(case when yr is not null then SUM(SALES) end) OVER
(Order by EmpId, (case when Yr is null then 0 else 1 end) desc, Yr)
end)
FROM Sales
GROUP BY EmpId, Yr WITH ROLLUP
ORDER BY (case when EmpId is null then 0 else 1 end) desc, empid,
(case when Yr is null then 0 else 1 end) desc, yr;

Related

Subtracting the revenue value for the earliest date minus the latest date in redshift sql

account name
year
revenue
abc
2006
1000
abc
2007
2000
abc
2008
5000
Hello everyone,
So I am trying to find a way to subtract the revenue for the latest year for a given account name to the earliest year found in a dataset.
For example in the above table
the latest year for abc -> 2008
the earliest year for abc -> 2006,
I can't hardcode the years in the code, I don't know what the years would be.
So, I want to get something like this
account name
subtracted revenue
abc
4000
I wish I could share some code but I have no idea how to proceed. I was thinking of using windowing function, but don't know how to apply it in this scenario.
Here is something that I tried, just check if this is any helpful
insert into revenue
select * from
(select 'abc' as accountname, 2006 as year, 1000 as revenue union
select 'abc' as accountname, 2007 as year, 2000 as revenue union
select 'abc' as accountname, 2008 as year, 5000 as revenue union
select 'def' as accountname, 2004 as year, 1000 as revenue union
select 'def' as accountname, 2006 as year, 3000 as revenue union
select 'xyz' as accountname, 2005 as year, 5000 as revenue
) as a
select accountname
, sum(case when yearname ='maxyear' then revenue else 0 end) - sum(case when yearname ='minyear' then revenue else 0 end) subtractedrevenue
from
(select *
, case when min(year)over(partition by accountname order by accountname) = max(year)over(partition by accountname order by accountname) then 'maxyear'
when year = min(year)over(partition by accountname order by accountname) then 'minyear'
when year= max(year)over(partition by accountname order by accountname) then 'maxyear'
else '' end yearname
from revenue
) as a
where yearname<>''
group by accountname```
You can use window functions:
select account_name,
sum(case when seqnum_desc = 1 then revenue else - revenue end)
from (select t.*,
row_number() over (partition by account_name order by year) as seqnum_asc,
row_number() over (partition by account_name order by year desc) as seqnum_desc
from t
) t
where 1 in (seqnum_asc, seqnum_desc)
group by account_name;
Here is a db<>fiddle.

SQL column count

[![enter image description here][1]][1]
I need to expected result from the attached screenshot
month empid emp_name p count wo count Totalhrs
----------------------------------------------------------------------------
FEB 00113 HUda salem al kaabi 25 4 1250
You can try a query like this :
Select
empid
max(emp_name) as emp_name,
count(*) as count,
sum(case when dstatus='wo' then 1 else 0 end) as wo_count,
sum(total_hrs) as totalhrs
from
<your_table>
where
p_date like '2020-10%'
group by
empid
Use conditional aggregation:
select empid, emp_name,
sum(case when dstatus = 'P' then 1 else 0 end) p_count,
sum(case when dstatus = 'WO' then 1 else 0 end) wo_count,
sum(total_hrs) total_hrs
from mytable
group by empid, emp_name
SELECT empid, emp_name, COUNT(*) 'p count', COUNT( IIF(DSTATUS = 'WO',1,0 )) 'wo count', COUNT(Total_Hrs) 'Total Hours'
FROM tb
GROUP BY empid, emp_name, p_count, wo_count, Total_Hrs

I am looking to find customers repurchase frequency in SQL from their first purchase date

I am trying to find the customer's repurchase rates from their first order date. For example, for 2016, how many customer purchased 1X in days 1-365 from their initial purchase, how many purchased twice etc.
I have a transaction_detail table which looks like below:
txn_date Customer_ID Transaction_Number Sales
1/2/2019 1 12345 $10
4/3/2018 1 65890 $20
3/22/2019 3 64453 $30
4/3/2019 4 88567 $20
5/21/2019 4 85446 $15
1/23/2018 5 89464 $40
4/3/2019 5 99674 $30
4/3/2019 6 32224 $20
1/23/2018 6 46466 $30
1/20/2018 7 56558 $30
I am able to find the customers who have shopped in 2016 and how many times have they repurchased in 2016, but I need to find the customer who have shopped in 2016 and how many times have they come back from their first purchase date.
I need a starting point for the query, I am not sure how to build this logic in my SQL code.
Any help would be appreciated.
I am using the below query:
WITH by_year
AS (SELECT
Customer_ID,
to_char(txn_date, 'YYYY') AS visit_year
FROM table
GROUP BY Customer_ID, to_char(txn_date, 'YYYY')),
with_first_year
AS (SELECT
Customer_ID,
visit_year,
FIRST_VALUE(visit_year) OVER (PARTITION BY Customer_ID ORDER BY visit_year) AS first_year
FROM by_year),
with_year_number
AS (SELECT
Customer_ID,
visit_year,
first_year,
(visit_year - first_year) AS year_number
FROM with_first_year)
SELECT
first_year AS first_year,
SUM(CASE WHEN year_number = 0 THEN 1 ELSE 0 END) AS year_0,
SUM(CASE WHEN year_number = 1 THEN 1 ELSE 0 END) AS year_1,
SUM(CASE WHEN year_number = 2 THEN 1 ELSE 0 END) AS year_2,
SUM(CASE WHEN year_number = 3 THEN 1 ELSE 0 END) AS year_3,
SUM(CASE WHEN year_number = 4 THEN 1 ELSE 0 END) AS year_4,
SUM(CASE WHEN year_number = 5 THEN 1 ELSE 0 END) AS year_5,
SUM(CASE WHEN year_number = 6 THEN 1 ELSE 0 END) AS year_6,
SUM(CASE WHEN year_number = 7 THEN 1 ELSE 0 END) AS year_7,
SUM(CASE WHEN year_number = 8 THEN 1 ELSE 0 END) AS year_8,
SUM(CASE WHEN year_number = 9 THEN 1 ELSE 0 END) AS year_9
FROM with_year_number
GROUP BY first_year
ORDER BY first_year
Use window functions and aggregation:
select cnt, count(*), min(customer_id), max(customer_id)
from (select customer_id, count(*) as cnt
from (select td.*,
min(txn_date) over (partition by Customer_ID) as min_txn_date
from transaction_detail td
) td
where txn_date >= min_txn_date and txn_date < min_txn_date + interval '365' day
group by customer_id
) c
group by cnt
order by cnt;
So as per my understanding, you want to know the count of the distinct person who first purchased in 2016 and repurchased after one year or more from date of purchase.
Select * from
(
Select customer_id,
Floor(months_between(txn_date, lead_txn_date)/12) as num_years
From
(
Select customer_id,
txn_date,
row_number() over (partition by Customer_ID order by txn_date) as rn,
lead(txn_date) over (partition by Customer_ID order by txn_date) as lead_txn_date
From your_table
)
Where txn_date >= date '2016-01-01'
and txn_date < date '2017-01-01'
and rn = 1
And months_between(txn_date, lead_txn_date) >= 12
)
Pivot
(
Count(1) for num_year in (1,2,3,4)
)
Ultimately, we are finding the number of years between first and second purchase of the customer. And first purchase must be in 2016.
Cheers!!

Running count for each 2 rows

I am trying to calculate running count for each 2 rows like below,
CREATE TABLE sales
(
EmpId INT,
Yr INT,
Sales DECIMAL(8,2)
)
INSERT INTO sales (EmpId, Yr, Sales)
VALUES (1, 2005, 12000), (1, 2006, 18000), (1, 2007, 25000),
(1, 2008, 25000), (1, 2009, 25000),
(2, 2005, 15000), (2, 2006, 6000), (2, 2007, 6000)
SELECT
EmpId, Yr, sales,
SUM(Sales) OVER (PARTITION BY empid ORDER BY empid ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS TotalSales2
FROM
sales
Output:
EmpId Yr sales TotalSales2
-----------------------------------
1 2005 12000 12000
1 2006 18000 30000
1 2007 25000 55000
1 2008 25000 68000
1 2009 25000 75000
2 2005 15000 15000
2 2006 6000 21000
2 2007 6000 27000
But expected output:
EmpId Yr Sales TotalSales2
-----------------------------------
1 2005 12000 12000
1 2006 18000 30000
1 2007 25000 25000
1 2008 25000 50000
1 2009 25000 25000
2 2005 15000 15000
2 2006 6000 21000
2 2007 6000 6000
What am I doing wrong in this query?
Note: SQL Servre version is 2012.
SELECT EmpId, Yr, Sales,
CASE WHEN ROW_NUMBER() OVER (PARTITION BY EmpId ORDER BY yr) % 2 = 0
THEN sales + lag(sales, 1, 0) OVER (PARTITION BY empid ORDER BY yr)
ELSE sales
END AS TotalSales2
FROM sales
Lag returns the previous row's value - when row_number() is even, add the current row's value to the previous row - otherwise, just show the sales for the current row. Partition each by EmpId, order each by yr - output matches the expected.
Also, thanks so much for adding the DDL/sample data.
The expression:
SUM(Sales) OVER (PARTITION BY empid
ORDER BY empid
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
calculates the sum considering the current row and the 2 rows immediately preceding it. So it actually calculates a rolling sum, which is what you really don't want.
I think you are actually looking for something like the following:
;WITH CTE_Group AS (
SELECT EmpId, Yr, sales,
(ROW_NUMBER() OVER (PARTITION BY empid ORDER BY yr) + 1 ) / 2 AS grp
FROM sales
)
SELECT EmpId, Yr, sales,
SUM(sales) OVER (PARTITION BY empid, grp
ORDER BY yr) AS TotalSales2
FROM CTE_Group
The above query uses a CTE in order to calculate field grp: the value of this field is 1 for the first two records of an empid partition, 2 for the next two records, and so on.
Using grp we can calculate the running total of sales for groups of 2 as is the requirement of the OP.
Demo here
Edit:
To offset a larger group of records try using (credit goes to #Max Szczurek for pointing this out):
(ROW_NUMBER() OVER (PARTITION BY empid ORDER BY yr) - 1 ) / n AS grp
where n is the number of records each group contains.
Although answer is already accepted, consider below query also. This will give the required output :
DECLARE #sales TABLE(EmpId INT, Yr INT, Sales DECIMAL(8,2))
INSERT INTO #sales ( EmpId, Yr, Sales )
VALUES (1, 2005, 12000),
(1, 2006, 18000),
(1, 2007, 25000),
(1, 2008, 25000),
(1, 2009, 25000),
(2, 2005, 15000),
(2, 2006, 6000),
(2, 2007, 6000)
;WITH SAMPLE_DATA
AS
(
SELECT ROW_NUMBER()over(partition by empid order by (select 100))SNO,* FROM #Sales
)
SELECT EmpId,Yr,Sales
,CASE WHEN (SNO%2=0)
THEN SALES+
(
SELECT Sales FROM SAMPLE_DATA T2 WHERE T2.EmpId=T1.EmpId AND T2.SNO=T1.SNO-1
)
ELSE Sales END
TotalSales2
FROM SAMPLE_DATA T1
OUTPUT
--------------------------------------
--EmpId Yr Sales TotalSales2
--------------------------------------
1 2005 12000.00 12000.00
1 2006 18000.00 30000.00
1 2007 25000.00 25000.00
1 2008 25000.00 50000.00
1 2009 25000.00 25000.00
2 2005 15000.00 15000.00
2 2006 6000.00 21000.00
2 2007 6000.00 6000.00
--------------------------------------

SQL query to group by age range from date created

I want to get statistics with sql query. My table is like this:
ID MATERIAL CREATEDATE DEPARTMENT
1 M1 10.10.1980 D1
2 M2 11.02.1970 D2
2 M3 18.04.1971 D3
.....................
.....................
.....................
How can I get a range of data count like this
DEPARTMENT AGE<10 10<AGE<20 20<AGE
D1 24 123 324
D2 24 123 324
Assuming that CREATEDATE is a date column, in PostgreSQL you can use the AGE function:
select DEPARTMENT, age(CREATEDATE) as AGE
from Materials
and with date_part you can get the age in years. To show the data in the format that you want, you could use this GROUP BY query:
select
DEPARTMENT,
sum(case when date_part('year', age(CREATEDATE))<10 then 1 end) as "age<10",
sum(case when date_part('year', age(CREATEDATE))>=10 and date_part('year', age(CREATEDATE))<20 then 1 end) as "10<age<20",
sum(case when date_part('year', age(CREATEDATE))>=20 then 1 end) as "20<age"
from
Materials
group by
DEPARTMENT
which can be simplified as:
with mat_age as (
select DEPARTMENT, date_part('year', age(CREATEDATE)) as mage
from Materials
)
select
DEPARTMENT,
sum(case when mage<10 then 1 end) as "age<10",
sum(case when mage>=10 and mage<20 then 1 end) as "10<age<20",
sum(case when mage>=20 then 1 end) as "20<age"
from
mat_age
group by
DEPARTMENT;
if you are using PostgreSQL 9.4 you can use FILTER:
with mat_age as (
select DEPARTMENT, date_part('year', age(CREATEDATE)) as mage
from Materials
)
select
DEPARTMENT,
count(*) filter (where mage<10) as "age<10",
count(*) filter (where mage>=10 and mage<20) as "10<age<20",
count(*) filter (where mage>=20) as "20<age"
from
mat_age
group by
DEPARTMENT;
The following solution assumes that your CREATEDATE column exists as some sort of valid Postgres date type. If this be not the case, and it is being stored as text, you will first have to convert it to date in order for the query to work.
SELECT DEPARTMENT,
SUM(CASE WHEN DATEDIFF(year, CREATEDATE, now()::date) < 10 THEN 1 ELSE 0 END) AS "AGE<10",
SUM(CASE WHEN DATEDIFF(year, CREATEDATE, now()::date) >= 10 AND
DATEDIFF(year, CREATEDATE, now()::date) < 20 THEN 1 ELSE 0 END) AS "10<AGE<20",
SUM(CASE WHEN DATEDIFF(year, CREATEDATE, now()::date) >= 20 THEN 1 ELSE 0 END) AS "20<AGE"
FROM Materials
GROUP BY DEPARTMENT
You can use extract(year FROM age(createdate)) to get the exact age
i.e
select extract(year FROM age(timestamp '01-01-1989')) age
will give you
Result:
age
---
27
so you can use following select statement to get your desired output:
SELECT dept
,sum(CASE WHEN age < 10THEN 1 END) "age<10"
,sum(CASE WHEN age >= 10 AND age < 20 THEN 1 END) "10<age<20"
,sum(CASE WHEN age >= 20 THEN 1 END) "20<age"
FROM (
SELECT dept,extract(year FROM age(crdate)) age
FROM dt
) t
GROUP BY dept
If you don't want to use a sub select use this.
SELECT dept
,sum(CASE WHEN extract(year FROM age(crdate)) < 10THEN 1 END) "age<10"
,sum(CASE WHEN extract(year FROM age(crdate)) >= 10 AND extract(year FROM age(crdate)) < 20 THEN 1 END) "10<age<20"
,sum(CASE WHEN extract(year FROM age(crdate)) >= 20 THEN 1 END) "20<age"
FROM dt
GROUP BY dept