MSSQL Group by and Select rows from grouping - sql

I'm trying to figure out if what I'm trying to do is possible. Instead of resorting to multiple queries on a table, I wanted to group the records by business date and id then group by the id and select one date for a field and another date for the other field.
SELECT
*
{AMOUNT FROM DATE}
{AMOUNT FROM OTHER DATE}
FROM (
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
AS subquery
GROUP BY id

It seems that you're looking to do a pivot query. I usually use cross tabs for this. Based on the query you posted, it could look like:
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM (
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
)AS subquery
GROUP BY id;
You could also use a CTE.
WITH CTE AS(
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
)
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM CTE
GROUP BY id;
Or even be a rebel and do the operation directly.
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM CTE
GROUP BY id;
However, some people have tested for performance and found that pre-aggregating can improve performance.

If I understand you correctly, then you're just trying to pivot, but only with two particular dates:
select id,
date1 = sum(iif(date = '2000-01-01', amount, null)),
date2 = sum(iif(date = '2000-01-02', amount, null))
from [table]
group by id

Related

Combine 2 queries together

I am struggling to work out combining a query that should give me 3 columns of Month, total_sold_products and drinks_sold_products
Query 1:
Select month(date), count(id) as total_sold_products
from Products
where date between '2022-01-01' and '2022-12-31'
Query 2
Select month(date), count(id) as drinks_sold_products
from Products where type = 'drinks' and date between '2022-01-01' and '2022-12-31'
I tried the union function but it summed count(id) twice and gave me only 2 columns
Many thanks!
Union is for attaching sets of data on top of each other. You need conditional aggregation or a join. See below.
SELECT MONTH(date),
COUNT(*) AS total_sold_products,
COUNT(CASE WHEN type = 'drinks' THEN 1 ELSE 0 END) AS drinks_sold_products,
FORMAT((CASE
WHEN COUNT(*) > 0 THEN
COUNT(CASE WHEN type = 'drinks' THEN 1 ELSE 0 END)/COUNT(*)
ELSE 0 END),
'P') AS Percentage
FROM Products
WHERE date BETWEEN'2022-01-01' AND '2022-12-31'
GROUP BY MONTH(date)

CASE WHEN condition with MAX() function

There are a lot questions on CASE WHEN topic, but the closest my question is related to this How to use CASE WHEN condition with MAX() function query which has not been resolved.
Here is some of my sample data:
date
debet
2022-07-15
57190.33
2022-07-14
815616516.00
2022-07-15
40866.67
2022-07-14
1221510.00
So, I want to all records for the last two dates and three additional columns: sum(sales) for the previous day, sum for the current day and the difference between them:
SELECT
[debet],
[date] ,
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END ) AS sum_act,
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END ) AS sum_prev ,
(
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END )
-
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END )
) AS diff
FROM
Table
WHERE
[date] = ( SELECT MAX(date) FROM Table WHERE date < ( SELECT MAX(date) FROM Table) )
OR
[date] = ( SELECT MAX(date) FROM Table WHERE date = ( SELECT MAX(date) FROM Table ) )
GROUP BY
[date],
[debet]
Further, of course, it informs that I can't use the aggregate function inside CASE WHEN. Now I use this combination: sum(CASE WHEN [date] = dateadd(dd,-3,cast(getdate() as date)) THEN [debet] ELSE 0 END). But here every time I need to make an adjustment for weekends and holidays. The question is, is there any other way than using 'getdate' in 'case when' Statement to get max date?
Expected result:
date
sum_act
sum_prev
diff
2022-07-15
97190.33
0.00
97190.33
2022-07-14
0.00
508769.96
-508769.96
You can use dense_rank() to filter the last 2 dates in your table. After that you can use either conditional case expression with sum() to calculate the required value
select [date],
sum_act = sum(case when rn = 1 then [debet] else 0 end),
sum_prev = sum(case when rn = 2 then [debet] else 0 end),
diff = sum(case when rn = 1 then [debet] else 0 end)
- sum(case when rn = 2 then [debet] else 0 end)
from
(
select *, rn = dense_rank() over (order by [date] desc)
from tbl
) t
where rn <= 2
group by [date]
db<>fiddle demo
Two steps:
Get the sums for the last three dates
Show the results for the last two dates.
Well, we could also get all daily sums in step 1, but we just need the last three in order to calculate the sums for the last two days, so why aggregate more data than necessary?
Here is the query. You may have to put the date column name in brackets in SQL Server, as date is a keyword in SQL.
select top(2)
date,
sum_debit_current,
sum_debit_previous,
sum_debit_current - sum_debit_previous as diff
(
select
date,
sum(debet) as sum_debit_current,
lag(sum(debet)) over (order by date) as sum_debit_previous
from table
where date in (select distinct top(3) date from table order by date desc)
group by date
)
order by date desc;
(SQL Server uses TOP(n) instead of standard SQL FETCH FIRST 3 ROWS and while SELECT DISTINCT TOP(3) date looks like "get the top 3 rows, then apply distinct on their date", it is really "apply distinct on the dates, then get the top 3" like in standard SQL.)

Advanced SQL with window function

I have Table a(Dimension table) and Table B(Fact table) stores transaction shopper history.
Table a : shopped id(surrogate key) created for unique combination(any of column 2,colum3,column4 repeated it will have same shopper id)
Table b is transaction data.
I am trying to identify New customers and repeated customers for each week, expected output is below.
I am thinking following SQL Statement
Select COUNT(*) OVER (PARTITION BY shopperid,weekdate) as total_new_shopperid for Repeated customer,
for Identifying new customer(ie unique) in same join condition, I am stuck on window function..
thanks,
Sam
You can use the DENSE_RANK analytical function along with aggregate function as follows:
SELECT WEEK_DATE,
COUNT(DISTINCT CASE WHEN DR = 1 THEN SHOPPER_ID END) AS TOTAL_NEW_CUSTOMER,
SUM(CASE WHEN DR = 1 THEN AMOUNT END) AS TOTAL_NEW_CUSTOMER_AMT,
COUNT(DISTINCT CASE WHEN DR > 1 THEN SHOPPER_ID END) AS TOTAL_REPEATED_CUSTOMER,
SUM(CASE WHEN DR > 1 THEN AMOUNT END) AS TOTAL_REPEATED_CUSTOMER_AMT
FROM
(
select T.*,
DENSE_RANK() OVER (PARTITION BY SHOPPER_ID ORDER BY WEEK_DATE) AS DR
FROM YOUR_TABLE T);
GROUP BY WEEK_DATE;
Cheers!!
Tejash's answer is fine (and I'm upvoting it).
However, Oracle is quite efficient with aggregation, so two levels of aggregation might have better performance (depending on the data):
select week_date,
sum(case when min_week_date = week_date then 1 else 0 end) as new_shoppers,
sum(case when min_week_date = week_date then amount else 0 end) as new_shopper_amount,
sum(case when min_week_date > week_date then 1 else 0 end) as returning_shoppers,
sum(case when min_week_date > week_date then amount else 0 end) as returning_amount
from (select shopper_id, week_date,
sum(amount) as amount,
min(week_date) over (partition by shopper_id) as min_week_date
from t
group by shopper_id, week_date
) sw
group by week_date
order by week_date;
Note: If this has better performance, it is probably due to the elimination of count(distinct).

AVG duplication SQL

I'm currently having issues creating two columns with AVG for different date ranges.
I've tried the below code to try and resolve this.
WITH Tbl AS(
SELECT FORMAT(SaleDate,'MM')+'.'+FORMAT(SaleDate,'yyyy') AS SALE_MY, Employee, NewScheme
FROM Salereport
WHERE Business Area='Sales'
)
SELECT
AgentName,
(SELECT AVG(NewScheme) FROM Tbl WHERE SALE_MY='01.2019' OR SALE_MY='02.2019' OR SALE_MY='03.2019'),
(SELECT AVG(NewScheme) FROM Tbl WHERE SALE_MY='04.2019' OR SALE_MY='05.2019' OR SALE_MY='06.2019')
FROM Tbl
GROUP BY Employee
Result is just the same AVG for everyone.
Use conditional aggregation:
SELECT Employee,
AVG(CASE WHEN SaleDate >= '2019-01-01' AND SaleDate < '2019-04-01'
THEN NewScheme
END),
AVG(CASE WHEN SaleDate >= '2019-01-04' AND SaleDate < '2019-04-07'
THEN NewScheme
END),
FROM Salereport
WHERE Business Area = 'Sales'
GROUP BY Employee;
When working with dates, you should be using date operations. The only time you normally need to convert to a string is to format dates in the result set.
Incidentally, your version is taking the average across all employees. No subquery is needed, but if you were to use one, you would want a correlated subquery.
Try this:
WITH Tbl AS(
SELECT FORMAT(SaleDate,'MM')+'.'+FORMAT(SaleDate,'yyyy') AS SALE_MY, Employee, NewScheme
FROM Salereport
WHERE Business Area='Sales'
)
SELECT Employee,
AVG(CASE WHEN SALE_MY IN ('01.2019', '02.2019', '03.2019') THEN NewScheme ELSE NULL END) AS Q1_Avg,
AVG(CASE WHEN SALE_MY IN ('04.2019', '05.2019', '06.2019') THEN NewScheme ELSE NULL END) AS Q2_Avg
FROM Tbl
GROUP BY Employee

How to get count of a particular row

I have table that contain Id,Date and Status i.e open/close
i just want a result in sql that contain month wise open,close and total count of Id's
e.g In Jan open count 15,close count 5 and total count 20
Use RollUp() and Group By as below:
;WITH T AS
(
SELECT
Id,
DATENAME(MONTH,[Date]) AS [MonthName],
Status
FROM #tblTest
)
SELECT
[MonthName],
[Status],
StatusCount
FROM
(
SELECT
MonthName,
CASE ISNULL(Status,'') WHEN '' THEN 'Total' ELSE Status END AS Status,
Count(Status) AS StatusCount
FROM T
GROUP BY ROLLUP([MonthName],[Status])
)X
WHERE X.MonthName IS NOT NULL
ORDER BY X.[MonthName],X.[Status]
Output:
Note: If required data in single row by month then apply PIVOT
select year(date), month(date),
sum(case when status = 'open' then 1 else 0 end) as open_count,
sum(case when status = 'closed' then 1 else 0 end) as closed_count,
count(*) as total_count
from your_table
group by year(date), month(date)