SQL query group by with null values is returning duplicates - sql

I have following query
My #dates table has following records:
month year saledate
9 2020 2020-09-01
10 2020 2020-10-01
11 2020 2020-11-01
with monthlysalesdata as(
select month(salesdate) as salemonth, year(salesdate) as saleyear,salesrepid, salespercentage
from salesrecords r
join #dates d on d.saledate = r.salesdate
group by salesrepid, salesdate),
averagefor3months as(
select 0 as salemonth, 0 as saleyear, salesrepid, salespercentage
from monthlysalesdata
group by salesrepid)
finallist as(
select * from monthlysalesdata
union
select * from averagefor3months
This query returns following records which gives duplicate for a averagefor3months result set when there is null record in the first monthlyresultdata. how to achieve average for 3 months as one record instead of having duplicates?
salesrepid salemonth saleyear percentage
232 0 0 null -------------this is the duplicate record
232 0 0 90
232 9 2020 80
232 10 2020 null
232 11 2020 100
My first cte has this result:
salerepid month year percentage
---------------------------------------------
232 9 2020 80
232 10 2020 null
232 11 2020 100
My second cte has this result:
salerepid month year percentage
---------------------------------------------
232 0 0 null
232 0 0 90
How to avoid the duplicate record in my second cte,

I suspect that you want a summary row per sales rep based on some aggregation. Your question is not clear on what is needed for the aggregation, but something like this:
with ym as (
select r.salesrepid, d.year, d.month, sum(<something>) as whatever
from salesrecords r join
#dates d
on d.saledate = r.salesdate
group by r.salesrepid, d.year, d.month
)
select ym.*
from ym
union all
select salesrepid, null, null, avg(whatever)
from hm
group by salesrepid;

I updated to selected the group by from the table directly instead of the previous cte and got my results. Thank you all for helping
with ym as (
select r.salesrepid, d.year, d.month, sum(<something>) as whatever
from salesrecords r join
#dates d
on d.saledate = r.salesdate
group by r.salesrepid, d.year, d.month
),
threemonthsaverage as(
select r.salesrepid, r.year, r.month, sum(something) as whatever
from salesrecords as r
group by salesrepid)
select ym *
union
select threemonthsaverage*

Related

SQL : create intermediate data from date range

I have a table as shown here:
USER
ROI
DATE
1
5
2021-11-24
1
4
2021-11-26
1
6
2021-11-29
I want to get the ROI for the dates in between the other dates, expected result will be as below
From 2021-11-24 to 2021-11-30
USER
ROI
DATE
1
5
2021-11-24
1
5
2021-11-25
1
4
2021-11-26
1
4
2021-11-27
1
4
2021-11-28
1
6
2021-11-29
1
6
2021-11-30
You may use a calendar table approach here. Create a table containing all dates and then join with it. Sans an actual table, you may use an inline CTE:
WITH dates AS (
SELECT '2021-11-24' AS dt UNION ALL
SELECT '2021-11-25' UNION ALL
SELECT '2021-11-26' UNION ALL
SELECT '2021-11-27' UNION ALL
SELECT '2021-11-28' UNION ALL
SELECT '2021-11-29' UNION ALL
SELECT '2021-11-30'
),
cte AS (
SELECT USER, ROI, DATE, LEAD(DATE) OVER (ORDER BY DATE) AS NEXT_DATE
FROM yourTable
)
SELECT t.USER, t.ROI, d.dt
FROM dates d
INNER JOIN cte t
ON d.dt >= t.DATE AND (d.dt < t.NEXT_DATE OR t.NEXT_DATE IS NULL)
ORDER BY d.dt;

In SQL, is there a way to show all dates even if the date doesn't have data points?

I have a transaction table t as follows in MS SQL Management Studio:
If I run the following SQL to summarise the transaction:
Select
Format(Transaction_Date, 'MMM-yyyy') as 'Year/Month'
,Customer
,Count(Customer) as SalesCount
From t
Group by Format(Transaction_Date, 'MMM-yyyy'), Customer
Order by Customer, Format(Transaction_Date, 'MMM-yyyy')
I'll get:
However I was asked to add all the months for the year 2019 and if there's no transaction in a certain month then return 0 for the SalesCount column:
I tried to create a month table with all the months in 2019 and left join it with the transaction table, but it still returns the same result with no showing of the months without transactions.
Time table I created:
declare #StartDate date = '2019-01-01';
declare #EndDate date = '2020-01-01';
With cte as (
Select #StartDate AS myDate
Union All
Select Dateadd(Month,1,myDate)
From cte
Where Dateadd(Month,1,myDate) < #EndDate
)
,TimeTable as(
SELECT
year(myDate)
,Datename(Month,myDate)
,Format(myDate,'MMMM-yy') as 'Month-Year'
FROM cte
)
Select
tb.'Month-Year'
t.Format(Transaction_Date, 'MMM-yyyy') as Year/Month
,t.Customer
,t.Count(Customer) as SalesCount
From TimeTable tb
Left Join Transaction t on t.'Month/Year' = tb.'Month-Year'
Group by tb.'Month-Year', Format(Transaction_Date, 'MMM-yyyy'), Customer
Order by Customer, Format(Transaction_Date, 'MMM-yyyy')
Your help will be much appreciated!
You need to generate all the rows for the months. One method uses a recursive CTE. Then rest is then left join and aggregation.
Let me assume you are using SQL Server:
with months as (
select convert(date, '2019-01-01') as mon
union all
select dateadd(month, 1, mon)
from months
where mon < '2019-12-01'
)
select Format(m.mon, 'MMM-yyyy') as year_month,
c.Customer,
count(t.customer) as SalesCount
from months m cross join
(select distinct customer from t) c left join
t
on t.transaction_date >= m.mon and
t.transaction_date < dateadd(month, 1, mon) and
t.customer = c.customer
group by m.mon, c.customer
order by c.ustomer, c.mon ;
Note the other changes to the query:
Year/Month is not a valid column alias.
This orders the rows chronologically. That is usually (always?) preferred over alphabetic sorting of months.
You can use monthYear combination table and with distinct customer list, you can achieve this.
declare #table table(trandate date,customer char(1))
inSert into #table values
('2019-01-03','A'),
('2019-01-17','A'),
('2019-06-03','A'),
('2019-07-03','A'),
('2019-06-03','B'),
('2019-07-03','B');
;with monthYear AS
(
select * from
(
values
('Jan-19')
,('Feb-19')
,('Mar-19')
,('Apr-19')
,('May-19')
,('Jun-19')
,('Jul-19')
,('Aug-19')
,('Sep-19')
,('Oct-19')
,('Nov-19')
,('Dec-19')
) as t(mon)
)
SELECT my.mon,c.customer, isnull(t.salescount,0) as salescount FROM monthYear as my
CROSS JOIN (SELECT distinct customer from #table) as c
OUTER APPLY
(
select format(trandate,'MMM-yy') as MonthYear, count(customer) as salescount
from #table
where customer = c.customer
group by format(trandate,'MMM-yy')
having format(trandate,'MMM-yy') = my.mon
) as t
mon
customer
salescount
Jan-19
A
2
Feb-19
A
0
Mar-19
A
0
Apr-19
A
0
May-19
A
0
Jun-19
A
1
Jul-19
A
1
Aug-19
A
0
Sep-19
A
0
Oct-19
A
0
Nov-19
A
0
Dec-19
A
0
Jan-19
B
0
Feb-19
B
0
Mar-19
B
0
Apr-19
B
0
May-19
B
0
Jun-19
B
1
Jul-19
B
1
Aug-19
B
0
Sep-19
B
0
Oct-19
B
0
Nov-19
B
0
Dec-19
B
0

add missing month in sales

I have a sales table with below values.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-07-01,1234,9
2020-03-01,3241,8
2020-07-01,3241,4
As you can see first purchase was for CustomerID = 1234 in Jan 2020 and for CustomerID = 3241 in MAR 2020.
I want on output where in all the date should be filled up with 0 purchase value.
means if there is no sale between Jan and July Then output should be as below.
TransactionDate,CustomerID,Quantity
2020-01-01,1234,5
2020-02-01,1234,0
2020-03-01,1234,0
2020-04-01,1234,0
2020-05-01,1234,0
2020-06-01,1234,0
2020-07-01,1234,9
2020-03-01,3241,8
2020-04-01,3241,0
2020-05-01,3241,0
2020-06-01,3241,0
2020-07-01,3241,4
You can use a recursive query to create the missing dates per customer.
with recursive dates (customerid, transactiondate, max_transactiondate) as
(
select customerid, min(transactiondate), max(transactiondate)
from sales
group by customerid
union all
select customerid, dateadd(month, 1, transactiondate), max_transactiondate
from dates
where transactiondate < max_transactiondate
)
select
d.customerid,
d.transactiondate,
coalesce(s.quantity, 0) as quantity
from dates d
left join sales s on s.customerid = d.customerid and s.transactiondate = d.transactiondate
order by d.customerid, d.transactiondate;
This is a convenient place to use a recursive CTE. Assuming all your dates are on the first of the month:
with cr as (
select customerid, min(transactiondate) as mindate, max(transactiondate) as maxdate
from t
group by customerid
union all
select customerid, dateadd(month, 1, mindate), maxdate
from cr
where mindate < maxdate
)
select cr.customerid, cr.mindate as transactiondate, coalesce(t.quantity, 0) as quantity
from cr left join
t
on cr.customerid = t.customerid and
cr.mindate = t.transactiondate;
Here is a db<>fiddle.
Note that if you have more than 100 months to fill in, then you will need option (maxrecursion 0).
Also, this can easily be adapted if the dates are not all on the first of the month. But you would need to explain what the result set should look like in that case.
[EDIT] Based on what other posted I updated the code.
;with
min_date_cte(MinTransactionDate, MaxTransactionDate) as (
select min(TransactionDate), max(TransactionDate) from tsales),
unq_yrs_cte(year_int) as (
select distinct year(TransactionDate) from tsales),
unq_cust_cte(CustomerID) as (
select distinct CustomerID from tsales)
select datefromparts(uyc.year_int, v.month_int, 1) TransactionDate,
ucc.CustomerID,
isnull(t.Quantity, 0) Quantity
from min_date_cte mdc
cross join unq_yrs_cte uyc
cross join unq_cust_cte ucc
cross join (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(month_int)
left join tsales t on datefromparts(uyc.year_int, v.month_int, 1)=t.TransactionDate
and ucc.CustomerID=t.CustomerId
where
datefromparts(uyc.year_int, v.month_int, 1)>=mdc.MinTransactionDate
and datefromparts(uyc.year_int, v.month_int, 1)<=mdc.MaxTransactionDate;
Results
TransactionDate CustomerID Quantity
2020-01-01 1234 5
2020-01-01 3241 0
2020-02-01 1234 0
2020-02-01 3241 0
2020-03-01 1234 0
2020-03-01 3241 8
2020-04-01 1234 0
2020-04-01 3241 0
2020-05-01 1234 0
2020-05-01 3241 0
2020-06-01 1234 0
2020-06-01 3241 0
2020-07-01 1234 9
2020-07-01 3241 4
You can make use of recursive query:
WITH cte1 as
(
select customerid, min([TransactionDate]) as Monthly_date, max([TransactionDate]) as end_date from calender_table
group by customerid
union all
select customerid, dateadd(month, 1, Monthly_date), end_date from cte1
where Monthly_date < end_date
)
select a.Monthly_date, a.customerid,coalesce(b.quantity, 0) from cte1 a left outer join calender_table b
on (a.Monthly_date = b.[TransactionDate] and a.customerid = b.customerid)
order by a.customerid, a.Monthly_date;

Get count of orders created monthly

I'm trying to list the total number of orders for the last 12 rolling months (not including the current month).
This is my query:
Select
Year(CreatedOn)*100+Month(CreatedOn) YearMonth,
Count(*) OrderCount
From Orders
Where DateDiff(MM,CreatedOn,GetUTCDate()) Between 1 And 12
Group By Year(CreatedOn), Month(CreatedOn)
Order By YearMonth
As expected, I am getting the results correctly. However, when there are no orders in a specific month, the month is excluded from the result completely. I would like to show that month with 0. See sample result:
201809 70
201810 8
201811 53
201812 67
201901 15
201902 13
201903 10
201905 12
201908 9
See the missing months 201904, 201906 and 201907. There should be a total of 12 rows.
The query should be executable within a sub-query using For XML Path so that I can get a comma separated list of orders in the last 12 months.
How can I accomplish this?
You need to generate the rows that you want somehow. One method uses a recursive CTE:
with dates as (
select Year(getdate())*100+Month(getdate()) as yearmonth,
1 as n, datefromparts(year(getdate()), month(getdate()), 1) as yyyymm
union all
select year(dateadd(month, -1, yyyymm)) * 100 + month(dateadd(month, -1, yyyymm),
n + 1,
dateadd(month, -1, yyyymm)
from cte
where n < 12
),
q as (
<your query here>
)
select d.yearmonth, coalesce(q.orders, 0) as orders
from dates d left join
q
on d.yearmonth = q.yearmonth;
Check this-
WITH R(N) AS
(
SELECT 1
UNION ALL
SELECT N+1
FROM R
WHERE N < 12
)
SELECT REPLACE(LEFT(CAST (DATEADD(MONTH,DATEDIFF(MONTH,0,(DATEADD(MONTH,-N,GetUTCDate()))),0) AS DATE),7),'-','') AS [YearMonth],ISNULL(o.OrderCount,0) as OrderCount
FROM R A
LEFT JOIN
(
Select
Year(CreatedOn)*100+Month(CreatedOn) YearMonth,
Count(*) OrderCount
From Orders
Where DateDiff(MM,CreatedOn,GetUTCDate()) Between 1 And 12
Group By Year(CreatedOn), Month(CreatedOn)
) O ON O.YearMonth=REPLACE(LEFT(CAST (DATEADD(MONTH,DATEDIFF(MONTH,0,(DATEADD(MONTH,-N,GetUTCDate()))),0) AS DATE),7),'-','')
Order By REPLACE(LEFT(CAST (DATEADD(MONTH,DATEDIFF(MONTH,0,(DATEADD(MONTH,-N,GetUTCDate()))),0) AS DATE),7),'-','');

SUM from Specific Date until the end of the month SQL

I have the following table:
ID GROUPID oDate oValue
1 A 2014-06-01 100
2 A 2014-06-02 200
3 A 2014-06-03 300
4 A 2014-06-04 400
5 A 2014-06-05 500
FF. until the end of the month
30 A 2014-06-30 600
I have 3 kinds of GROUPID, and each group will create one record per day.
I want to calculate the total of oValue from the 2nd day of each month until the end of the month. So the total of June would be from 2/Jun/2014 until 30/Jun/2014. If July, then the total would be from 2/Jul/2014 until 31/Jul/2014.
The output will be like this (sample):
GROUPID MONTH YEAR tot_oValue
A 6 2014 2000
A 7 2014 3000
B 6 2014 1500
B 7 2014 5000
Does anyone know how to solve this with sql syntax?
Thank you.
You can use a correlated subquery to get this:
SELECT T.ID,
T.GroupID,
t.oDate,
T.oValue,
ct.TotalToEndOfMonth
FROM T
OUTER APPLY
( SELECT TotalToEndOfMonth = SUM(oValue)
FROM T AS T2
WHERE T2.GroupID = T.GroupID
AND T2.oDate >= T.oDate
AND T2.oDate < DATEADD(MONTH, DATEDIFF(MONTH, 0, T.oDate) + 1, 0)
) AS ct;
For your example data this gives:
ID GROUPID ODATE OVALUE TOTALTOENDOFMONTH
1 A 2014-06-01 100 2100
2 A 2014-06-02 200 2000
3 A 2014-06-03 300 1800
4 A 2014-06-04 400 1500
5 A 2014-06-05 500 1100
30 A 2014-06-30 600 600
Example on SQL Fiddle
For future reference if you ever upgrade, in SQL Server 2012 (and later) this becomes even easier with windowed aggregate functions that allow ordering:
SELECT T.*,
TotalToEndOfMonth = SUM(oValue)
OVER (PARTITION BY GroupID,
DATEPART(YEAR, oDate),
DATEPART(MONTH, oDate)
ORDER BY oDate DESC)
FROM T
ORDER BY oDate;
Example on SQL Fiddle
EDIT
If you only want this for the 2nd of each month, but still need all the fields then you can just filter the results of the first query I posted:
SELECT T.ID,
T.GroupID,
t.oDate,
T.oValue,
ct.TotalToEndOfMonth
FROM T
OUTER APPLY
( SELECT TotalToEndOfMonth = SUM(oValue)
FROM T AS T2
WHERE T2.GroupID = T.GroupID
AND T2.oDate >= T.oDate
AND T2.oDate < DATEADD(MONTH, DATEDIFF(MONTH, 0, T.oDate) + 1, 0)
) AS ct
WHERE DATEPART(DAY, T.oDate) = 2;
Example on SQL Fiddle
If you are only concerned with the total then you can use:
SELECT T.GroupID,
[Month] = DATEPART(MONTH, oDate),
[Year] = DATEPART(YEAR, oDate),
tot_oValue = SUM(T.oValue)
FROM T
WHERE DATEPART(DAY, T.oDate) >= 2
GROUP BY T.GroupID, DATEPART(MONTH, oDate), DATEPART(YEAR, oDate);
Example on SQL Fiddle
Not sure whether you have data for different years
Select YEAR(oDate),MONTH(oDate),SUM(Value)
From #Temp
Where DAY(oDate)>1
Group By YEAR(oDate),MONTH(oDate)
If you want grouped per GROUPID, year and month this should do it:
SELECT
GROUPID,
[MONTH] = MONTH(oDate),
[YEAR] = YEAR(oDate),
tot_oValue = SUM(ovalue)
FROM your_table
WHERE DAY(odate) > 1
GROUP BY GROUPID, YEAR(oDate), MONTH(oDate)
ORDER BY GROUPID, YEAR(oDate), MONTH(oDate)
This query produces required output:
SELECT GROUPID, MONTH(oDate) AS "Month", YEAR(oDate) AS "Year", SUM(oValue) AS tot_oValue
FROM table_name
WHERE DAY(oDate) > 1
GROUP BY GROUPID, YEAR(oDate), MONTH(oDate)
ORDER BY GROUPID, YEAR(oDate), MONTH(oDate)