SQL query that does two GROUP BYs? - sql

I'm having trouble getting the SQL for a report I need to generate. I've got the (equivalent of the) following setup:
A table articles (fields such as as name, manufacturer_id, etc).
A table sales.
FK to articles called article_id
An integer called amount
A date field
A field called type. We can assume it is a string and it can have 3 known values - 'morning', 'evening' and 'night'
I want to generate an aggregated sales report, given a start date and end date:
+--------------+---------------+--------------+-------------+
| article_name | morning_sales | evening_sales| night_sales |
+--------------+---------------+--------------+-------------+
| article 1 | 0 | 12 | 2 |
| article 2 | 80 | 3 | 0 |
... ... ... ... ...
| article n | 37 | 12 | 1 |
+--------------+---------------+--------------+-------------+
I'd like to do it as efficiently as possible. So far I've been able to generate a query that will give me one type of sale (morning, evening or night) but I'm not able to do it for multiple types simultaneously. Is it even possible?
This is what I have so far; it'll give me the article name and morning sales of all the articles in a given period - in other words, the first two columns of the report:
SELECT articles.name as article_name,
SUM(sales.amount) as morning_sales,
FROM "sales"
INNER JOIN articles ON articles.id = sales.articles_id
WHERE ( sales.date >= '2011-05-09'
AND sales.end_date <= '2011-05-16'
AND sales.type = 'morning'
)
GROUP BY sales.article_id
I guess I could do the same for evening and night, but the articles will be different; some articles might have sales in the morning but not in the evening, for example.
If I have to do 1 request per sale type, how do I "mix and match" the different article lists I'll get?
Is there a better way do do this (maybe with subqueries of some sort)?
Similarly, I'm able to build a query that gives me all the morning, evening and night sales, grouping by type. I guess my problem is that I need to do two GROUP BYs in order to get this report. I don't know how to do that, if it's possible at all.
I'm using PostgreSQL as my DB, but I'd like to stay as standard as possible. If it helps, the app using this is a Rails app.

Fortunately, you don't need to do multiple queries with your database format. This should work for you:
SELECT
articles.name AS article_name
SUM(IF(sales.type = 'morning', sales.amount, 0.0)) AS morning_sales,
SUM(IF(sales.type = 'evening', sales.amount, 0.0)) AS evening_sales,
SUM(IF(sales.type = 'night', sales.amount, 0.0)) AS night_sales
FROM sales
JOIN articles ON sales.article_id = articles.id
WHERE
sales.date >= "2011-01-01 00:00:00"
AND sales.date < "2011-02-01 00:00:00"
GROUP BY sales.article_id
And if there are other types, you would have to add more columns there, OR simply sum up the other types by adding this to the SELECT clause:
SUM(
IF(sales.type IS NULL OR sales.type NOT IN ('morning', 'evening', 'night'),
sales.amount, 0.0
)
) AS other_sales
The above is compatible with MySQL. To use it in Postgres, I think you'd have to change the IF expressions to CASE expressions, which should look like this (untested):
SELECT
articles.name AS article_name,
SUM(CASE WHEN sales.type = 'morning' THEN sales.amount ELSE 0.0 END) AS morning_sales,
SUM(CASE WHEN sales.type = 'evening' THEN sales.amount ELSE 0.0 END) AS evening_sales,
SUM(CASE WHEN sales.type = 'night' THEN sales.amount ELSE 0.0 END) AS night_sales
FROM sales
JOIN articles ON sales.article_id = articles.id
WHERE
sales.date >= "2011-01-01 00:00:00"
AND sales.date < "2011-02-01 00:00:00"
GROUP BY sales.article_id

Two options.
Option 1. A single join with computed columns for agreggation
select article_name = a.article_name ,
morning_sales = sum( case when sales.type = 'morning' then sales.amount end ) ,
evening_sales = sum( case when sales.type = 'evening' then sales.amount end ) ,
night_sales = sum( case when sales.type = 'night' then sales.amount end ) ,
other_sales = sum( case when sales.type in ( 'morning','evening','night') then null else sales.amount end ) ,
total_sales = sum( sales.amount )
from articles a
join sales s on s.articles_id = a.articles_id
where sales.date between #dtFrom and #dtThru
group by a.article_name
Option 2. multiple left joins
select article_name = a.article_name ,
morning_sales = sum( morning.amount ) ,
evening_sales = sum( evening.amount ) ,
night_sales = sum( night.amount ) ,
other_sales = sum( other.amount ) ,
total_sales = sum( total.amount )
from articles a
left join sales morning on morning.articles_id = a.articles_id
and morning.type = 'morning'
and morning.date between #dtFrom and #dtThru
left join sales evening on evening.articles_id = a.articles_id
and evening.type = 'evening'
and evening.date between #dtFrom and #dtThru
left join sales night on night.articles_id = a.articles_id
and night.type = 'evening'
and night.date between #dtFrom and #dtThru
left join sales other on other.articles_id = a.articles_id
and ( other.type is null
OR other.type not in ('morning','evening','night')
)
and other.date between #dtFrom and #dtThru
left join sales total on total.articles_id = a.articles_id
and total.date between #dtFrom and #dtThru
group by a.article_name

Related

Can I left join twice to do multiple calculations?

I am trying to calculate if a member shops in January, what proportion shop again in February and what proportion shop again within 3 months. Ultimately to create a table similar to the image attached.
I have tried the below code. The first left join works, but when I add the second one to calculate within_3months the error: "FROM keyword not found where expected" is shown (for the separate line). Can I left join twice or must I do separate scripts for columns?
, count(distinct B.members)/count(distinct A.members) *100 as 1month_retention_rate
select
year_month_january21
, count(distinct A.members) as num_of_mems_shopped_january21
, count(distinct B.members)as retained_february21
, count(distinct B.members)/count(distinct A.members) *100 as 1month_retention_rate
, count(distinct C.members)/count(distinct A.members) *100 as within_3months
from
(select
members
, year_month as year_month_january21
from table.members t
join table.date tm on t.dt_key = tm.date_key
and year_month = 202101
group by
members
, year_month) A
left join
(select
members
, year_month as year_month_february21
from table.members t
join table.date tm on t.dt_key = tm.date_key
and year_month = 202102
group by
members
, year_month) B on A.members = B.members
left join
(select
members
, year_month as year_month_3months
from table.members t
join table.date tm on t.dt_key = tm.date_key
and year_month between 202102 and 202104
group by
members
, year_month) C on A.members = C.members
group by
year_month_january21;
I have tried left creating a separate time table and joining to this. It does not work. Doing calculations separately works but I must do this for multiple time frames so will take a long time.
The error isn't coming from the added left join, it's from the as 1month_retention_rate part, because it's an illegal name.
You can see that more simply with:
select dummy as 1month_retention_rate
from dual;
ORA-00923: FROM keyword not found where expected
You could change the column alias so it follows the naming rules (specifically here, does not start with a digit), or if that specific name is actually required then you could make it a quoted identifier - generally not a good option, but sometimes OK in the final output of a query.
fiddle
So in your code you would just change your new line
, count(distinct B.members)/count(distinct A.members) *100 as 1month_retention_rate
to something like
, count(distinct B.members)/count(distinct A.members) *100 as one_month_retention_rate
or with a quoted identifier
, count(distinct B.members)/count(distinct A.members) *100 as "1month_retention_rate"
fiddle - which still errors but now with ORA-00942 as I don't have your tables, and that is after changing your obfuscated schema/table names to something legal too.
There may be more efficient ways to perform the calculation, but that's a separate issue...
I could understand that you want to get :
count of all members who visited in Jan.
count of all members who visited in Jan and visited again in Feb.
count of all members who visited in Jan and visited again in Feb, Mars and April.
If my understanding is true then you could simplify your inner query using IF instead of LEFT JOIN .
Take a look on the following query. Assuming that table members have an ID field :
SELECT
mem_jan AS num_of_mems_shopped_january21,
mem_feb AS retained_february21,
mem_feb / mem_jan * 100 as 1month_retention_rate
mem_3m / mem_jan * 100 as within_3months
FROM(
SELECT
SUM(IF(mm_jan>0,1,0) AS mem_jan,
SUM(IF(mm_jan>0 AND mm_feb>0,1,0) AS mem_feb,
SUM(IF(mm_jan>0 AND mm_count_3m>0,1,0) AS mem_3m
FROM
(
SELECT
t.Id,
SUM(IF(year_month = 202101, 1,0)) AS mm_jan, /*visit for a member in Jan*/
SUM(IF(year_month = 202102, 1,0)) AS mm_feb, /*visit for a member in Feb*/
SUM(IF(year_month between 202102 and 202104,1,0)) AS mem_3m/*visit for a member in 3 months*/
FROM
table.members t
join table.date tm on t.dt_key = tm.date_key
WHERE
year_month between 202101 and 202104
GROUP BY
t.Id
) AS t1
) AS t2
This is not a final running query but it can explain my idea. According to your engine you may use CASE or IF THEN ELSE
Don't use multiple joins, count the shops per member per month and then use conditional aggregation.
In Oracle, that would be:
SELECT 202101 AS year_month,
COUNT(CASE WHEN cnt_202101 > 0 THEN 1 END)
AS members_shopped_202101,
COUNT(CASE WHEN cnt_202101 > 0 AND cnt_202102 > 0 THEN 1 END)
AS members_retained_202102,
COUNT(CASE WHEN cnt_202101 > 0 AND cnt_202102 > 0 THEN 1 END)
/ COUNT(CASE WHEN cnt_202101 > 0 THEN 1 END) * 100
AS one_month_retention_rate,
COUNT(CASE WHEN cnt_202101 > 0 AND (cnt_202102 > 0 OR cnt_202103 > 0 OR cnt_202104 > 0) THEN 1 END)
/ COUNT(CASE WHEN cnt_202101 > 0 THEN 1 END) * 100
AS within_3months
FROM (
SELECT members,
year_month
FROM members m
INNER JOIN date d
ON m.dt_key = d.date_key
)
PIVOT (
COUNT(*)
FOR year_month IN (
202101 AS cnt_202101,
202102 AS cnt_202102,
202103 AS cnt_202103,
202104 AS cnt_202104
)
);

SQL Case When Slowing Down Query

What I'm looking to do is quantify the total value of purchases and the number of months in which a purchase was made within three different timeframes by account. I only want to look at accounts who made a purchase between 1-1-2020 and 4-1-2021.
I'm wondering if there is a more streamlined way to pull in the fields I'm creating using CASE WHEN below (maybe through a series of queries to create the calculations and the left joining?). This query is taking extremely long to pull back, so I'd like to enhance this code where I can. All of my code and desired output is listed below. Thank you!
Creating a temporary table to pull account numbers:
DROP TABLE IF EXISTS #accounts
SELECT DISTINCT s.account_no, c.code, c.code_desc
INTO #accounts
FROM sales AS s
LEFT JOIN customer AS c ON s.account_no = c.account_no
WHERE s.tran_date BETWEEN '2020-01-01' AND '2021-04-01'
GROUP BY s.account_no, c.code, c.code_desc;
Confirming row counts:
SELECT COUNT (*)
FROM #accounts
ORDER BY account_no;
Creating Sales and Sales period count columns for three timeframes:
SELECT
s.account_no, c.code, c.code_desc
SUM(CASE
WHEN s.tran_date BETWEEN '2020-01-01' AND '2021-04-01'
THEN VALUE_USD
END) AS Total_Spend_Pre,
SUM(CASE
WHEN s.tran_date BETWEEN '2021-04-01' AND '2022-03-31'
THEN VALUE_USD
END) Total_Spend_During,
SUM(CASE
WHEN s.tran_date > '2022-04-01'
THEN VALUE_USD
END) Total_Spend_Post,
COUNT(DISTINCT CASE WHEN s.tran_date BETWEEN '2020-01-01' AND '2021-04-01' THEN CONCAT(s.bk_month, s.bk_year) END) Pre_Periods,
COUNT(DISTINCT CASE WHEN s.tran_date BETWEEN '2021-04-01' AND '2022-03-31' THEN CONCAT(s.bk_month, s.bk_year) END) During_Periods,
COUNT(DISTINCT CASE WHEN s.tran_date > '2022-04-01' THEN CONCAT(s.bk_month, s.bk_year) END) Post_Periods
FROM
sales AS s
LEFT JOIN
customer AS c ON s.account_no = c.account_no
WHERE
c.account_no IN (SELECT DISTINCT account_no
FROM #accounts)
GROUP BY
s.account_no, c.code, c.code_desc;
Desired output:
account_no
code
code_desc
Total_Spend_Pre
Total_Spend_During
Total_Spend_Post
Pre_Periods
During_Periods
Post_Periods
25
1234
OTHER
1000
2005
500
2
14
5
11
5678
PC
500
100
2220
5
11
2
You may use your date ranges to join with dataset, and 'Tag' your result like below, this will result in 3 rows, for each group. If you need them in a single row, have PIVOTE over it
;With DateRanges AS (
SELECT CAST('2020-01-01' AS DATE) StartDate, CAST('2021-04-01' AS DATE) EndDate, 'Pre' Tag UNION
SELECT '2021-04-01', '2022-03-31', 'During' UNION
SELECT '2022-04-01', Null, 'Post'
)
SELECT s.account_no, c.code, c.code_desc, d.Tag,
SUM(VALUE_USD) AS Total_Spend,
COUNT(DISTINCT CONCAT(s.bk_month, s.bk_year)) RecordCount
FROM sales as s
LEFT JOIN customer as c
INNER JOIN DateRanges D ON s.tran_date BETWEEN D.StartDate AND ISNULL(D.EndDate,s.tran_date)
ON s.account_no = c.account_no
WHERE c.account_no IN (SELECT DISTINCT account_no FROM #accounts)
GROUP BY s.account_no, c.code, c.code_desc;
with [cte_accountActivityPeriods] as (
select [PeriodOrdinal] = 1, [PeriodName] = 'Total Spend Pre', [PeriodStart] = convert(date,'2020-01-01',23) , [PeriodFinish] = convert(date,'2021-03-31',23) union
select [PeriodOrdinal] = 2, [PeriodName] = 'Total Spend During', [PeriodStart] = convert(date,'2021-04-01',23) , [PeriodFinish] = convert(date,'2022-03-31',23) union
select [PeriodOrdinal] = 3, [PeriodName] = 'Total Spend Post', [PeriodStart] = convert(date,'2022-04-01',23) , [PeriodFinish] = convert(date,'9999-12-31',23)
)
, [cte_allsalesForActivityPeriod]
SELECT s.account_no, bk_month, bk_year, [PeriodOrdinal], s.tran_date, s.value_usd
FROM sales as s
cross join [cte_accountActivityPeriods]
on s.[tran_date] between [cte_ActivityPeriods].[PeriodStart] and [cte_ActivityPeriods].[PeriodFinish]
)
, [cte_uniqueAccounts] as ( /*Unique and qualifying Accounts*/
select distinct account_no from [cte_allsalesForActivityPeriod]
inner join #accounts accs on accs.[account_no] = [cte_allsalesForActivityPeriod].[account_no]
)
, [cte_AllSalesAggregatedByPeriod] as (
select account_no, [PeriodOrdinal], bk_month, bk_year, [PeriodTotalSpend] = sum([value_usd])
from [cte_allsalesForActivityPeriod]
group by s.account_no, [PeriodOrdinal], bk_month, bk_year
)
, [cte_PeriodAnalysis] as (
select account_no, [PeriodOrdinal], [ActivePeriods] = count(distinct concat(bk_month, bk_year))
from [cte_AllSalesAggregatedByPeriod]
group by s.account_no, [PeriodOrdinal]
)
, [cte_pivot_clumsily] as (
/* Aggregations already done - so simple pivot */
select [cte_uniqueAccounts].[account_no]
, [Total_Spend_Pre] = case when [SaleVal].[PeriodOrdinal] in (1) then [SaleVal].[PeriodTotalSpend] else 0 end
, [Total_Spend_During] = case when [SaleVal].[PeriodOrdinal] in (2) then [SaleVal].[PeriodTotalSpend] else 0 end
, [Total_Spend_Post] = case when [SaleVal].[PeriodOrdinal] in (3) then [SaleVal].[PeriodTotalSpend] else 0 end
, [Pre_Periods] = case when [SalePrd].[PeriodOrdinal] in (1) then [SalePrd].[ActivePeriods] else 0 end
, [During_Periods] = case when [SalePrd].[PeriodOrdinal] in (2) then [SalePrd].[ActivePeriods] else 0 end
, [Post_Periods] = case when [SalePrd].[PeriodOrdinal] in (3) then [SalePrd].[ActivePeriods] else 0 end
from [cte_uniqueAccounts]
left join [cte_AllSalesAggregatedByPeriod] [SaleVal] on [SaleVal].[account_no] = [cte_uniqueAccounts].[account_no]
left join [cte_PeriodAnalysis] [SalePrd] on [SalePrd].[account_no] = [cte_uniqueAccounts].[account_no]
)
select c.code, c.code_desc, [cte_pivot_clumsily].*
from [cte_pivot_clumsily]
LEFT JOIN customer as c
ON [cte_pivot_clumsily].account_no = c.account_no

SQL query to use group by to get the sum of two different columns within a date range

I have two tables time track and absence for an employee.
person_number Measure start_Date end_date Time_type
73636 10 01-Jan-2020 02-Jan-2020 Double
73636 24 06-Jan-2020 08-jan-2020 Double
73636 10 15-Jan-2020 25-Jan-2020 Regular Pay
73636 11.9 06-Jan-2020 08-jan-2020 Double
73636 27 10-Jan-2020 15-Jan-2020 Regular Pay
Absence det
person_number start_Date end_date duration Absence_type
73636 05-Jan-2020 10-Jan-2020 10 Vacation
73636 06-Jan-2020 18-jan-2020 9 Paid Leave
73636 20-Jan-2020 21-jan-2020 1 Paid Leave
Now when i pass the from and to date as 01-Jan-2020 and 31-Jan-2020, the output should look like -
Person_Number Double Regular Hour_code hour_amount
73636 31.9 37 Paid Leave 10
The hour_code should have only "Paid Leave" and no other absences
Now I have written the below query for this
SELECT
distinct person_number,
sum(
CASE
WHEN elements = 'Double' THEN measure
END
) AS OT_Hours,
sum(
CASE
WHEN elements LIKE 'Regular Pay%' THEN measure
END
) AS regular_measure_hours,
sum(
CASE
WHEN absence_name IN ('Paid Leave') THEN absence_duration
END
) AS hour3_amount,
max(
CASE
WHEN absence_name IN ('Paid Leave') THEN 'Paid Leave'
END
) AS hour3_code
FROM
(
select
person_number,
Time_type elements,
Absence_type absence_name,
duration,
measure
from
time_track_tab,
abs_tab,
per_all_people_F papf
where
time_track_tab.person_id = abs_tab.person_id
and abs_tab.person_id = papf.person_id
and abs_tab.Absence_type = 'Paid Leave'
)
group by
person_number
This is giving me multiple row output and calculation of sum is not coming correctly as in between the to and from date there are different dates present for both absence and time track.
My requirement is to calculate the sum of ALL the duration and measure column within these parameter dates. How can i tweak my query to get the correct sum between these dates ?
Is there a way to use partition by or group by or anything else to calculation these correctly in the column
You probably need to group both tables first then join them together to avoid the cross join.
select person_number, TimeTrack.DoublePay, TimeTrack.Regular,
Absenses.Hour_code, Absenses.hour_amount from
per_all_people_F papf,
(select
person_id, sum(duration) as hour_amount, Absence_type as Hour_code
from
abs_tab
where
abs_tab.Absence_type = 'Paid Leave'
and
start_Date between '2020-01-01' and '2020-01-31'
group by person_id,Absence_type
) Absenses,
(select
person_id,
sum(case when Time_type = 'Double' then Measure end) as DoublePay,
sum(case when Time_type = 'Regular Pay' then Measure end) as Regular
from time_track_tab
where
start_Date between '2020-01-01' and '2020-01-31'
group by person_id
) TimeTrack
where
papf.person_id = TimeTrack.person_id
and
papf.person_id = Absenses.person_id
and
papf.person_id = 73636
I made a SqlFiddle if you want to play with it
http://sqlfiddle.com/#!9/03e460/36
Also my 2 cents; I'd recommend left outer joining from the per_all_people_F table or else people without absenses will get filtered out.
See if, what you need is something like this:
select * from
(SELECT person_number,
sum(
CASE
WHEN Time_type = 'Double' THEN measure
END
) AS Double,
sum(
CASE
WHEN Time_type = ('Regular Pay') THEN measure
END
) AS regular
from time_track_tab
group by person_number
) A
inner join
(SELECT
person_number,
sum(
CASE
WHEN Absence_type = 'Vacation' THEN duration
END
) AS Vacation,
sum(
CASE
WHEN Absence_type = ('Paid Leave') THEN duration
END
) AS paidLeave
from abs_tab
group by person_number
)B on A.person_number = B.person_number
here the fiddle:
http://sqlfiddle.com/#!4/21253/2

Simplify complex Query

I need to simplify a complex old query in order to filter is with date range.
I got a table with Tickets and TicketNotes.
I need
a column with the Tickets count of the day
a column with the Tickets count with a specific note of the day
the date
The old query
SELECT SUM(IFNULL(qtickets.count, 0)) j, SUM(IFNULL(mtickets.count, 0)) m FROM (
SELECT
COUNT(tickets.id) COUNT,
DATE(tickets.date) DATE
FROM
tickets
WHERE
tickets.status = 'Closed' AND tickets.did = 7
AND MONTH(tickets.date) = MONTH( CURRENT_DATE - INTERVAL 1 MONTH )
AND YEAR(tickets.date) = YEAR( CURRENT_DATE - INTERVAL 1 MONTH )
GROUP BY
DATE(tickets.date)
) AS mtickets LEFT JOIN (
SELECT
1 AS COUNT,
DATE(tickets.date) DATE
FROM
ticketnotes
INNER JOIN tickets ON tickets.id = ticketnotes.ticketid
WHERE
ticketnotes.message LIKE '%https://xxxxx.net/help/tickets/%'
AND tickets.status = 'Closed'
AND tickets.did = 7
AND MONTH(tbltickets.date) = MONTH( CURRENT_DATE - INTERVAL 1 MONTH )
AND YEAR(tbltickets.date) = YEAR( CURRENT_DATE - INTERVAL 1 MONTH )
GROUP BY
DATE(tickets.date)
) AS qtickets ON (mtickets.date = qtickets.date)
The goal is to get a result of
Date | M | Q
===================
2020-04-01 | 1 | 1
2020-04-02 | 2 | 1
2020-04-03 | 5 | 2
...
2020-04-30 | 3 | 0
With M be the total closed tickets of the day for did = 7 and Q be the total closed tickets that got the note.message.
I need to check the query with one instance of date filter date BETWEEN '2020-04-01' AND '2020-04-30' and still get the correct three columns.
=======
UPDATE:
When I'm trying to add AND DATE(tickets.date) BETWEEN DATE('2020-04-01') AND DATE('2020-04-30') in Gordon's answer, I got other result data from my primary query.
QUERY:
SELECT
DATE(t.date),
COUNT(t.id) AS num_tickets,
(CASE WHEN COUNT(tn.ticketid) = 0 THEN 0 ELSE 1 END) AS num_with_message
FROM
tickets t
LEFT JOIN ticketnotes tn ON
tn.ticketid = t.id AND tn.message LIKE '%https://xxxxx.net/help/tickets/%'
WHERE
t.status = 'Closed' AND t.did = 7
AND DATE(t.date) BETWEEN DATE('2020-04-01') AND DATE('2020-04-30')
GROUP BY
DATE(t.date)
The result is getting num_tickets with wrong data as getting num_ticket without JOIN.
Any suggestions ?
You could try using case for the ehere like
SELECT
DATE(tickets.date) DATE
, COUNT(tickets.id) M
, case sum( ticketnotes.message LIKE '%https://xxxxx.net/help/tickets/%' <> 0 ) then 1 else null end Q
FROM
ticketnotes
INNER JOIN tickets ON tickets.id = ticketnotes.ticketid
WHERE tickets.status = 'Closed'
AND tickets.did = 7
AND MONTH(tbltickets.date) = MONTH( CURRENT_DATE - INTERVAL 1 MONTH )
AND YEAR(tbltickets.date) = YEAR( CURRENT_DATE - INTERVAL 1 MONTH )
GROUP BY DATE(tickets.date)
This answers the original version of the question.
What you are describing sounds like a group by with left join. However, it is not clear what exactly you are looking for. My best guess is:
select date(t.date), count(t.id) as num_tickets,
count(tn.ticketid) as num_with_message
from tickets t left join
ticketnotes tn
on tn.ticketid = t.id and
tn.message like '%https://xxxxx.net/help/tickets/%'
where t.status = 'Closed' and
t.did = 7
group by date(t.date)

SQL Query in CRM Report

A "Case" in CRM has a field called "Status" with four options.
I'm trying to
build a report in CRM that fills a table with every week of the year (each row is a different week), and then counts the number of cases that have each Status option (the columns would be each of the Status options).
The table would look like this
Status 1 Status 2 Status 3
Week 1 3 55 4
Week 2 5 23 5
Week 3 14 11 33
So far I have the following:
SELECT
SUM(case WHEN status = 1 then 1 else 0 end) Status1,
SUM(case WHEN status = 2 then 1 else 0 end) Status2,
SUM(case WHEN status = 3 then 1 else 0 end) Status3,
SUM(case WHEN status = 4 then 1 else 0 end) Status4,
SUM(case WHEN status = 5 then 1 else 0 end) Status5
FROM [DB].[dbo].[Contact]
Which gives me the following:
Status 1 Status 2 Status 3
2 43 53
Now I need to somehow split this into 52 rows for the past year and filter these results by date (columns in the Contact table). I'm a bit new to SQL queries and CRM - any help here would be much appreciated.
Here is a SQLFiddle with my progress and sample data: http://sqlfiddle.com/#!2/85b19/1
Sounds like you want to group by a range. The trick is to create a new field that represents each range (for you one per year) and group by that.
Since it also seems like you want an infinite range of dates, marc_s has a good summary for how to do the group by trick with dates in a generic way: SQL group by frequency within a date range
So, let's break this down:
You want to make a report that shows, for each contact, a breakdown, week by week, of the number of cases registered to that contact, which is divided into three columns, one for each StateCode.
If this is the case, then you would need to have 52 date records (or so) for each contact. For calendar like requests, it's always good to have a separate calendar table that lets you query from it. Dan Guzman has a blog entry that creates a useful calendar table which I'll use in the query.
WITH WeekNumbers AS
(
SELECT
FirstDateOfWeek,
-- order by first date of week, grouping calendar year to produce week numbers
WeekNumber = row_number() OVER (PARTITION BY CalendarYear ORDER BY FirstDateOfWeek)
FROM
master.dbo.Calendar -- created from script
GROUP BY
FirstDateOfWeek,
CalendarYear
), Calendar AS
(
SELECT
WeekNumber =
(
SELECT
WeekNumber
FROM
WeekNumbers WN
WHERE
C.FirstDateOfWeek = WN.FirstDateOfWeek
),
*
FROM
master.dbo.Calendar C
WHERE
CalendarDate BETWEEN '1/1/2012' AND getutcdate()
)
SELECT
C.FullName,
----include the below if the data is necessary
--Cl.WeekNumber,
--Cl.CalendarYear,
--Cl.FirstDateOfWeek,
--Cl.LastDateOfWeek,
'Week: ' + CAST(Cl.WeekNumber AS VARCHAR(20))
+ ', Year: ' + CAST(Cl.CalendarYear AS VARCHAR(20)) WeekNumber
FROM
CRM.dbo.Contact C
-- use a cartesian join to produce a table list
CROSS JOIN
(
SELECT
DISTINCT WeekNumber,
CalendarYear,
FirstDateOfWeek,
LastDateOfWeek
FROM
Calendar
) Cl
ORDER BY
C.FullName,
Cl.WeekNumber
This is different from the solution Ben linked to because Marc's query only returns weeks where there is a matching value, whereas you may or may not want to see even the weeks where there is no activity.
Once you have your core tables of contacts split out week by week as in the above (or altered for your specific time period), you can simply add a subquery for each StateCode to see the breakdown in columns as in the final query below.
WITH WeekNumbers AS
(
SELECT
FirstDateOfWeek,
WeekNumber = row_number() OVER (PARTITION BY CalendarYear ORDER BY FirstDateOfWeek)
FROM
master.dbo.Calendar
GROUP BY
FirstDateOfWeek,
CalendarYear
), Calendar AS
(
SELECT
WeekNumber =
(
SELECT
WeekNumber
FROM
WeekNumbers WN
WHERE
C.FirstDateOfWeek = WN.FirstDateOfWeek
),
*
FROM
master.dbo.Calendar C
WHERE
CalendarDate BETWEEN '1/1/2012' AND getutcdate()
)
SELECT
C.FullName,
--Cl.WeekNumber,
--Cl.CalendarYear,
--Cl.FirstDateOfWeek,
--Cl.LastDateOfWeek,
'Week: ' + CAST(Cl.WeekNumber AS VARCHAR(20)) +', Year: ' + CAST(Cl.CalendarYear AS VARCHAR(20)) WeekNumber,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Active'
) ActiveCases,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Resolved'
) ResolvedCases,
(
SELECT
count(*)
FROM
CRM.dbo.Incident I
INNER JOIN CRM.dbo.StringMap SM ON
I.StateCode = SM.AttributeValue
INNER JOIN
(
SELECT
DISTINCT ME.Name,
ME.ObjectTypeCode
FROM
CRM.MetadataSchema.Entity ME
) E ON
SM.ObjectTypeCode = E.ObjectTypeCode
WHERE
I.ModifiedOn >= Cl.FirstDateOfWeek
AND I.ModifiedOn < dateadd(day, 1, Cl.LastDateOfWeek)
AND E.Name = 'incident'
AND SM.AttributeName = 'statecode'
AND SM.LangId = 1033
AND I.CustomerId = C.ContactId
AND SM.Value = 'Canceled'
) CancelledCases
FROM
CRM.dbo.Contact C
CROSS JOIN
(
SELECT
DISTINCT WeekNumber,
CalendarYear,
FirstDateOfWeek,
LastDateOfWeek
FROM
Calendar
) Cl
ORDER BY
C.FullName,
Cl.WeekNumber