Multiple Sub-Queries In A SQL Query - sql

I am creating a sample query that'll convert rows to column something as follows:
Person_Id Total Earned Leave Earned Leave Enjoyed Remaining Earned Leave Total Casual Leave Casual Leave Enjoyed Remaining Casual Leave
1001 20 10 10 20 4 16
So above is the output I get and used multiple sub-queries using the following query:
SELECT DISTINCT m.Person_Id, (SELECT k.Leave_Allocation FROM LeaveDetails k WHERE k.Leave_Name = 'Earn Leave'
AND k.Person_Id = 1001 AND k.[Year] = '2017') AS 'Total Earned Leave',
(SELECT o.Leave_Enjoy FROM LeaveDetails o WHERE o.Leave_Name = 'Earn Leave'
AND o.Person_Id = 1001 AND o.[Year] = '2017') AS 'Earned Leave Enjoyed',
(SELECT p.Leave_Remain FROM LeaveDetails p WHERE p.Leave_Name = 'Earn Leave'
AND p.Person_Id = 1001 AND p.[Year] = '2017') AS 'Remaining Earned Leave',
(SELECT k.Leave_Allocation FROM LeaveDetails k WHERE k.Leave_Name = 'Casual Leave'
AND k.Person_Id = 1001 AND k.[Year] = '2017') AS 'Total Casual Leave',
(SELECT o.Leave_Enjoy FROM LeaveDetails o WHERE o.Leave_Name = 'Casual Leave'
AND o.Person_Id = 1001 AND o.[Year] = '2017') AS 'Casual Leave Enjoyed',
(SELECT p.Leave_Remain FROM LeaveDetails p WHERE p.Leave_Name = 'Casual Leave'
AND p.Person_Id = 1001 AND p.[Year] = '2017') AS 'Remaining Casual Leave'
FROM LeaveDetails m WHERE m.Person_Id = 1001 AND m.[Year] = '2017'
I am not sure if I am going to have performance issue here as there will be lots of data and was arguing if this will be better than Pivot or Run-Time Table Creation. I just want to make sure if this is going to be a better choice for the purpose I am trying to accomplish. You can share your ideas as well samples using SQL Server, MySQL or Oracle for better performance issue - Thanks.
Sample Table and Data:
CREATE TABLE [dbo].[LeaveDetails](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Person_Id] [nvarchar](20) NULL,
[Leave_Name] [nvarchar](40) NULL,
[Leave_Allocation] [float] NULL,
[Leave_Enjoy] [float] NULL,
[Leave_Remain] [float] NULL,
[Details] [nvarchar](100) NULL,
[Year] [nvarchar](10) NULL,
[Status] [bit] NULL
)
INSERT [dbo].[LeaveDetails] ([Id], [Person_Id], [Leave_Name], [Leave_Allocation], [Leave_Enjoy], [Leave_Remain], [Details], [Year], [Status]) VALUES (1, N'1001', N'Earn Leave', 20, 10, 10, NULL, N'2017', 1)
INSERT [dbo].[LeaveDetails] ([Id], [Person_Id], [Leave_Name], [Leave_Allocation], [Leave_Enjoy], [Leave_Remain], [Details], [Year], [Status]) VALUES (2, N'1001', N'Casual Leave', 20, 4, 16, NULL, N'2017', 1)

Use conditional aggregation:
SELECT m.Person_Id,
MAX(CASE WHEN m.Leave_Name = 'Earn Leave' THEN k.Leave_Allocation END) as [Total Earned Leave],
MAX(CASE WHEN m.Leave_Name = 'Earn Leave' THEN m.Leave_Enjoy END) as [Earned Leave Enjoyed],
MAX(CASE WHEN m.Leave_Name = 'Earn Leave' THEN m.Leave_Remain END) as [Remaining Earned Leave],
MAX(CASE WHEN m.Leave_Name = 'Casual Leave' THEN k.Leave_Allocation END) as [Total Casual Leave],
MAX(CASE WHEN m.Leave_Name = 'Casual Leave' THEN k.Leave_Remain END) as [Casual Leave Enjoyed],
MAX(CASE WHEN m.Leave_Name = 'Casual Leave' THEN k.Leave_Remain END) as [Remaining Casual Leave]
FROM LeaveDetails m
WHERE m.Person_Id = 1001 AND m.[Year] = '2017'
GROUP BY m.Person_ID;
Note: I do not advocate having special characters (such as spaces) in column aliases. If you do, use the proper escape character (square braces). Only use single quotes for string and date constants.

PIVOT would work, but it looks like this is simply a single row that you want pivoted to a columnar output and the column names are known explicitly. If that's the case, you could just UNION the single column results together:
SELECT 'Person_ID' as col_name, Person_Id as col_value FROM LeaveDetails WHERE Person_Id = 1001 AND [Year] = '2017'
UNION
SELECT 'Leave_Enjoy' as col_name, Leave_Enjoy as col_value FROM LeaveDetails WHERE Person_Id = 1001 AND [Year] = '2017'
UNION
...
It's a lot simpler to write, cleaner to read, and should run a little faster - there is still one table scan for each column. Is the table indexed on Person_ID and Year?
If speed is an issue you could create a temp table of the one row:
SELECT * into #ld_temp FROM LeaveDetails WHERE Person_Id = 1001 AND [Year] = '2017'
then select from the temp table in the SELECT/UNION code:
SELECT 'Person_ID' as col_name, Person_Id as col_value FROM #ld_temp
UNION
SELECT 'Leave_Enjoy' as col_name, Leave_Enjoy as col_value FROM #ld_temp
UNION
...
Now you're down to just a single scan of the big table.
I hope this helps.

Related

SQL when sum of supplies count reach to the demands count?

I want to generate a query in SqlServer 2014 from two tables, have no relation with each other.
The first one represents the demands. And the second one represents the supplies for them.
Demands(
[DemandId] [int] NOT NULL,
[ItemCode] [nvarchar](50) NULL,
[TotalCount] [int] NULL,
[Date] [datetime] NULL)
Supplies(
[SupplyId] [int] NOT NULL,
[ItemCode] [nvarchar](50) NULL,
[Count] [int] NULL,
[Date] [datetime] NULL)
For example, we have a demand with (TotalCount = 1000, ItemCode = 1, Date = d1)
and two Supplies in (Date = d2, Count = 300, ItemCode = 1) and (Date = d3, Count = 700, ItemCode = 1)
the demand finished in d3 Date, so I want a query to indicate when supplies have finished the demands.
consider the following data:
the result should be:
Item01 2020-01-07
Item02 2020-01-06
I appreciate any help.
A simple summary could be...
treat a demand as a negative amount of supply
combine the two datasets in to a single time series
use a cumulative sum to see the net availability
Such as...
WITH
NetContribution AS
(
SELECT [ItemCode], [Date], [Count] FROM Supplies
UNION ALL
SELECT [ItemCode], [Date], -[TotalCount] FROM Demands
)
SELECT
[ItemCode],
[Date],
[Count] AS NetAvailabilityChange,
SUM([Count])
OVER (PARTITION BY [ItemCode]
ORDER BY [Date],
[Count] DESC
)
AS NetAvailability
FROM
NetContribution
While the NetAvailability is negative, Supply has not yet met Demand. While it's positive, Supply has exceeded Demand.
EDIT: In response to your question edit...
Just use the above query and add a WHERE clause...
WITH
NetContribution AS
(
SELECT [ItemCode], [Date], [Count] FROM Supplies
UNION ALL
SELECT [ItemCode], [Date], -[TotalCount] FROM Demands
),
NetAvailability AS
(
SELECT
[ItemCode],
[Date],
[Count] AS Delta,
SUM([Count])
OVER (PARTITION BY [ItemCode]
ORDER BY [Date],
[Count] DESC
)
AS Amount
FROM
NetContribution
)
SELECT
*
FROM
NetAvailability
WHERE
Amount >= 0
This is my source data
Demand :
'1', 'A', '1000', '2020-12-01'
'4', 'B', '2000', '2020-12-01'
Supply :
'2', 'A', '700', '2020-12-05'
'3', 'A', '300', '2020-12-08'
'5', 'B', '1000', '2020-12-05'
'6', 'B', '1000','2020-12-08'
Performed the below query :
select a.itemcode, case when totaldemand - totalsupply = 0 then endsupplydate
else null end enddate from </b>
(
select 'demand' type,itemcode,sum(quantity) totaldemand,min(demanddate) as
date from demand b group by type,itemcode ) b
inner join (
select 'supply' type,itemcode,sum(quantity) totalsupply,max(supplydate) as
endsupplydate from supply group by type,itemcode) a
on a.itemcode = b.itemcode;
Output you will be getting :
ItemCode,DemandStart,SupplyEnd,QuantityLeft
'A', '2020-12-08'
'B', '2020-12-08'
In the absence of using SUM() OVER() to generate a cumulative sum, you can use a triangular join (Join the current row on to all preceding rows), but on large data sets is nastily slow...
WITH
NetContribution AS
(
SELECT [ItemCode], [Date], SUM([Count]) AS [Count]
FROM (
SELECT [ItemCode], [Date], [Count] FROM Supplies
UNION ALL
SELECT [ItemCode], [Date], -[TotalCount] FROM Demands
)
combined
GROUP BY [ItemCode], [Date]
),
NetAvailability AS
(
SELECT
a.[ItemCode],
a.[Date],
a.[Count] AS Delta,
SUM(b.[Count]) AS Amount
FROM
NetContribution AS a
INNER JOIN
NetContribution AS b
ON a.[ItemCode] = b.[ItemCode]
AND a.[Date] >= b.[Date]
GROUP BY
a.[ItemCode],
a.[Date],
a.[Count]
)
SELECT
*
FROM
NetAvailability
WHERE
Amount >= 0
https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=48660224fc63bcb2803f5a08b8b1311e

Optimizing SUM OVER PARTITION BY for several hierarchical groups

I have a table like below:
Region Country Manufacturer Brand Period Spend
R1 C1 M1 B1 2016 5
R1 C1 M1 B1 2017 10
R1 C1 M1 B1 2017 20
R1 C1 M1 B2 2016 15
R1 C1 M1 B3 2017 20
R1 C2 M1 B1 2017 5
R1 C2 M2 B4 2017 25
R1 C2 M2 B5 2017 30
R2 C3 M1 B1 2017 35
R2 C3 M2 B4 2017 40
R2 C3 M2 B5 2017 45
I need to find SUM([Spend] over different groups as follow:
Total Spend over all the rows in the whole table
Total Spend for each Region
Total Spend for each Region and Country group
Total Spend for each Region, Country and Advertiser group
So I wrote this query below:
SELECT
[Period]
,[Region]
,[Country]
,[Manufacturer]
,[Brand]
,SUM([Spend]) OVER (PARTITION BY [Period]) AS [SumOfSpendWorld]
,SUM([Spend]) OVER (PARTITION BY [Period], [Region]) AS [SumOfSpendRegion]
,SUM([Spend]) OVER (PARTITION BY [Period], [Region], [Country]) AS [SumOfSpendCountry]
,SUM([Spend]) OVER (PARTITION BY [Period], [Region], [Country], [Manufacturer]) AS [SumOfSpendManufacturer]
FROM myTable
But that query takes >15 minutes for a table of just 450K rows. I'd like to know if there is any way to optimize this performance. Thank you in advanced for your answers/suggestions!
Your description of the problem suggests grouping sets to me:
SELECT YEAR([Period]) AS [Period], [Region], [Country], [Manufacturer],
SUM([Spend])
GROUP BY GROUPING SETS ( (YEAR([Period]),
(YEAR([Period]), [Region]),
(YEAR([Period]), [Region], [Country]),
(YEAR([Period]), [Region], [Country], [Manufacturer])
);
I don't know if this will be faster, but it certainly seems more aligned with your question.
Use cross apply here to speed the query up:
SELECT
periodyear
,[Region]
,[Country]
,[Manufacturer]
,[Brand]
,SUM([Spend]) OVER (PARTITION BY periodyear AS [SumOfSpendWorld]
,SUM([Spend]) OVER (PARTITION BY periodyear, [Region]) AS [SumOfSpendRegion]
,SUM([Spend]) OVER (PARTITION BY periodyear, [Region], [Country]) AS [SumOfSpendCountry]
,SUM([Spend]) OVER (PARTITION BY periodyear, [Region], [Country], [Manufacturer]) AS [SumOfSpendManufacturer]
FROM myTable
cross apply (select YEAR([Period]) periodyear) a
Old school of SUM() OVER():
SELECT
[Period]
, [Region]
, [Country]
, [Manufacturer]
, [Brand]
, (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] GROUP BY [Period]) AS [SumOfSpendWorld]
, (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] AND e.Region = t.Region GROUP BY [Period], [Region] ) AS [SumOfSpendRegion]
, (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] AND e.Region = t.Region AND e.Country = t.Country GROUP BY [Period], [Region], [Country] ) AS [SumOfSpendCountry]
, (SELECT SUM([Spend]) FROM myTable t WHERE e.[Period] = t.[Period] AND e.Region = t.Region AND e.Country = t.Country AND e.Manufacturer = t.Manufacturer GROUP BY [Period], [Region], [Country], [Manufacturer] ) AS [SumOfSpendManufacturer]
FROM myTable e
While this is not the elegant way to do it, but it gets the job done. I would highly recommend looking over the table and analyze it to see which alternative approaches would be best for your situation. If you feel it's a dead-end, then I would suggest using temp tables to make things faster.
For instance, you could select the rows based on period and use bulk copy to insert them directly to the temp table, then do your magic. I've seen tables that forced me to use temp tables instead of a simple select query. Others forced me to extend the table into two tables.
So, it's not always going to be nice and clean !
I hope this would give you another insight that would help you in your journey.

Remove rows where one row offsets the next using accounting rules

I have a view of materials data which contains what was purchased and reversals of some of the purchases. I need a query that removes records that have reversals of purchase transactions. NOTE: The view does not have a primary key.
In the example I need to remove the first two rows as the second row offsets the first row because it reverses the purchase, but I need to keep the third row. Any ideas?
Here is the SQL for the view:
SELECT LEFT(mi.Plnt, 3) AS SBUID ,
oth.EQUIP AS PROJECTID ,
ms.Req_No AS GI ,
ms.Req_Item AS GI_LINE ,
CONVERT(VARCHAR(11), [Doc_Date], 100) + ' 12:00 AM' AS DOC_DATE ,
mi.[SLoc] AS SLOC ,
[Material] AS MATERIAL ,
mi.[Description] AS MATERIAL_DESCRIPTION ,
[Qty] AS QUANTITY ,
mi.[UoM] AS UOM ,
CASE WHEN mi.Mvt IN ( '101', '103', '105', '123', '261' ) THEN
mi.Amount
ELSE mi.Amount * -1
END AS Cost ,
mi.Amount AS EXT_ORG_COST ,
mi.PO AS [PO] ,
mi.Batch ,
mi.Vendor AS VENDOR ,
mi.VendorName AS VENDOR_NAME ,
at.AC_Group AS AC_TYPE ,
[Mvt] AS MVT
FROM [dbo].[MatIssued] mi
INNER JOIN dbo.OrderTableHistory oth ON oth.SUB_ORDER = mi.SubOrder
INNER JOIN dbo.Aircraft_Information2 ai ON ai.Equip = oth.EQUIP
INNER JOIN dbo.RFC_AcftTypeList at ON at.ID = ai.AC_TypeID
LEFT OUTER JOIN dbo.MatStatus ms ON ms.MPN = mi.Material
AND ms.SubOrder = mi.SubOrder
WHERE mi.Plnt IN ( '9131', '9132' )
AND mi.Mvt IN ( '101', '102', '103', '104', '105', '106', '122', '123' ,
'261' ,'262' )
AND mi.Doc_Date >= DATEADD(YEAR, -1, GETDATE())
ORDER BY mi.PO ,
mi.Batch ,
PROJECTID ,
mi.Mvt;
Some assumptions, based on your screenshot:
Reversals have same DOC_DATE as purchases
Reversals have same Batch as purchases
If the above assumptions are correct, try something like this:
DELETE FROM t
FROM MyTable t
WHERE EXISTS (
SELECT 1
FROM MyTable t2
WHERE
-- Join to outer table
t2.SLOC = t.SLOC
AND t2.MATERIAL = t.MATERIAL
AND t2.QUANTITY = t.QUANTITY
AND t2.PO = t.PO
AND t2.Batch = t.Batch
AND t2.VENDOR = t.VENDOR
GROUP BY SLOC, MATERIAL, QUANTITY, PO, Batch, VENDOR
HAVING COUNT(*) = 2 -- There are 2 matching rows
AND -MIN(QUANTITY) = MAX(QUANTITY) -- Minimum quantity negates Maximum quantity
AND MIN(COST) + MAX(COST) = 0 -- Costs cancel each other out
AND MIN(CASE WHEN Cost > 0 THEN DOC_DATE END) <= MIN(CASE WHEN Cost < 0 THEN DOC_DATE END) -- Purchase DOC_DATE less than or equal to reversal DOC_DATE
AND MIN(MVT) = MAX(MVT) + 1 -- Correlate purchase and reversal movement
AND (t.DOC_DATE = MIN(DOC_DATE) OR t.DOC_DATE = MAX(DOC_DATE)) -- Join to outer table
)

Query with subquery-count and groupby

Below is the ERD
I want to count number of gender ('Male' and 'Female') for each month irrespective of year.
What I have tried so far is that I can count number of males and females for each month separately like below
Query
Select u.Gender, datename(month, p.EntryDate) month, COUNT(p.User_Id) count
from [HospitalManagement].[dbo].[Patients] p,[HospitalManagement].[dbo].[Users] u
where u.Id = p.User_Id
group by datename(month, p.EntryDate), u.Gender
Result
I want it like below
Expected Result
Month | MaleCount | FemaleCount
June | 0 | 2
November | 1 | 1
To achieve above I try following query
Query
Select datename(month, p.EntryDate) month,
(select count(u.gender) from [HospitalManagement].[dbo].[Users] u
where u.Id = p.User_Id and u.Gender = 'Female'
group by u.Gender) female,
(select count(u.gender) from [HospitalManagement].[dbo].[Users] u
where u.Id = p.User_Id and u.Gender = 'Male'
group by u.Gender) male
from [HospitalManagement].[dbo].[Patients] p
group by datename(month, p.EntryDate)
Error
Column 'HospitalManagement.dbo.Patients.User_Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Below are the create statements of tables (I am using MSSql)
-- Creating table 'Users'
CREATE TABLE [dbo].[Users] (
[Id] bigint IDENTITY(1,1) NOT NULL,
[Email] nvarchar(max) NULL,
[Password] nvarchar(max) NULL,
[UserName] nvarchar(max) NULL,
[Age] bigint NULL,
[Gender] nvarchar(max) NULL,
[NRIC] nvarchar(max) NULL,
[Comments] nvarchar(max) NULL,
[Address] nvarchar(max) NULL,
[ContactNo] nvarchar(max) NULL,
[FullName] nvarchar(max) NULL
);
GO
-- Creating table 'Patients'
CREATE TABLE [dbo].[Patients] (
[Id] bigint IDENTITY(1,1) NOT NULL,
[Disease] nvarchar(max) NULL,
[Occupation] nvarchar(max) NULL,
[EntryDate] datetime NULL,
[EntryTime] time NULL,
[User_Id] bigint NOT NULL
);
GO
As the error says, you are directly using User_Id column in select clause which is not present in the GROUP BY.
You can change the correlated subqueries to faster LEFT JOINs (assuming the User_Id is unique in the Patients table).
select
datename(month, p.EntryDate) mon,
count(case when u.Gender = 'Female' then 1 end) female_cnt,
count(case when u.Gender = 'Male' then 1 end) male_cnt
from [Patients] p left join [Users] u
on p.User_Id = u.Id
group by datename(month, p.EntryDate);
EDIT:
you can use a lookup CTE to generate all months and then do LEFT JOIN with it like this:
;WITH months(mn, mon) AS
(
SELECT 1, DATENAME(MONTH,DATEADD(month,0,GETDATE())) mon
UNION ALL
SELECT mn+1, DATENAME(MONTH,DATEADD(MONTH,mn,GETDATE()))
FROM months
WHERE mn < 12
)
select
m.mon mon,
count(case when u.Gender = 'Female' then 1 end) female_cnt,
count(case when u.Gender = 'Male' then 1 end) male_cnt
from months m left join [Patients] p
on m.mon = datename(month, p.EntryDate)
left join [Users] u
on p.User_Id = u.Id
group by m.mon;
You can try the following query:
Select
datename(month, p.EntryDate) month,
COUNT(IF(u.Gender = 'Male', 1, NULL) AS MaleCount,
COUNT(IF(u.Gender = 'Female', 1, NULL) AS FemaleCount
from
[HospitalManagement].[dbo].[Patients] p,[HospitalManagement].[dbo].[Users] u
where
u.Id = p.User_Id
group by
datename(month, p.EntryDate)
This is how we do in MySQL or Oracle. Have not tried in MSSQL though. But as these are standard SQL function (and is available in MSSQL), this should work in MSSQL as well.
Select datename(month, p.EntryDate) month , SUM(CASE WHEN u.Gender = 'Male' THEN 1 ELSE 0) MaleCount, SUM(CASE WHEN u.Gender = 'Female' THEN 1 ELSE 0) FemaleCount,
from [HospitalManagement].[dbo].[Patients] p,[HospitalManagement].[dbo].[Users] u
where u.Id = p.User_Id
group by datename(month, p.EntryDate)
Typically I am adding 1 in case of the specific gender to get the result.
Note: If there are any syntax errors please let me know with the error, I can correct it.

Count entries across three tables based on month in SQL or LINQ

I would like to extract some data from three tables in a SQL Server 2005 database. While this can surely be done in code, it seems like this could be done reasonably well in SQL (bonus points for LINQ!).
Basically, I would like to know for each month how many calls and meetings each employee has held with each of our clients. Something like this:
Employee GUID Customer GUID Jan calls Jan mtgs Feb calls Feb mtgs...
[a guid] [another guid] 5 0 7 3
The data is spread across three tables. For simplicity's sake, let's just show the relevant columns:
Communications Table
[CommunicationId] (PK, uniqueidentifier)
[Type] (nvarchar(1)) ('C' for call, 'M' for meeting, etc.)
[Date] (datetime)
Person-Communication Table
[PersonId] (PK, FK, uniqueidentifier) (Can contain GUIDs for employees or clients, see Person Table below)
[CommunicationId] (PK, FK, uniqueidentifier)
Person Table
[PersonId] (PK, uniqueidentifier)
[Type] (nvarchar(1)) ('E' for employee, 'C' for customer)
So, the questions:
Can this be done in SQL without horrendous code or big performance problems?
If so, how? I'd even settle for a good high-level strategy. I'm guessing pivots will play a big role here (particularly the "Complex PIVOT Example"). DATEPART(MONTH, Date) seems like a good method for partitioning the communications by month along the lines of:
SELECT DATEPART(MONTH, Date), COUNT(*)
FROM [CommunicationTable]
WHERE DATEPART(YEAR, Date) = '2009'
GROUP BY DATEPART(MONTH, Date)
ORDER BY DATEPART(MONTH, Date)
... which gets me the number of communications in each month in 2009:
1 2871
2 2639
3 3654
4 2751
5 1773
6 2575
7 2906
8 2398
9 2621
10 2638
11 1705
12 2290
Non PIVOT, CASE using syntax:
WITH summary AS (
SELECT emp.personid AS emp_guid,
cust.personid AS cust_guid,
DATEPART(MONTH, ct.date) AS mon,
ct.type,
COUNT(*) AS num_count
FROM COMMUNICATIONTABLE ct
LEFT JOIN PERSON_COMMUNICATION pc ON pc.communicationid = ct.communicationid
JOIN PERSON emp ON emp.personid = pc.personid
AND emp.type = 'E'
JOIN PERSON cust ON cust.personid = p.personid
AND cust.type = 'C'
WHERE ct.date BETWEEN '2009-01-01' AND '2009-12-31'
GROUP BY emp.personid, cust.personid, DATEPART(MONTH, ct.ate), ct.type)
SELECT s.emp_guid,
s.cust_guid,
MAX(CASE WHEN s.mon = 1 AND s.type = 'C' THEN s.num_count ELSE 0 END) AS "Jan calls",
MAX(CASE WHEN s.mon = 1 AND s.type = 'M' THEN s.num_count ELSE 0 END) AS "Jan mtgs",
... --Copy/Paste two lines, update the month check... and the col alias
FROM summary s
GROUP BY s.emp_guid, s.cust_guid
Use WHERE ct.date BETWEEN '2009-01-01' AND '2009-12-31' because WHERE DATEPART(YEAR, Date) = '2009' can't use an index if one exists on the date column.
This should get you started I did one month for one year for you, you can also add in the date range restrictions:
SELECT PE.PersonID as EmployeeID,PC2.PersonID as CustomerID,
SUM(CASE WHEN DATEPART(MONTH, C.[Date]) = 1
AND DATEPART(YEAR,C.[Date]) = 2009
AND C.[type] = 'C' THEN 1 ELSE 0 END) AS [Jan 2009 Calls]
FROM PersonTable PE
JOIN PersonCommunicationTable PC ON PE.PersonID = PC.PersonID
JOIN CommunicationsTable C ON PC.CommunicationID = C.CommunicationID
JOIN PersonCommunicationTable PC2 ON PC.CommunicationID = PC2.CommunicationID AND NOT PC2.PersonID = PC.PersonID
WHERE PE.Type = 'E'
Here is a reasonably equivalent solution using Pivot.
Declare #Comm TABLE
(
[CommunicationId] uniqueidentifier PRIMARY KEY DEFAULT NEWID(),
[Type] nvarchar(1), -- ('C' for call, 'M' for meeting, etc.)
[Date] datetime
)
Declare #Person TABLE
(
[PersonId] uniqueidentifier PRIMARY KEY DEFAULT NEWID(),
[Type] Nvarchar(1) -- ('E' for employee, 'C' for customer)
)
Declare #PersonComm TABLE
(
[PersonId] uniqueidentifier, -- (Can contain GUIDs for employees or clients, see Person Table below)
[CommunicationId] uniqueidentifier
)
INSERT INTO #Person(Type)
Select 'C' UNION ALL Select 'E' UNION ALL Select 'C' UNION ALL Select 'E'
INSERT INTO #Comm([Type],[Date])
Select 'C', '01/04/2010' UNION ALL Select 'C', '01/04/2010'
UNION ALL Select 'C', '04/04/2010' UNION ALL Select 'C', '05/01/2010'
UNION ALL Select 'C', '08/04/2009' UNION ALL Select 'C', '09/01/2009'
UNION ALL Select 'M', '01/04/2010' UNION ALL Select 'M', '03/20/2010'
UNION ALL Select 'M', '04/04/2010' UNION ALL Select 'M', '06/01/2010'
UNION ALL Select 'M', '04/10/2009' UNION ALL Select 'M', '04/10/2009'
INSERT INTO #PersonComm
Select E.PersonID , Comm.[CommunicationId]
FROM #Person E
,#Comm Comm
Where E.[Type] = 'E'
INSERT INTO #PersonComm
Select E.PersonID , Comm.[CommunicationId]
FROM #Person E
,#Comm Comm
Where E.[Type] = 'C'
Select EmployeeID,
ClientID,
Year,
[JanuaryC] AS [Jan Calls],
[JanuaryM] AS [Jan Meetings],
[FebruaryC],
[FebruaryM],
[MarchC],
[MarchM],
[AprilC],
[AprilM],
[MayC],
[MayM],
[JuneC],
[JuneM],
[JulyC],
[JulyM],
[AugustC],
[AugustM],
[SeptemberC] ,
[SeptemberM],
[OctoberC] ,
[OctoberM],
[NovemberC],
[NovemberM],
[DecemberC],
[DecemberM]
FROM
(
Select P.PersonId EmployeeID, Client.PersonId ClientID, YEAR(C.Date) Year, DateName(m,C.Date) Month, COUNT(*) Amount, C.Type CommType,
DateName(m,C.Date) + C.Type PivotColumn -- JanuaryC
FROM #Comm C
INNER JOIN #PersonComm PC
ON PC.CommunicationId = C.CommunicationId
INNER JOIN #Person P
ON P.PersonId = PC.PersonId
INNER JOIN #PersonComm PCC
ON PCC.CommunicationId = PC.CommunicationId
INNER JOIN #Person Client
ON Client.PersonId = PCC.PersonId AND Client.Type = 'C'
Where P.Type = 'E'
Group By P.PersonId, CLient.PersonId, YEAR(C.Date), DateName(m,C.Date), C.Type
) SourceTable
PIVOT (
MAX(Amount)
FOR PivotColumn IN
([JanuaryC], [JanuaryM],[FebruaryC], [FebruaryM],[MarchC], [MarchM], [AprilC], [AprilM], [MayC], [MayM], [JuneC], [JuneM], [JulyC], [JulyM],
[AugustC], [AugustM],[SeptemberC] , [SeptemberM],[OctoberC] ,[OctoberM],[NovemberC], [NovemberM], [DecemberC], [DecemberM]
)
)As PivotTable