Multiple joins to single table in SQL Server query throwing off counts - sql

I am writing a stored procedure in SQL Server Management Studio 2005 to return a list of states and policy counts for two different time periods, month to date and year to date. I have created a couple of views to gather the required data and a stored procedure for use in a Reporting Services report.
Below is my stored procedure:
SELECT DISTINCT
S.[State],
COUNT(HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(HP_YTD.PolicyID) AS PolicyCount_YTD
FROM tblStates S
LEFT OUTER JOIN vwHospitalPolicies HP_MTD ON S.[State] = HP.[State]
AND HP.CreatedDate BETWEEN DATEADD(MONTH, -1, GETDATE()) AND GETDATE()
LEFT OUTER JOIN vwHospitalPolicies HP_YTD ON S.[State] = HP.[State]
AND HP.CreatedDate BETWEEN DATEADD(YEAR, -1, GETDATE()) AND GETDATE()
GROUP BY S.[State]
ORDER BY S.[State] ASC
The problem I am running into is my counts are bloating when a second LEFT OUTER JOIN is added, even the COUNT() that isn't referencing the second join. I need a left join since not all states will have policies for the given period, but they should still appear on the report.
Any suggestions would be greatly appreciated!

It sounds like you need:
COUNT(DISTINCT HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(DISTINCT HP_YTD.PolicyID) AS PolicyCount_YTD
instead of:
COUNT(HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(HP_YTD.PolicyID) AS PolicyCount_YTD
Your original query is including the number of matching rows in the second join. Adding a DISTINCT clause inside the COUNT limits it to unique occurrences of the PolicyID.

I prefer CTEs for this sort of work - they're basically a sort of inline view.
Your actual problem is that some policies are being counted twice - once for the month-to-date, and once for the year-to-date - in both count columns. So, if you have source tables like this:
year-to-date
state policy
==================
1 1
1 2
1 3
2 4
month-to-date
state policy
==================
1 1
1 2
The result table for the JOINs (before COUNT is assesed) looks like this:
temp
state monthPolicy yearPolicy
================================
1 1 1
1 1 2
1 1 3
1 2 1
1 2 2
1 2 3
2 - 4
Clearly not what you want. Often, the solution is to use a CTE or other table reference to present summed-up records for the final join. Something like this:
WITH PoliciesMonthToDate (state, count) as (
SELECT state, COUNT(*)
FROM vwHospitalPolicies
WHERE createdDate >= DATEADD(MONTH, -1, GETDATE())
AND createdDate < DATEADD(DAY, 1, GETDATE())
GROUP BY state),
PoliciesYearToDate (state, count) as (
SELECT state, COUNT(*)
FROM vwHospitalPolicies
WHERE createdDate >= DATEADD(YEAR, -1, GETDATE())
AND createdDate < DATEADD(DAY, 1, GETDATE())
GROUP BY state)
SELECT a.state, COALESCE(b.count, 0) as policy_count_mtd,
COALESCE(c.count, 0) as policy_count_ytd
FROM tblStates a
LEFT JOIN PoliciesMonthToDate b
ON b.state = a.state
LEFT JOIN PoliciesYearToDate c
ON c.state = a.state
ORDER BY a.state
(minor nitpick - don't use prefixes for tables and views. It's noise, and if for some reason you switch between them, it would require a code change. That, or the name would then be misleading)

Related

Two Table Join Into One Result

I have two tables where I am attempting to join the results into one. I am trying to get the INV_QPC which is the case pack size shown in the results (SEIITN and SKU) are the same product numbers.
The code below gives two results, but the goal is to get the bottom result into the main output, where I was hoping the join would be the lookup to show the case pack size in relation to SKU.
INV_QPC = case pack size
SKU = SKU/Product Number
SEIITN = SKU/Product Number
Thanks for looking.
SELECT
ORDER_QTY, SKU, INVOICE_NUMBER, CUSTOMER_NUMBER, ROUTE,
ALLOCATED_QTY, SHORTED_QTY, PRODUCTION_DATE,
DATEPART(wk, PRODUCTION_DATE) AS FISCAL_WEEK,
YEAR(PRODUCTION_DATE) AS FISCAL_YEAR,
CONCAT(SKU, CUSTOMER_NUMBER) AS SKU_STORE_WEEK
FROM
[database].[dbo].[ORDERS]
WHERE
[PRODUCTION_DATE] >= DATEADD(day, -3, GETDATE())
AND [PRODUCTION_DATE] <= GETDATE()
SELECT INV_QPC
FROM [database].[dbo].[PRODUCT_MASTER]
JOIN [database].[dbo].[ORDERS] ON ORDERS.SKU = PRODUCT_MASTER.SEIITN;
It looks like you are on the right track, but your second SQL statement is only returning the INV_QPC column, so it is not being joined to the first query. Here is an updated SQL statement that should give you the result you are looking for:
SELECT
ORD.ORDER_QTY, ORD.SKU, ORD.INVOICE_NUMBER, ORD.CUSTOMER_NUMBER, ORD.ROUTE,
ORD.ALLOCATED_QTY, ORD.SHORTED_QTY, ORD.PRODUCTION_DATE,
DATEPART(wk, ORD.PRODUCTION_DATE) AS FISCAL_WEEK,
YEAR(ORD.PRODUCTION_DATE) AS FISCAL_YEAR,
CONCAT(ORD.SKU, ORD.CUSTOMER_NUMBER) AS SKU_STORE_WEEK,
PROD.INV_QPC
FROM
[database].[dbo].[ORDERS] ORD
JOIN [database].[dbo].[PRODUCT_MASTER] PROD ON ORD.SKU = PROD.SEIITN
WHERE
ORD.PRODUCTION_DATE >= DATEADD(day, -3, GETDATE())
AND ORD.PRODUCTION_DATE <= GETDATE()
In this query, I have added the INV_QPC column to the SELECT statement, and also included the join condition in the JOIN clause. Additionally, I have given aliases to the tables in the FROM and JOIN clauses to make the query easier to read. Finally, I have updated the WHERE clause to reference the ORD alias instead of the table name directly.

How to filter Users that meet CASE criteria without nesting WHERE in SQL?

Right now I have a query that lets me know which users didn't make a purchase 12 months prior to becoming members. These users have MEM_PRE_12=0 and I want to filter off those users more natively using SQL partitions rather than always putting rudimentary WHERE criteria.
Here is the SQL I use to find the users I want/don't want.
SELECT SUM(CASE WHEN DATE <= DATEADD(month, -12, U.INSERTED_AT) THEN 1 ELSE 0 END) AS MEM_PRE_12, I.CLIENTID, I.INSTALLATIONID
FROM <<<My_Joined_Tables>>>
GROUP BY I.CLIENTID, I.INSTALLATIONID
HAVING MEM_PRE_12 != 0
ORDER BY MEM_PRE_12
After this I'm going to have to go back and say where I.CLIENTID in the above nested query and select the actual information I want from users who made purchases greater than their insertion date.
How can I do this without so much nesting of all these joined tables?
If you want the detailed rows for customers who made a purchase in the last 12 months, you can use window functions:
with q as (
<whatever your query logic is>
)
select q.*
from (select q.*,
SUM(CASE WHEN DATE <= DATEADD(month, -12, U.INSERTED_AT) THEN 1 ELSE 0 END) over (partition by CLIENTID, INSTALLATIONID) as AS MEM_PRE_12
from q
) q
where mem_pre_12 > 0;

SQL with as expression shows multiple results

I am writing a SQL query using with as expression. I always get a result in the square of what I required.
This is my query:
DECLARE #MAX_DATE AS INT
SET #MAX_DATE = (SELECT DATEPART(MONTH,FECHA) FROM ALBVENTACAB WHERE NUMALBARAN IN (SELECT DISTINCT MAX(NUMALBARAN) FROM ALBVENTACAB));
;WITH TABLE_LAST AS (
SELECT CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA)) as LAST_YEAR_MONTH
,SUM(TOTALNETO) AS LAST_YEAR_VALUE
FROM ALBVENTACAB
WHERE DATEPART(YEAR,CURRENT_TIMESTAMP) -1 = DATEPART(YEAR,FECHA) AND NUMSERIE LIKE 'A%'
AND DATEPART(MONTH,FECHA) <= #MAX_DATE
GROUP BY CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA))
)
,TABLE_CURRENT AS(
SELECT CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA)) as CURR_YEAR_MONTH
,SUM(TOTALNETO) AS CURR_YEAR_VALUE
FROM ALBVENTACAB
WHERE DATEPART(YEAR,CURRENT_TIMESTAMP) <= DATEPART(YEAR,FECHA) AND NUMSERIE LIKE 'A%'
GROUP BY CONCAT(DATEPART(MONTH,FECHA),'-',DATEPART(YEAR,FECHA))
)
SELECT *
FROM TABLE_CURRENT, TABLE_LAST
When I run the query I get exactly the square of the result.
I want to compare sale monthly with last year.
2-2020 814053.3 2-2019 840295.1
1-2020 1094993.65 2-2019 840295.1
3-2020 293927.3 2-2019 840295.1
2-2020 814053.3 1-2019 1050701.68
1-2020 1094993.65 1-2019 1050701.68
3-2020 293927.3 1-2019 1050701.68
2-2020 814053.3 3-2019 887776.1
1-2020 1094993.65 3-2019 887776.1
3-2020 293927.3 3-2019 887776.1
I should get only 3 rows instead of 9 rows.
You need to properly join your two CTE - the way you're doing it now, you're getting a Cartesian product of each row in either CTE together.
Do something like:
*;WITH TABLE_LAST AS
( ....
),
TABLE_CURRENT AS
( ....
)
SELECT *
FROM TABLE_CURRENT curr
INNER JOIN TABLE_LAST last ON (some join condition here)
What that join condition is going to be - I have no idea, and cannot tell from your question - but you have to define how these two sets of data "connect" ....
It could be something like:
SELECT *
FROM TABLE_CURRENT curr
INNER JOIN TABLE_LAST last ON curr.CURR_YEAR_MONTH = last.LAST_YEAR_MONT
or whatever else makes sense in your situation - but basically, you need to somehow "tie together" these two sets of data and get only those rows that make sense - not just every row from "last" combined with every row from "curr" ....
While you already got the answer on how to join the two results, I thought I'd tell you how to typically approach such problems.
From the same table, you want two sums on different conditions (different years that is). You solve this with conditional aggregation, which does just that: aggregate (sum) based on a condition (year).
select
datepart(month, fecha) as month,
sum(case when datepart(year, fecha) = datepart(year, getdate()) then totalneto end) as this_year,
sum(case when datepart(year, fecha) = datepart(year, getdate()) -1 then totalneto end) as last_year
from albventacab
where numserie like 'A%'
and fecha > dateadd(year, -2, getdate())
group by datepart(month, fecha)
order by datepart(month, fecha);

Trouble running a complex query in sql?

I am pretty new to SQL Server and just started playing with it. I am trying to create a table that shows attendance percentage by department.
So first i run this query:
SELECT CrewDesc, COUNT(*)
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
This gives a table like this:
Accounting 10
Marketing 5
Economics 20
Engineering 5
Machinery 5
Tech Support 10
Then i run another query:
SELECT DeptDescription, COUNT(*)
FROM database.Attendee
GROUP BY DeptDescription
This gives me the result of all the people that have attended meeting something like
Accounting 8
Marketing 5
Economics 15
Engineering 10
Tech Support 8
Then I get the current week in the year by SELECT Datepart(ww, GetDate()) as CurrentWeek To make this example easy lets assume this will be week "2".
Now the way i was going to create this was a table for each step but that seems like waste. Is there a way we can combine to tables in a query? So in the end result i would like a table like this
Total# Attd Week (Total*Week) Attd/(Total*week)%
Accounting 10 8 2 20 8/20
Marketing 5 5 2 10 5/10
Economics 20 15 2 40 15/40
Engineering 5 10 2 10 10/10
Machinery 5 NULL 2 10 0/10
Tech Support 10 8 2 20 8/20
Ok, note that my recommendation below is based on your exact existing queries - there are certainly other ways to construct this that may be more performant, but functionally this should work for your requirement. Also, it illustrates the key features of different join types that happen to be relevant for your request, as well as inline views (aka nested queries), which are a super-powerful technique in the SQL language as a whole.
select t1.CrewDesc, t1.Total, t2.Attd, t3.Week,
(t1.Total*t3.Week) as Total_x_Week,
case when isnull(t1.Total*t3.Week, 0) = 0 then 0 else isnull(t2.Attd, 0) / isnull(t1.Total*t3.Week, 0) end as PercentageAttd
from (
SELECT CrewDesc, COUNT(*) AS Total
FROM database.emp INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
) t1
left outer join /* left outer to keep all rows from t1 */ (
SELECT DeptDescription, COUNT(*) AS Attd
FROM database.Attendee GROUP BY DeptDescription
) t2
on t1.CrewDesc = t2.DeptDescription
cross join /* useful when adding a scalar value to all rows */ (
SELECT Datepart(ww, GetDate()) as Week
) t3
order by t1.CrewDesc
Good luck!
Try something like this
SELECT COALESCE(a.crewdesc,b.deptdescription),
a.total,
b.attd,
Datepart(ww, Getdate()) AS week,
total * Datepart(ww, Getdate()),
b.attd/(a.total*Datepart(ww, Getdate()))
FROM (query 1) a
FULL OUTER JOIN (query 2) b
ON a.crewdesc = b.deptdescription
WITH Total AS ( SELECT CrewDesc, COUNT(*) AS [Count]
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
),
Attd AS ( SELECT DeptDescription, COUNT(*) AS [Count]
FROM database.Attendee
GROUP BY DeptDescription
)
SELECT COALESCE(CrewDesc,DeptDescription) AS [Dept],
Total.[Count] AS [Total#],Attd.[Count] AS [Attd],
Total.[Count] * Datepart(ww, GetDate()) AS [(Total*Week)],
CAST(Attd.[Count] AS VARCHAR(10))+'/'+ CAST((Total.[Count] * Datepart(ww, GetDate()))AS VARCHAR(10)) AS [Attd/(Total*week)%]
FROM Total INNER JOIN Attd ON Total.CrewDesc = Attd.DeptDescription
I'm assuming your queries are correct -- you give no real information about your model so I've no way to know. They look wrong since the same data is called CrewDesc in one table and Dept in another. Also the join sim1 = sim2 seems very strange to me. In any case given the queries you posted this will work.
With TAttend as
(
SELECT CrewDesc, COUNT(*) as TotalNum
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
), Attend as
(
SELECT DeptDescription, COUNT(*) as Attd
FROM database.Attendee
GROUP BY DeptDescription
)
SELECT CrewDesc as Dept, TotalNum, ISNULL(Attd, 0) as Attd ,Datepart(ww, GetDate()) as Week,
CASE WHEN ISNULL(Attd, 0) > 0 THEN 0
ELSE ISNULL(Attd, 0) / (TotalNum * Datepart(ww, GetDate()) ) END AS Percent
FROM TAttend
LEFT JOIN Attend on CrewDesc = DeptDescription

Count records with a criteria like "within days"

I have a table as below on sql.
OrderID Account OrderMethod OrderDate DispatchDate DispatchMethod
2145 qaz 14 20/3/2011 23/3/2011 2
4156 aby 12 15/6/2011 25/6/2011 1
I want to count all records that have reordered 'within 30 days' of dispatch date where Dispatch Method is '2' and OrderMethod is '12' and it has come from the same Account.
I want to ask if this all can be achieved with one query or do I need to create different tables and do it in stages as I think I wll have to do now? Please can someone help with a code/query?
Many thanks
T
Try the following, replacing [tablename] with the name of your table.
SELECT Count(OriginalOrders.OrderID) AS [Total_Orders]
FROM [tablename] AS OriginalOrders
INNER JOIN [tablename] AS Reorders
ON OriginalOrders.Account = Reorders.Account
AND OriginalOrders.OrderDate < Reorders.OrderDate
AND DATEDIFF(day, OriginalOrders.DispatchDate, Reorders.OrderDate) <= 30
AND Reorders.DispatchMethod = '2'
AND Reorders.OrderMethod = '12';
By using an inner join you'll be sure to only grab orders that meet all the criteria.
By linking the two tables (which are essentially the same table with itself using aliases) you make sure only orders under the same account are counted.
The results from the join are further filtered based on the criteria you mentioned requiring only orders that have been placed within 30 days of the dispatch date of a previous order.
Totally possible with one query, though my SQL is a little stale..
select count(*) from table
where DispatchMethod = 2
AND OrderMethod = 12
AND DATEDIFF(day, OrderDate, DispatchDate) <= 30;
(Untested, but it's something similar)
One query can do it.
SELECT COUNT(*)FROM myTable reOrder
INNER JOIN myTable originalOrder
ON reOrder.Account = originalOrder.Account
AND reOrder.OrderID <> originalOrder.OrderID
-- all re-orders that are within 30 days or the
-- original orders dispatch date
AND DATEDIFF(d, originalOrder.DispatchDate, reOrder.OrderDate) <= 30
WHERE reOrder.DispatchMethod = 2
AND reOrder.OrderMethod = 12
You need a self-join.
The query below assumes that a given account will have either 1 or 2 records in the table - 2 if they've reordered, else 1.
If 3 records exist for a given account, 2 orders + 1 reorder then this won't work - but we'd then need more information on how to distinguish between an order and a reorder.
SELECT COUNT(*) FROM myTable new, myTable prev
WHERE new.DispatchMethod = 2
AND new.OrderMethod = 12
AND DATEDIFF(day, prev.DispatchDate, new.OrderDate) <=30
AND prev.Account == new.Account
AND prev.OrderDate < new.OrderDate
Can we use GROUP BY in this case, such as the following?
SELECT COUNT(Account)
FROM myTable
WHERE DispatchMethod = 2 AND OrderMethod = 12
AND DATEDIFF(d, DispatchDate, OrderDate) <=30
GROUP BY Account
Will the above work or am I missing something here?