SQL - Group data with same ID and Date that has been to every Machine but has a different Name - sql

I am trying to create a query that will group data by CT ID and Date that have all 3 MachineID's (1, 10, and 20) and at least one different Sawing Pattern Name.
This Image shows a highlighted example of the data I'm trying to get back and the code i'm currently using
I'm trying to only show data similar to the highlighted rows in the image (CT ID 501573833) and exclude the data in the rows around it where the Sawing Pattern Name is the same at all 3 MachineID's.

Your description suggests group by and having. The conditions you describe can all go in the having clause:
select ct_id, date
from t
group by ct_id, date
having sum(case when machineid = 1 then 1 else 0 end) > 0 and
sum(case when machineid = 10 then 1 else 0 end) > 0 and
sum(case when machineid = 20 then 1 else 0 end) > 0 and
min(sawing_pattern_name) <> max(sawing_pattern_name)

Seems to me that an EXISTS could be useful here.
SELECT
[CT ID],
[MachineID],
[Sawing Pattern name],
[Time],
CAST([Time] AS DATE) AS [Date]
FROM [DataCollector].[dbo].[Maxicut] t
WHERE EXISTS
(
SELECT 1
FROM [DataCollector].[dbo].[Maxicut] d
WHERE d.[CT ID] = t.[CT ID]
AND CAST(d.[Time] AS DATE) = CAST(t.[Time] AS DATE)
AND d.[MachineID] != t.[MachineID]
AND REPLACE(d.[Sawing Pattern name],',','') != REPLACE(t.[Sawing Pattern name],',','')
);

Related

A SQL query for the retrieval of result based on input month

Table 1
Table 2
My requirement is to input the Redemption month and list the tickets that has been scanned double or more.
For example Ticket No. T1 has been scanned 2 times under pickup,only once under PickupOutforDelivery and 2 times under Delivery.
Result needed like this:
How can I write a query to get the result like this?
Tried:
SELECT
Ticket,
COUNT(Scantype = 0) AS Pickup,
COUNT(Scantype = 1) AS PickupOutforDelivery,
COUNT(Scantype = 2) AS Delivery
FROM
Scans
GROUP BY
Ticket, ScanType
HAVING
(Pickup > 1 OR PickupOutforDelivery > 1 OR Delivery > 1)
OR (Pickup >= 1 AND PickupOutforDelivery >= 1)
ORDER BY
Ticket
Result
Assuming that RedemptionMonth has a datatype of DATE (which is clearly required); the following query will give you the result you want, except the "cosmetic" part (breaking by year month for the report part) that you have to do on your application:
SELECT YEAR(RedemptionMonth) AS [YEAR], MONTH(RedemptionMonth) AS [MONTH], TicketNo,
COALESCE(SUM(CASE WHEN ScanName = 'Pickup' THEN 1 ELSE 0 END), 0) AS Pickup,
COALESCE(SUM(CASE WHEN ScanName = 'PickupOutForDelivery' THEN 1 ELSE 0 END), 0) AS PickupOutForDelivery ,
COALESCE(SUM(CASE WHEN ScanName = 'Delivery' THEN 1 ELSE 0 END), 0) AS Delivery
FROM [Table 1] AS T1
JOIN [Table 2] AS T2
ON T1.ScanType = T2.ScanType
GROUP BY YEAR(RedemptionMonth) AS [YEAR], MONTH(RedemptionMonth) AS [MONTH], TicketNo
Because you are using the numeric value of scanType, no JOIN is needed. So, the only fix is needed for conditional aggregation:
SELECT Ticket,
SUM(CASE WHEN Scantype = 0 THEN 1 ELSE 0 END) as Pickup,
SUM(CASE WHEN Scantype = 1 THEN 1 ELSE 0 END) as PickupOutforDelivery,
SUM(CASE WHEN Scantype = 2 THEN 1 ELSE 0 END) as Delivery
FROM Scans
WHERE redemptionMonth = 'Jan-21'
GROUP BY Ticket
HAVING Pickup > 1 OR
PickupOutforDelivery > 1 OR
Delivery > 1 OR
(Pickup >= 1 AND PickupOutforDelivery >= 1)
ORDER BY Ticket;
Note that you can add redemptionMonth to the GROUP BY (and SELECT) to get the results for each month.
If redemptionMonth is really a date and not a string, then define the time period using a range of dates:
WHERE redemptionMonth >= '2021-01-01' AND
redemptionMonth < '2021-02-01'

Inline Table Join Multiplying Results

The below query joins two views and one inline table to another inline table. When I run the query without table FI all of the SUM values return correctly, however, when I run the query with table FI all of the SUM values from vw_Interactions are multiplied and returned incorrect (SUM values from vw_LeadInteractions are not affected).
vw_Interactions is a transactional log and returns a 1 in each column where that measure is true (ex: a 1 is returned in I.[Call] where a phone call was logged), and vw_LeadInteractions is the same except it returns the Client's ID.
I did several hours of research and found that inline tables can cause issues when joining (the Cartesian product?), however I wasn't able to understand how those answers were relevant to this query.
Can someone explain why that when table FI is included in this query that it multiplies the SUM values of everything from vw_Interactions? And then how do I fix my query so this does not happen?
This query is for my employer's outbound call center to measure what's happening during each 'round' of calling.
/* Parameters */
DECLARE #StartDatetime AS Date
SET #StartDatetime = '06/01/13'
DECLARE #EndDatetime AS Date
SET #EndDatetime = '05/31/14'
/* Dataset */
SELECT R.[RoundsGoal]
,R.[RoundNumber]
,COUNT(DISTINCT R.[Client_Id]) AS 'Leads'
,ISNULL(SUM(I.[Call]), 0) AS 'Calls'
,ISNULL(COUNT(DISTINCT LI.[Call]), 0) AS 'CallLeads'
,ISNULL(SUM(FI.[FirstCall]), 0) AS 'FirstCalls'
,ISNULL(SUM(I.[DecisionMakerCall]), 0) AS 'DecisionMakerCalls'
,ISNULL(COUNT(DISTINCT LI.[DecisionMakerCall]), 0) AS 'DecisionMakerCallLeads'
,ISNULL(SUM(FI.[FirstDecisionMakerCall]), 0) AS 'FirstDecisionMakerCalls'
,ISNULL(SUM( I.[LeftMessageCall]), 0) AS 'LeftMessageCalls'
,ISNULL(COUNT(DISTINCT LI.[LeftMessageCall]), 0) AS 'LeftMessageLeads'
,ISNULL(SUM(FI.[FirstLeftMessageCall]), 0) AS 'FirstLeftMessageCalls'
,ISNULL(SUM(I.[NoAnswerCall]), 0) AS 'NoAnswerCalls'
,ISNULL(COUNT(DISTINCT LI.[NoAnswerCall]), 0) AS 'NoAnswerCallLeads'
,ISNULL(SUM(FI.[FirstNoAnswerCall]), 0) AS 'FirstNoAnswerCalls'
FROM (
SELECT RD.[Client_Id]
,ISNULL(UF1.[NumericCol], 0) AS 'RoundsGoal'
,COUNT(RD.[RoundDate]) OVER(PARTITION BY RD.[Client_Id] ORDER BY RD.[RoundDate] ASC) AS 'RoundNumber'
,RD.[RoundDate]
FROM [dbo].[vw_RoundDates] RD
LEFT JOIN [dbo].[AMGR_User_Fields] UF1 ON RD.[Client_Id] = UF1.[Client_Id] AND UF1.[Type_Id] = 140 --Rounds Goal TypeId
LEFT JOIN [dbo].[AMGR_User_Field_Defs] UFD1 ON UF1.[Type_Id] = UFD1.[Type_Id] AND UF1.[Code_Id] = UFD1.[Code_Id]
WHERE RD.[RoundDate] >= #StartDatetime AND RD.[RoundDate] <= #EndDatetime
) R
LEFT JOIN [dbo].[vw_Interactions] I ON R.[Client_Id] = I.[Client_Id] AND R.[RoundDate] = CAST(I.[Created] AS DATE)
LEFT JOIN [dbo].[vw_LeadInteractions] LI ON R.[Client_Id] = LI.[Client_Id] AND R.[RoundDate] = CAST(LI.[Created] AS DATE)
LEFT JOIN (
SELECT I.[Client_Id]
,CASE WHEN (CASE WHEN I.[Call] = 1 THEN ROW_NUMBER() OVER(PARTITION BY I.[Client_Id], I.[Call] ORDER BY I.[Created] ASC) ELSE NULL END) = 1 THEN 1 ELSE NULL END AS 'FirstCall'
,CASE WHEN (CASE WHEN I.[DecisionMakerCall] = 1 THEN ROW_NUMBER() OVER(PARTITION BY I.[Client_Id], I.[DecisionMakerCall] ORDER BY I.[Created] ASC) ELSE NULL END) = 1 THEN 1 ELSE NULL END AS 'FirstDecisionMakerCall'
,CASE WHEN (CASE WHEN I.[LeftMessageCall] = 1 THEN ROW_NUMBER() OVER(PARTITION BY I.[Client_Id], I.[LeftMessageCall] ORDER BY I.[Created] ASC) ELSE NULL END) = 1 THEN 1 ELSE NULL END AS 'FirstLeftMessageCall'
,CASE WHEN (CASE WHEN I.[NoAnswerCall] = 1 THEN ROW_NUMBER() OVER(PARTITION BY I.[Client_Id], I.[NoAnswerCall] ORDER BY I.[Created] ASC) ELSE NULL END) = 1 THEN 1 ELSE NULL END AS 'FirstNoAnswerCall'
,[Created]
FROM [dbo].[vw_Interactions] I
) FI ON R.[Client_Id] = FI.[Client_Id] AND R.[RoundDate] = CAST(FI.[Created] AS DATE)
GROUP BY R.[RoundsGoal]
,R.[RoundNumber]
ORDER BY R.[RoundsGoal] ASC
,R.[RoundNumber] ASC
Here is the correct results set without table FI. Notice the Calls on row 23 equals 135,110.
Here is the incorrect results, that include table FI. Notice the Calls on row 23 are multiplied to 1,561,038.

Display only needed results in SQL

I need a little assistance in finishing this query. Here is what I have so far:
select
(select count(fileName)
from PDFFile
where dateTime > cast(getdate() as date)
and stateId = 17) AS "Files on SFTP"
,
(select count(fileName)
from PDFFile
where dateTime > cast(getdate() as date)
and stateId = 12) AS "Files Pass"
,
((select count(fileName)
from PDFFile
where dateTime > cast(getdate() as date)
and stateId = 17)
-
(select count(fileName)
from PDFFile
where dateTime > cast(getdate() as date)
and stateId = 12)) AS "Diff"
This is going to give me 3 columns of results. First result will be a number, second will be a number and the third will be the diff. There may even be a better way to write this but I'm still a novice. Hint: There is an entry in the DB for each state:
fileName |dateTime | stateID
--------+---------+-----------------+---------
abc.pdf | 2013-12-17 12:03:14.597 | 17
abc.pdf | 2013-12-17 12:06:23.096 | 12
xyz.pdf | 2013-12-17 12:09:16.583 | 17
xyz.pdf | 2013-12-17 12:10:19.823 | 12
Anyways for the finale...
I need to have a 4th column or a separate query (possible to UNION?) that pulls the fileNames based off the results in the diff.
Hypothetically if the diff is 40, the 4th column or separate query should list the 40 names. At times the diff may be negative so again hypothetically speaking if its -40 it should list the 40 names.
Assistance is greatly appreciated. Thank you!
You can greatly simplify your query using conditional aggregation:
select sum(case when dateTime > cast(getdate() as date) and stateId = 17 then 1 else 0
end) as "Files on SFTP",
sum(case when dateTime > cast(getdate() as date) and stateId = 12 then 1 else 0
end) AS "Files Pass",
(sum(case when dateTime > cast(getdate() as date) and stateId = 17 then 1 else 0
end) -
sum(case when dateTime > cast(getdate() as date) and stateId = 12 then 1 else 0
end)
) as diff
from PDFFile;
To get the list of files that are in the first group but not the second requires a bit more logic. The problem is that the unit of aggregation is at the file level.
select PDFFile
from PDFFile
group by PDFFile
having sum(case when dateTime > cast(getdate() as date) and stateId = 17 then 1 else 0
end) > 0 and
sum(case when dateTime > cast(getdate() as date) and stateId = 12 then 1 else 0
end) = 0;
Each part of the having clause counts the number of rows -- for each file -- that match the two conditions. You want at least one row that matches the first condition (hence > 0) and no rows that match the second (= 0).
This type of "combine row data into one column" question comes up quite a lot on Stack Overflow and although it has its place it's often easier and more efficient to solve the problem in another way.
For example, it's a lot easier to ask SQL to "give me all the filenames where stateid = 17", return them to your app and then get the app to display them. It may also be that your user doesn't want to see them until there is a particular summary line that is of interest to them that they need to drill down into further. Think of email as an example - you may only need to look at the 30 character subject line and know you don't need to download the 1Mb email body.
For your first question though there is a lot easier (and more efficient) way to write your query. Note that this example is untested
select
sum(case when stateId = 17 then 1 else 0 end) as "Files on SFTP",
sum(case when stateId = 12 then 1 else 0 end) as "Files Pass",
sum(case when stateId = 17 then 1 else 0 end) -
sum(case when stateId = 12 then 1 else 0 end) as "Diff",
from
PdfFile
where
datetime > getdate()
I'm using CASE here to prevent having to do three separate sub-queries. Sub-queries are inefficient. CASE isn't great but it's faster than sub-queries. I've also placed your datetime check at the bottom of the query as a WHERE as it was common to each of your checks.

multi-select sql query with date range

I have this query where I get totals of different stats from an employee roster table.
SELECT A.rempid AS EmpId,
E.flname,
A.rdo_total,
B.grave_total,
C.sundays,
D.holidays
FROM (SELECT rempid,
Count(rshiftid)AS RDO_Total
FROM rtmp1
WHERE rshiftid = 2
GROUP BY rempid
HAVING Count(rshiftid) > 0) A,
(SELECT rempid,
Count(rshiftid)AS Grave_Total
FROM rtmp1
WHERE rshiftid = 6
GROUP BY rempid
HAVING Count(rshiftid) > 0)B,
(SELECT rempid,
Count(rshiftid) AS Sundays
FROM rtmp1
WHERE Datepart(dw, rdate) = 1
AND rshiftid > 2
GROUP BY rempid
HAVING Count(rshiftid) > 0)C,
(SELECT rempid,
Count(rshiftid) AS Holidays
FROM rtmp1
WHERE rdate IN (SELECT pubhdt
FROM pubhol)
AND rshiftid > 2
GROUP BY rempid
HAVING Count(rshiftid) > 0)D,
(SELECT empid,
[fname] + ' ' + [sname] AS flName
FROM remp1)E
WHERE A.rempid = B.rempid
AND A.rempid = E.empid
AND A.rempid = C.rempid
AND A.rempid = D.rempid
ORDER BY A.rempid
I would like to add a date range into it, so that I can query the database within 2 dates. The rTmp1 table has a column called rDate. I was wondering what the best way to do this. I could add it to a stored procedure and add variable to each select query. Or is there a better way to run the query within a date range.
i think just add an additional where clause item similar to:
AND ( rDate > somedate AND rDate < someotherdate )
Adding the date range to each query is the most direct solution.
Making it a stored procedure is something that can always be done with a query, but has nothing to do with this specific case.
If the number of records resulting from narrowing down your table to the specified date range is substantially less than the entire table, it might be an option to insert these records into a temporary table or a table variable and run your existing query on that table/resultset.
Though I do not have any data to test, you might consider the following query as it is more easy to read and might perform better. But you have to check the results for yourself and maybe do some adjustments.
DECLARE #startDate date = '12/01/2012'
DECLARE #endDate date = DATEADD(MONTH, 1, #startDate)
SELECT
[e].[empid],
[e].[fname] + ' ' + [e].[sname] AS [flName],
SUM(CASE WHEN [t].[rshiftid] = 2 THEN 1 ELSE 0 END) AS [RDO_Total],
SUM(CASE WHEN [t].[rshiftid] = 6 THEN 1 ELSE 0 END) AS [Grave_Total],
SUM(CASE WHEN [t].[rshiftid] > 2 AND DATEPART(dw, [t].[rdate]) = 1 THEN 1 ELSE 0 END) AS [Sundays],
SUM(CASE WHEN [t].[rshiftid] > 2 AND [h].[pubhdt] IS NOT NULL THEN 1 ELSE 0 END) AS [Holidays]
FROM [remp1] [e]
INNER JOIN [rtmp1] [t] ON [e].[empid] = [t].[rempid]
LEFT JOIN [pubhol] [h] ON [t].[rdate] = [h].[pubhdt]
WHERE [t].[rdate] BETWEEN #startDate AND #endDate
GROUP BY
[e].[empid],
[e].[fname],
[e].[sname]
ORDER BY [empid] ASC

Multiple Queries in different table

(Also posted here.)
So I have two tables, one is invalid table and the other is valid table.
valid table:
id
status
date
invalid table:
id
status
date
I have to produce a report with this output:
date on-time late total valid invalid1 invalid2 total rate
--------- ------- ---- ----- ----- -------- -------- ----- ----
9/10/2011 4 10 14 3 3 3 6
date: common fields on the 2 tables, field to group by, how many records on that day has
on-time: count of all the id on the valid table
late: count of all the records(id) on the invalid table
total: total of on-time and late
valid: count of id on the valid table with the "valid" status
invalid1: count of id on the invalid table with "invalid1" status
invalid2: count of id on the invalid table with "invalid2" status
total: total of valid, invalid1, invalid2
rate: average of totals
It's basically multiple queries with different table. How can I achieve it?
Someting like this?
SELECT
*,
(result.total + result._total) / 2 AS rate
FROM (
SELECT
date,
SUM(CASE WHEN data.valid = 1 THEN 1 ELSE 0 END) AS ontime,
SUM(CASE WHEN data.valid = 0 THEN 1 ELSE 0 END) AS late,
COUNT(*) AS total,
SUM(CASE WHEN data.valid = 1 AND data.status = 'valid' THEN 1 ELSE 0 END) AS valid,
SUM(CASE WHEN data.valid = 0 AND data.status = 'invalid1' THEN 1 ELSE 0 END) AS invalid1,
SUM(CASE WHEN data.valid = 0 AND data.status = 'invalid2' THEN 1 ELSE 0 END) AS invalid2,
SUM(CASE WHEN data.status IN ('valid', 'invalid', 'invalid2') THEN 1 ELSE 0 END) AS _total
FROM (
SELECT
date,
status,
valid = 1
FROM
Valid
UNION ALL
SELECT
date,
status,
valid = 0
FROM
InValid ) AS data
GROUP BY
date) AS result
SELECT date, ontime, late, ontime+late total, valid, invalid1, invalid2, valid+invalid1+invalid2 total
FROM
(SELECT date,
COUNT(*) late,
COUNT(IIF(status = 'invalid1', 1, NULL)) invalid1,
COUNT(IIF(status = 'invalid2', 1, NULL)) invalid2,
FROM invalid
GROUP BY date
) JOIN (
SELECT date,
COUNT(*) ontime,
COUNT(IIF(status = 'valud', 1, NULL)) valid,
FROM valid
GROUP BY date
) USING (date)
First of all, it seems that you are holding exactly the same information in 2 tables - I would recommend merging those tables together and add an additional boolean column called valid to hold the info related to validity of the record.
The query on your existent DB structure might look something like this:
SELECT unioned.* FROM (
( SELECT v.date AS date, v.status AS status, v.id AS id, COUNT(id) AS valid, 0 AS invalid1, 0 AS invalid2 FROM valid v GROUP BY v.date)
UNION
( SELECT i1.date AS date, i1.status AS status, i1.id AS id, 0 AS valid, COUNT(i1.id) AS invalid1, 0 AS invalid2 FROM invalid1 i1 GROUP BY i1.date)
UNION
( SELECT i2.date AS date, i2.status AS status, i2.id AS id, 0 AS valid, 0 AS invalid1, COUNT(i.id) AS invalid2 FROM invalid1 i1 GROUP BY i1.date)
) AS unioned GROUP BY unioned.date