SQL Puzzle - Why can't I join these two subqueries? - sql

I'm stuck on a SQL query.
Consider the following table:
Table DG_GAME_ROUNDS
RoundId int
GameId int
RoundNumber int
Value varchar(20)
Guess varchar(20)
Answer varchar(20)
Correct bit
Minutes int
Seconds int
Milliseconds int
This table holds the results of game rounds. Now sometimes you can fat finger an answer to the game and wind up with a guess time of 35 or even 0 milliseconds. These answers skew the results of my game and I want to remove them.
I want to figure out the average guess time where the guess is at least 200 milliseconds long. So if a game had five rounds with guesses of 455, 400, 340, 30, 300. I want to ignore the 30 and average out the remaining four values and get an average guess time of 374. Without dropping the 30 the average guess time would be 305.
My problem is that I'm trying to join two subqueries and I'm getting an error message that there is a problem around the "on" statement. I think joining subqueries is allowed.
select vt.gameid, vt.totalms, vt.numofguesses, vt.correctguesses
from
(select gr.gameid
, sum((gr.seconds*1000) + gr.milliseconds) as totalms
, count(gr.roundid) as numofguesses
, sum(cast(gr.correct as int)) as correctguesses
from work_tables.dbo.dg_game_rounds gr (nolock)
group by gr.gameid
) vt
inner join (
select vtIII.gameid, vtIII.avgtime
from
(
select vtII.gameid, sum(vtII.avgms)/count(vtII.avgms) as avgtime
from (
select gr.gameid, gr.seconds * 1000 + gr.milliseconds as avgms
from dg_game_rounds gr (nolock)
where gr.seconds * 1000 + gr.milliseconds > 200
) vtII
group by vtII.gameid
) vtIII
on vtIII.gameid = vt.gameid

Because you're missing an ending ) (2nd to last line)
select vt.gameid, vt.totalms, vt.numofguesses, vt.correctguesses
from
(select gr.gameid
, sum((gr.seconds*1000) + gr.milliseconds) as totalms
, count(gr.roundid) as numofguesses
, sum(cast(gr.correct as int)) as correctguesses
from work_tables.dbo.dg_game_rounds gr (nolock)
group by gr.gameid
) vt
inner join (
select vtIII.gameid, vtIII.avgtime
from
(
select vtII.gameid, sum(vtII.avgms)/count(vtII.avgms) as avgtime
from (
select gr.gameid, gr.seconds * 1000 + gr.milliseconds as avgms
from dg_game_rounds gr (nolock)
where gr.seconds * 1000 + gr.milliseconds > 200
) vtII
group by vtII.gameid
) vtIII ) vtIII
on vtIII.gameid = vt.gameid

You haven't closed all your subqueries:
select vt.gameid, vt.totalms, vt.numofguesses, vt.correctguesses
from
(select gr.gameid
, sum((gr.seconds*1000) + gr.milliseconds) as totalms
, count(gr.roundid) as numofguesses
, sum(cast(gr.correct as int)) as correctguesses
from work_tables.dbo.dg_game_rounds gr (nolock)
group by gr.gameid
) vt
inner join (
select vtIII.gameid, vtIII.avgtime
from
(
select vtII.gameid, sum(vtII.avgms)/count(vtII.avgms) as avgtime
from (
select gr.gameid, gr.seconds * 1000 + gr.milliseconds as avgms
from dg_game_rounds gr (nolock)
where gr.seconds * 1000 + gr.milliseconds > 200
) vtII
group by vtII.gameid
) vtIII ) f
on f.gameid = vt.gameid
I added this: ) vtIII ) f

Count your parentheses.
inner join (
is never closed.

Related

How add up values from multiple SQL columns based on occurrances

I need select values from a table and returns the total hours for all categories and their occurrences. The challenge is that there are different totals for each occurrence.
My query:
SELECT c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences
FROM dbo.Categories sc
INNER JOIN dbo.Categories c
ON sc.CategoryID = c.CategoryID
INNER JOIN dbo.OrderHistory oh
ON sc.GONumber = oh.OrderNumber
AND sc.Item = oh.ItemNumber
WHERE sc.BusinessGroupID = 1
AND oh.OrderNumber = 500
AND oh.ItemNumber = '100'
GROUP BY c.Category, c.HrsFirstOccur, c.HrsAddlOccur
returns the following results:
Category
HrsFirstOccur
HrsAddlOccur
Occurrences
Inertia
24
16
2
Lights
1
0.5
4
Labor
10
0
1
The total is calculated based on the number of occurrences. The first one is totaled then for each additional occurrence, the HrsAddlOccur is used.
My final result should be (24 + 16) + (1 + 0.5 + 0.5 + 0.5) + 10 for a grand total of 52.5.
How do I loop and process the results to total this up?
The total is calculated based on the number of occurrences. The first one is totaled then for each additional occurrence, the HrsAddlOccur is used.
SQL databases understand arithmetic. You can perform the computation on each row. As I understand, the logic you want is:
SELECT
c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences,
c.HrsFirstOccur + ( COUNT(*) - 1 ) * HrsAddlOccur As Total
FROM ... < rest of your query > ..
Later on you can aggregate the whole resultset to get the grand total:
SELECT SUM(Total) GrandTotal
FROM (
... < above query > ..
) t
you can sum them simply up
WITH CTE as(SELECT c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences
FROM dbo.Categories sc
INNER JOIN dbo.Categories c ON sc.CategoryID = c.CategoryID
INNER JOIN dbo.OrderHistory oh ON sc.GONumber = oh.OrderNumber
AND sc.Item = oh.ItemNumber
WHERE sc.BusinessGroupID = 1
AND oh.OrderNumber = 500
AND oh.ItemNumber = '100')
SELECT SUM(HrsFirstOccur + (CAST((Occurrences -1) AS DECIMAL(8,2)) * HrsAddlOccur)) as total FROM CTE
it would do it like the example
CREATE TABLE CTE
([Category] varchar(7), [HrsFirstOccur] int, [HrsAddlOccur] DECIMAL(8,2), [Occurrences] int)
;
INSERT INTO CTE
([Category], [HrsFirstOccur], [HrsAddlOccur], [Occurrences])
VALUES
('Inertia', 24, 16, 2),
('Lights', 1, 0.5, 4),
('Labor', 10, 0, 1)
;
3 rows affected
SELECT SUM(HrsFirstOccur + (CAST((Occurrences -1) AS DECIMAL(8,2)) * HrsAddlOccur)) as total
FROM CTE
total
52.5000
fiddle

Bizarre Join with comma

I'm looking at someone else's code and find this bizarre join:
SELECT
SUM(
(
intUnitOverheadCost + intUnitLaborCost + intUnitMaterialCost + intUnitSubcontractCost
+ intUnitDutyCost + intUnitFreightCost + intUnitMiscCost
)
*
(
(
CASE
WHEN imtSource = 3
THEN - 1
ELSE 1
END
) * intQuantity
)
)
FROM PartTransactions --imt
INNER JOIN PartTransactionCosts --int
ON imtPartTransactionID = intPartTransactionID
LEFT JOIN Warehouses --imw
ON imtPartWarehouseLocationID = imwWarehouseID
, ProductionProperties --xap <-- weird join
WHERE imtJobID = jmpJobID
AND imtSource IN (2,3)
AND imtReceiptID = ''
AND Upper(imtTableName) <> 'RECEIPTLINES'
AND imtNonInventoryTransaction <= {?CHECKBOXGROUP_4_ShowNonInventory}
AND imtJobType IN (1, 3)
AND imtTransactionDate < DATEADD(d, 1, {?PROMPT_1_TODATE})
AND (
imtNonNettable = 0
OR (
imtNonNettable <> 0
AND ISNULL(imwDoNotIncludeInJobCosts, 0) = 0
)
)
AND intCostType = (
CASE -- Always 1
WHEN xapIMCostingMethod = 1
THEN 1
WHEN xapIMCostingMethod = 2
THEN 2
WHEN xapIMCostingMethod = 3
THEN 3
ELSE 4
END
)
There is only one record in table ProductionProperties and the result of select xapIMCostingMethod from ProductionProperties is always 1.
There are always 4 enumerated results in PartTransactionCosts, but only 1 result is allowed.
ProductionProperties.xapIMCostingMethod is implicitly joining to PartTransactionCosts.intCostType
My specific question is what is really going on with this comma join? It looks like it has to be a cross-join, later filtered in the WHERE clause with one possible result.
Agree with the previous answer. It is a cartesian join but since the rows are 1 it doesn't cause an issue.
I'm thinking if you added rows to ProductionProperties then it would serve as a multiplier for your sum. I did a little experiment to show the issue:
declare #tableMoney table (
unit int,
Product char(5),
xapIMPCostingMethod int,
Cost money
)
declare #tableProdProperties table (
xapIMPCostingMethod int
)
insert #tableMoney (unit, Product, xapIMPCostingMethod, Cost)
values
(1,'bike',1, 2.00),
(1,'car',1, 2.25),
(2,'boat',2, 4.50)
insert #tableProdProperties (xapIMPCostingMethod)
values (1),
(2)
select sum(Cost)
from #tableMoney, #tableProdProperties
I also don't like to use joins where it isn't clear what is joining to what so I always use an alias:
select sum(Cost)
from #tableMoney tbm join #tableProdProperties tpp
on tbm.xapIMPCostingMethod = tpp.xapIMPCostingMethod

Why is this query with a nested select faster when I include the where clause twice

I had a large sql query that had a nested select in the from clause.
Similar to this:
SELECT * FROM
( SELECT * FROM SOME_TABLE WHERE some_num = 20)
WHERE some_num = 20
In my sql query if I remove the outer "some_num" = 20 it takes 5 times as long . Shouldent these querys run in almost exactly the same time, if not wouldn't having the the additional where slow it down slightly?
What am I not understanding about how sql querys work?
Here is the original query in question
SELECT a.ITEMNO AS Item_No,
a.DESCRIPTION AS Item_Description,
UNITPRICE / 100 AS Retail_Price,
b.UNITSALES AS Units_Sold,
( Dollar_Sales ) AS Dollar_Sales,
( Dollar_Cost ) AS Dollar_Cost,
( Dollar_Sales ) - ( Dollar_Cost ) AS Gross_Profit,
( Percent_Page * c.PAGECOST ) AS Page_Cost,
( Dollar_Sales - Dollar_Cost - ( Percent_Page * c.PAGECOST ) ) AS Net_Profit,
Percent_Page * 100 AS Percent_Page,
( CASE
WHEN UNITPRICE = 0 THEN NULL
WHEN Percent_Page = 0 THEN NULL
WHEN ( Dollar_Sales - Dollar_Cost - ( Percent_Page * c.PAGECOST ) ) > 0 THEN 0
ELSE ( ceiling(abs(Dollar_Sales - Dollar_Cost - ( Percent_Page * c.PAGECOST )) / ( UNITPRICE / 100 )) )
END ) AS Break_Even,
b.PAGENO AS Page_Num
FROM (SELECT PAGENO,
OFFERITEM,
UNITSALES,
UNITPRICE,
( DOLLARSALES / 100 ) AS Dollar_Sales,
( DOLLARCOST / 10000 ) AS Dollar_Cost,
(( CAST(STUFF(PERCENTPAGE, 2, 0, '.') AS DECIMAL(9, 6)) )) AS Percent_Page
FROM OFFERITEMS
WHERE LEFT(OFFERITEM, 6) = 'CH1301'
AND PERCENTPAGE > 0) AS b
INNER JOIN ITEMMAST a
ON a.EDPNO = 1 * RIGHT(OFFERITEM, 8)
LEFT JOIN OFFERS c
ON c.OFFERNO = 'CH1301'
WHERE LEFT(OFFERITEM, 6) = 'CH1301'
ORDER BY Net_Profit DESC
Notice the two
WHERE left(OFFERITEM,6) = 'CH1301'
If I remove the outer Where then the query takes 5 times as long
As requested the Execution plan excuse the crappy upload
http://i.imgur.com/1PqmpVf.png
Is the column OFFERITEM in an index but PERCENTPAGE is not?
In your inner query you reference both these columns, in the outer query you only reference OFFERITEM.
Difficult to say without seeing the execution plan, but it could be that the outer query is causing the optimizer to run an 'index scan' whereas the inner query would cause a full table scan.
On a separate note, you should definitely modify:
WHERE left(OFFERITEM,6) ='CH1301'
to:
where offeritem like 'CH1301%'
As this will allow an index seek if there is an index on offeritem.

All hour of day

In my sql query, I count the number of orders in each Hour of day. My query looks something like this:
SELECT COUNT(dbo.Uputa.ID),{ fn HOUR(dbo.Orders.Date) } AS Hour
FROM Orders
WHERE dbo.Orders.Date BETWEEN '2011-05-01' AND '2011-05-26'
GROUP BY { fn HOUR(dbo.Orders.Date) }
ORDER BY Hour
My problem is that the query returns only existing Hours in dbo.Orders.Date.
For example:
Number Hour
12 3
12 5
I want to return all hours like this:
Number Hour
0 0
0 1
0 2
12 3
0 4
12 5
...
0 23
Does anybody have idea how to accomplish this?
Use a common table expression to create all hours, then left join your grouped totals to get a result.
with mycte as
(
SELECT 0 AS MyHour
UNION ALL
SELECT MyHour + 1
FROM mycte
WHERE MyHour + 1 < 24
)
SELECT mycte.MyHour, COALESCE(OrderCount,0) FROM mycte
LEFT JOIN
(
SELECT COUNT(dbo.Uputa.ID) AS OrderCount,{ fn HOUR(dbo.Orders.Date) } AS MyHour
FROM Orders
WHERE dbo.Orders.Date BETWEEN '2011-05-01' AND '2011-05-26'
GROUP BY { fn HOUR(dbo.Orders.Date) }
) h
ON
h.MyHour = mycte.MyHour;
A 'numbers table' (SQL, Auxiliary table of numbers for example) is in general quite a useful thing to have in your database; if you create one here you can select all rows between 0 and 23 from your numbers table, left join that against your results and you'll get the results you want without the need to create a custom CTE or similar purely for this query.
SELECT COUNT(dbo.Uputa.ID),n.number AS Hour
FROM (select number from numbers where number between 0 and 23) n
left join Orders o on n.number={ fn HOUR(dbo.Orders.Date) }
WHERE dbo.Orders.Date BETWEEN '2011-05-01' AND '2011-05-26'
GROUP BY n.number
ORDER BY n.number
(I've worded this as per your example for clarity but in practice I'd try and avoid putting a function in the join criteria to maximise performance.)
You can use a CTE to add the missing hours and JOIN these with your original query to fill in the blanks.
SQL Statement
;WITH q (Number, Hour) AS (
SELECT 0, 1
UNION ALL
SELECT q.Number, q.Hour + 1
FROM q
WHERE q.Hour < 23
)
SELECT COALESCE(o.Number, q.Number)
, q.Hour
FROM q
LEFT OUTER JOIN (
SELECT COUNT(dbo.Uputa.ID),{ fn HOUR(dbo.Orders.Date) } AS Hour
FROM Orders
WHERE dbo.Orders.Date BETWEEN '2011-05-01' AND '2011-05-26'
GROUP BY { fn HOUR(dbo.Orders.Date) }
) o ON o.Hour = q.Hour
ORDER BY
q.Hour
Test Script
;WITH Orders (Number, Hour) AS (
SELECT 12, 3
UNION ALL SELECT 12, 5
)
, q (Number, Hour) AS (
SELECT 0, 1
UNION ALL
SELECT q.Number, q.Hour + 1
FROM q
WHERE q.Hour < 23
)
SELECT COALESCE(o.Number, q.Number)
, q.Hour
FROM q
LEFT OUTER JOIN Orders o ON o.Hour = q.Hour

Can you create nested WITH clauses for Common Table Expressions?

WITH y AS (
WITH x AS (
SELECT * FROM MyTable
)
SELECT * FROM x
)
SELECT * FROM y
Does something like this work? I tried it earlier but I couldn't get it to work.
While not strictly nested, you can use common table expressions to reuse previous queries in subsequent ones.
To do this, the form of the statement you are looking for would be
WITH x AS
(
SELECT * FROM MyTable
),
y AS
(
SELECT * FROM x
)
SELECT * FROM y
You can do the following, which is referred to as a recursive query:
WITH y
AS
(
SELECT x, y, z
FROM MyTable
WHERE [base_condition]
UNION ALL
SELECT x, y, z
FROM MyTable M
INNER JOIN y ON M.[some_other_condition] = y.[some_other_condition]
)
SELECT *
FROM y
You may not need this functionality. I've done the following just to organize my queries better:
WITH y
AS
(
SELECT *
FROM MyTable
WHERE [base_condition]
),
x
AS
(
SELECT *
FROM y
WHERE [something_else]
)
SELECT *
FROM x
With does not work embedded, but it does work consecutive
;WITH A AS(
...
),
B AS(
...
)
SELECT *
FROM A
UNION ALL
SELECT *
FROM B
EDIT
Fixed the syntax...
Also, have a look at the following example
SQLFiddle DEMO
These answers are pretty good, but as far as getting the items to order properly, you'd be better off looking at this article
http://dataeducation.com/dr-output-or-how-i-learned-to-stop-worrying-and-love-the-merge
Here's an example of his query.
WITH paths AS (
SELECT
EmployeeID,
CONVERT(VARCHAR(900), CONCAT('.', EmployeeID, '.')) AS FullPath
FROM EmployeeHierarchyWide
WHERE ManagerID IS NULL
UNION ALL
SELECT
ehw.EmployeeID,
CONVERT(VARCHAR(900), CONCAT(p.FullPath, ehw.EmployeeID, '.')) AS FullPath
FROM paths AS p
JOIN EmployeeHierarchyWide AS ehw ON ehw.ManagerID = p.EmployeeID
)
SELECT * FROM paths order by FullPath
we can create nested cte.please see the below cte in example
;with cte_data as
(
Select * from [HumanResources].[Department]
),cte_data1 as
(
Select * from [HumanResources].[Department]
)
select * from cte_data,cte_data1
I was trying to measure the time between events with the exception of what one entry that has multiple processes between the start and end. I needed this in the context of other single line processes.
I used a select with an inner join as my select statement within the Nth cte. The second cte I needed to extract the start date on X and end date on Y and used 1 as an id value to left join to put them on a single line.
Works for me, hope this helps.
cte_extract
as
(
select ps.Process as ProcessEvent
, ps.ProcessStartDate
, ps.ProcessEndDate
-- select strt.*
from dbo.tbl_some_table ps
inner join (select max(ProcessStatusId) ProcessStatusId
from dbo.tbl_some_table
where Process = 'some_extract_tbl'
and convert(varchar(10), ProcessStartDate, 112) < '29991231'
) strt on strt.ProcessStatusId = ps.ProcessStatusID
),
cte_rls
as
(
select 'Sample' as ProcessEvent,
x.ProcessStartDate, y.ProcessEndDate from (
select 1 as Id, ps.Process as ProcessEvent
, ps.ProcessStartDate
, ps.ProcessEndDate
-- select strt.*
from dbo.tbl_some_table ps
inner join (select max(ProcessStatusId) ProcessStatusId
from dbo.tbl_some_table
where Process = 'XX Prcss'
and convert(varchar(10), ProcessStartDate, 112) < '29991231'
) strt on strt.ProcessStatusId = ps.ProcessStatusID
) x
left join (
select 1 as Id, ps.Process as ProcessEvent
, ps.ProcessStartDate
, ps.ProcessEndDate
-- select strt.*
from dbo.tbl_some_table ps
inner join (select max(ProcessStatusId) ProcessStatusId
from dbo.tbl_some_table
where Process = 'YY Prcss Cmpltd'
and convert(varchar(10), ProcessEndDate, 112) < '29991231'
) enddt on enddt.ProcessStatusId = ps.ProcessStatusID
) y on y.Id = x.Id
),
.... other ctes
Nested 'With' is not supported, but you can always use the second With as a subquery, for example:
WITH A AS (
--WITH B AS ( SELECT COUNT(1) AS _CT FROM C ) SELECT CASE _CT WHEN 1 THEN 1 ELSE 0 END FROM B --doesn't work
SELECT CASE WHEN count = 1 THEN 1 ELSE 0 END AS CT FROM (SELECT COUNT(1) AS count FROM dual)
union all
select 100 AS CT from dual
)
select CT FROM A