Select rows from query with a distinct foreign key?

Select rows from query with a distinct foreign key? - sql

I am having trouble just now with yet another SQL problem. I really need to take some time out to learn this properly.
Anyway I have this query that someone else wrote and it gets values from a few different tables.
Now more than one item can have the same ProductID. So there may be 3 items returned all with the same ProductID but they have different descriptions etc.
I want to select only 1 item per ProductID. I have tried using DISTINCT and group by but I get a lot of errors. Also this is for an ACCESS database.
I think it's because of the logic used in the select query that is messing up my grouping.
Here is the query (I have tried formatting it a little better, used an online tool but its still a huge mess)
SELECT tblproducts.productid,
tblproducts.categorycode,
tblproducts.scaletitle,
tblproducts.picture,
tblitems.cost,
tblitems.modelnumber,
tblitems.itemid,
Iif([tblitems]![tradeapproved],Iif(([tblitems]![markup] / 100) <> 0,(Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost])) * ([tblitems]![markup] / 100),
0) + Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost]) + [tblitems]![tradeapprovedcost] + [tblitems]![shippingcost],
Iif(([tblitems]![markup] / 100) <> 0,(Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost])) * ([tblitems]![markup] / 100),
0) + Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost]) + [tblitems]![shippingcost]) AS price
FROM (tblitems
INNER JOIN tblproducts
ON tblitems.productid = tblproducts.productid)
INNER JOIN tblsuppliers
ON tblproducts.supplierid = tblsuppliers.supplierid
WHERE tblproducts.categorycode = 'BS'
AND tblitems.tradeapproved = 0
AND tblsuppliers.active = on
AND tblitems.isaccessory = false
ORDER BY Iif([tblitems]![tradeapproved],Iif(([tblitems]![markup] / 100) <> 0,(Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost])) * ([tblitems]![markup] / 100),
0) + Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost]) + [tblitems]![tradeapprovedcost] + [tblitems]![shippingcost],
Iif(([tblitems]![markup] / 100) <> 0,(Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost])) * ([tblitems]![markup] / 100),
0) + Iif(([tblitems]![supplierdiscount] / 100) <> 0,
[tblitems]![cost] - ([tblitems]![cost] * ([tblitems]![supplierdiscount] / 100)),
[tblitems]![cost]) + [tblitems]![shippingcost])
Can anyone post a quick fix for this? Thanks

Well, since you said you want to learn this stuff:
An inner join will connect Items to ProductId's but will result in a full set. So if you have 3 ProductIds and 1 Item you will get
ProdId ItemId Description
1 1 Handy Dandy Randy Sandy!
2 1 Easily Accessible personal grooming comb.
3 1 This item provides a man or woman with extra re...
So what you really want to do is get all the ItemIds:
select ItemId from Item_tbl
And then loop over each result, getting a single ProductId per Item:
select top 1 ProductId from Product_tbl where ItemId = 12345
Now anyone who suggests a loop with SQL gets yelled down, and (usually) rightly so. But this is a tough query to make, since it's not something people usually do.
You were along the right lines with group by. Group By says "consolidate all the rows that have distinct column X" where column X would be ItemId. So: Give me one row per ItemId.
Now you have to pick a ProductId from those 3 Products with ItemId 1. The cheater way to do it is not to pick a ProductId at random but rather a productId that fits a particular aggregate function. The most common are min and max.
select
ItemId,
max(ProductId)
from Itemtbl i
inner join Producttbl p
on i.itemid = p.itemId
group by ItemId
This will get the largest ProductId for each ItemId. You can do the same to get the minimum.
Now, what's trickier is finding a ProductId that fits a criteria - say the most recently updated. What you want to say is "select the ItemId, and the max(updatedDate), and then pull the ProductId of that max updatded date along - but that doesn't work in sql (dear god I wish it did though).
This query will give bad results:
select
ItemId,
max(ProductId),
max(updatdedDate)
from Itemtbl i
inner join Producttbl p
on i.itemid = p.itemId
group by ItemId
Because the max ProductId does not necessarily come from the row with the max updatedDate.
Instead you have to write a query that does this:
Selects the ItemId (e.g. 5), and the maxUpdated date (e.g. 5/5/2005)
Goes back to the Products_tbl and finds the ProductId whose ItemId is 5 and updatedDate is 5/5/2005
That query is left as an exercise. (but there's a bug! what if two products have the same last updated date and the same ItemId!)

First step to increase readability is to create a View for your tblItems that includes the fancy logic for Price, eg:
View [vwItemsWithAdjustedCost]
SELECT
ProductID,
TradeApproved,
IsAccessory,
Cost,
ModelNumber,
ItemID,
IIf(
( SupplierDiscount / 100 ) <> 0,
Cost - ( Cost * ( SupplierDiscount / 100 ) ),
Cost
) AS AdjustedCost
FROM tblItems
View [vwItemsWithPrice]
SELECT
ProductID,
TradeApproved,
IsAccessory,
Cost,
ModelNumber,
ItemID,
IIf(
( Markup / 100 ) <> 0,
AdjustedCost * ( Markup / 100 ),
0
)
+ AdjustedCost
IIf(
TradeApproved,
TradeApprovedCost,
0
)
+ ShippingCost AS Price
FROM vwItemsWithAdjustedCost
Next you have to decide what the criteria is for picking one item out of the many that match the same ProductID, if 3 items have the same ID which one do you want to show!?
As stated by Tom, an easy way is to just get the first (lowest) ID that matches, something like this:
SELECT
P.ProductID,
P.CategoryCode,
P.ScaleTitle,
P.Picture,
IP.Cost,
IP.ModelNumber,
IP.ItemID,
IP.Price
FROM
tblProducts P
INNER JOIN (
SELECT
ProductID,
MIN( ItemID ) AS MinItemID
FROM tblItems I
GROUP BY ProductID
) S
ON S.ProductID = P.ProductID
INNER JOIN vwItemsWithPrice IP
ON IP.ItemID = S.MinItemID
WHERE
P.CategoryCode = 'BS'
AND IP.TradeApproved = 0
AND IP.IsAccessory = false
ORDER BY IP.Price
This says for each ProductID, give me the first (lowest) ItemID from tblItems, and using that join to my view.
Hope this helps!

Related

How add up values from multiple SQL columns based on occurrances

I need select values from a table and returns the total hours for all categories and their occurrences. The challenge is that there are different totals for each occurrence.
My query:
SELECT c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences
FROM dbo.Categories sc
INNER JOIN dbo.Categories c
ON sc.CategoryID = c.CategoryID
INNER JOIN dbo.OrderHistory oh
ON sc.GONumber = oh.OrderNumber
AND sc.Item = oh.ItemNumber
WHERE sc.BusinessGroupID = 1
AND oh.OrderNumber = 500
AND oh.ItemNumber = '100'
GROUP BY c.Category, c.HrsFirstOccur, c.HrsAddlOccur
returns the following results:
Category
HrsFirstOccur
HrsAddlOccur
Occurrences
Inertia
24
16
2
Lights
1
0.5
4
Labor
10
0
1
The total is calculated based on the number of occurrences. The first one is totaled then for each additional occurrence, the HrsAddlOccur is used.
My final result should be (24 + 16) + (1 + 0.5 + 0.5 + 0.5) + 10 for a grand total of 52.5.
How do I loop and process the results to total this up?

The total is calculated based on the number of occurrences. The first one is totaled then for each additional occurrence, the HrsAddlOccur is used.
SQL databases understand arithmetic. You can perform the computation on each row. As I understand, the logic you want is:
SELECT
c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences,
c.HrsFirstOccur + ( COUNT(*) - 1 ) * HrsAddlOccur As Total
FROM ... < rest of your query > ..
Later on you can aggregate the whole resultset to get the grand total:
SELECT SUM(Total) GrandTotal
FROM (
... < above query > ..
) t

you can sum them simply up
WITH CTE as(SELECT c.Category,
c.HrsFirstOccur,
c.HrsAddlOccur,
COUNT(*) AS Occurrences
FROM dbo.Categories sc
INNER JOIN dbo.Categories c ON sc.CategoryID = c.CategoryID
INNER JOIN dbo.OrderHistory oh ON sc.GONumber = oh.OrderNumber
AND sc.Item = oh.ItemNumber
WHERE sc.BusinessGroupID = 1
AND oh.OrderNumber = 500
AND oh.ItemNumber = '100')
SELECT SUM(HrsFirstOccur + (CAST((Occurrences -1) AS DECIMAL(8,2)) * HrsAddlOccur)) as total FROM CTE
it would do it like the example
CREATE TABLE CTE
([Category] varchar(7), [HrsFirstOccur] int, [HrsAddlOccur] DECIMAL(8,2), [Occurrences] int)
;
INSERT INTO CTE
([Category], [HrsFirstOccur], [HrsAddlOccur], [Occurrences])
VALUES
('Inertia', 24, 16, 2),
('Lights', 1, 0.5, 4),
('Labor', 10, 0, 1)
;
3 rows affected
SELECT SUM(HrsFirstOccur + (CAST((Occurrences -1) AS DECIMAL(8,2)) * HrsAddlOccur)) as total
FROM CTE
total
52.5000
fiddle

Out of range integer: infinity

So I'm trying to work through a problem thats a bit hard to explain and I can't expose any of the data I'm working with but what Im trying to get my head around is the error below when running the query below - I've renamed some of the tables / columns for sensitivity issues but the structure should be the same
"Error from Query Engine - Out of range for integer: Infinity"
WITH accounts AS (
SELECT t.user_id
FROM table_a t
WHERE t.type like '%Something%'
),
CTE AS (
SELECT
st.x_user_id,
ad.name as client_name,
sum(case when st.score_type = 'Agility' then st.score_value else 0 end) as score,
st.obs_date,
ROW_NUMBER() OVER (PARTITION BY st.x_user_id,ad.name ORDER BY st.obs_date) AS rn
FROM client_scores st
LEFT JOIN account_details ad on ad.client_id = st.x_user_id
INNER JOIN accounts on st.x_user_id = accounts.user_id
--WHERE st.x_user_id IN (101011115,101012219)
WHERE st.obs_date >= '2020-05-18'
group by 1,2,4
)
SELECT
c1.x_user_id,
c1.client_name,
c1.score,
c1.obs_date,
CAST(COALESCE (((c1.score - c2.score) * 1.0 / c2.score) * 100, 0) AS INT) AS score_diff
FROM CTE c1
LEFT JOIN CTE c2 on c1.x_user_id = c2.x_user_id and c1.client_name = c2.client_name and c1.rn = c2.rn +2
I know the query works for sure because when I get rid of the first CTE and hard code 2 id's into a where clause i commented out it returns the data I want. But I also need it to run based on the 1st CTE which has ~5k unique id's
Here is a sample output if i try with 2 id's:
Based on the above number of row returned per id I would expect it should return 5000 * 3 rows = 150000.
What could be causing the out of range for integer error?

This line is likely your problem:
CAST(COALESCE (((c1.score - c2.score) * 1.0 / c2.score) * 100, 0) AS INT) AS score_diff
When the value of c2.score is 0, 1.0/c2.score will be infinity and will not fit into an integer type that you’re trying to cast it into.
The reason it’s working for the two users in your example is that they don’t have a 0 value for c2.score.
You might be able to fix this by changing to:
CAST(COALESCE (((c1.score - c2.score) * 1.0 / NULLIF(c2.score, 0)) * 100, 0) AS INT) AS score_diff

Fetching Records from One table and making difference from same table gives irregular output

I want the sum from these two tables but I am getting seperately -
SELECT
GrandTotal - RecPayAmount -
(
select
sum(detail.LineAmount)
From
TranPOSDetail as detail
where
detail.RefHeaderCode = TranPOSHeader.Code
and
EntryFlag = 4
)
from TranPOSHeader
where
VoucherTypeCode=2000
And
WalkInCustomerCode=200429
And
GrandTotal > RecPayAmount
My Output is Like
1) 10
2) 20
But I want it like -
1) 30
How can I modify this query to reflect the results I want?

Use Cte and aggregate total values
WITH Amount
AS (
SELECT GrandTotal - RecPayAmount - (
SELECT sum(detail.LineAmount)
FROM TranPOSDetail AS detail
WHERE detail.RefHeaderCode = TranPOSHeader.Code
AND EntryFlag = 4
) TotalAmount
FROM TranPOSHeader
WHERE VoucherTypeCode = 2000
AND WalkInCustomerCode = 200429
AND GrandTotal > RecPayAmount
)
SELECT Sum(TotalAmount)
FROM Amount

Here is one simpler approach
SELECT Sum(GrandTotal - RecPayAmount - oa.Total_LineAmount)
FROM TranPOSHeader th
OUTER APPLY (SELECT Sum(d.LineAmount)
FROM TranPOSDetail AS d
WHERE d.RefHeaderCode = th.Code
AND d.EntryFlag = 4) oa (Total_LineAmount)
WHERE VoucherTypeCode = 2000
AND WalkInCustomerCode = 200429
AND GrandTotal > RecPayAmount

Without changing the existing query, you can try like following.
SELECT SUM(T.S) AS Total FROM
(
SELECT
(GrandTotal - RecPayAmount -
(
select
sum(detail.LineAmount)
From
TranPOSDetail as detail
where
detail.RefHeaderCode = TranPOSHeader.Code
and
EntryFlag = 4
)
) AS S
from TranPOSHeader
where
VoucherTypeCode=2000
And
WalkInCustomerCode=200429
And
GrandTotal > RecPayAmount
) T

query with calculations most efficient way

I have written the query below which works fine & produces the correct results. However I feel this is probably not terribly efficient as my SQL experience is quite limited.
The main thing that sticks out is where I calculate the nominal differences & price differences, these two lines.
1. isnull(hld.Nominal, 0) - isnull(nav.Nominal, 0) NomDiff
2. isnull((hld.Price / nav.LocalPrice - 1) * 100, 0)
Because I also have to put both these lines in the where condition, so the same calculations are being calculated twice. What is a better way of writing this query?
;WITH hld AS
(
SELECT id,
name,
Nominal,
Price
FROM tblIH
),
nav AS
(
SELECT id,
name,
Nominal,
LocalPrice
FROM tblNC
)
SELECT COALESCE(hld.id, nav.id) id,
COALESCE(nav.name, hld.name) name,
ISNULL(hld.Nominal, 0) HldNom,
ISNULL(nav.Nominal, 0) NavNom,
ISNULL(hld.Nominal, 0) - ISNULL(nav.Nominal, 0) NomDiff,
ISNULL(hld.Price, 0) HldPrice,
ISNULL(nav.LocalPrice, 0) NavPrice,
ISNULL((hld.Price / nav.LocalPrice - 1) * 100, 0)
FROM hld
FULL OUTER JOIN nav ON hld.id = nav.id
WHERE ISNULL(hld.Nominal, 0) - ISNULL(nav.Nominal, 0) <> 0
OR ISNULL((hld.Price / nav.LocalPrice - 1) * 100, 0) <> 0

First you select without where condition, you have result as table tmp, then you add where condition with column NomDiff and PriceDiff
;WITH hld AS
(
SELECT id,
name,
Nominal,
Price
FROM tblIH
),
nav AS
(
SELECT id,
name,
Nominal,
LocalPrice
FROM tblNC
)
select *
from (SELECT COALESCE(hld.id, nav.id) id,
COALESCE(nav.name, hld.name) name,
ISNULL(hld.Nominal, 0) HldNom,
ISNULL(nav.Nominal, 0) NavNom,
ISNULL(hld.Nominal, 0) - ISNULL(nav.Nominal, 0) NomDiff,
ISNULL(hld.Price, 0) HldPrice,
ISNULL(nav.LocalPrice, 0) NavPrice,
ISNULL((hld.Price / nav.LocalPrice - 1) * 100, 0) PriceDiff
FROM hld
FULL OUTER JOIN nav ON hld.id = nav.id) tmp
where NomDiff <> 0 or PriceDiff <> 0

The calculation part you can include in the second CTE, then you can simply select or filter the calculated field as a normal column in your final select query, without further calculations.

Why is this query with a nested select faster when I include the where clause twice

I had a large sql query that had a nested select in the from clause.
Similar to this:
SELECT * FROM
( SELECT * FROM SOME_TABLE WHERE some_num = 20)
WHERE some_num = 20
In my sql query if I remove the outer "some_num" = 20 it takes 5 times as long . Shouldent these querys run in almost exactly the same time, if not wouldn't having the the additional where slow it down slightly?
What am I not understanding about how sql querys work?
Here is the original query in question
SELECT a.ITEMNO AS Item_No,
a.DESCRIPTION AS Item_Description,
UNITPRICE / 100 AS Retail_Price,
b.UNITSALES AS Units_Sold,
( Dollar_Sales ) AS Dollar_Sales,
( Dollar_Cost ) AS Dollar_Cost,
( Dollar_Sales ) - ( Dollar_Cost ) AS Gross_Profit,
( Percent_Page * c.PAGECOST ) AS Page_Cost,
( Dollar_Sales - Dollar_Cost - ( Percent_Page * c.PAGECOST ) ) AS Net_Profit,
Percent_Page * 100 AS Percent_Page,
( CASE
WHEN UNITPRICE = 0 THEN NULL
WHEN Percent_Page = 0 THEN NULL
WHEN ( Dollar_Sales - Dollar_Cost - ( Percent_Page * c.PAGECOST ) ) > 0 THEN 0
ELSE ( ceiling(abs(Dollar_Sales - Dollar_Cost - ( Percent_Page * c.PAGECOST )) / ( UNITPRICE / 100 )) )
END ) AS Break_Even,
b.PAGENO AS Page_Num
FROM (SELECT PAGENO,
OFFERITEM,
UNITSALES,
UNITPRICE,
( DOLLARSALES / 100 ) AS Dollar_Sales,
( DOLLARCOST / 10000 ) AS Dollar_Cost,
(( CAST(STUFF(PERCENTPAGE, 2, 0, '.') AS DECIMAL(9, 6)) )) AS Percent_Page
FROM OFFERITEMS
WHERE LEFT(OFFERITEM, 6) = 'CH1301'
AND PERCENTPAGE > 0) AS b
INNER JOIN ITEMMAST a
ON a.EDPNO = 1 * RIGHT(OFFERITEM, 8)
LEFT JOIN OFFERS c
ON c.OFFERNO = 'CH1301'
WHERE LEFT(OFFERITEM, 6) = 'CH1301'
ORDER BY Net_Profit DESC
Notice the two
WHERE left(OFFERITEM,6) = 'CH1301'
If I remove the outer Where then the query takes 5 times as long
As requested the Execution plan excuse the crappy upload
http://i.imgur.com/1PqmpVf.png

Is the column OFFERITEM in an index but PERCENTPAGE is not?
In your inner query you reference both these columns, in the outer query you only reference OFFERITEM.
Difficult to say without seeing the execution plan, but it could be that the outer query is causing the optimizer to run an 'index scan' whereas the inner query would cause a full table scan.
On a separate note, you should definitely modify:
WHERE left(OFFERITEM,6) ='CH1301'
to:
where offeritem like 'CH1301%'
As this will allow an index seek if there is an index on offeritem.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas