How to fix a query that produces too many rows?

How to fix a query that produces too many rows? - sql

I'm designing Firebird 3.0 database for service sales, for example, for beauty saloons etc.
The database has the tables:
Serv - for service's list;
ServRecs - for service sales records;
Docs - for service documents;
Calc - for service calculations i.e. which raw material is used in specific service, quantity of raw material etc;
RecsOut - for raw material output records (sales);
RecsIn - for raw material Input records;
Inventory - for raw material's & good's names and properties.
Serv: Id, name, qnt, Vat...
ServRecs: Id, serv_id, Doc_id, qnt...
Docs: doc_id, docN, DocDT, Summ, ...
Calc: Id, serv_id, RawMat_id, qnt, unit_id...
RecsOut: id, doc_id, good_id, RecsIn_id
RecsIn: id, good_id...
Inventory: id, name (Rawaterial's and good's name)...
Let me explain with an example:
There is service document 323. There are 2 services used in it: service with serv_id=7 (hair cutting) and serv_id=8 (hair washing). As ServRecs table's qnt field shows service with serv_id=8 is used 2 times (i.e. 2 washing, before and after coloring), service with serv_id=7 only 1 time. As Calc table shows, generally, on service #7 are used raw material with id=11446 15ml and with id=11448 15ml, on service #8 - raw material with id=11450 10ml. That is, total used: raw material 11446 - 15ml, 11448 - 15ml and 11450 - 20ml (2*10ml).
My query looks like this:
select
i.id,
i.name as UsedRawMaterialName,
s.name as ServiceName,
ro.doc_id as ServiceDoc_id,
ri.cost as CostofRawMaterial,
sr.qnt as ServiceQnt, --used service quantity, for example, 2 times washing
sr.qnt*c.qnt as UsedRawMaterialQnt, --used service quantity*rawmaterial's used for 1 service
i.unit_k
from Inventory I, RecsOut ro, RecsIn ri, calc c, servrecs sr, serv s, Docs d, unit u,
where
d.doc_id= ro.doc_id and d.doc_id=sr.doc_id and d.doc_id=323 and
s.id=c.serv_id and sr.serv_id=c.serv_id and
c.rawmat_id=i.id and
ro.recsIn_id=ri.id and
i.unit_k=u.unit_k
My aim is get result like this:
However, the query returns result with redundant records and wrong values like this:
What is wrong in my query?
Update 1:
I changed "old-style Join syntax" with "new-style Join syntax" and easily find out that error was in "Join RecsOut ro on ro.id=i.id" clause. "New-style Join" is really very visually informative than old-style.
select
i.id,
i.name as UsedRawMaterialName,
s.name as ServiceName,
ro.doc_id as ServiceDoc_id,
ri.cost as CostofRawMaterial,
sr.qnt as ServiceQnt, --used service quantity, for example, 2 times washing
sr.qnt*c.qnt as UsedRawMaterialQnt, --used service quantity*rawmaterial's used for 1 service
i.unit_k
from
Inventory I Join RecsOut ro on ro.id=i.id
Join RecsIn ri on ro.recsin_id=ri.id
Join calc c on c.rawmat_id=i.id
join ServRecs sr on sr.serv_id=c.serv_id
Join serv s on s.id=c.serv_id
Join doc d on d.doc_id=ro.doc_id and
d.doc_id=sr.doc_id and
d.doc_id=323
join unit u on i.unit_k=u.unit_k

#basti A major benefit of the "New Style Join" is that each table can be brought in one at a time during development and testing. With each table "joined" it is very straightforward to see which relationship has generated more (or indeed less) records than you are expecting
Translating your code shows me there could be breakage somewhere . Thanks for replying to comment ...
from Inventory I
join RecsOut ro on ro.recsIn_id=ri.id
-- ??? join RecsIn ri, --- ??
join calc c on c.rawmat_id=i.id
join servrecs sr on sr.serv_id=c.serv_id
join serv s on s.id=c.serv_id
join Docs d on d.doc_id= ro.doc_id
and d.doc_id=sr.doc_id
and d.doc_id=323
join unit u on i.unit_k=u.unit_k
Don't forget to embrace inner , left and outer joins

Related

3 Tables into dropdown list

I have 3 tables, Clients, Products, Transactions.
When we enter a product, it is given a PID (product id) and a CID (client ID), which relate to the Clients and Transaction tables. The transaction table has a CID and Quantity.
I am trying to list all unique products and quantities, some clients have 2 listings of the same product, so if 1 is 10 units and the other is 20, then that client has 30 of product a.
The transactions table lists all sales, which are subtracted from the total.
I need the query to show the product name, client name, quantity available.
Here is the code I have so far, apologies for the mess and thanks much for any help.
This is an Access database.
SELECT Min(Products.PID) exPID,
Min(Products.[Product Name]) AS exProdName,
Min(Products.[Seller Asking]) AS exAsking,
Min(Products.CID) AS exClientID,
Min(Transactions.[CID Seller]) AS exSellerID,
Sum(Products.Quantity - ((SELECT Transactions.[No Units], Clients.Name,
Transactions.[CID Seller], Products.CID
FROM Transactions, Clients, Products
WHERE Transactions[CID Seller]=Products.CID)
) AS exSumofTrans),
Min(Clients.Name) AS exClientName,
Min(Transactions.[CID Seller]) AS exSeller
FROM Transactions, Clients, Products
WHERE (((Transactions.[CID Seller])=[Products].[CID]
AND (Products.[PID])=([Transactions].[PID])));
First issue here is an error on the inner select.
The error says:
'Syntax error in query expression (Sum(Products.Quantity-((Select Transactions.[No Units], Clients.Name, Transactions.[CID Seller], Products.CID
FROM Transactions, Clients, Products
Where Transactions[CID Seller]=Products.CID)) as exSumofTrans)'.

I had to do some guessing based on what it looks like you are trying to do and I have no way of testing it so it may need further tweaking but this will get you on the right path:
SELECT Products.PID AS exPID,
FIRST(Products.[Product Name]) AS exProdName,
FIRST(Products.[Seller Asking]) AS exAsking,
FIRST(Products.CID) AS exClientID,
SUM(Products.Quantity)-FIRST(T.exNoUnits) AS exSumofTrans,
T.exSellerID,
FIRST(Clients.Name) AS exClientName
FROM (Products
INNER JOIN (SELECT [CID Seller] AS exSellerID, PID, SUM([No Units]) AS exNoUnits
FROM Transactions GROUP BY [CID Seller], PID) AS T
ON T.PID=Products.PID)
INNER JOIN Clients ON T.exSellerID=Clients.CID
GROUP BY Products.PID, T.exSellerID

T-SQL JOIN Table On Self Based on Closest Date

Thank you in advance for reading!
The question I'm trying to answer is: "How much do parts really cost to make?" We manufacture by machining raw metal billets down to metal parts. Final parts are sold to a customer and scrap metal from the process is sold to the scrap yard.
For business/ERP configuration reasons our scrap vendor is listed as a customer and we ship him 'parts' like our other customers. These dummy parts are simply for each of the metal alloys we work with, so there is one dummy scrap part for each alloy we use. The scrap shipments are made whenever we fill our scrap bins so there's no defined time interval.
I'm trying to connect the ship date of a real part to a real customer to the closest scrap ship date of the same alloy. Then I can grab the scrap value per pound we were paid and include it in our revenue for the parts we make. If I can ask for the world it would be helpful to know how to grab the scrap shipment immediately before or immediately after the shipment of a real part - I'm sure management will change their minds several times debating if they want to use the 'before' or 'after' number.
I've tried other solutions and can't get them to work. I'm crying uncle, I simply can't get it to work....the web SQL interface our ERP uses claims it's T-SQL... thank you for reading this far!
What I'd like the output to look like is:
Customer Part Price Alloy Weight_Lost Scrap_Value Ship_Date
ABC Widget1 99.99 C182 63 2.45 10-01-2016
Here's the simplest I can boil the tables down to:
SELECT
tbl_Regular_Sales.Customer
tbl_Regular_Sales.Part
tbl_Regular_Sales.Price
tbl_Regular_Sales.Alloy
tbl_Regular_Sales.Weight_Lost
tbl_Scrap_Sales.Price AS 'Scrap_Value'
tbl_Regular_Sales.Ship_Date
FROM
(SELECT P.Part
,P.Alloy
,P.Price
,S.Ship_Date
,S.Customer
FROM Part AS P
JOIN S AS S
ON S.Part_Key = P.Part_Key
WHERE Shipper.Customer = 'Scrap_Yard'
) AS tbl_Scrap_Sales
JOIN
(SELECT P.Part
,P.Weight_Lost
,P.Alloy
,P.Price
,S.Ship_Date
,S.Customer
FROM Part AS P
JOIN S AS S
ON S.Part_Key = P.Part_Key
WHERE Shipper.Customer <> 'Scrap_Yard' ) AS tbl_Regular_Sales
ON
tbl_Regular_Sales.Alloy = tbl_Scrap_Sales.Alloy
AND <Some kind of date JOIN to get the closest scrap shipment value>

Something like this may do the trick:
WITH cteScrapSales AS (
SELECT
P.Alloy
,P.Price
,S.Ship_Date
FROM Part AS P
JOIN Shipper AS S ON S.Part_Key = P.Part_Key
WHERE S.Customer = 'Scrap_Yard'
), cteRegularSales AS (
SELECT
P.Part_Key
,P.Part
,P.Weight_Lost
,P.Alloy
,P.Price
,S.Ship_Date
,S.Customer
FROM Part AS P
JOIN Shipper AS S ON S.Part_Key = P.Part_Key
WHERE S.Customer <> 'Scrap_Yard'
)
SELECT
C.Customer
,C.Part
,C.Price
,C.Alloy
,C.Weight_Lost
,C.Scrap_Value
,C.Ship_Date
FROM (
SELECT R.*, S.Price AS Scrap_Value, ROW_NUMBER() OVER (PARTITION BY R.Part_Key ORDER BY DATEDIFF(SECOND, R.Ship_Date, S.Ship_Date)) ix
FROM cteRegularSales R
JOIN cteScrapSales S ON S.Allow = R.Allow AND S.Ship_Date > R.Ship_Date
) AS C
WHERE C.ix = 1;

SQL Left Outer Join on Subquery

I am attempting to build a query that contains a left join subquery - based on the principles I learned in a previous question - that should pull similar data sets from two different tables. The goal is to compare volume data by account || platform to ensure that the stored procedure that creates one table from another is doing so correctly.
The idea is this:
Account || Product || T1Vol || T2Vol
abc AT 10 10
def RT 20 25
ghi OB 30
So with this example, the idea is to pull all accounts and products from T1 (the table the procedure acts on) and any accounts and products from T2 (the newly created table) where there is a match (so, Left Join on T1 = T2). (Ideally, everything will match perfectly, with no variance in T1 vs T2 vol and no nulls in T2 volume).
I wrote the following the query to accomplish this but its not quite working. The current error I get is not a GROUP BY expression - which I don't think is the real issue. I have been searching and with iterations to no avail.
The query is below. (To keep with the example, T1 = OpStats and T2 = RegSplits. Any help is much appreciated.
SELECT DTA.trading_code Account, OpStats.product_dwkey Platform, SUM(OpStats.risk_amount_adj)/1000000 OpStatsVol, RegSplits.Volume RegSplitsVol
FROM fact_trade_presplit_rollup OpStats
INNER JOIN dim_trading_accounts DTA ON OpStats.trading_dwkey=DTA.trading_dwkey
LEFT OUTER JOIN
( SELECT b.trading_Code Account, a.product_dwkey Platform, SUM(a.risk_amount_adj)/1000000 Volume
FROM fact_trade_rollup a
INNER JOIN dim_trading_accounts b on a.trading_dwkey=b.trading_dwkey
WHERE a.account_type IN('Customer','Taker')
AND a.date_key>='01-JAN-16'
AND a.date_key<='31-MAR-16'
AND a.daily_db_metric NOT IN ('Manual Treasury Volume ($B)', 'Manual Volume ($B)', 'HSBC-WL POMS (Internal) Volume ($B)','JPMC-WL Order Book (Internal) Volume ($B)')
AND (a.product_dwkey IN('RT','HWL') AND a.source_name<>'STP')
GROUP BY b.trading_code, a.product_dwkey ) RegSplits
ON (DTA.trading_code = RegSplits.Account) /* is it because I am trying to join DTA to the subquery */
WHERE OpStats.account_type IN('Customer','Taker')
AND OpStats.date_key>='01-JAN-16'
AND OpStats.date_key<='31-MAR-16'
AND OpStats.daily_db_metric NOT IN ('Manual Treasury Volume ($B)', 'Manual Volume ($B)', 'HSBC-WL POMS (Internal) Volume ($B)','JPMC-WL Order Book (Internal) Volume ($B)')
AND (OpStats.product_dwkey IN('RT','HWL') AND OpStats.source_name<>'STP')
GROUP BY DTA.trading_code, OpStats.product_dwkey;

The "Not group by expression" error is very easy to check.
Just compare SELECT expressions with GROUP BY expressions:
SELECT DTA.trading_code Account,
OpStats.product_dwkey Platform,
SUM(OpStats.risk_amount_adj)/1000000 OpStatsVol,
RegSplits.Volume RegSplitsVol
FROM ......
......
GROUP BY DTA.trading_code,
OpStats.product_dwkey;
There are two elements in SELECT that are not in GROUP BY:
SUM(OpStats.risk_amount_adj)/1000000 OpStatsVol
RegSplits.Volume RegSplitsVol
The number 1 is OK - it's an aggregate function, it cannot be in GROUP BY.
The number 2 caused this error - it's not an aggregate function, and it is not listed in GROUP BY clause.

fetching single child row based on a max value using Django ORM

I have a model, "Market" that has a one-to-many relation to another model, "Contract":
class Market(models.Model):
name = ...
...
class Contract(models.Model):
name= ...
market = models.ForeignKey(Market, ...)
current_price = ...
I'd like to fetch Market objects along with the contract with the maximum price of each. This is how I'd do it via raw SQL:
SELECT M.id as market_id, M.name as market_name, C.name as contract_name, C.price
as price from pm_core_market M INNER JOIN
(SELECT market_id, id, name, MAX(current_price) as price
FROM pm_core_contract GROUP BY market_id) AS C
ON M.id = C.market_id
Is there a way to implement this without using SQL? If there is, which one should be preferred in terms of performance?

Django 1.1 (currently beta) adds aggregation support to the database API. Your query can be done like this:
from django.db.models import Max, F
Contract.objects.annotate(max_price=Max('market__contract__current_price')).filter(current_price=F('max_price')).select_related()
This generates the following SQL query:
SELECT contract.id, contract.name, contract.market_id, contract.current_price, MAX(T3.current_price) AS max_price, market.id, market.name
FROM contract LEFT OUTER JOIN market ON (contract.market_id = market.id) LEFT OUTER JOIN contract T3 ON (market.id = T3.market_id)
GROUP BY contract.id, contract.name, contract.market_id, contract.current_price, market.id, market.name
HAVING contract.current_price = MAX(T3.current_price)
The API uses an extra join instead of a subquery (like your query does). It is difficult to tell which query is faster, especially without knowing the database system. I suggest that you do some benchmarks and decide.

Nested SELECT Statement

SQL is not my forte, but I'm working on it - thank you for the replies.
I am working on a report that will return the completion percent of services for indiviudals in our contracts. There is a master table "Contracts," each individual Contract can have multiple services from the "services" table, each service has multiple standards for the "standards" table which records the percent complete for each standard.
I've gotten as far as calculating the total percent complete for each individual service for a specific Contract_ServiceID, but how do I return all the services percentages for all the contracts? Something like this:
Contract Service Percent complete
abc Company service 1 98%
abc Company service 2 100%
xyz Company service 1 50%
Here's what I have so far:
SELECT
Contract_ServiceId,
(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100 as "Percent Complete"
FROM dbo.Standard sta WITH (NOLOCK)
INNER JOIN dbo.Contract_Service conSer ON sta.ServiceId = conSer.ServiceId
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
WHERE Contract_ServiceId = '[an id]'
GROUP BY Contract_ServiceID
This gets me too:
Contract_serviceid Percent Complete
[an id] 100%
EDIT: Tables didn't show up in post.

I'm not sure if I understand the problem, if the result is ok for a service_contract you canContract Service
SELECT con.ContractId,
con.Contract,
conSer.Contract_ServiceID,
conSer.Service,
(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100 as "Percent Complete"
FROM dbo.Standard sta WITH (NOLOCK)
INNER JOIN dbo.Contract_Service conSer ON sta.ServiceId = conSer.ServiceId
INNER JOIN dbo.Contract con ON con.ContractId = conSer.ContractId
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
GROUP BY con.ContractId, con.Contract, conSer.Contract_ServiceID, conSer.Service
make sure you have all the columns you select from the Contract table in the group by clause

You should be able to add in your select the company name and group by that and the service id and ditch the where clause...
Perhaps like this:
SELECT
Contract,
Contract_ServiceId,
(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100 as "Percent Complete"
FROM dbo.Standard sta WITH (NOLOCK)
INNER JOIN dbo.Contract_Service conSer ON sta.ServiceId = conSer.ServiceId
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
GROUP BY Contract, Contract_ServiceID

Assuming your query works for just the one service, looks like you're most of the way there, leave off the WHERE clause to obtain all results, your GROUP BY will take care of one service per result.
Just join on the Contract table to show the contract related to each service, and you're done.

In addition to removing the where clause and adding more group conditions, you also will want to watch out for null records in each of your tables. This requires changing an INNER JOIN to a LEFT JOIN (unless you don't want to see those rows) and some ISNULL's to clean up data. I'm not sure where the StandardReportId concept falls in here, but it looks like a filtering mechanism that I won't toy with.
SELECT
ContractID
ISNULL(Contract_ServiceId, '-1') -- or some other stand in value
ISNULL((SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100, 0) as "Percent Complete"
FROM
Contract AS con
LEFT OUTER JOIN dbo.Contract_Service conSer ON con.ContractID = conSer.ContractID
LEFT OUTER JOIN dbo.Standard sta WITH (NOLOCK) ON conSer.ServiceId = sta.StandardID
LEFT OUTER JOIN dbo.StandardResponse standResp ON sta.StandardId = standResp.StandardId
AND conSer.StandardReportId = standResp.StandardReportId
GROUP BY
ContractID, Contract_ServiceID

Because you are grouping by the contract serviceid I think you can just remove the where clause and it should calculate the percentage for all contact serviceids.
If there are no records in dbo.Standard for that contract serviceid, you may need to left outer join instead from the contract service table to the dbo.Standard table in order to show contracts without completion records.
I hope that makes sense... My SQL is getting rusty after migrating to a data framework.

(SUM(CompletionPercentage)/COUNT(CompletionPercentage)) * 100
If CompletionPercentage is an int field you will have trouble with integer math. Anytime you divide by an integer you need to multiply it by 1.0 to make sure it is considering the number as a decimal. Otherwise 49/100 would = 0.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to fix a query that produces too many rows? - sql

Related

3 Tables into dropdown list

T-SQL JOIN Table On Self Based on Closest Date

SQL Left Outer Join on Subquery

fetching single child row based on a max value using Django ORM

Nested SELECT Statement

Categories

Resources