Query optimization to yield computational results

Query optimization to yield computational results - sql

Table Structure
It is very difficult to add tables in this posting, atleast I dont know. Tried using HTML table tags, but they wont appear good. Hence posting the table structure as an image.
Considering the 3 tables seen in the image, Projects, BC, Actual Spend, as an sample, I'm looking for an optimal query that returns the Reports as the result. As you can see, BC has some computation, Actual Spend has
SELECT ProjectId, Name, Budget
, (SELECT b.[BC] FROM [BC] b
WHERE b.[BC] IN
(SELECT SUM(mx.[BC]) FROM [BC] mx
WHERE ProjectId=p.ProjectId)) AS 'BC'
, (SELECT sp.[ActualSpendAmount] FROM [ActualSpend] sp
WHERE sp.[DateSpent] IN
(SELECT MAX(as.[DateSpent]) FROM [ActualSpend] as
WHERE ProjectId=p.ProjectId)) AS 'Actual Spend'
, t.[Budget] - ((SELECT b.[BC] FROM [BC] b
WHERE b.[BC] IN
(SELECT SUM(mx.[BC]) FROM [BC] mx
WHERE ProjectId=p.ProjectId))
+
(SELECT sp.[ActualSpendAmount] FROM [ActualSpend] sp
WHERE sp.[DateSpent] IN
(SELECT MAX(as.[DateSpent]) FROM [ActualSpend] as
WHERE ProjectId=p.ProjectId)))
FROM Projects p;
As you can see, the SELECT for BC, Actual Spend is run twice. I have several other tables like BC, Actual Spend, that yields some computation. Is there any way to optimize this. Even if I put them in a function, it would be the same, the function would need to be called more than once.
Is there a way to optimize this query.
Pasting the table structure below:
Projects Table:
ProjectId Name Budget
1 DeadRock 500000
2 HardRock 300000
BC Table: Actual Spend Table:
ProjectId BCId BC ApprovalDate ProjectId ActualSpendId ActualSpendAmount DateSpent
1 1 5000 2015/02/01 1 1 " 15000" " 2015/03/01"
1 2 3000 2015/03/10 1 2 " 33000" " 2015/05/12"
1 3 15000 2015/05/01 1 3 " 45000" " 2015/06/03"
1 4 5000 2015/07/01 1 4 " 75000" " 2015/07/11"
2 5 2000 2015/03/19 2 5 " 5000" " 2015/04/20"
2 6 6000 2015/05/20 2 6 " 19000" " 2015/05/29"
2 7 25000 2015/08/01 2 7 " 42000" " 2015/06/23"
2 8 " 85000" " 2015/07/15"
Report:
ProjectId Name Budget BC Actual Spend ETC
"1 " DeadRock 500,000 28,000 75,000 397,000 Budget-(BC+ActualSpend)
"2 " HardRock 300,000 " 33,000" 85,000 182,000 Budget-(BC+ActualSpend)

Based on your expected result your query is way too complicated (and will not run without errors).
Assuming your DBMS supports Windowed Aggregates:
SELECT p.ProjectId, p.NAME, p.Budget,
BC.BC,
act.ActualSpendAmount,
p.Budget - (BC.BC + act.ActualSpendAmount)
FROM Projects AS p
LEFT JOIN
( -- sum of BC per project
SELECT ProjectId, SUM(BC) AS BC
FROM BC
GROUP BY ProjectId
) AS BC
ON ProjectId=bc.ProjectId
JOIN
( -- latest amount per project
SELECT ProjectId, ActualSpendAmount,
ROW_NUMBER()
OVER (PARTITION BY ProjectId
ORDER BY DateSpent DESC) AS rn
FROM ActualSpend
) AS Act
ON Act.ProjectId=p.ProjectId
AND Act.rn = 1

Your correlated subquery for BC does not make any sense:
, (SELECT b.[BC] FROM [BC] b
WHERE b.[BC] IN
(SELECT SUM(mx.[BC]) FROM [BC] mx
WHERE ProjectId=p.ProjectId)) AS 'BC'
If we concentrate on projectID 1, you have the data here:
ProjectId BCId BC ApprovalDate
--------------------------------------------
1 1 5000 2015/02/01
1 2 3000 2015/03/10
1 3 15000 2015/05/01
1 4 5000 2015/07/01
Therefore this part of the query:
(SELECT SUM(mx.[BC]) FROM [BC] mx
WHERE ProjectId=p.ProjectId)
Will return 28,000. Then what you essentially have is:
, (SELECT b.[BC] FROM [BC] b
WHERE b.[BC] IN (28000)) AS 'BC'
Which will return null, unless there happens to be 1 and only 1 record in the table for any project with that particular amount in BC. If there is more than one you will get an error because more than one record is returned in the subquery.
I suspect, based on the report data you simply want the sum, so you can simply use:
SELECT p.ProjectId,
p.Name,
p.Budget
bc.BC
FROM Projects p
LEFT JOIN
( SELECT bc.ProjectID, SUM(bc.BC) AS bc
FROM BC
GROUP BY bc.ProjectID
) AS bc
ON bc.ProjectID = P.ProjectID;
To get the SUM of BC for each project.
I have made an assumption here that you are using SQL Server based on the use of square brackets for object names, and syntax in your previous questions. In which case I would use OUTER APPLY to get the latest actual spend row, giving a final query of:
SELECT p.ProjectId,
p.Name,
p.Budget
bc.BC,
sp.ActualSpendAmount AS [Actual Spend],
p.Budget - bc.BC + sp.ActualSpendAmount AS ETC
FROM Projects p
LEFT JOIN
( SELECT bc.ProjectID, SUM(bc.BC) AS bc
FROM BC
GROUP BY bc.ProjectID
) AS bc
ON bc.ProjectID = P.ProjectID
OUTER APPLY
( SELECT TOP 1 sp.ActualSpendAmount
FROM [ActualSpend] AS sp
WHERE sp.ProjectID = p.ProjectID
ORDER BY sp.DateSpent DESC
) AS sp;

Related

How to create a query using the difference of two sums in seperate tables?

I'm trying to calculate the quantity left in each water container, based on the refills it gets, and how much is extracted.
At the moment I have created my tables as:
CONTAINERS
----------
ID NUMBER
1 14F
2 12A
3 55Y
REFILLS
-------
ID CONTAINERID QUANTITY
1 14F 100
2 14F 10
3 12A 65
EXTRACTIONS
-----------
ID CONTAINERID QUANTITY
1 14F 20
So I need a query that will return each container with the amount that is left in them, i.e. in this case:
CONTAINERID CURRENTQUANTITY
14F 90
12A 65
55Y 0
Where 90 is the result from the two refills and one extraction in that case (100+10-20).
I have managed to calculate the sum of all refills/extractions:
SELECT CONTAINERS.ID, SUM(REFILLS.QUANTITY) AS REFILLSQUANTITY
FROM CONTAINERS INNER JOIN REFILLS ON CONTAINERS.ID = REFILLS.CONTAINERID
GROUP BY CONTAINERS.ID;
And the same way for extractions, but I'm a bit stuck how to combine them and get the difference in one query. Any help would be much appreciated!

In MS Access, use left join and aggregations:
select c.id, c.number,
nz(refill, 0) - nz(extraction, 0) as net
from (containers as c left join
(select containerid, sum(quantity) as refill
from refills
group by containerid
) as r
on c.number = r.containerid
) left join
(select containerid, sum(quantity) as extraction
from extractions
group by containerid
) as e
on c.number = e.containerid;

Join tables based on dates with check

I have two tables in PostgreSQL:
Demans_for_parts:
demandid partid demanddate quantity
40 125 01.01.17 10
41 125 05.01.17 30
42 123 20.06.17 10
Orders_for_parts:
orderid partid orderdate quantity
1 125 07.01.17 15
54 125 10.06.17 25
14 122 05.01.17 30
Basicly Demans_for_parts says what to buy and Orders_for_parts says what we bought. We can buy parts which do not list on Demans_for_parts.
I need a report which shows me all parts in Demans_for_parts and how many weeks past since the most recent matching row in Orders_for_parts. note quantity field is irrelevent here,
The expected result is (if more than one row per part show the oldes):
partid demanddate weeks_since_recent_order
125 01.01.17 2 (last order is on 10.06.17)
123 20.06.17 Unhandled

I think the tricky part is getting one row per table. But that is easy using distinct on. Then you need to calculate the months. You can use age() for this purpose:
select dp.partid, dp.date,
(extract(year from age(dp.date, op.date))*12 +
extract(month from age(dp.date, op.date))
) as months
from (select distinct on (dp.partid) dp.*
from demans_for_parts dp
order by dp.partid, dp.date desc
) dp left join
(select distinct on (op.partid) op.*
from Orders_for_parts op
order by op.partid, op.date desc
) op
on dp.partid = op.partid;

smth like?
with o as (
select distinct partid, max(orderdate) over (partition by partid)
from Orders_for_parts
)
, p as (
select distinct partid, min(demanddate) over (partition by partid)
from Demans_for_parts
)
select p.partid, min as demanddate, date_part('day',o.max - p.min)/7
from p
left outer join o on (p.partid = o.partid)
;

SQL Grouping a Count Select With Aggregate Total

I've been working on this for far too many hours now and hit the wall. Hoping an SQL guru can help shed some light.
SELECT
CATEGORY.CategoryID, CATEGORY.Category_Name, CATEGORY_SUB.CategoryID AS Expr1,
CATEGORY_SUB.SubCategory_Name, COUNT(SELL_1.Item_SubCategory) AS Count,
(SELECT COUNT(Item_Category) AS Expr10
FROM SELL WHERE (UserName = 'me')
GROUP BY Item_Category) AS Expr20
FROM SELL AS SELL_1 LEFT OUTER JOIN
CATEGORY ON
SELL_1.Item_Category = CATEGORY.Category_Name
LEFT OUTER JOIN CATEGORY_SUB ON
CATEGORY.CategoryID = CATEGORY_SUB.CategoryID AND SELL_1.Item_SubCategory = CATEGORY_SUB.SubCategory_Name WHERE (SELL_1.Seller_UserName = 'me') AND (SELL_1.Item_Removed IS NULL) AND (SELL_1.Item_Pause IS NULL) AND (SELL_1.Item_Expires > GETDATE())
GROUP BY CATEGORY.Category_Name, CATEGORY_SUB.SubCategory_Name, CATEGORY.CategoryID, CATEGORY_SUB.CategoryID
ORDER BY Count DESC
In short the table returned should how the following columns where Expr20 is a "sum" or aggregate of the total counts of CategoryName so for example.
CategoryID CategoryName Expr1 SubCategory_Name Count Expr20
1 CatA 200 SubCatA1 1 1
1 CatA 201 SubCatA2 2 3
1 CatA 202 SubCatA3 4 7
2 CatB 301 SubCatB1 1 1
2 CatB 302 SubCatB2 4 5
3 CatC 401 SubCatC1 3 3
3 CatC 402 SubCatC2 2 5
3 CatC 403 SubCatC3 4 9
And So on.
My problem is no matter what I do I cannot seem to get Expr20 to work.
It seems the problem is with MS SQL wanting the alias after the (SELECT COUNT(Item_Category) so then it throws the error because 2 columns are returned.
I'm running MS SQL 2005. Grateful for any help

Really struggled with this and in the end used maybe a more elegant solution but potentially more server intensive...I'm not sure as I'm no SQL expert...but wanted to post my solution.
SELECT T1.CategoryID, Expr20, etc...
FROM
(
SELECT COUNT(Item_Category)
FROM SELL WHERE (UserName = 'me')
GROUP BY Item_Category) AS Expr20
) T1
JOIN
(
SELECT CATEGORY.CategoryID, CATEGORY.Category_Name, CATEGORY_SUB.CategoryID AS Expr1, CATEGORY_SUB.SubCategory_Name, COUNT(SELL.Item_SubCategory) AS Count...etc as shown in the question) T2
ON T1.Item_Category = T2.Category_Name
ORDER BY T1.Counted DESC
Worked a treat and I got the table and results I needed grouping the category names with the correct number of sum total per line.
So the trick was to make a select around the 2 selects rather than trying to join them as this just doesn't seem possible.
How this helps someone and saves them the 13 hours or hair pulling I went through last night.

It is a bit hard to see what data you are starting with. But, assuming you have all columns except Expr20, you can use outer apply or a correlated subquery:
select t.*, t2.Expr20
from sell t outer apply
(select sum(count) as Expr20
from sell t2
where t2.CategoryId = t.CategoryId and
t2.expr1 <= t.expr1
) t2;

Query to join tables based on two criteria

First, I'm not sure the title adequetely describes what I am trying to achive - so please ammend as you see fit.
I have a table in an SQL database which records budget allocations and transfers.
Each allocation and transfer is recorded against a combination of two details - the year_ID and program_ID. Allocations can come from nowhere or from other year_id & program_id combinations - these are the transfers.
For example, year_ID 1 & program_ID 2 was allocated $1000, then year_ID 1 & program_ID 2 transfered $100 to year_ID 2 & program_id 2.
This is stored in the database like
From_year_ID From_program_ID To_year_ID To_program_ID Budget
null null 1 2 1000
1 2 2 2 100
The query needs to summarise these budget allocations based on the year_id + program_id combination, so the results would display:
year_ID program_ID Budget_Allocations Budget_Transfers
1 2 1000 100
2 2 100 0
I've spent two days trying to put this query together and am officially stuck - could someone help me out or point me in the right direction? I've tried what feels like every combination of left, right, inner, union joins, with etc - but haven't got the outcome I'm looking for.
Here is a sqlfiddle with sample data: http://sqlfiddle.com/#!3/9c1ec/1/0 and one of the queries that doesnt quite work.

I would sum the Budget by Program_ID and Year_ID in some CTEs and join those to the Program and Year tables to avoid summing Budget values more than once.
WITH
bt AS
(SELECT
To_Year_ID AS Year_ID,
To_Program_ID AS Program_ID,
SUM(Budget) AS Budget_Allocation
FROM T_Budget
GROUP BY
To_Year_ID,
To_Program_ID),
bf AS
(SELECT
From_Year_ID AS Year_ID,
From_Program_ID AS Program_ID,
SUM(Budget) AS Budget_Transfer
FROM T_Budget
GROUP BY
From_Year_ID,
From_Program_ID)
SELECT
y.Year_ID,
p.Program_id,
bt.Budget_Allocation,
bf.Budget_Transfer,
y.Short_Name + ' ' + p.Short_Name AS Year_Program,
isnull(bt.Budget_Allocation,0) -
isnull(bf.Budget_Transfer,0)AS Budget_Balance
FROM T_Programs p
CROSS JOIN T_Years y
INNER JOIN bt
ON bt.Program_ID = p.Program_ID
AND bt.Year_ID = y.Year_ID
LEFT JOIN bf
ON bf.Program_ID = p.Program_ID
AND bf.Year_ID = y.Year_ID
ORDER BY
y.Year_ID,
p.Program_ID
http://sqlfiddle.com/#!3/9c1ec/13

How do I fix this SQL query returning improper values?

I am writing an SQL query which will return a list of auctions a certain user is losing, like on eBay.
This is my table:
bid_id bid_belongs_to_auction bid_from_user bid_price
6 7 1 15.00
8 7 2 19.00
13 7 1 25.00
The problematic area is this (taken from my full query, placed at the end of the question):
AND EXISTS (
SELECT 1
FROM bids x
WHERE x.bid_belongs_to_auction = bids.bid_belongs_to_auction
AND x.bid_price > bids.bid_price
AND x.bid_from_user <> bids.bid_from_user
)
The problem is that the query returns all the auctions on which there are higher bids, but ignoring the user's even higher bids.
So, an example when the above query works:
bid_id bid_belongs_to_auction bid_from_user bid_price
6 7 1 15.00
7 7 2 18.00
In this case, user 1 is returned as losing the auction, because there is another bid higher than the users bid.
But, here is when the query doesn't work:
bid_id bid_belongs_to_auction bid_from_user bid_price
6 7 1 15.00
8 7 2 19.00
13 7 1 25.00
In this case, user 1 is incorrectly returned as losing the auction, because there is another bid higher than one of his previous bids, but the user has already placed a higher bid over that.
If it's important, here's my full query, but I think it won't be necessary to solve the aforementioned problem, but I'm posting it here anyway:
$query = "
SELECT
`bid_belongs_to_auction`,
`auction_unixtime_expiration`,
`auction_belongs_to_hotel`,
`auction_seo_title`,
`auction_title`,
`auction_description_1`
FROM (
SELECT
`bid_belongs_to_auction`,
`bid_from_user`,
MAX(`bid_price`) AS `bid_price`,
`auctions`.`auction_enabled`,
`auctions`.`auction_unixtime_expiration`,
`auctions`.`auction_belongs_to_hotel`,
`auctions`.`auction_seo_title`,
`auctions`.`auction_title`,
`auctions`.`auction_description_1`
FROM `bids`
LEFT JOIN `auctions` ON `auctions`.`auction_id`=`bids`.`bid_belongs_to_auction`
WHERE `auction_enabled`='1' AND `auction_unixtime_expiration` > '$time' AND `bid_from_user`='$userId'
AND EXISTS (
SELECT 1
FROM bids x
WHERE x.bid_belongs_to_auction = bids.bid_belongs_to_auction
AND x.bid_price > bids.bid_price
AND x.bid_from_user <> bids.bid_from_user
)
GROUP BY `bid_belongs_to_auction`
) AS X
WHERE `bid_from_user`='$userId'
";

Here's a different approach:
$query = "
SELECT
`max_bids`.`bid_belongs_to_auction`,
`auctions`.`auction_unixtime_expiration`,
`auctions`.`auction_belongs_to_hotel`,
`auctions`.`auction_seo_title`,
`auctions`.`auction_title`,
`auctions`.`auction_description_1`
FROM `auctions`
INNER JOIN (
SELECT
`bid_belongs_to_auction`,
MAX(`bid_price`) AS `auction_max_bid`,
MAX(CASE `bid_from_user` WHEN '$userId' THEN `bid_price` END) AS `user_max_bid`
FROM `bids`
GROUP BY `bid_belongs_to_auction`
) AS `max_bids` ON `auctions`.`auction_id` = `max_bids`.`bid_belongs_to_auction`
WHERE `auctions`.`auction_enabled`='1'
AND `auctions`.`auction_unixtime_expiration` > '$time'
AND `max_bids`.`user_max_bid` IS NOT NULL
AND `max_bids`.`user_max_bid` <> `max_bids`.`auction_max_bid`
";
Basically, when you are retrieving the max bids for all the auctions, you are also retrieving the specific user's max bids along. Next step is to join the obtained list to the auctions table and apply an additional filter on the user's max bid being not equal to the auction's max bid.
Note: the `max_bids`.`user_max_bid` IS NOT NULL condition might be unnecessary. It would definitely be so in SQL Server, because the non-nullness would be implied by the `max_bids`.`user_max_bid` <> `max_bids`.`auction_max_bid` condition. I'm not sure if it's the same in MySQL.

Untested, but this is how I would approach it. Ought to perform OK if there's an index on userid and also one on auctionid.
select OurUserInfo.auctionid, OurUserInfo.userid,
OurUserInfo.ourusersmaxbid, Winningbids.TopPrice
from
(
select A.auctionid, A.userid, max(A.price) as OurUsersMaxBid
from auctions A where userid = ?
group by A.auctionid, A.userid
) as OurUserInfo
inner join
(
-- get the current winning bids for all auctions in which our user is bidding
select RelevantAuctions.auctionid, max(auctions.price) as TopPrice
from auctions inner join
(
select distinct auctionid from auctions where userid = ? -- get our user's auctions
) as RelevantAuctions
on auctions.auctionid = RelevantAuctions.auctionid
group by RelevantAuctions.auctionid
) as WinninBids
on OurUserInfo.auctionid = winningbids.auctionid
where WinninBids.TopPrice > OurUserInfo.ourusersmaxbid

Instead of
SELECT 1
FROM bids x
WHERE x.bid_belongs_to_auction = bids.bid_belongs_to_auction
AND x.bid_price > bids.bid_price
AND x.bid_from_user <> bids.bid_from_user
try this:
SELECT 1
FROM (SELECT BID_ID,
BID_BELONGS_TO_AUCTION,
BID_FROM_USER,
BID_PRICE
FROM (SELECT BID_ID,
BID_BELONGS_TO_AUCTION,
BID_FROM_USER,
BID_PRICE,
RANK ()
OVER (
PARTITION BY BID_BELONGS_TO_AUCTION, BID_FROM_USER
ORDER BY BID_PRICE DESC)
MY_RANK
FROM BIDS)
WHERE MY_RANK = 1) x
WHERE x.bid_belongs_to_auction = bids.bid_belongs_to_auction
AND x.bid_price > bids.bid_price
AND x.bid_from_user <> bids.bid_from_user;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Query optimization to yield computational results - sql

Related

How to create a query using the difference of two sums in seperate tables?

Join tables based on dates with check

SQL Grouping a Count Select With Aggregate Total

Query to join tables based on two criteria

How do I fix this SQL query returning improper values?

Categories

Resources