mysql group by date with multiple join - sql

SELECT
tba.UpdatedDate AS UpdatedDate,
tsh.SupplierID,
ts.ProductCode as ProductCode,
sum(tba.AfterDiscount) as AfterDiscount,
sum(tba.Quantity) as Quantity
FROM
tblstockhistory as tsh
left join tblstock as ts
on tsh.StockID=ts.StockID
left join tblbasket as tba
on ts.ProductCode=tba.ProductCode
and tsh.SupplierID=49
AND tba.Status=3
group by
tba.UpdatedDate
ORDER BY
Quantity DESC
i have the supplier table, the supplier id tagged in to tblstockhistory table, and in this tblstockhistory table contains the StockID(reference from tblstock table), and i have Stock table contains StockID, ProductCode ,
And i have the tblbasket table , in this am maintaining the ProductCode,
My idea here ,
i want to show thw stats by supplierID, when i pass the supplier id, it show show , this supplier supplied goods sale stats,
But the above query sometime return null value, and it takes too much time for excution, around 50 seconds ,
I what somthing like below from above query
Date SupplierID, Amount, Quantity
2010-12-12 12 12200 20
2010-12-12 40 10252 30
2010-12-12 10 12551 50
2010-12-13 22 1900 20
2010-12-13 40 18652 30
2010-12-13 85 19681 50
2010-12-15 22 1900 20
2010-12-15 40 18652 30
2010-12-15 85 19681 50

Does a tblstockhistory ever exist without a stockID. If it doesn't you can convert it to an inner join which can help.
e.g.
tblstockhistory as tsh
INNER join tblstock as ts
on tsh.StockID=ts.StockID
Also you might to consider adding indexes if they don't currently exist.
At the very least I would have the following fields indexed since they will likely be joined and queried commonly.
tblstockhistory.SockID
tblstockhistory.SupplierID
tblstock.StockID
tblstock.ProductCode
tblbasket.ProductCode
tblBacket.Status
tblbasket.UpdatedDate
Finally if its really important that this query be lightening fast you can create summary tables and update them periodically.

re write the group by clause as and try again
group by
tba.UpdatedDate, tsh.SupplierID
you have mentioned ProductCode in your query but not in the 'result' you wanted if you want to display ProductCode as well then add it to the group by clause or else remove it from the select clause.

Related

Join tables based on dates with check

I have two tables in PostgreSQL:
Demans_for_parts:
demandid partid demanddate quantity
40 125 01.01.17 10
41 125 05.01.17 30
42 123 20.06.17 10
Orders_for_parts:
orderid partid orderdate quantity
1 125 07.01.17 15
54 125 10.06.17 25
14 122 05.01.17 30
Basicly Demans_for_parts says what to buy and Orders_for_parts says what we bought. We can buy parts which do not list on Demans_for_parts.
I need a report which shows me all parts in Demans_for_parts and how many weeks past since the most recent matching row in Orders_for_parts. note quantity field is irrelevent here,
The expected result is (if more than one row per part show the oldes):
partid demanddate weeks_since_recent_order
125 01.01.17 2 (last order is on 10.06.17)
123 20.06.17 Unhandled
I think the tricky part is getting one row per table. But that is easy using distinct on. Then you need to calculate the months. You can use age() for this purpose:
select dp.partid, dp.date,
(extract(year from age(dp.date, op.date))*12 +
extract(month from age(dp.date, op.date))
) as months
from (select distinct on (dp.partid) dp.*
from demans_for_parts dp
order by dp.partid, dp.date desc
) dp left join
(select distinct on (op.partid) op.*
from Orders_for_parts op
order by op.partid, op.date desc
) op
on dp.partid = op.partid;
smth like?
with o as (
select distinct partid, max(orderdate) over (partition by partid)
from Orders_for_parts
)
, p as (
select distinct partid, min(demanddate) over (partition by partid)
from Demans_for_parts
)
select p.partid, min as demanddate, date_part('day',o.max - p.min)/7
from p
left outer join o on (p.partid = o.partid)
;

SQL counting query

Sorry if this is a basic question.
Basically, I have a table that is as follows, below is a basic sample
store-ProdCode-result
13p I10x 5
13p I20x 7
13p I30x 8
14a K38z 23
17a K38z 23
my data set has nearly 100,000 records.
What I'm trying to do is, for every store find the top 10 prodCode.
I am unsure of how to do this but what I tried was:
select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null
group by store,prod_code
order by count(prod_code) desc limit 10
this is giving me something completely different and i'm unsure on how I go about achieving my final result.
All help is appreciated.
Thanks
The expected output should be: for every store(s_code) display the top 10 prodcode
so:
store--prodcode--result
1a abc 5
1a abd 4
2a dgf 1
2a ldk 6
.(10 times until next store code)
You can use the table twice in the FROM clause, once for the data, and once to get a count of how many records have fewer results for that store.
SELECT a.s_code, a.prod_code, count(*)
FROM top10_secondary a
LEFT OUTER JOIN top10_secondary b
ON a.s_code = b.s_code
AND b.result < a.result
GROUP BY a.s_code, a.prod_code
HAVING count(*) < 10
With this technique though, you may get more than 10 records per store if the 10th result value exists multiple times. Because the limit rule is simply "include record as long as there are less than 10 records with result values than mine"
It looks like in your case, "result" is a ranking, so they would not be duplicated per store.
This is a good case for Window functions.
SELECT
s_code,
prod_code,
prod_count
FROM
(
SELECT
s_code,
prod_code,
prod_count,
RANK() OVER (PARTITION BY s_code ORDER BY prod_Count DESC) as prod_rank
FROM
(SELECT s_code as store, prod_code, count(prod_Code) prod_count FROM table GROUP BY s_code, prod_code) t1
) t2
WHERE prod_rank <= 10
The inner most query gets the count of each product at the store. The second inner more query determines the rank for those products for each store based on that count. Then the outer most query limits the results based on that rank.
o

Merge output as one

Select Category, Books.ISBN,
Orderitems.Quantity * (Books.Retail - Books.Cost) AS Category_Profit
From BOoks
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
Example output:
Category ISBN Category_Profit
Family life 1234 50
Family Life 1234 50
Family Life 1234 100
Fitness 4321 10
Fitness 4321 20
So forth and so forth,
How can I make the output calculate all the values for each category into one line?
I.e
Family Life 1234 200
Fitness 4321 30
Because you already have this as a starting point, use exactly what you have as a temp table, and pull data from that:
Select Category, ISBN, Sum(Category_Profit) From
(
select Category, Books.ISBN as ISBN,
Orderitems.Quantity * (Books.Retail - Books.Cost) AS Category_Profit
From Books
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
) temp
group by Category, ISBN
You organized the data really well, so implementing a sum on the Profit is easy. You group by Category and ISBN to get all unique pairs of those columns with the corresponding Profit.
If you do not want to use a sub-query you can sum in your query (but I feel it's something helpful to use my existing query as a sub-query before altering it, just to make sure it works:
select Category, Books.ISBN,
SUM(Orderitems.Quantity * (Books.Retail - Books.Cost)) AS Category_Profit
From Books
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
temp
group by Category, Books.ISBN
Group by can be used to solve your problem.
Note: In Group by clause , a set of table rows can be grouped based on certain columns and in the select clause either the group by columns or aggregate function(SUM,MIN,MAX,Count etc) on other columns can be shown.
Reference :
http://www.dofactory.com/sql/group-by
Use group by as is done below. Hope this solves the issue.
Select Category, Books.ISBN,
SUM(Orderitems.Quantity * (Books.Retail - Books.Cost)) AS Category_Profit
From BOoks
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
Group by Category,ISBN
Use GROUP_BY & SUM, Syntax :
SELECT column_name, SUM(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name;
Refer: SQL GROUP_BY
On you table you may run :
Select Category, ISBN, Sum(Category_Profit) From Table1
group by Category, ISBN;
Table1:
Category ISBN Category_Profit
Family life 1234 50
Family Life 1234 50
Family Life 1234 100
Fitness 4321 10
Fitness 4321 20
Output:
| Category | ISBN | Sum(Category_Profit) |
|-------------|------|----------------------|
| Family life | 1234 | 200 |
| Fitness | 4321 | 30 |
Fiddle

Selecting the latest per group of items [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Retrieving the last record in each group
i have 2 tables products and cost
PRODUCT
ProdCode - PK
ProdName
COST
Effectivedate - PK
RetailCOst
Prodcode
i tried this query:
SELECT a.ProdCOde AS id, MAX(EffectiveDate) AS edate, RetailCOst AS retail
FROM cost a
INNER JOIN product b USING (ProdCode)
WHERE EffectiveDate <= '2009-10-01'
GROUP BY a.ProdCode;
uhm yah its showing the right effectivedate but the cost on that specific effectivedate doesnt match.
so i want to select the latest date with the matching cost per item.
for example the date i selected is '2009-12-25' and the records for 1 item are these:
ProdCode |EffectiveDate| Cost
00010000 | 2009-01-05 | 50
00010000 | 2009-05-25 | 48
00010000 | 2010-07-01 | 40
so in result i should get 00010000|2009-05-25|48 because it is lesser than the date on my query and it is the latest for that item. and then i want to to show on my query the latest costs on each product.
hope to hear from you soon! thanks!
You need to use a subquery here:
SELECT maxdates.ProdCode, maxdates.maxDate, cost.RetailCost as retail
SELECT ProdCode, max(EffectiveDate) as maxDate
FROM cost
WHERE EffectiveDate < '2009-10-01'
GROUP BY ProdCode
) maxdates
LEFT JOIN cost ON (maxdates.ProdCode=cost.ProdCode
AND maxdates.maxDate=cost.EffectiveDate)
Explanation:
The inner SELECT gives a list of all Products and their respective maximum EffectiveDates. The join "glues" the retail cost per data entry to the result.
Alternatively, using the old max concat trick should do the trick.
SELECT
p.ProdCode,
SUBSTRING(MAX(CONCAT(d.EffectiveDate, c.RetailCost)), 1, 10) AS date,
SUBSTRING(MAX(CONCAT(d.EffectiveDate, c.RetailCost)), 10, 100) + 0 AS cost
FROM
product p,
cost c
WHERE
p.ProdCode = c.ProdCode AND
c.EffectiveDate < '2009-10-01'
GROUP BY
p.ProdCode

Optimizing Query With Subselect

I'm trying to generate a sales reports which lists each product + total sales in a given month. Its a little tricky because the prices of products can change throughout the month. For example:
Between Jan-01 and Jan-15, my company sells 50 Widgets at a cost of $10 each
Between Jan-15 and Jan-31, my company sells 50 more Widgets at a cost of $15 each
The total sales of Widgets for January = (50 * 10) + (50 * 15) = $1250
This setup is represented in the database as follows:
Sales table
Sale_ID ProductID Sale_Date
1 1 2009-01-01
2 1 2009-01-01
3 1 2009-01-02
...
50 1 2009-01-15
51 1 2009-01-16
52 1 2009-01-17
...
100 1 2009-01-31
Prices table
Product_ID Sale_Date Price
1 2009-01-01 10.00
1 2009-01-16 15.00
When a price is defined in the prices table, it is applied to all products sold with the given ProductID from the given SaleDate going forward.
Basically, I'm looking for a query which returns data as follows:
Desired output
Sale_ID ProductID Sale_Date Price
1 1 2009-01-01 10.00
2 1 2009-01-01 10.00
3 1 2009-01-02 10.00
...
50 1 2009-01-15 10.00
51 1 2009-01-16 15.00
52 1 2009-01-17 15.00
...
100 1 2009-01-31 15.00
I have the following query:
SELECT
Sale_ID,
Product_ID,
Sale_Date,
(
SELECT TOP 1 Price
FROM Prices
WHERE
Prices.Product_ID = Sales.Product_ID
AND Prices.Sale_Date < Sales.Sale_Date
ORDER BY Prices.Sale_Date DESC
) as Price
FROM Sales
This works, but is there a more efficient query than a nested sub-select?
And before you point out that it would just be easier to include "price" in the Sales table, I should mention that the schema is maintained by another vendor and I'm unable to change it. And in case it matters, I'm using SQL Server 2000.
If you start storing start and end dates, or create a view that includes the start and end dates (you can even create an indexed view) then you can heavily simplify your query. (provided you are certain there are no range overlaps)
SELECT
Sale_ID,
Product_ID,
Sale_Date,
Price
FROM Sales
JOIN Prices on Sale_date > StartDate and Sale_Date <= EndDate
-- careful not to use between it includes both ends
Note:
A technique along these lines will allow you to do this with a view. Note, if you need to index the view, it will have to be juggled around quite a bit ..
create table t (d datetime)
insert t values(getdate())
insert t values(getdate()+1)
insert t values(getdate()+2)
go
create view myview
as
select start = isnull(max(t2.d), '1975-1-1'), finish = t1.d from t t1
left join t t2 on t1.d > t2.d
group by t1.d
select * from myview
start finish
----------------------- -----------------------
1975-01-01 00:00:00.000 2009-01-27 11:12:57.383
2009-01-27 11:12:57.383 2009-01-28 11:12:57.383
2009-01-28 11:12:57.383 2009-01-29 11:12:57.383
It's well to avoid these types of correlated subqueries. Here's a classic technique for such cases.
SELECT
Sale_ID,
Product_ID,
Sale_Date,
p1.Price
FROM Sales AS s
LEFT JOIN Prices AS p1 ON s.ProductID = p1.ProductID
AND s.Sale_Date >= p1.Sale_Date
LEFT JOIN Prices AS p2 ON s.ProductID = p2.ProductID
AND s.Sale_Date >= p2.Sale_Date
AND p2.Sale_Date > p1.Sale_Date
WHERE p2.Price IS NULL -- want this one not to be found
Use a left outer join on the pricing table as p2, and look for a NULL record demonstrating that the matched product-price record found in p1 is the most recent on or before the sales date.
(I would have inner-joined the first price match, but if there is none, it's nice to have the product show up anyway so you know there's a problem.)
Are you actually running into performance problems or are you just anticipating them? I would implement this exactly as you have, were my hands tied from a schema-modification standpoint as yours are.
I agreee with Sean. The code you have written is very clean and understandable. If you are having performance issues, then take the extra effort to make the code faster. Otherwise, you are making the code more complex for no reason. Nested sub-selects are extremely useful when used judiciously.
The combination of Product_ID and Sale_Date is your foreign key. Try a select-join on Product_ID, Sale_Date.