SQL for price difference calculation - sql

I've got two tables that I'm trying to grab data from. The first is a 'titles' table, which represents product information (name, unique ID, etc). The second is a 'prices' table which collects price information for various currencies (each product can have multiple historic entries in the prices table).
I've written a fairly long-winded SQL statement to grab the latest price changes across products, but there are some issues that hopefully more experienced users will be able to help me out with:
SELECT
t.id,
t.name,
t.type,
p.value,
(SELECT
value
FROM
prices
WHERE
prices.id = p.id AND
prices.country='US' AND
prices.timestamp < p.timestamp
ORDER BY
prices.timestamp DESC
LIMIT 1) AS last_value
FROM
prices AS p
INNER JOIN
titles AS t
ON
t.row_id = p.id
WHERE
p.country = 'US' AND
(SELECT
value
FROM
prices
WHERE
prices.id = p.id AND
prices.country='US' AND
prices.timestamp < p.timestamp
ORDER BY
prices.timestamp DESC
LIMIT 1) IS NOT NULL
GROUP BY
t.id
ORDER BY
p.timestamp DESC,
last_value DESC
LIMIT 0, 25"
The first issue I've run into is that, while this query works, titles appear multiple times in the same listing. While this is expected, I'd ideally like only the latest price change to be displayed for the title. To solve this, I tried GROUPING by the titles 'id' (note the: GROUP BY t.id above). Unfortunately, while I'd expect the GROUP to respect the ORDER BY (which orders the latest price changes in DESC order), the results seem to remove the latest changes and show the GROUP'd titles with earlier price values.
Secondly, is there any better way to grab the last price of each item (currently I grab the current value, and then run a subquery to grab the 'last_value' - which effectively represents the value before the current price change). At the moment I run two subqueries: one to grab the second to last known price, and again to ensure that a previous price exists in the database (otherwise there's no point in listing the title as having a price change).
Any help would be appreciated!

How about this:
SELECT titles.id, titles.name, titles.type, prices.value, MAX(prices.timestamp)
FROM titles, prices
WHERE prices.row_id = titles.id AND prices.country='US';
Mind you, I don't have MySQL installed so I couldn't try this query.
[Edit:] I think it won't work 'cause it'll always display the last price entered for all the items because it'll always choose the highest timestamp from the prices table, maybe a group by will do, I'm really sleepy now and I can't think straight;
[Edit2:] How about this:
(SELECT max(report_run_date) as maxdate, report_name
FROM report_history
GROUP BY report_name) maxresults
SELECT titles.id, titles.name, titles.type, prices.value,
(SELECT MAX(prices.timestamp) as maxtimestamp FROM prices GROUP BY prices.row_id)
FROM titles, prices
WHERE prices.row_id = titles.id AND prices.country='US';

Related

SQL QUERY : Find for each year copies sold > 10000

I am practicing a bit with SQL and I came across this exercise:
Consider the following database relating to albums, singers and sales:
Album (Code, Singer, Title)
Sales (Album, Year, CopiesSold)
with a constraint of referential integrity between the Sales Album attribute and the key of the
Album report.
Formulate the following query in SQL :
Find the code and title of the albums that have sold 10,000 copies
every year since they came out.
I had thought of solving it like this:
SELECT CODE, TITLE, COUNT (*)
FROM ALBUM JOIN SALES ON ALBUM.Code = SALES.Album
WHERE CopiesSold > 10000
HAVING COUNT(*) = /* Select difference from current year and came out year.*/
Can you help me with this? Thanks.
You can do this with an INNER JOIN, GROUP BY, and HAVING.
SELECT A.Code, A.Title
FROM ALBUM A
INNER JOIN SALES S ON S.Album = A.Code
GROUP BY A.Code, A.Title
HAVING MIN(S.CopiesSold) >= 10000
The HAVING clause will filter out albums whose minimum Copies Sold are < 10000.
EDIT
There was also a question about gaps in the Sales data, there are a number of ways to modify the above query to solve for this as well. One solution would be to use an embedded query to identify the correct number of years.
SELECT A.Code, A.Title
FROM ALBUM A
INNER JOIN SALES S ON S.Album = A.Code
GROUP BY A.Code, A.Title
HAVING MIN(S.CopiesSold) >= 10000 AND
COUNT(*) = (SELECT COUNT(DISTINCT Year) FROM SALES WHERE Year >= MIN(s.Year))
This solution assumes that at least one album by some artist was sold each year (a fairly safe bet). If you had a Years table there are simpler solutions. If the data is current there are also solutions that utilize DATEDIFF.
You can use correlated subqueries with EXISTS or NOT EXISTS respectively.
In one check if the maximum year minus the minimum year plus one is equal to the count of records with a defined year of an album. That way you make sure you don't get albums where there are figures missing for a year and you therefore cannot tell whether they sold 10000 or more or not. Also check that the maximum year is the current year not to miss gaps between the maximum year and the current year. (In the example code I will use the literal 2020 but there are means to get that dynamically. They depend on the DBMS however and you didn't state which one you're using.)
In the second one check that there's no record with undefined sales figures or sales figures lower than 10000 for the album. If no such record exists, all of the existing one have to have figures of 10000 or greater.
SELECT a1.code,
a1.title
FROM album a1
WHERE EXISTS (SELECT ''
FROM sales s1
WHERE s1.album = a1.code
HAVING max(s1.year) - min(s1.year) + 1 = count(s1.year)
AND max(s1.year) = 2020)
AND NOT EXISTS (SELECT *
FROM sales s2
WHERE s2.album = a1.code
AND s2.copiessold IS NULL
OR s2.copiessold < 10000);
I think the ALL keyword should work nicely here. Something like this:
SELECT * FROM Album
WHERE 10000 <= ALL (
SELECT CopiesSold FROM Sales
WHERE Sales.Album = Album.Code)

Get the product of two values from two different tables

If anyone can help me figure out where I am going wrong with this SQL that would be great. Please see my attempt to answer it below. I have answer how I think it should be answered but I am very confused by the exam advice below, which says I should use a SUM function? I have googled this and I do not see how a SUM function can help here when I need get the product of two values in this case. Or am I missing something major?
Question: TotalValue is a column in Order relation that contains derived data representing total value (amount) of each order. Write a SQL SELECT statement that computes a value for this column.
My answer:
SELECT Product.ProductPrice * OrderLine.QuantityOrdered AS Total_Value
FROM Product,
OrderLine
GROUP BY Product;
Advice from exam paper:
This is a straightforward question. Tip: you need to use the SUM function. Also, note that you can take the sum of various records set using the GROUP BY clause.
Ok your question became a lot clearer once I clicked on the the hyperlink (blue text).
Each order is going to be made up of a quantity of 1 or more products.
So there could be 3 Product A and 5 Product B etc.
So you have to get the total for each product which is your Price * Quantity, but then you need to add them all together which is where the SUM comes in.
Example:
3 * ProductA Price (e.g. €5) = 15
5 * ProductB Price (e.g. €4) = 20
Total Value = 35
So you need to use the Product, Order and OrderLine tables.
Something like (I haven't tested it):
SELECT SUM(Product.ProductPrice * OrderLine.QuantityOrdered) FROM Product, Order, OrderLine
WHERE Order.OrderID = OrderLine.OrderID
AND Product.ProductID = OrerLine.ProductID
GROUP BY Order.OrderID
This should return rows containing the totalValue for each order - the GROUP BY clause causes the SUM to SUM over each group - not the entire rows.
For a single order you would need add (before the GROUP BY) "AND Order.OrderID = XXXXX" where XXXXX is the actual orders OrderId.

Multiple of same result even with group by

Alright so say I have a 'product_catalog', and 'orders' tables. Each order has the product_catalog_id as a foreign key. What I want to return as the query results is the product_code (name of the product associated with a specific product_catalog_id) + a count of how many of each product_code have been ordered. That's easy enough with something like this (Oracle SQL):
SELECT pc.product_code,
COUNT(*) as count
FROM orders o
join product_catalog pc on pc.product_catalog_id = o.product_catalog_id
GROUP BY pc.product_code
ORDER BY count DESC;
but I also want to print various pieces of information from the order table such as total of all monthly charges for that product_code. That would seem easy enough with something like this:
(o.monthly_base_charge*count(*)) as "Monthly Fee"
but the problem is that there have been various monthly fees for the same product_code over time. If I add the above line in and add 'o.monthly_base_charge' to the group by statement, then it will print out a unique row for every variation of pricing for that product_code. How do I get it to ignore those price variations and just add together every entry with that product code?
It is a little unclear what you are asking. My best guess is that you want the sum of the monthly base charge:
SELECT pc.product_code,
COUNT(*) as count,
sum(o.monthly_base_charge) as "Monthly Fee"
FROM orders o join
product_catalog pc
on pc.product_catalog_id = o.product_catalog_id
GROUP BY pc.product_code
ORDER BY count DESC;
I'm not sure if this is exactly what you want. What happens if you have two orders in the same month for the same product?
You may need to do something like this since SQL will not be able to know which monthly base charge to multiply by the count.
SELECT pc.product_code,
COUNT(*) as count,
(min(o.monthly_base_charge)*count(*)) as "Monthly Fee"
FROM orders o
join product_catalog pc on pc.product_catalog_id = o.product_catalog_id
GROUP BY pc.product_code
ORDER BY count DESC;
Or you will need to add o.monthly_base_charge to the group by in order for sql to know how to determine the count()
GROUP BY pc.product_code, o.monthly_base_charge

How do I set ORDER BY in SQL query to a value depending by the SQL query itself?

Imagine an auction (ebay auction, for example). You create an auction, set the start bidding value, let's say, 5 dollars. This gets stored as a minimal bid value to the auctions table.At this point, the current bid value of this auction is 5 dollars.
Now, if someone bids to your auction, let's say, 10 dollars, this gets stored to the bids table.At this point, the current bid value of this auction is 10 dollars.
Now let's imagine you want to retrieve 5 cheapest auctions. You will write a query like this:
SELECT
`auction_id`,
`auction_startPrice`,
MAX(bids.bid_price) as `bid_price`
FROM
`auctions`
LEFT JOIN `bids` ON `auctions`.`auction_id`=`bids`.`bid_belongs_to_auction`
GROUP BY `auction_id`
LIMIT 5
Pretty simple, and it works! But now you need to add an ORDER BY clause to the query. The problem is, however, that we want to ORDER BY either by auctions.auction_startPrice or by bid_price, depending on whichever of this is higher, as explained in the first paragraphs.
Can this be understood? I know how to do this using 2 queries, but I am hoping it can be done with 1 query.
Thanks!
EDIT: Just a further explanation to help you imagine the problem. If I set ORDER BY auction_startPrice ASC, then I will get 5 auctions with their lowest initial bid price, but what if there are already bids placed on those auctions? Then their current lowest price is equal to those bids, NOT to the start price, therefore my query is wrong.
SELECT
`auction_id`,
`auction_startPrice`,
`bid_price`
FROM
(
SELECT
`auction_id`,
`auction_startPrice`,
MAX(bids.bid_price) as `bid_price`,
IF(MAX(bids.bid_price)>`auction_startPrice`,
MAX(bids.bid_price),
`auction_startPrice`) higherPrice
FROM
`auctions`
LEFT JOIN `bids` ON `auctions`.`auction_id`=`bids`.`bid_belongs_to_auction`
GROUP BY `auction_id`
) X
order by higherPrice desc
LIMIT 5;
Note:
In the inner query, an extra column is created, named 'higherPrice'
The IF function compares the MAX(bid_price) column against the startprice, and only if the Max-bid is not null (implicitly required in comparison) and greater than start price, then the Max-bid becomes the value in the higherPrice column. Otherwise, it will contain the start price.
The outer query merely makes use of the columns from the inner query, ordering by the higherPrice
I'm not sure which database you're using but look at this example:
http://www.extremeexperts.com/sql/articles/CASEinORDER.aspx
SELECT
`auction_id`,
`auction_startPrice`,
MAX(bids.bid_price) as `bid_price`
FROM
`auctions`
LEFT JOIN `bids` ON `auctions`.`auction_id`=`bids`.`bid_belongs_to_auction`
GROUP BY `auction_id`
ORDER BY CASE WHEN `auction_startPrice` > isnull(MAX(bids.bid_price),0) then `auction_startPrice` else MAX(bids.bid_price) end
LIMIT 5

SQL conundrum, how to select latest date for part, but only 1 row per part (unique)

I am trying to wrap my head around this one this morning.
I am trying to show inventory status for parts (for our products) and this query only becomes complex if I try to return all parts.
Let me lay it out:
single table inventoryReport
I have a distinct list of X parts I wish to display, the result of which must be X # of rows (1 row per part showing latest inventory entry).
table is made up of dated entries of inventory changes (so I only need the LATEST date entry per part).
all data contained in this single table, so no joins necessary.
Currently for 1 single part, it is fairly simple and I can accomplish this by doing the following sql (to give you some idea):
SELECT TOP (1) ldDate, ptProdLine, inPart, inSite, inAbc, ptUm, inQtyOh + inQtyNonet AS in_qty_oh, inQtyAvail, inQtyNonet, ldCustConsignQty, inSuppConsignQty
FROM inventoryReport
WHERE (ldPart = 'ABC123')
ORDER BY ldDate DESC
that gets me my TOP 1 row, so simple per part, however I need to show all X (lets say 30 parts). So I need 30 rows, with that result. Of course the simple solution would be to loop X# of sql calls in my code (but it would be costly) and that would suffice, but for this purpose I would love to work this SQL some more to reduce the x# calls back to the db (if not needed) down to just 1 query.
From what I can see here I need to keep track of the latest date per item somehow while looking for my result set.
I would ultimately do a
WHERE ldPart in ('ABC123', 'BFD21', 'AA123', etc)
to limit the parts I need. Hopefully I made my question clear enough. Let me know if you have an idea. I cannot do a DISTINCT as the rows are not the same, the date needs to be the latest, and I need a maximum of X rows.
Thoughts? I'm stuck...
SELECT *
FROM (SELECT i.*,
ROW_NUMBER() OVER(PARTITION BY ldPart ORDER BY ldDate DESC) r
FROM inventoryReport i
WHERE ldPart in ('ABC123', 'BFD21', 'AA123', etc)
)
WHERE r = 1
EDIT: Be sure to test the performance of each solution. As pointed out in this question, the CTE method may outperform using ROW_NUMBER.
;with cteMaxDate as (
select ldPart, max(ldDate) as MaxDate
from inventoryReport
group by ldPart
)
SELECT md.MaxDate, ir.ptProdLine, ir.inPart, ir.inSite, ir.inAbc, ir.ptUm, ir.inQtyOh + ir.inQtyNonet AS in_qty_oh, ir.inQtyAvail, ir.inQtyNonet, ir.ldCustConsignQty, ir.inSuppConsignQty
FROM cteMaxDate md
INNER JOIN inventoryReport ir
on md.ldPart = ir.ldPart
and md.MaxDate = ir.ldDate
You need to join into a Sub-query:
SELECT i.ldPart, x.LastDate, i.inAbc
FROM inventoryReport i
INNER JOIN (Select ldPart, Max(ldDate) As LastDate FROM inventoryReport GROUP BY ldPart) x
on i.ldPart = x.ldPart and i.ldDate = x.LastDate