SQL select only highest date - sql

For a project I want to generate a price list.
I want to get only the latest prices from each supplier for each article.
There are just those two tables.
Table articles
ARTNR | TXT | ACTIVE | SUPPLIER
------------------------------------------
10 | APPLE | Y | 10
20 | ORANGE | Y | 10
30 | KEYBOARD | N | 20
40 | ORANGE | Y | 20
50 | BANANA | Y | 10
60 | CHERRY | Y | 10
Table prices
ARTNR | PRCGRP | PRCDAT | PRICE
--------------------------------------
10 | 10 | 01-Aug-10 | 2.1
10 | 10 | 05-Aug-11 | 2.2
10 | 10 | 21-Aug-12 | 2.5
20 | 0 | 01-Aug-10 | 2.1
20 | 10 | 09-Aug-12 | 2.3
10 | 10 | 14-Aug-13 | 2.7
This is what I have so far:
SELECT
ARTICLES.[ARTNR], ARTICLES.[TXT], ARTICLES.[ACTIVE], ARTICLES.[SUPPLIER], PRICES.PRCGRP, PRICES.PRCDAT, PRICES.PRICE
FROM
ARTICLES INNER JOIN PRICES ON ARTICLES.ARTNR = PRICES.ARTNR
WHERE
(
(ARTICLES.[ACTIVE]="Y") AND
(ARTICLES.[SUPPLIER]=10) AND
(PRICES.PRCGRP=0) AND
(PRICES.PRCDAT=(SELECT MAX(PRCDAT) FROM PRICES as art WHERE art.ARTNR = PRICES.artnr) )
)
ORDER BY ARTICLES.ARTNR
;
It is okay to choose just one supplier each time, but I want the max price.
The problem is:
Lots of articles do not show up with the query above,
but I cannot figure out what is wrong.
I can see that they should be in the resultset when I leave out the subselect on max prcdat.
What is wrong?

Your subquery to get the latest price does not take the other conditions into account, that is when you're getting the latest price, you may get a price in another price group or that is not active. When you join that against the filtered list that has no inactive prices and only prices in a single price group, you get no hits that exist in both.
Either you need to duplicate or - better - move your conditions inside the subquery to get the best price under the conditions. I can't test against access, but something like this should be possible if the SQL is not too limited;
SELECT a.artnr, a.txt, a.active, a.supplier, p.prcgrp, p.prcdat, p.price
FROM articles a INNER JOIN prices p ON a.ARTNR = p.ARTNR
JOIN (
SELECT a.artnr, MAX(p.prcdat) prcdat
FROM articles a JOIN prices p ON a.artnr = p.artnr
WHERE a.active='Y' AND a.supplier=10 AND p.prcgrp=10
GROUP BY a.artnr) z
ON a.artnr = z.artnr AND p.prcdat = z.prcdat
ORDER BY a.ARTNR
If the SQL support in access won't allow a join with a subquery, you can just move the conditions inside your existing subquery, something like;
SELECT a.artnr, a.txt, a.active, a.supplier, p.prcgrp, p.prcdat, p.price
FROM articles a INNER JOIN prices p ON a.ARTNR = p.ARTNR
WHERE p.prcdat = (
SELECT MAX(p2.prcdat)
FROM articles a2 JOIN prices p2 ON a2.artnr = p2.artnr
WHERE a.artnr = a2.artnr AND a2.active='Y' AND a2.supplier=10 AND p2.prcgrp=10
)
ORDER BY a.ARTNR;
Note that due to limitations in identifying a unique price (no primary key in prices), the queries may give duplicates if several prices for the same article have the same prcdat. If that's a problem, you'll probably need to duplicate your conditions outside the subquery too.

Related

Finding products that were ordered 20% more times than the average of all other products in postgresql

I have asked a similar question and have received some help from some very nice people.
How to find the average of all other products in postgresql.
This question is not all but I thought I can work out the rest on my own if the hardest part can be resolved but apparently I've overestimated my abilities. So I'm posting another question... :)
The question is as followed.
I have a table Products which looks like the following:
+-----------+-----------+----------+
|ProductCode|ProductType| .... |
+-----------+-----------+----------+
| ref01 | BOOKS | .... |
| ref02 | ALBUMS | .... |
| ref06 | BOOKS | .... |
| ref04 | BOOKS | .... |
| ref07 | ALBUMS | .... |
| ref10 | TOYS | .... |
| ref13 | TOYS | .... |
| ref09 | ALBUMS | .... |
| ref29 | TOYS | .... |
| ref02 | ALBUMS | .... |
| ..... | ..... | .... |
+-----------+-----------+----------+
Another table Sales which looks like the following:
+-----------+-----------+----------+
|ProductCode| qty | .... |
+-----------+-----------+----------+
| ref01 | 15 | .... |
| ref02 | 12 | .... |
| ref06 | 20 | .... |
| ref04 | 14 | .... |
| ref07 | 11 | .... |
| ref10 | 19 | .... |
| ref13 | 3 | .... |
| ref09 | 9 | .... |
| ref29 | 5 | .... |
| ref02 | 4 | .... |
| ..... | ..... | .... |
+-----------+-----------+----------+
I am trying to find the products that were ordered 20% more than the average of all other products of the same type.
A product can be ordered several times and the quantities (qty) of each order might not be the same. Such as ref02 in the sample table. I only included one example (ref02) but it is the case for all products. So to find how many times a specific product was ordered would mean to find the sum of quantities ordered from all orders of the product.
By manually calculating, the result should be something like:
+-----------+-----------+----------+
|ProductCode| qty | .... |
+-----------+-----------+----------+
| ref02 | 16 | .... |
| ref06 | 20 | .... |
| ref07 | 11 | .... |
| ref10 | 19 | .... |
| ..... | ..... | .... |
+-----------+-----------+----------+
So if looking in the type ALBUMS and product ref02, then I need to find the average of Orders of ALL OTHER ALBUMS.
In this case, it is the average of ref06 and ref04, but there are more in the actual table. So what I need to do is the following:
Since product ref02 is 'ALBUMS' and there are two orders of ref02, the total orders will be 12+4=16. And ref07 and ref09 are also 'ALBUMS'.
So their average is (11+9)/2=10 < 12+4=16.
Since product ref06 is 'BOOKS', and **ref01** and ref04 are also 'BOOKS'.
So their average is (15+14)/2=14.5 <20.
Since product ref07 is 'ALBUMS', and **ref02** and ref09 are also 'ALBUMS'.
So their average is (12+9+4)/3=8.3 <11.
Since product ref10 is 'TOYS', and ref13 and ref29 are also 'TOYS'
So their average is (3+5)/2=4<19.
The rest does not satisfy the condition thus will not be in the result.
I know how to and was able to find the average of orders for all products under the same type, but I have no idea how to find the average of orders for all other products under the same type.
I know how to find the desired products with the helps I've received from my previous question How to find the average of all other products in postgresql, but that is when there is only one order for each product. I don't know how to proceed if there are multiple orders for each product. This is the "overestimated" bit I've mentioned at the beginning... :(
The answers I've received in my previous question has this problem:
DEMO (db<>fiddle). The tables in the demo are much more similar to the ones I'm working with, and as you see, there are many rows for one product. (The duplicated rows are by accident. The values just happened to be the same)
I am using PostgreSQL, but the exercise forbids the use of several keywords including: WITH, OVER, LIMIT, PARTITION, or LATERAL. I realize that they are commonly used in most solutions I've found and the ones provided to me, but I cannot use them because no result will be returned otherwise... :(
I know not being allowed to use these keywords can be annoying, but I honestly don't know what to do so please help! :)
I wrote a query for all combinations, Total by Product Code, Total by Product Type and e.t.c. You can calculate the average value if you need using (SUM values / Count Values).
select
main1.product_code,
main1.product_type,
main1.total as "Total by Product Code",
main1.sales_count as "Count by Product Code",
main2.total as "Total by Product Type",
main2.sales_count as "Count by Product Type",
main2.total - main1.total as "Total by Other Products Types (ignore this Product Code)",
main2.sales_count - main1.sales_count as "Count by Other Products Types (ignore this Product Code)"
from
(
select
s.product_code,
p.product_type,
sum(s.qty) as total,
count(*) as sales_count
from
examples.sales s
left join
examples.products p on p.product_code = s.product_code
group by
s.product_code, p.product_type
) main1
left join
(
select t1.product_type, sum(t1.qty) as total, count(*) as sales_count from (
select * from examples.sales s
left join examples.products p on p.product_code = s.product_code
) t1
group by t1.product_type
) main2 on main1.product_type = main2.product_type
Result:
Pr.Code
Pr.Type
Total by Pr.Code
Count by Pr.Code
Total by Pr.Type
Count by Pr.Type (ignore this Product Code)
Total by Other Pr.Types
Count by Other Pr.Types (ignore this Product Code)
ref29
TOYS
5
1
27
3
22
2
ref06
BOOKS
20
1
34
2
14
1
ref13
TOYS
3
1
27
3
24
2
ref02
ALBUMS
16
2
36
4
20
2
ref10
TOYS
19
1
27
3
8
2
ref07
ALBUMS
11
1
36
4
25
3
ref04
BOOKS
14
1
34
2
20
1
ref09
ALBUMS
9
1
36
4
27
3
Fix two errors in the setup
1.
A product can be ordered several times ...
It should still appear once in the Products table. The 2nd entry of ref02 is wrong.
2.
So to find how many times a specific product was ordered would mean to find the sum of quantities ordered from all orders of the product.
So your rationale for ref07 doesn't hold:
Since product ref07 is 'ALBUMS', and **ref02** and ref09 are also 'ALBUMS'.
So their average is (12+9+4)/3=8.3 <11.
Counting the two sales for ref02 separately is wrong in light of your definition. Operate with sums per product:
Since product ref07 is 'ALBUMS', and ref02 and ref09 are also 'ALBUMS'.
So their average is (16+9)/2 = 12.5 > 11. -- doesn't qualify!
Answer
find the products that were ordered 20% more than the average of all other products of the same type.
I am putting a proper solution first: an efficient query for Postgres 11+ using a window function with custom window frame over the aggregate sum()
SELECT product_code, orders
FROM (
SELECT product_code, sum(s.orders) AS orders
, avg(sum(s.orders)) OVER (PARTITION BY p.product_type
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
EXCLUDE CURRENT ROW) AS avg_orders
FROM product p
JOIN sales s USING (product_code)
GROUP BY product_code, p.product_type
) sub
WHERE avg_orders * 1.2 < orders
ORDER BY product_code; -- optional
Result (with the errors mentioned above fixed):
product_code
orders
ref02
16
ref06
20
ref10
19
Much more efficient than the below.
Postgres can apply a window function over an aggregate in the same query level. See:
Postgres window function and group by exception
How to use a SQL window function to calculate a percentage of an aggregate
At your request, an inefficient solution working around modern SQL features:
SELECT product_code, ps.orders
FROM (
SELECT product_code, p.product_type, sum(s.orders) AS orders
FROM product p
JOIN sales s USING (product_code)
GROUP BY product_code, p.product_type
) ps
JOIN LATERAL (
SELECT avg(orders) AS avg_orders
FROM (
SELECT sum(s1.orders) AS orders
FROM product p1
JOIN sales s1 USING (product_code)
WHERE p1.product_type = ps.product_type
AND p1.product_code <> ps.product_code
GROUP BY product_code
) sub
) a ON a.avg_orders * 1.2 < ps.orders
ORDER BY product_code; -- optional
db<>fiddle here
Same result.
We have to repeat the basic aggregation for sums in the subquery, since we cannot use a CTE to materialize it. (Possible remaining workaround: use a temporary table isntead.)
Basics in my answer to your previous question:
How to find the average of all other products in postgresql

Can i merge the result of two queries?

I'm trying to retrieve some statistics from my database, to be concrete i'm look to show how many todo's is completed vs the total of a checklist.
The structure is as follows
A category has many Cards, has many checklists, has many assessments.
I can get the amount of assessments or completed assessments with the following query.
SELECT count(a.id) AS completed_count, a.checklist_id, ca.category_id
FROM assessments a
JOIN checklists ch ON ch.id = a.checklist_id
JOIN cards ca ON ca.id = ch.card_id
WHERE a.complete
GROUP BY a.checklist_id, ca.category_id;
This will give me something like this.
completed_count | checklist_id | category_id
-----------------+--------------+-------------
2 | 3 | 2
1 | 2 | 2
2 | 5 | 3
I could then do a query, to get the total amount, by removing the WHERE a.complete, and write some code that matches the two results.
But what i really want, is a result like this.
completed_amount | total_amount | checklist_id | category_id
------------------+--------------+--------------+-------------
2 | 2 | 3 | 2
1 | 1 | 2 | 2
2 | 2 | 5 | 3
I just can't wrap my head around, how i can achieve that.
I think conditional aggregation does what you want:
SELECT sum( (a.complete)::int ) AS completed_count,
count(*) as total_count
a.checklist_id, ca.category_id
FROM assessments a JOIN
checklists ch
ON ch.id = a.checklist_id JOIN
cards ca
ON ca.id = ch.card_id
GROUP BY a.checklist_id, ca.category_id;

SQL 2 Left outer joins with Sum and Group By

Looking for some guidance on this. I am attempting to run a report in my complaint management system.. Complaints by Year, Location, Subcategory, Showing Totals for TotalCredits (child table) and TotalsCwts (childtable) as well as total ExternalRootCause (on master table).
This is my SQL, but the TotalCwts and TotalCredits are not being calculated correctly. It calculates 1 time for each child record rather than the total for each master record.
SELECT
dbo.Complaints.Location,
YEAR(dbo.Complaints.ComDate) AS Year,
dbo.Complaints.ComplaintSubcategory,
COUNT(Distinct(dbo.Complaints.ComId)) AS CustomerComplaints,
SUM(DISTINCT CASE WHEN (dbo.Complaints.RootCauseSource = 'External' ) THEN 1 ELSE 0 END) as ExternalRootCause,
SUM(dbo.ComplaintProducts.Cwts) AS TotalCwts,
Coalesce(SUM(dbo.CreditDeductions.CreditAmount),0) AS TotalCredits
FROM dbo.Complaints
JOIN dbo.CustomerComplaints
ON dbo.Complaints.ComId = dbo.CustomerComplaints.ComId
LEFT OUTER JOIN dbo.CreditDeductions
ON dbo.Complaints.ComId = dbo.CreditDeductions.ComId
LEFT OUTER JOIN dbo.ComplaintProducts
ON dbo.Complaints.ComId = dbo.ComplaintProducts.ComId
WHERE
dbo.Complaints.Location = Coalesce(#Location,Location)
GROUP BY
YEAR(dbo.Complaints.ComDate),
dbo.Complaints.Location,
dbo.Complaints.ComplaintSubcategory
ORDER BY
[YEAR] desc,
dbo.Complaints.Location,
dbo.Complaints.ComplaintSubcategory
Data Results
Location | Year | Subcategory | Complaints | External RC | Total Cwts | Total Credits
---------------------------------------------------------------------------------------
Boston | 2016 | Documentation | 1 | 0 | 8 | 8.00
Data Should Read
Location | Year | Subcategory | Complaints | External RC | Total Cwts | Total Credits
---------------------------------------------------------------------------------------
Boston | 2016 | Documentation | 1 | 0 | 4 | 2.00
Above data reflects 1 complaint having 4 Product Records with 1cwt each and 2 credit records with 1.00 each.
What do I need to change in my query or should I approach this query a different way?
The problem is that the 1 complaint has 2 Deductions and 4 products. When you join in this manner then it will return every combination of Deduction/Product for the complaint which gives 8 rows as you're seeing.
One solution, which should work here, is to not query the Dedustion and Product tables directly; query a query which returns one row per table per complaint. In other words, replace:
LEFT OUTER JOIN dbo.CreditDeductions ON dbo.Complaints.ComId = dbo.CreditDeductions.ComId
LEFT OUTER JOIN dbo.ComplaintProducts ON dbo.Complaints.ComId = dbo.ComplaintProducts.ComId
...with this - showing the Deductions table only, you can work out the Products:
LEFT OUTER JOIN (
select ComId, count(*) CountDeductions, sum(CreditAmount) CreditAmount
from dbo.CreditDeductions
group by ComId
) d on d.ComId = Complaints.ComId
You'll have to change the references to dbo.CreditDedustions to just d (or whatever you want to call it).
Once you've done them both then you'll one each per complaint, which will result with 1 row per complaint contaoining the counts and totals from the two sub-tables.

join on three tables? Error in phpMyAdmin

I'm trying to use a join on three tables query I found in another post (post #5 here). When I try to use this in the SQL tab of one of my tables in phpMyAdmin, it gives me an error:
#1066 - Not unique table/alias: 'm'
The exact query I'm trying to use is:
select r.*,m.SkuAbbr, v.VoucherNbr from arrc_RedeemActivity r, arrc_Merchant m, arrc_Voucher v
LEFT OUTER JOIN arrc_Merchant m ON (r.MerchantID = m.MerchantID)
LEFT OUTER JOIN arrc_Voucher v ON (r.VoucherID = v.VoucherID)
I'm not entirely certain it will do what I need it to do or that I'm using the right kind of join (my grasp of SQL is pretty limited at this point), but I was hoping to at least see what it produced.
(What I'm trying to do, if anyone cares to assist, is get all columns from arrc_RedeemActivity, plus SkuAbbr from arrc_Merchant where the merchant IDs match in those two tables, plus VoucherNbr from arrc_Voucher where VoucherIDs match in those two tables.)
Edited to add table samples
Table arrc_RedeemActivity
RedeemID | VoucherID | MerchantID | RedeemAmt
----------------------------------------------
1 | 2 | 3 | 25
2 | 6 | 5 | 50
Table arrc_Merchant
MerchantID | SkuAbbr
---------------------
3 | abc
5 | def
Table arrc_Voucher
VoucherID | VoucherNbr
-----------------------
2 | 12345
6 | 23456
So ideally, what I'd like to get back would be:
RedeemID | VoucherID | MerchantID | RedeemAmt | SkuAbbr | VoucherNbr
-----------------------------------------------------------------------
1 | 2 | 3 | 25 | abc | 12345
2 | 2 | 5 | 50 | def | 23456
The problem was you had duplicate table references - which would work, except for that this included table aliasing.
If you want to only see rows where there are supporting records in both tables, use:
SELECT r.*,
m.SkuAbbr,
v.VoucherNbr
FROM arrc_RedeemActivity r
JOIN arrc_Merchant m ON m.merchantid = r.merchantid
JOIN arrc_Voucher v ON v.voucherid = r.voucherid
This will show NULL for the m and v references that don't have a match based on the JOIN criteria:
SELECT r.*,
m.SkuAbbr,
v.VoucherNbr
FROM arrc_RedeemActivity r
LEFT JOIN arrc_Merchant m ON m.merchantid = r.merchantid
LEFT JOIN arrc_Voucher v ON v.voucherid = r.voucherid

Using multiple left joins to calculate averages and counts

I am trying to figure out how to use multiple left outer joins to calculate average scores and number of cards. I have the following schema and test data. Each deck has 0 or more scores and 0 or more cards. I need to calculate an average score and card count for each deck. I'm using mysql for convenience, I eventually want this to run on sqlite on an Android phone.
mysql> select * from deck;
+----+-------+
| id | name |
+----+-------+
| 1 | one |
| 2 | two |
| 3 | three |
+----+-------+
mysql> select * from score;
+---------+-------+---------------------+--------+
| scoreId | value | date | deckId |
+---------+-------+---------------------+--------+
| 1 | 6.58 | 2009-10-05 20:54:52 | 1 |
| 2 | 7 | 2009-10-05 20:54:58 | 1 |
| 3 | 4.67 | 2009-10-05 20:55:04 | 1 |
| 4 | 7 | 2009-10-05 20:57:38 | 2 |
| 5 | 7 | 2009-10-05 20:57:41 | 2 |
+---------+-------+---------------------+--------+
mysql> select * from card;
+--------+-------+------+--------+
| cardId | front | back | deckId |
+--------+-------+------+--------+
| 1 | fron | back | 2 |
| 2 | fron | back | 1 |
| 3 | f1 | b2 | 1 |
+--------+-------+------+--------+
I run the following query...
mysql> select deck.name, sum(score.value)/count(score.value) "Ave",
-> count(card.front) "Count"
-> from deck
-> left outer join score on deck.id=score.deckId
-> left outer join card on deck.id=card.deckId
-> group by deck.id;
+-------+-----------------+-------+
| name | Ave | Count |
+-------+-----------------+-------+
| one | 6.0833333333333 | 6 |
| two | 7 | 2 |
| three | NULL | 0 |
+-------+-----------------+-------+
... and I get the right answer for the average, but the wrong answer for the number of cards. Can someone tell me what I am doing wrong before I pull my hair out?
Thanks!
John
It's running what you're asking--it's joining card 2 and 3 to scores 1, 2, and 3--creating a count of 6 (2 * 3). In card 1's case, it joins to scores 4 and 5, creating a count of 2 (1 * 2).
If you just want a count of cards, like you're currently doing, COUNT(Distinct Card.CardId).
select deck.name, coalesce(x.ave,0) as ave, count(card.*) as count -- card.* makes the intent more clear, i.e. to counting card itself, not the field. but do not do count(*), will make the result wrong
from deck
left join -- flatten the average result rows first
(
select deckId,sum(value)/count(*) as ave -- count the number of rows, not count the column name value. intent is more clear
from score
group by deckId
) as x on x.deckId = deck.id
left outer join card on card.deckId = deck.id -- then join the flattened results to cards
group by deck.id, x.ave, deck.name
order by deck.id
[EDIT]
sql has built-in average function, just use this:
select deckId, avg(value) as ave
from score
group by deckId
What's going wrong is that you're creating a Cartesian product between score and card.
Here's how it works: when you join deck to score, you may have multiple rows match. Then each of these multiple rows is joined to all of the matching rows in card. There's no condition preventing that from happening, and the default join behavior when no condition restricts it is to join all rows in one table to all rows in another table.
To see it in action, try this query, without the group by:
select *
from deck
left outer join score on deck.id=score.deckId
left outer join card on deck.id=card.deckId;
You'll see a lot of repeated data in the columns that come from score and card. When you calculate the AVG() over data that has repeats in it, the redundant values magically disappear (as long as the values are repeated uniformly). But when you COUNT() or SUM() them, the totals are way off.
There may be remedies for inadvertent Cartesian products. In your case, you can use COUNT(DISTINCT) to compensate:
select deck.name, avg(score.value) "Ave", count(DISTINCT card.front) "Count"
from deck
left outer join score on deck.id=score.deckId
left outer join card on deck.id=card.deckId
group by deck.id;
This solution doesn't solve all cases of inadvertent Cartesian products. The more general-purpose solution is to break it up into two separate queries:
select deck.name, avg(score.value) "Ave"
from deck
left outer join score on deck.id=score.deckId
group by deck.id;
select deck.name, count(card.front) "Count"
from deck
left outer join card on deck.id=card.deckId
group by deck.id;
Not every task in database programming must be done in a single query. It can even be more efficient (as well as simpler, easier to modify, and less error-prone) to use individual queries when you need multiple statistics.
Using left joins isn't a good approach, in my opinion. Here's a standard SQL query for the result you want.
select
name,
(select avg(value) from score where score.deckId = deck.id) as Ave,
(select count(*) from card where card.deckId = deck.id) as "Count"
from deck;