MS Access 2007 query pulls same records multiple times - sql

I have a problem, my query in MS Access 2007 pulls same records multiple times.
There are two tables : sales, products
Product Table
ID | Name | Price | Code
01 | PEN | 0.10$ | 01
02 | ITEM | 0.20$ | 2567
Sales table:
ID | Code | Amount
1 | 01 | 4
2 | 2567 | 2
And there's query
SELECT Product.Name, Product.Price, Sales.Amount
FROM Product, Sales
WHERE Product.Code IN (SELECT Sales.Code FROM Sales);
Thats the result
Name Price Amount
PEN $0.10 4
PEN $0.10 4
ITEM $0.20 2
ITEM $0.20 2

Change your query to
SELECT Product.Name, Product.Price, Product.Amount
FROM Product, Sales
WHERE Product.Code = Sales.Code;

Your query is currently joining every record in Product to every record in Sales, resulting in multiples. You need to do a join between them, either in the WHERE clause like Yousaf suggested, or like this, which is more the standard way to do it:
SELECT Product.Name, Product.Price, Sales.Amount
FROM Product
INNER JOIN Sales ON Product.Code = Sales.Code

Related

Finding products that were ordered 20% more times than the average of all other products in postgresql

I have asked a similar question and have received some help from some very nice people.
How to find the average of all other products in postgresql.
This question is not all but I thought I can work out the rest on my own if the hardest part can be resolved but apparently I've overestimated my abilities. So I'm posting another question... :)
The question is as followed.
I have a table Products which looks like the following:
+-----------+-----------+----------+
|ProductCode|ProductType| .... |
+-----------+-----------+----------+
| ref01 | BOOKS | .... |
| ref02 | ALBUMS | .... |
| ref06 | BOOKS | .... |
| ref04 | BOOKS | .... |
| ref07 | ALBUMS | .... |
| ref10 | TOYS | .... |
| ref13 | TOYS | .... |
| ref09 | ALBUMS | .... |
| ref29 | TOYS | .... |
| ref02 | ALBUMS | .... |
| ..... | ..... | .... |
+-----------+-----------+----------+
Another table Sales which looks like the following:
+-----------+-----------+----------+
|ProductCode| qty | .... |
+-----------+-----------+----------+
| ref01 | 15 | .... |
| ref02 | 12 | .... |
| ref06 | 20 | .... |
| ref04 | 14 | .... |
| ref07 | 11 | .... |
| ref10 | 19 | .... |
| ref13 | 3 | .... |
| ref09 | 9 | .... |
| ref29 | 5 | .... |
| ref02 | 4 | .... |
| ..... | ..... | .... |
+-----------+-----------+----------+
I am trying to find the products that were ordered 20% more than the average of all other products of the same type.
A product can be ordered several times and the quantities (qty) of each order might not be the same. Such as ref02 in the sample table. I only included one example (ref02) but it is the case for all products. So to find how many times a specific product was ordered would mean to find the sum of quantities ordered from all orders of the product.
By manually calculating, the result should be something like:
+-----------+-----------+----------+
|ProductCode| qty | .... |
+-----------+-----------+----------+
| ref02 | 16 | .... |
| ref06 | 20 | .... |
| ref07 | 11 | .... |
| ref10 | 19 | .... |
| ..... | ..... | .... |
+-----------+-----------+----------+
So if looking in the type ALBUMS and product ref02, then I need to find the average of Orders of ALL OTHER ALBUMS.
In this case, it is the average of ref06 and ref04, but there are more in the actual table. So what I need to do is the following:
Since product ref02 is 'ALBUMS' and there are two orders of ref02, the total orders will be 12+4=16. And ref07 and ref09 are also 'ALBUMS'.
So their average is (11+9)/2=10 < 12+4=16.
Since product ref06 is 'BOOKS', and **ref01** and ref04 are also 'BOOKS'.
So their average is (15+14)/2=14.5 <20.
Since product ref07 is 'ALBUMS', and **ref02** and ref09 are also 'ALBUMS'.
So their average is (12+9+4)/3=8.3 <11.
Since product ref10 is 'TOYS', and ref13 and ref29 are also 'TOYS'
So their average is (3+5)/2=4<19.
The rest does not satisfy the condition thus will not be in the result.
I know how to and was able to find the average of orders for all products under the same type, but I have no idea how to find the average of orders for all other products under the same type.
I know how to find the desired products with the helps I've received from my previous question How to find the average of all other products in postgresql, but that is when there is only one order for each product. I don't know how to proceed if there are multiple orders for each product. This is the "overestimated" bit I've mentioned at the beginning... :(
The answers I've received in my previous question has this problem:
DEMO (db<>fiddle). The tables in the demo are much more similar to the ones I'm working with, and as you see, there are many rows for one product. (The duplicated rows are by accident. The values just happened to be the same)
I am using PostgreSQL, but the exercise forbids the use of several keywords including: WITH, OVER, LIMIT, PARTITION, or LATERAL. I realize that they are commonly used in most solutions I've found and the ones provided to me, but I cannot use them because no result will be returned otherwise... :(
I know not being allowed to use these keywords can be annoying, but I honestly don't know what to do so please help! :)
I wrote a query for all combinations, Total by Product Code, Total by Product Type and e.t.c. You can calculate the average value if you need using (SUM values / Count Values).
select
main1.product_code,
main1.product_type,
main1.total as "Total by Product Code",
main1.sales_count as "Count by Product Code",
main2.total as "Total by Product Type",
main2.sales_count as "Count by Product Type",
main2.total - main1.total as "Total by Other Products Types (ignore this Product Code)",
main2.sales_count - main1.sales_count as "Count by Other Products Types (ignore this Product Code)"
from
(
select
s.product_code,
p.product_type,
sum(s.qty) as total,
count(*) as sales_count
from
examples.sales s
left join
examples.products p on p.product_code = s.product_code
group by
s.product_code, p.product_type
) main1
left join
(
select t1.product_type, sum(t1.qty) as total, count(*) as sales_count from (
select * from examples.sales s
left join examples.products p on p.product_code = s.product_code
) t1
group by t1.product_type
) main2 on main1.product_type = main2.product_type
Result:
Pr.Code
Pr.Type
Total by Pr.Code
Count by Pr.Code
Total by Pr.Type
Count by Pr.Type (ignore this Product Code)
Total by Other Pr.Types
Count by Other Pr.Types (ignore this Product Code)
ref29
TOYS
5
1
27
3
22
2
ref06
BOOKS
20
1
34
2
14
1
ref13
TOYS
3
1
27
3
24
2
ref02
ALBUMS
16
2
36
4
20
2
ref10
TOYS
19
1
27
3
8
2
ref07
ALBUMS
11
1
36
4
25
3
ref04
BOOKS
14
1
34
2
20
1
ref09
ALBUMS
9
1
36
4
27
3
Fix two errors in the setup
1.
A product can be ordered several times ...
It should still appear once in the Products table. The 2nd entry of ref02 is wrong.
2.
So to find how many times a specific product was ordered would mean to find the sum of quantities ordered from all orders of the product.
So your rationale for ref07 doesn't hold:
Since product ref07 is 'ALBUMS', and **ref02** and ref09 are also 'ALBUMS'.
So their average is (12+9+4)/3=8.3 <11.
Counting the two sales for ref02 separately is wrong in light of your definition. Operate with sums per product:
Since product ref07 is 'ALBUMS', and ref02 and ref09 are also 'ALBUMS'.
So their average is (16+9)/2 = 12.5 > 11. -- doesn't qualify!
Answer
find the products that were ordered 20% more than the average of all other products of the same type.
I am putting a proper solution first: an efficient query for Postgres 11+ using a window function with custom window frame over the aggregate sum()
SELECT product_code, orders
FROM (
SELECT product_code, sum(s.orders) AS orders
, avg(sum(s.orders)) OVER (PARTITION BY p.product_type
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
EXCLUDE CURRENT ROW) AS avg_orders
FROM product p
JOIN sales s USING (product_code)
GROUP BY product_code, p.product_type
) sub
WHERE avg_orders * 1.2 < orders
ORDER BY product_code; -- optional
Result (with the errors mentioned above fixed):
product_code
orders
ref02
16
ref06
20
ref10
19
Much more efficient than the below.
Postgres can apply a window function over an aggregate in the same query level. See:
Postgres window function and group by exception
How to use a SQL window function to calculate a percentage of an aggregate
At your request, an inefficient solution working around modern SQL features:
SELECT product_code, ps.orders
FROM (
SELECT product_code, p.product_type, sum(s.orders) AS orders
FROM product p
JOIN sales s USING (product_code)
GROUP BY product_code, p.product_type
) ps
JOIN LATERAL (
SELECT avg(orders) AS avg_orders
FROM (
SELECT sum(s1.orders) AS orders
FROM product p1
JOIN sales s1 USING (product_code)
WHERE p1.product_type = ps.product_type
AND p1.product_code <> ps.product_code
GROUP BY product_code
) sub
) a ON a.avg_orders * 1.2 < ps.orders
ORDER BY product_code; -- optional
db<>fiddle here
Same result.
We have to repeat the basic aggregation for sums in the subquery, since we cannot use a CTE to materialize it. (Possible remaining workaround: use a temporary table isntead.)
Basics in my answer to your previous question:
How to find the average of all other products in postgresql

SQL MIN() with GROUP BY select additional columns

I am trying to query a sql database table for the minimum price for products. I also want to grab an additional column with the value of the row with the minimum price. My data looks something like this.
ProductId | Price | Location
1 | 50 | florida
1 | 55 | texas
1 | 53 | california
2 | 65 | florida
2 | 64 | texas
2 | 60 | new york
I can query the minimum price for a product with this query
select ProductId, Min(Price)
from Table
group by ProductId
What I want to do is also include the Location where the Min price is being queried from in the above query. Is there a standard way to achieve this?
One method uses a correlated subquery:
select t.*
from t
where t.price = (select min(t2.price) from t t2 where t2.productid = t.productid);
In most databases, this has very good performance with an index on (productid, price).

SQL 2 Left outer joins with Sum and Group By

Looking for some guidance on this. I am attempting to run a report in my complaint management system.. Complaints by Year, Location, Subcategory, Showing Totals for TotalCredits (child table) and TotalsCwts (childtable) as well as total ExternalRootCause (on master table).
This is my SQL, but the TotalCwts and TotalCredits are not being calculated correctly. It calculates 1 time for each child record rather than the total for each master record.
SELECT
dbo.Complaints.Location,
YEAR(dbo.Complaints.ComDate) AS Year,
dbo.Complaints.ComplaintSubcategory,
COUNT(Distinct(dbo.Complaints.ComId)) AS CustomerComplaints,
SUM(DISTINCT CASE WHEN (dbo.Complaints.RootCauseSource = 'External' ) THEN 1 ELSE 0 END) as ExternalRootCause,
SUM(dbo.ComplaintProducts.Cwts) AS TotalCwts,
Coalesce(SUM(dbo.CreditDeductions.CreditAmount),0) AS TotalCredits
FROM dbo.Complaints
JOIN dbo.CustomerComplaints
ON dbo.Complaints.ComId = dbo.CustomerComplaints.ComId
LEFT OUTER JOIN dbo.CreditDeductions
ON dbo.Complaints.ComId = dbo.CreditDeductions.ComId
LEFT OUTER JOIN dbo.ComplaintProducts
ON dbo.Complaints.ComId = dbo.ComplaintProducts.ComId
WHERE
dbo.Complaints.Location = Coalesce(#Location,Location)
GROUP BY
YEAR(dbo.Complaints.ComDate),
dbo.Complaints.Location,
dbo.Complaints.ComplaintSubcategory
ORDER BY
[YEAR] desc,
dbo.Complaints.Location,
dbo.Complaints.ComplaintSubcategory
Data Results
Location | Year | Subcategory | Complaints | External RC | Total Cwts | Total Credits
---------------------------------------------------------------------------------------
Boston | 2016 | Documentation | 1 | 0 | 8 | 8.00
Data Should Read
Location | Year | Subcategory | Complaints | External RC | Total Cwts | Total Credits
---------------------------------------------------------------------------------------
Boston | 2016 | Documentation | 1 | 0 | 4 | 2.00
Above data reflects 1 complaint having 4 Product Records with 1cwt each and 2 credit records with 1.00 each.
What do I need to change in my query or should I approach this query a different way?
The problem is that the 1 complaint has 2 Deductions and 4 products. When you join in this manner then it will return every combination of Deduction/Product for the complaint which gives 8 rows as you're seeing.
One solution, which should work here, is to not query the Dedustion and Product tables directly; query a query which returns one row per table per complaint. In other words, replace:
LEFT OUTER JOIN dbo.CreditDeductions ON dbo.Complaints.ComId = dbo.CreditDeductions.ComId
LEFT OUTER JOIN dbo.ComplaintProducts ON dbo.Complaints.ComId = dbo.ComplaintProducts.ComId
...with this - showing the Deductions table only, you can work out the Products:
LEFT OUTER JOIN (
select ComId, count(*) CountDeductions, sum(CreditAmount) CreditAmount
from dbo.CreditDeductions
group by ComId
) d on d.ComId = Complaints.ComId
You'll have to change the references to dbo.CreditDedustions to just d (or whatever you want to call it).
Once you've done them both then you'll one each per complaint, which will result with 1 row per complaint contaoining the counts and totals from the two sub-tables.

SQL select only highest date

For a project I want to generate a price list.
I want to get only the latest prices from each supplier for each article.
There are just those two tables.
Table articles
ARTNR | TXT | ACTIVE | SUPPLIER
------------------------------------------
10 | APPLE | Y | 10
20 | ORANGE | Y | 10
30 | KEYBOARD | N | 20
40 | ORANGE | Y | 20
50 | BANANA | Y | 10
60 | CHERRY | Y | 10
Table prices
ARTNR | PRCGRP | PRCDAT | PRICE
--------------------------------------
10 | 10 | 01-Aug-10 | 2.1
10 | 10 | 05-Aug-11 | 2.2
10 | 10 | 21-Aug-12 | 2.5
20 | 0 | 01-Aug-10 | 2.1
20 | 10 | 09-Aug-12 | 2.3
10 | 10 | 14-Aug-13 | 2.7
This is what I have so far:
SELECT
ARTICLES.[ARTNR], ARTICLES.[TXT], ARTICLES.[ACTIVE], ARTICLES.[SUPPLIER], PRICES.PRCGRP, PRICES.PRCDAT, PRICES.PRICE
FROM
ARTICLES INNER JOIN PRICES ON ARTICLES.ARTNR = PRICES.ARTNR
WHERE
(
(ARTICLES.[ACTIVE]="Y") AND
(ARTICLES.[SUPPLIER]=10) AND
(PRICES.PRCGRP=0) AND
(PRICES.PRCDAT=(SELECT MAX(PRCDAT) FROM PRICES as art WHERE art.ARTNR = PRICES.artnr) )
)
ORDER BY ARTICLES.ARTNR
;
It is okay to choose just one supplier each time, but I want the max price.
The problem is:
Lots of articles do not show up with the query above,
but I cannot figure out what is wrong.
I can see that they should be in the resultset when I leave out the subselect on max prcdat.
What is wrong?
Your subquery to get the latest price does not take the other conditions into account, that is when you're getting the latest price, you may get a price in another price group or that is not active. When you join that against the filtered list that has no inactive prices and only prices in a single price group, you get no hits that exist in both.
Either you need to duplicate or - better - move your conditions inside the subquery to get the best price under the conditions. I can't test against access, but something like this should be possible if the SQL is not too limited;
SELECT a.artnr, a.txt, a.active, a.supplier, p.prcgrp, p.prcdat, p.price
FROM articles a INNER JOIN prices p ON a.ARTNR = p.ARTNR
JOIN (
SELECT a.artnr, MAX(p.prcdat) prcdat
FROM articles a JOIN prices p ON a.artnr = p.artnr
WHERE a.active='Y' AND a.supplier=10 AND p.prcgrp=10
GROUP BY a.artnr) z
ON a.artnr = z.artnr AND p.prcdat = z.prcdat
ORDER BY a.ARTNR
If the SQL support in access won't allow a join with a subquery, you can just move the conditions inside your existing subquery, something like;
SELECT a.artnr, a.txt, a.active, a.supplier, p.prcgrp, p.prcdat, p.price
FROM articles a INNER JOIN prices p ON a.ARTNR = p.ARTNR
WHERE p.prcdat = (
SELECT MAX(p2.prcdat)
FROM articles a2 JOIN prices p2 ON a2.artnr = p2.artnr
WHERE a.artnr = a2.artnr AND a2.active='Y' AND a2.supplier=10 AND p2.prcgrp=10
)
ORDER BY a.ARTNR;
Note that due to limitations in identifying a unique price (no primary key in prices), the queries may give duplicates if several prices for the same article have the same prcdat. If that's a problem, you'll probably need to duplicate your conditions outside the subquery too.

Adding another column based on different criteria (SQL-server)

I do quite a bit of data analysis and use SQL on a daily basis but my queries are rather simple, usually pulling a lot of data which I thereafter manipulate in excel, where I'm a lot more experienced.
This time though I'm trying to generate some Live Charts which have as input a single SQL query. I will now have to create complex tables without the aid of the excel tools I'm so familiar with.
The problem is the following:
We have telesales agents that book appointments by answering to inbound calls and making outbound cals. These will generate leads that might potentially result in a sale. The relevant tables and fields for this problem are these:
Contact Table
Agent
Sales Table
Price
OutboundCallDate
I want to know for each telesales agent their respective Total Sales amount in one column, and their outbound sales value in another.
The end result should look something like this:
+-------+------------+---------------+
| Agent | TotalSales | OutboundSales |
+-------+------------+---------------+
| Tom | 30145 | 0 |
| Sally | 16449 | 1000 |
| John | 10500 | 300 |
| Joe | 50710 | 0 |
+-------+------------+---------------+
With the below SQL I get the following result:
SELECT contact.agent, SUM(sales.price)
FROM contact, sales
WHERE contact.id = sales.id
GROUP BY contact.agent
+-------+------------+
| Agent | TotalSales |
+-------+------------+
| Tom | 30145 |
| Sally | 16449 |
| John | 10500 |
| Joe | 50710 |
+-------+------------+
I want to add the third column to this query result, in which the price is summed only for records where the OutboundCallDate field contains data. Something a bit like (where sales.OutboundCallDate is Not Null)
I hope this is clear enough. Let me know if that's not the case.
Use CASE
SELECT c.Agent,
SUM(s.price) AS TotalSales,
SUM(CASE
WHEN s.OutboundCallDate IS NOT NULL THEN s.price
ELSE 0
END) AS OutboundSales
FROM contact c, sales s
WHERE c.id = s.id
GROUP BY c.agent
I think the code would look
SELECT contact.agent, SUM(sales.price)
FROM contact, sales
WHERE contact.id = sales.id AND SUM(WHERE sales.OutboundCallDate)
GROUP BY contact.agent
notI'm assuming your Sales table contains something like Units and Price. If it's just a sales amount, then replace the calculation with the sales amount field name.
The key thing here is that the value summed should only be the sales amount if the OutboundCallDate exists. If the OutboundCallDate is not NULL, then we're using a value of 0 for that row.
select Agent.Agent, TotalSales = sum (sales.Price*Units)
, OutboundSales = sum (
case when Outboundcalldate is not null then price*Units
else 0
end)
From Sales inner join Agent on Sales.Agent = Agent.Agent
Group by Agent.Agent