Calculating average price of items purchased by customers - sql

I have three tables: customer, order and line items. They are set up as follows:
CREATE TABLE cust_account(
cust_id DECIMAL(10) NOT NULL,
first VARCHAR(30),
last VARCHAR(30),
address VARCHAR(50),
PRIMARY KEY (cust_id));
CREATE TABLE orders(
order_num DECIMAL(10) NOT NULL,
cust_id DECIMAL(10) NOT NULL,
order_date DATE,
PRIMARY KEY (order_num));
CREATE TABLE lines(
order_num DECIMAL(10) NOT NULL,
line_id DECIMAL(10) NOT NULL,
item_num DECIMAL(10) NOT NULL,
price DECIMAL(10),
PRIMARY KEY (order_id, line_id),
FOREIGN KEY (item_id) REFERENCES products);
Using Oracle, I need to write a query that presents the average item price for for those customers that made more than 5 or more purchases. This is what I've been working with:
SELECT DISTINCT cust_account.cust_id,cust_account.first, cust_account.last, lines.AVG(price) AS average_price
FROM cust_account
JOIN orders
ON cust_account.cust_id = orders.cust_id
JOIN lines
ON lines.order_num = orders.order_num
WHERE lines.item_num IN (SELECT lines.item_num
FROM lines
JOIN orders
ON lines.order_num = orders.order_num
GROUP BY lines.order_num
HAVING COUNT(DISTINCT orders.cust_id) >= 5
);

... INNER JOIN all your tables together
... GROUP BY customer and compute the average price of each customer's lines
... use a HAVING clause to limit the results to groups having 5 or more purchases
Query:
SELECT ca.first, ca.last, avg(l.price) avg_price
FROM cust_account ca
INNER JOIN orders o ON o.cust_id = ca.cust_id
INNER JOIN lines l ON l.order_num = o.order_number
GROUP BY ca.first, ca.last
HAVING COUNT(distinct l.line_id) >=5
-- OR, maybe your requirement is ...
-- HAVING COUNT(distinct o.order_num) >= 5
-- ... the question was a bit unclear on this point

I think this is it. I don't think it will work right away (I know nothing about oracle) but I think you will get the idea:
SELECT orders.cust_id,
AVG(lines.price) AS average_price
FROM lines
JOIN orders ON orders.order_num = orders.order_num
WHERE orders.cust_id IN (SELECT orders.cust_id
FROM orders
GROUP BY orders.cust_id
HAVING COUNT(*) >= 5)
GROUP BY orders.cust_id;
Subquery selects customers that have more than 5 orders.
And main query just gets all lines from all orders made by this customers.
I guess you can eliminate subquery by using HAVING DISTINCT .... Anyways, one with subquery should work just fine.
UPD.
something like this
SELECT orders.cust_id,
AVG(lines.price) AS average_price
JOIN orders ON orders.order_num = orders.order_num
GROUP BY orders.cust_id
HAVING COUNT(DISTINCT orders.id) >= 5;

Related

How to count number of associated records, when association has composite primary key?

Given a customers table, and an associated orders table, I want to count the number of orders made per customer.
However, the orders table has a composite primary key. Here are the schemas and the test data:
CREATE TABLE customers (
name TEXT NOT NULL PRIMARY KEY
);
INSERT INTO customers VALUES ('Joe');
INSERT INTO customers VALUES ('Jane');
CREATE TABLE orders (
customer_name TEXT NOT NULL REFERENCES customers (name),
section_id INT NOT NULL,
item_id INT NOT NULL,
PRIMARY KEY (customer_name, section_id, item_id)
);
INSERT INTO orders VALUES ('Joe', 1, 100);
INSERT INTO orders VALUES ('Joe', 1, 101);
INSERT INTO orders VALUES ('Joe', 2, 110);
There are two customers: Joe (with 3 orders) and Jane (with no orders).
This query...
SELECT customers.*,
COUNT((orders.section_id, orders.item_id)) AS num_orders
FROM customers
LEFT JOIN orders
ON customers.name = orders.customer_name
GROUP BY customers.name;
...correctly counts that Joe has 3 orders, but incorrectly counts that Jane has 1 order (she actually has 0).
That's because COUNT((orders.section_id, orders.item_id)) produces 1. Even though section_id and item_id are both NULL, the tuple expression (NULL, NULL) is not considered NULL.
How do I properly query the orders count in the face of composite primary keys?
I'm not sure why do you want to count on both fields: section_id and item_id...
I'd suggest to provide simple experiment:
SELECT (NULL, NULL) result
-- produces: (,)
So, when you use id with COUNT, it have to return 1 - as it's NOT NULL.
SELECT COALESCE(NULL, NULL) result
-- produces: NULL
So, when you use id with COUNT, it have to return 0 - as it's NULL.
db<>fiddle
As to me, it's enough to use COUNT(item_id) or COUNT(section_id), as both can NOT be null. So...
SELECT c.*,
COUNT(o.section_id) AS num_orders
FROM customers c
LEFT JOIN orders o
ON c.name = o.customer_name
GROUP BY c.name;
Unless... you want to get distinct sections/items in each order of specific customer.
SELECT c.*,
COUNT(DISTINCT o.section_id) AS num_sections,
COUNT(DISTINCT o.item_id) AS num_items
FROM customers c
LEFT JOIN orders o
ON c.name = o.customer_name
GROUP BY c.name;
db<>fiddle
Count orders.customer_name, that's the one that's guaranteed not to be NULL if there was a join partner row as it's used in the ON clause with =.
SELECT customers.*,
count(orders.customer_name) AS num_orders
FROM customers
LEFT JOIN orders
ON customers.name = orders.customer_name
GROUP BY customers.name;

sqlite - join tables or subquery?

I have the following tables:
create table part_category(id text primary key);
create table parts (id text primary key not null,
cat references part_category(id));
create table products (id text primary key not null);
create table product_parts (product references products(id),
part references parts(id),
qty integer);
create table locations (id text primary key not null,
stage text not null);
create table stock (part references parts(id),
cat references part_category(id),
location references locations(id),
qty integer,
date text);
create table orders (part references parts(id),
cat references parts(cat),
product references products(id),
qty integer not null default 0,
date_order text,
date_due text,
date_done text,
status boolean,
primary key(part, product, date_due));
And I'd like to have this returned from a select:
Part, Category, Product, Qty, Date Ordered, Date Due, qty of material, qty of stock, qty of wip
The columns bolded above are the ones that I can't figure out. Below is my select with the subquery where I'm trying to get the qty of stock.
The problem is the query is returning zero for everything.
orders = db.execute('''select distinct o.part, o.cat, o.product, o.qty,
o.date_order, o.date_due, o.date_done,
julianday(date_due) - julianday(date_order) as days_due,
(select stock.qty from stock, orders
where stock.part = orders.part and stock.location = 'stock' and orders.status = 1)
as qty_stock
from orders as o join stock as s on o.part = s.part
where o.status = 1
order by o.date_due asc, o.product asc, o.part asc''').fetchall()
Example output is
for item in orders:
print item['part'], item['qty'], item['qty_stock']
SOME_PART_NUMBER 3 0
But should be:
SOME_PART_NUMBER 3 22
I'm unsure about your business logic.
I guess this is what you want.
select distinct o.part, o.cat, o.product, o.qty,
o.date_order, o.date_due, o.date_done,
julianday(date_due) - julianday(date_order) as days_due,
qs.stockQuantity as qty_stock
from orders as o
join stock as s on o.part = s.part
left join (select stock.part, sum(stock.qty) stockQuantity
from stock ss
join orders oo on ss.part = oo.part
where ss.location = 'stock' and oo.status = 1
group by stock.part
) qs on qs.part = o.part
where o.status = 1
order by o.date_due asc, o.product asc, o.part asc
The title says "join tables OR subquery". The sql does both. I'm not sayin' that's the problem. But it certainly adds a level of complexity that could be error prone. You could try removing the subquery and replace it with s.qty, then add s.location = "stock" to the WHERE clause.

How to compose select request using many-to-many relationship in PostgreSQL?

I have this tables:
CREATE TABLE orders (
order_id serial PRIMARY KEY
, number_of_things int
);
CREATE TABLE things (
thing_id serial PRIMARY KEY
, cost int
);
CREATE TABLE orders_to_things (
order_id int REFERENCES orders (order_id)
, thing_id int REFERENCES things (thing_id)
);
How to compose a request for select all orders where cost of things more than some number?
I tried to use:
SELECT orders.order_id
FROM orders
INNER JOIN orders_to_things ON (orders_to_things.order_id = orders.order_id)
JOIN things ON (orders_to_things.thing_id=things.thing_id)
WHERE (select SUM(things.cost) FROM things) > *some number*
but didn't get the correct result.
Try this:
SELECT O.order_id, sum(T.cost)
FROM orders O
INNER JOIN orders_to_things ON orders_to_things.order_id = orders.order_id
JOIN things T ON orders_to_things.thing_id=things.thing_id
GROUP BY O.order_id
HAVING T.cost > 'number....'
If you want all order ids, you don't need the orders table. The simplest way to write the query is:
SELECT ott.order_id, sum(t.cost)
FROM orders_to_things ott JOIN
things t
ON ott.thing_id = t.thing_id
GROUP BY ott.order_id
HAVING sum(t.cost) > <number>;

Get results that have the same data in the table

I need to get all the customer name where their preference MINPRICE and MAXPRICE is the same.
Here's my schema:
CREATE TABLE CUSTOMER (
PHONE VARCHAR(25) NOT NULL,
NAME VARCHAR(25),
CONSTRAINT CUSTOMER_PKEY PRIMARY KEY (PHONE),
);
CREATE TABLE PREFERENCE (
PHONE VARCHAR(25) NOT NULL,
ITEM VARCHAR(25) NOT NULL,
MAXPRICE NUMBER(8,2),
MINPRICE NUMBER(8,2),
CONSTRAINT PREFERENCE_PKEY PRIMARY KEY (PHONE, ITEM),
CONSTRAINT PREFERENCE_FKEY FOREIGN KEY (PHONE) REFERENCES CUSTOMER (PHONE)
);
I think I need to do some compare between rows and rows? or create another views to compare? any easy way to do this?
its one to many. a customer can have multiple preferences so i need to query a list of customer that have the same minprice and maxprice. compare between rows minprice=minprice and maxprice=maxprice
A self-join on preference would find rows with the same price preference, but a different phone number:
select distinct c1.name
, p1.minprice
, p1.maxprice
from preference p1
join preference p2
on p1.phone <> p2.phone
and p1.minprice = p2.minprice
and p1.maxprice = p2.maxprice
join customer c1
on c1.phone = p1.phone
join customer c2
on c2.phone = p2.phone
order by
p1.minprice
, p1.maxprice
, c1.name
It seems strange that you have minprice and maxprice in your preference table. Is that a table that you update after each transaction, such that each customer only has 1 active preference record? I mean, it reads like a customer could pay two different prices for the same item, which seems odd.
Assuming customer and preference are 1:1
SELECT c.*
FROM customer c INNER JOIN preference p ON c.phone = p.phone
WHERE p.minprice = p.maxprice
However, if a customer can have multiple preferences and you are looking for minprice = maxprice for ALL item ... then you could do this
SELECT c.*
FROM (SELECT phone, MIN(minprice) as allMin, MAX(maxprice) as allMax
FROM preference
GROUP BY phone) p INNER JOIN customer c on p.phone = c.phone
WHERE allMin = allMax
This will show all the customer names that have the same price preferences.
SELECT minprice, maxprice, GROUP_CONCAT(name) names
FROM preference
JOIN customer USING (phone)
GROUP BY minprice, maxprice
HAVING COUNT(*) > 1
The HAVING clause prevents it showing preferences that have no duplicates. If you want to see those single-customer preferences, remove that line.

Removing Duplicate values within a Query when Distinct doesn't work? SQL

Here is the current query I am running.
select c.customer_name, c.city, c.credit_limit, sum(ol.quoted_price)
from customer c, order_line ol, (select order_num from order_line where part_num = 'AT94') t1
where ol.order_num = t1.order_num and customer_num in ( select customer_num
from orders
where order_num in (select t1.order_num
from order_line,(select order_num from order_line where part_num = 'AT94') t1
INNER JOIN
(select order_num from order_line where part_num = 'BV06') t2
on t1.order_num = t2.order_num
where t1.order_num = order_line.order_num
group by t1.order_num))
group by t1.order_num, c.customer_name, c.city, c.credit_limit
The current output I am receiving is:
I wish to obviously remove the duplication located in the output and currently have no idea how to do so. I have tried using unique in the multiple sub-queries with no success.
Any help is great! Thanks.
Here is the database creation.
CREATE TABLE CUSTOMER
(
CUSTOMER_NUM CHAR(3) PRIMARY KEY,
CUSTOMER_NAME CHAR(35) NOT NULL,
STREET CHAR(15),
CITY CHAR(15) DEFAULT 'Ottawa',
PROVINCE CHAR(3),
ZIP CHAR(5),
BALANCE DECIMAL(8,2),
CREDIT_LIMIT DECIMAL(8,2),
REP_NUM CHAR(2)
CONSTRAINT CHK_Limit CHECK (CREDIT_LIMIT >= BALANCE)
);
CREATE TABLE ORDERS
(
ORDER_NUM CHAR(5) PRIMARY KEY,
ORDER_DATE DATE,
CUSTOMER_NUM CHAR(3)
);
CREATE TABLE PART
(
PART_NUM CHAR(4) PRIMARY KEY,
DESCRIPTION CHAR(15),
ON_HAND DECIMAL(4,0),
CLASS CHAR(2),
WAREHOUSE CHAR(1),
PRICE DECIMAL(6,2)
);
CREATE TABLE ORDER_LINE
(
ORDER_NUM CHAR(5),
PART_NUM CHAR(4),
NUM_ORDERED DECIMAL(3,0),
QUOTED_PRICE DECIMAL(6,2),
PRIMARY KEY (ORDER_NUM, PART_NUM)
);
The solution I would like where the order does not matter for the column:
Al's.. | Barrhaven | 7500.00 | 21.95
John.. | Toronto | 10000.00 | 311.95
All data used. Quite a large sum of text. Just decided Pastebin instead of making this question that much longer.
https://pastebin.com/ASBzqcJq
You basically need to see which customer has bought both of parts_numb AT94 and BV06. The issue you're facing is that your query returning duplicated rows, this is because your query already have some redundancy. Which made me go back and check the giving results manually to get the correct results out of the sample that you've provided.
SELECT
ol.ORDER_NUM,
c.customer_name,
c.city,
c.credit_limit,
sum(ol.quoted_price)
FROM #order_line ol
INNER JOIN #Orders o ON o.order_num = ol.order_num
JOIN (SELECT ORDER_NUM FROM #ORDER_LINE WHERE part_num = 'AT94') t1 ON t1.ORDER_NUM = o.ORDER_NUM
JOIN (SELECT ORDER_NUM FROM #ORDER_LINE WHERE part_num = 'BV06') t2 ON t2.ORDER_NUM = o.ORDER_NUM
INNER JOIN #customer c ON c.customer_num = o.customer_num
WHERE
ol.ORDER_NUM IN(t1.ORDER_NUM)
AND ol.ORDER_NUM IN(t2.ORDER_NUM)
GROUP BY
ol.ORDER_NUM,
c.customer_name,
c.city,
c.credit_limit
SQLFiddle
Hope this will save your day
I can see that your result set already has distinct records. Your result set does not have any two rows with exactly same values. What is the output you are expecting? I am not sure what your requirement is. But I think the join between customer number in Customer table and Customer number in Orders table is missing.
Are you trying to do something like this?
select C.customer_name, C.city, C.credit_limit, sum(OL.quoted_price)
from CUSTOMER C
join ORDERS O ON C.Customer_num=O.Customer_num
join ORDER_LINE OL on O.Order_num=Ol.Order_num
WHERE part_num in('AT94','BV06')
group by C.customer_name, C.city, C.credit_limit;