Excluding tuples based on maximum condition - sql

I have been trying to answer to solve this SQL Query problem, but got no success. The problem is the following:
PROBLEM:
Given 4 tables, PRODUCTS, REPAIRS, OWNERS and MALFUNCTION, for each product Brand and Model display the type of malfunction which have been repaired more times.
The tables have the following fields:
PRODUCTS: *Series_num, Brand, Model, Year, Code_Owner
OWNERS: *Code_Owner, Name, Surname, Street, Civic, City, (u)Phone
MALFUNCTIONS: *Malf_code, Desc
REPAIRS: *Series_num, *Malf_code, *Repair_Date, Price
* <- Primary key
(u) <- Unique attribute
The expected result, given this example of data:
| MODEL | BRAND | MALF_CODE | NUMBER OF REPAIRS|
|----------------------------------------------------|
| 1 | BRAND1 | 1 | 20 |
| 1 | BRAND1 | 2 | 10 |
| 2 | BRAND1 | 1 | 1 |
| 2 | BRAND1 | 2 | 1 |
| 1 | BRAND2 | 1 | 10 |
| 1 | BRAND2 | 2 | 11 |
Should be:
| MODEL | BRAND | MALF_CODE | NUMBER OF REPAIRS|
|----------------------------------------------------|
| 1 | BRAND1 | 1 | 20 |
| 2 | BRAND1 | 1 | 1 |
| 1 | BRAND2 | 2 | 11 |
Note that BRAND1, MODEL:2 has the same number of repairs for two different types of malfunction, so one of the rows can be ignored or both of them can be shown (it does not matter)
WHAT I'VE TRIED:
To get the first table, I used a simple JOIN query:
SELECT A.MODEL, A.BRAND, R.MALF_CODE, COUNT(*) AS N_REP
FROM REPAIRS R LEFT JOIN PRODUCTS A ON A.SERIES_NUM = R.SERIES_NUM
GROUP BY A.MODEL, A.BRAND, R.MALF_CODE;
Then I tried to get the second table thanks to MAX() function:
SELECT A.MODEL, A.BRAND, R.MALF_CODE, COUNT(*) AS N_REP
FROM REPAIRS R LEFT JOIN PRODUCTS A ON A.SERIES_NUM = R.SERIES_NUM
GROUP BY A.MODEL, A.BRAND, R.MALF_CODE
HAVING COUNT(*) IN(
SELECT MAX(R.MALF_CODE)
FROM REPAIRS R LEFT JOIN PRODUCTS A ON A.SERIES_NUM = R.SERIES_NUM
GROUP BY A.MODEL, A.BRAND, R.MALF_CODE
ORDER BY A.BRAND, R.MALF_CODE);
But this throws me the following error:
[42000][907] ORA-00907: Missing closing Parenthesis
It seems I can't find the error.
I hope I've been clear enough. Thanks in advance.
EDIT: I forgot to mention that I'm aware of RANK functions and such, but never heard of Partitions. So a solution without them is highly appreciated but not mandatory.

If I understand correctly, you want the row with the most repairs for each model/brand combination. If so, window functions are one method:
SELECT MODEL, BRAND, MALF_CODE, N_REP
FROM (SELECT P.MODEL, P.BRAND, R.MALF_CODE, COUNT(*) AS N_REP,
ROW_NUMBER() OVER (PARTITION BY P.MODEL, P.BRAND ORDER BY COUNT(*) DESC, R.MALF_CODE) as SEQNUM
FROM REPAIRS R LEFT JOIN
PRODUCTS P
ON P.SERIES_NUM = R.SERIES_NUM
GROUP BY P.MODEL, P.BRAND, R.MALF_CODE
) MB
WHERE seqnum = 1;

Related

Postgresql left join

I have two tables cars and usage. I create a record in usage once a month for some of cars.
Now I want to get distinct list of cars with their latest usage that I saved.
first of all look at the tables please
cars:
| id | model | reseller_id |
|----|-------------|-------------|
| 1 | Samand Sall | 324228 |
| 2 | Saba 141 | 92933 |
usages:
| id | car_id | year | month | gas |
|----|--------|------|-------|-----|
| 1 | 2 | 2020 | 2 | 68 |
| 2 | 2 | 2020 | 3 | 94 |
| 3 | 2 | 2020 | 4 | 33 |
| 4 | 2 | 2020 | 5 | 12 |
The problem is here
I need only the latest usage of year and month
I tried a lot of ways but none of them is good enough. because sometimes this query gets me one ofnot latest records of usages.
SELECT * FROM cars AS c
LEFT JOIN
(select *
from usages
) u on (c.id = u.car_id)
order by u.gas desc
You can do this with a DISTINCT ON in the derived table:
SELECT *
FROM cars AS c
LEFT JOIN (
select distinct on (u.car_id) *
from usages u
order by u.car_id, u.year desc, u.month desc
) lu on c.id = lu.car_id
order by u.gas desc;
I think you need window function row_number. Here is the demo.
select
id,
model,
reseller_id
from
(
select
c.id,
model,
reseller_id,
row_number() over (partition by u.car_id order by u.id desc) as rn
from cars c
left join usages u
on c.id = u.car_id
) subq
where rn = 1

Conditionally return values from LEFT JOIN between 3 tables based on CASE

First off, apologies for a long post. It's really more simple than it looks ;-)
I'm trying to do something that I think is conceptually simple, and I believe I'm most of the way there, but there's one last part that I can't implement without errors that I can't figure out how to fix.
I have three related tables.
Orders:
Each row is an Order with a unique ID, there will never be duplicates.
+---------+---------+
| OrderID | Name |
+---------+---------+
| 1 | Order 1 |
| 2 | Order 2 |
| 3 | Order 3 |
+---------+---------+
Order Details:
Relational table where each row is a product line on an order.
+---------+-----------+
| OrderID | ProductID |
+---------+-----------+
| 1 | a |
| 2 | b |
| 2 | c |
| 3 | a |
| 3 | b |
| 3 | b |
+---------+-----------+
As you can see some orders have just one product (1), some will have multiple products (2) and some will have duplicate products (3).
Products
Each row is a product with a unique ID, there will never be duplicates.
+-----------+-------------+
| ProductID | Description |
+-----------+-------------+
| a | Chicken |
| b | Fish |
| c | Beef |
+-----------+-------------+
I want to return all rows from the Orders table and conditionally return some information about the related Products in one column.
The condition is that I look at how many DISTINCT products each Order has. If it's just 1 then I want to return the Product Description value. If it's more than 1 then I want to return some placeholder text such as 'Multi'.
I think that I need to use CASE to get this working, but I can't figure it out.
I can count the unique products successfully like this:
SELECT
o.Name
,COUNT(DISTINCT d.ProductId) as 'Unique Products'
FROM Orders o
LEFT JOIN OrderDetails d ON o.OrderID = d.OrderID
LEFT JOIN Products p on d.ProductId = p.ProductId
GROUP BY o.Name
ORDER BY o.Name DESC
GO
Results are like this:
+---------+-----------------+
| Name | Unique Products |
+---------+-----------------+
| Order 1 | 1 |
| Order 2 | 2 |
| Order 3 | 2 |
+---------+-----------------+
What I want is this:
+---------+-----------------+
| Name | Unique Products |
+---------+-----------------+
| Order 1 | Chicken |
| Order 2 | Multi |
| Order 3 | Multi |
+---------+-----------------+
I have been trying to use CASE which I believe I've gotten correct:
CASE WHEN (COUNT(DISTINCT d.ProductId)) > 1 THEN 'Multi' ELSE p.Description END AS 'Products'
However unless I add p.Description to GROUP BY then I get the error (which I understand):
Column 'Product.Description' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
But if I do add it the results aren't what I want, for example:
+---------+----------+
| Name | Products |
+---------+----------+
| Order 1 | Chicken |
| Order 2 | Fish |
| Order 2 | Beef |
| Order 3 | Chicken |
| Order 3 | Fish |
| Order 3 | Fish |
+---------+----------+
When it should just say "Order 2 - Multi" on one row for example. This is the bit I don't understand.
If can get some help on this bit alone it would solve my problem and I'd accept the answer. However...
Bonus Round
The above is fine and all, but if this bit is possible I'd accept this as an answer above the others.
Can I concatenate the product names? I've been looking at COALESCE and FOR XML PATH but I can't wrap my head around them at all so I don't even have any code to show.
Results would look something like this:
+---------+--------------+
| Name | Products |
+---------+--------------+
| Order 1 | Chicken |
| Order 2 | Fish;Beef |
| Order 3 | Chicken;Fish |
+---------+--------------+
If you've made it this far I commend you! Thanks!
You are pretty close. You just need some case logic and an aggregation function around the description:
SELECT o.Name,
(CASE WHEN COUNT(DISTINCT d.ProductId) = 1
THEN MAX(p.description)
ELSE 'Multi'
END) as Descriptions
FROM Orders o LEFT JOIN
OrderDetails d
ON o.OrderID = d.OrderID LEFT JOIN
Products p
ON d.ProductId = p.ProductId
GROUP BY o.Name
ORDER BY o.Name DESC
The second part is a very different question. In SQL Server, you need to use an XML subquery:
select o.Name,
stuff((select distinct ',' + p.description
from OrderDetails d left join
Products p
on d.ProductId = p.ProductId
where o.OrderID = d.OrderID
for xml path (''), type
).value('.', 'nvarchar(max)'
), 1, 1, ''
) as descriptions
from Orders o
order by o.Name desc

How to get a MAX and a COUNT from a three table join?

I got an interview question where there's a Car sale modeled in a DB. Each Car represents a physical car in a Car sale which refers to a Make and a Model table. A Sale table keeps track of each Car that is sold. A Sale only consists of one Car, so there's a record in Sale per every unique Car that had been sold.
The question was to find-out the name of the most sold Model in the car sale. I answered with a 3-level nested query. The interviewer specifically asked for a solution using joins where I only succeeded in just joining the tables without the aggregates.
How would you join 3 tables as below (Car, Make, Sale) while using two other aggregates?
Here's a rough sketch of the schema. The most sold Model here should return 'Corolla'
Car
| carid| modid | etc...
_________________
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
Make
| mkid | name |
_________________
| 1 | Toyota |
| 2 | Nissan |
| 3 | Chevy |
| 4 | Merc |
| 5 | Ford |
Model
| modid| name | mkid |
________________________
| 1 | Corolla| 1
| 2 | Sunny | 2
| 3 | Carina | 1
| 4 | Skyline| 2
| 5 | Focus | 5
Sale
| sid | carid | etc...
_________________
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
Edit:
Using MS SQL Server 2008
Output needed:
Model Name | Count
_____________________
Corolla | 3
i.e. The model of the Car that has been sold the most.
Notice only 3 Corollas and 2 Sunnys are in the Car table while Sale table corresponds to each of those with other sales detail. The 5 Sale records are actually Corolla, Corolla, Corolla, Sunnnu and Sunny.
Since you are using SQL Server 2008, make use of Common Table Expression and Window Function.
WITH recordList
AS
(
SELECT c.name, COUNT(*) [Count],
DENSE_RANK() OVER (ORDER BY COUNT(*) DESC) rn
FROM Sale a
INNER JOIN Car b
ON a.carid = b.carID
INNER JOIN Model c
ON b.modID = c.modID
GROUP BY c.Name
)
SELECT name, [Count]
FROM recordList
WHERE rn = 1
SQLFiddle Demo
When interviewers ask for this they usually want you to say that you'd use windowed functions. You could give each sale a unique ascending number partitioned by model and the highest sale number you'd get would be the max count.
http://www.postgresql.org/docs/9.1/static/tutorial-window.html
Following query works on oracle 11g . here's fiddle link
SELECT name FROM (
SELECT model.name AS name FROM car , sale , model
WHERE car.carid=sale.carid
AND car.modid=model.modid
GROUP BY model.name
ORDER BY count(*) DESC )
WHERE rownum = 1;
Or
SELECT name FROM (
SELECT model.name AS name FROM car natural join sale natural join model
GROUP BY model.name
ORDER BY count(*) DESC )
WHERE rownum = 1;
OUTPUT
| NAME |
-----------
| Corolla |
Based on your newly added SQL Server 2008 tag. If you are using a different RDBMS you'll probably need to use limit instead of top and place it at the end of the top_sold_car subquery.
select Make.name as Make, Model.name as Model
from (
select top 1 count(*) as num_sold
from Car
group by modid
order by num_sold desc) as top_sold_car
join Model
on (top_sold_car.modid = Model.modid)
join Make
on (Model.mkid = Make.mkid)

Issue with SQL involving JOINS

I have 2 tables with similar layout, involving INCOME and EXPENSES.
The id column is a customer ID.
I need a result of customer TOTAL AMOUNT, summing up income and expenses.
Table: Income
| id | amountIN|
+--------------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
Table: Expenses
| id | amountOUT|
+---------------+
| 1 | -x |
| 4 | -z |
My problem is that some customers only have expenses and others just income... so cannot know in advance id I need to do a LEFT or RIGHT JOIN.
In the example above an RIGHT JOIN could do the trick, but if the situation is inverted (more customers on the Expenses table) it doesn't work.
Expected Result
| id | TotalAmount|
+--------------+
| 1 | a - x |
| 2 | b |
| 3 | c |
| 4 | d - z |
Any help?
select id, SUM(Amount)
from
(
select id, amountin as Amount
from Income
union all
select id, amountout as Amount
from Expense
) a
group by id
I believe a full join will solve your problem.
I would approach this as a union. Do that in your subquery then sum on it.
For instance:
select id, sum(amt) from
(
select i.id, i.amountIN as amt from Income i
union all
select e.id, e.amountOUT as amt from Expenses e
)
group by id
You should really have another table like client :
Table: Client
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
So you could do something like that
SELECT Client.ID, COALESCE(Income.AmountIN, 0) - COALESCE(Expenses.AmountOUT, 0)
FROM Client c
LEFT JOIN Income i ON i.ID = c.ID
LEFT JOIN Expense e ON e.ID = c.ID
Will be less complicated and i'm sure it will come handy another time :)

Get SUM in GROUP BY with JOIN using MySQL

I have two tables in MySQL 5.1.38.
products
+----+------------+-------+------------+
| id | name | price | department |
+----+------------+-------+------------+
| 1 | Fire Truck | 15.00 | Toys |
| 2 | Bike | 75.00 | Toys |
| 3 | T-Shirt | 18.00 | Clothes |
| 4 | Skirt | 18.00 | Clothes |
| 5 | Pants | 22.00 | Clothes |
+----+------------+-------+------------+
ratings
+------------+--------+
| product_id | rating |
+------------+--------+
| 1 | 5 |
| 2 | 5 |
| 2 | 3 |
| 2 | 5 |
| 3 | 5 |
| 4 | 5 |
| 5 | 4 |
+------------+--------+
My goal is to get the total price of all products which have a 5 star rating in each department. Something like this.
+------------+-------------+
| department | total_price |
+------------+-------------+
| Clothes | 36.00 | /* T-Shirt and Skirt */
| Toys | 90.00 | /* Fire Truck and Bike */
+------------+-------------+
I would like to do this without a subquery if I can. At first I tried a join with a sum().
select department, sum(price) from products
join ratings on product_id=products.id
where rating=5 group by department;
+------------+------------+
| department | sum(price) |
+------------+------------+
| Clothes | 36.00 |
| Toys | 165.00 |
+------------+------------+
As you can see the price for the Toys department is incorrect because there are two 5 star ratings for the Bike and therefore counting that price twice due to the join.
I then tried adding distinct to the sum.
select department, sum(distinct price) from products
join ratings on product_id=products.id where rating=5
group by department;
+------------+---------------------+
| department | sum(distinct price) |
+------------+---------------------+
| Clothes | 18.00 |
| Toys | 90.00 |
+------------+---------------------+
But then the clothes department is off because two products share the same price.
Currently my work-around involves taking something unique about the product (the id) and using that to make the price unique.
select department, sum(distinct price + id * 100000) - sum(id * 100000) as total_price
from products join ratings on product_id=products.id
where rating=5 group by department;
+------------+-------------+
| department | total_price |
+------------+-------------+
| Clothes | 36.00 |
| Toys | 90.00 |
+------------+-------------+
But this feels like such a silly hack. Is there a better way to do this without a subquery? Thanks!
Use:
SELECT p.department,
SUM(p.price) AS total_price
FROM PRODUCTS p
JOIN (SELECT DISTINCT
r.product_id,
r.rating
FROM RATINGS r) x ON x.product_id = p.id
AND x.rating = 5
GROUP BY p.department
Technically, this does not use a subquery - it uses a derived table/inline view.
The primary reason you are having trouble finding a solution is that the schema as presented is fundamentally flawed. You shouldn't allow a table to have two rows that are complete duplicates of each other. Every table should have a means to uniquely identify each row even if it is the combination of all columns. Now, if we change the ratings table so that it has an AUTO_INCREMENT column called Id, the problem is easier:
Select products.department, Sum(price) As total_price
From products
Left Join ratings As R1
On R1.product_id = products.id
And R1.rating = 5
Left Join ratings As R2
On R2.product_id = R1.product_id
And R2.rating = R1.rating
And R2.Id > R1.Id
Where R2.Id Is Null
Group By products.department
You can do two queries. First query:
SELECT DISTINCT product_id FROM ratings WHERE rating = 5;
Then, take each of those ID's and manually put them in the second query:
SELECT department, Sum(price) AS total_price
FROM products
WHERE product_id In (1,2,3,4)
GROUP BY department;
This is the work-around for not being able to use subqueries. Without them, there is no way to eliminate the duplicate records caused by the join.
I can't think of any way to do it without a subquery somewhere in the query. You could perhaps use a View to mask the use of a subquery.
Barring that, your best bet is probably to find the minimum data set needed to make the calculation and do that in the front end. Whether or not that's possible depends on your specific data - how many rows, etc.
The other option (actually, maybe this is the best one...) would be to get a new ORM or do without it altogether ;)
This view would allow you to bypass the subquery:
CREATE VIEW Distinct_Product_Ratings
AS
SELECT DISTINCT
product_id,
rating
FROM
Ratings