SQL group function nested too deeply - sql

I want to make a SQL query which finds the pname of parts which are the least repeated in supplier_parts_shipment.
Table 1 supplier (sno, sname, city)
1, ahmad, jeddah
2,kaled,med
3,njwa,med
Table 2 parts (pno, pname, color)
1, laptop, red
2,keybord,blue
Table 3 supplier_parts_shipment (shno, sno, pno, date)
1,1,1,2014
2,2,1,2014
3,3,2,2014
I tried something like this:
SELECT pname
, min(count(pno))
FROM parts
WHERE pno IN (SELECT pno
FROM supplier_parts_shipment
group by
pname
HAVING min(count(pno))
)
SQL> /
pno IN(SELECT pno FROM supplier_parts_shipment group by pname HAVING min(count(pno))))
*
ERROR at line 2:
ORA-00935: group function is nested too deeply

I would have gone about it in a different manner.
First create a query that shows the counts of shipments by pname ordered in ascending values. Use that as a subquery and pick the first.
SELECT * FROM (
SELECT COUNT(sps.pno), p.pname
FROM supplier_parts_shipment sps
JOIN parts p on sps.pno = p.pno
GROUP BY pname
ORDER BY COUNT(sps.pno) ASC)
WHERE ROWNUM = 1

If there are multiple parts which are least frequent and you want all of them then:
WITH pno_frequencies AS (
SELECT pno,
COUNT(1) AS pno_cnt
FROM supplier_parts_shipment
GROUP BY pno
),
least_frequent_pnos AS (
SELECT pno
FROM pno_frequencies
WHERE pno_cnt = ( SELECT MIN( pno_cnt ) FROM pno_frequencies )
)
SELECT pname
FROM parts p
WHERE EXISTS (SELECT 1
FROM least_frequent_pnos f
WHERE p.pno = f.pno
);
If you only want a single part regardless of whether there are multiple parts with the same minimum frequency then:
WITH pno_frequencies AS (
SELECT pno,
COUNT(1) AS pno_cnt
FROM supplier_parts_shipment
GROUP BY pno
ORDER BY pno_cnt ASC
),
least_frequent_pno AS (
SELECT pno
FROM pno_frequencies
WHERE ROWNUM = 1
)
SELECT pname
FROM parts p
WHERE EXISTS (SELECT 1
FROM least_frequent_pno f
WHERE p.pno = f.pno
);
SQLFIDDLE

min(count(pno))
makes no sense. that is why you are getting the error. try:
select parts.pname
, subse.c
from (select pno
, dense_rank() over (order by c asc) r
, c
from (select pno
, count(pno) as c
from supplier_parts_shipment
group by
pno
)
) subse
inner join parts
on (parts.pno = subse.pno)
where subse.r = 1
the innermost select counts the supplier_parts_shipment for pno, the second level finds the pno that has the least count from the innermost. the outermost select then joins pname from parts and filters to the one that has the least repeats.

If you are using oracle 12c or higher then it will be quite easy with row limiting clause as following:
Select p.pname, count(1) as cnt
From parts p
Join supplier_parts_shipment sps
On sps.pno = p.pno
Group by p.name
Order by cnt
Fetch first row with ties
For more information about row limiting clause refer this document.
Cheers!!

Related

How to avg and then order data in SQL if it can appear in two columns?

So I have table points with columns like person1_id person2_id team_score and you can see that one person can appear in person1_id or person2_id column because the player can be in multiple teams, so the question is how do I get the top n players with highest average_score which is defined by the average of all the team_score he|she participated in? like person_id average_score?
One approach uses a union to create a single logical column of all players and their scores:
SELECT
person_id,
AVG(team_score) AS average_score
FROM
(
SELECT person1_id AS person_id, team_score FROM points
UNION ALL
SELECT person2_id, team_score FROM points
) t
GROUP BY
person_id
ORDER BY
AVG(team_score) DESC
LIMIT 10; -- e.g. for the top 10, but you may replace 10 with any value you want
I think Tim's answer is a very good answer. But, assuming that you have a persons table, you can do this without union all:
select p.person_id,
(select avg(team_score)
from points po
where p.person_id in (po.person1_id, po.person2_id)
) as average_score
from persons p
order by average_score desc
limit 5; -- or whatever
A rather more complicated expression is probably the most efficient:
select p.person_id,
( (select sum(team_score)
from points po
where p.person_id = po.person1_id
) +
(select sum(team_score)
from points po
where p.person_id = po.person2_id
)
) /
nullif( (select count(*)
from points po
where p.person_id = po.person1_id
) +
(select count(*)
from points po
where p.person_id = po.person2_id
), 0
) as average_score
from persons p
order by average_score desc
limit 5;
The reason this is more efficient is that it can make use of indexes on points(person1_id, team_score) and points(person2_id, team_score).
You might solve your problem with a coalesce function, something like:
SELECT
COALESCE(person1_id, person2_id) AS person_id,
AVG(team_score) as average_score
FROM points
GROUP BY COALESCE(person1_id, person2_id)
ORDER BY AVG(team_score)
What's going on here is COALESCE(col1, col2) returns the first non-null column in the list. You can do this with as many columns as you like.
Here are the docs: https://www.sqlite.org/lang_corefunc.html#coalesce
First you need to get all the distinct palyers and then join them to the points table:
select
d.id person_id,
avg(p.team_score) avgscore
from (
select person1_id id from points
union
select person2_id id from points
) d inner join points p
on (p.person1_id = d.id) or (p.person2_id = d.id)
group by d.id
order by avgscore desc
limit 3
See the demo

Oracle - optimising SQL query

I have two tables - countries (id, name) and users (id, name, country_id). Each user belongs to one country. I want to select 10 random users from the same random country. However, there are countries that have less than 10 users, so I can't use them. I need to select only from those countries, that have at least 10 users.
I can write something like this:
SELECT * FROM(
SELECT *
FROM users u
{MANY_OTHER_JOINS_AND_CONDITIONS}
WHERE u.country_id =
(
SELECT *
FROM
(
SELECT c.id
FROM countries c
JOIN
(
SELECT users.country_id, COUNT(*) as cnt
FROM users
{MANY_OTHER_JOINS_AND_CONDITIONS}
GROUP BY users.country_id
) X ON X.country_id = c.id
WHERE X.cnt >= 10
ORDER BY DBMS_RANDOM.RANDOM
) Y
WHERE ROWNUM = 1
)
ORDER BY DBMS_RANDOM.RANDOM
) Z WHERE ROWNUM < 10
However, In my real scenario, I have more conditions and joins to other tables for determining which user is applicable. By using this query, I must have these conditions on two places - in query that actually selects data and in the count subquery.
Is there any way how to write query like this but without having those other conditions on two places (which is probably not good performance-wise)?
You can use a CTE for the user criteria to avoid repeating the logic and to allow the DB to cache that set once (though in my experience the DB isn't as good at that as it should be, so check your execution plan).
I'm more of a Sql Server guy than Oracle, and syntax is subtly different so this may need some tweaks yet, but try this:
WITH SafeUsers (ID, Name, country_id) As
(
--criteria for users only has to specified here
SELECT ID, Name, country_id
FROM users
WHERE ...
),
RandomCountry (ID) As
(
SELECT ID
FROM (
SELECT u.country_id AS ID
FROM SafeUsers u -- but we reference it HERE
GROUP BY u.country_id
HAVING COUNT(u.Id) >= 10
ORDER BY DBMS_RANDOM.RANDOM
) c
WHERE ROWNUM = 1
)
SELECT u.*
FROM (
SELECT s.*
FROM SafeUsers s -- and HERE
INNER JOIN RandomCountry r ON s.country_id = r.ID
ORDER BY DBMS_RANDOM.RANDOM
) u
WHERE ROWNUM <= 10
And by removing nesting and introducing names for each intermediate step, this query is suddenly much easier to read and maintain.
you could create a view
for
create view user_with_many_cond as
SELECT *
FROM users u
{MANY_OTHER_JOINS_AND_CONDITIONS}
ths looking to your query
You could use having instead of a where outside the query
The order by seems could be placed inside the inner query
so the filter for one row
SELECT * FROM(
SELECT *
FROM user_with_many_cond u
WHERE u.country_id =
(
SELECT c.id
FROM countries c
JOIN
(
SELECT users.country_id, COUNT(*) as cnt
FROM user_with_many_cond
GROUP BY users.country_id
HAVING cnt >=10
ORDER BY DBMS_RANDOM.RANDOM
) X ON X.country_id = c.id
WHERE ROWNUM = 1
)
ORDER BY DBMS_RANDOM.RANDOM
) Z WHERE ROWNUM < 10
To get countries with more than 10 users:
SELECT users.country_id
, row_number() over (order by dbms_random.value()) as rn
FROM users
GROUP BY users.country_id having count(*) > 10
Use this as a sub-query to choose a country and grab some users:
with ctry as (
SELECT users.country_id
, row_number() over (order by dbms_random.value()) as ctry_rn
FROM users
GROUP BY users.country_id having count(*) > 10
)
, usr as (
select user_id
, row_number() over (order by dbms_random.value()) as usr_rn
from ctry
join users
on users.country_id = ctry.country_id
where ctry.ctry_rn = 1
)
select users.*
from usr
join users
on users.user_id = usr.user_id
where usr.usr_rn <= 10
/
This example ignores your {MANY_OTHER_JOINS_AND_CONDITIONS}: please inject them back where you need them.

VLookup in SQL? - Joining to only pick out the top row

I am trying to get just the first row from a JOIN in SQL. Something similiar to Vlookup in Excel.
I have the following tables
CREATE TABLE customer_lookup (
customer_product varchar(50),
supplier_product varchar(50),
customer_code varchar(10)
)
CREATE TABLE supplier (
part_number varchar(50)
)
INSERT INTO customer_lookup (
customer_product,
supplier_product,
customer_code ) VALUES ('CONTAINER', 'BOX', 'CUST01')
INSERT INTO customer_lookup (
customer_product,
supplier_product,
customer_code ) VALUES ('CONTAINER', 'BOX', 'CUST02')
INSERT INTO customer_lookup (
customer_product,
supplier_product,
customer_code ) VALUES ('FABRIC', 'MATERIAL', 'CUST01')
INSERT INTO supplier ( part_number ) VALUES ('FABRIC')
INSERT INTO supplier ( part_number ) VALUES ('CONTAINER')
INSERT INTO supplier ( part_number ) VALUES ('PAINT')
and my query is
SELECT
s.part_number, c.supplier_product, c.customer_code
FROM
supplier s
LEFT JOIN
(
SELECT * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
http://sqlfiddle.com/#!6/716b5/1
The result I am trying to get is
part_number supplier_product customer_code
FABRIC MATERIAL CUST01
CONTAINER BOX CUST01
PAINT (null) (null)
but the above SQL query produces
part_number supplier_product customer_code
FABRIC MATERIAL CUST01
CONTAINER BOX CUST01
CONTAINER BOX CUST02
PAINT (null) (null)
I don't care that the row with CONTAINER is missing customer_code CUST02. I just need to top one
I have tried
SELECT
s.part_number, c.supplier_product, c.customer_code
FROM
supplier s
LEFT JOIN
(
SELECT TOP 1 * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
but this just nulls out both FABRIC and PAINT rows
Any help would be appreciated
You can use GROUP BY and MAX to achieve what you're looking for
SELECT
s.part_number, c.supplier_product, MAX(c.customer_code)
FROM
supplier s
LEFT JOIN
(
SELECT * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
GROUP BY s.part_number, c.supplier_product
For every part_number and supplier_product unique identifying combination, you want the highest customer_code value.
If you don't care which row qualifies as the top row, as long as it returns one row at most, then you can use the row_number window function with order by null.
SELECT s.part_number, c.supplier_product, c.customer_code
FROM supplier s
LEFT JOIN (SELECT *,
row_number() over (partition by customer_product order by null) as rn
FROM customer_lookup) c
ON s.part_number = c.customer_product
AND c.rn = 1
If you do care which row gets picked, then just modify the order by clause accordingly.
You can simply use CROSS APPLY to get your results, the main benefit here is that you are not using aggregation (GROUP BY)
SELECT
s.part_number, c.supplier_product, c.customer_code
FROM
supplier s
CROSS APPLY
(
SELECT TOP 1 * FROM customer_lookup t
WHERE s.part_number = t.customer_product
ORDER BY t.customer_code
) c
You should also add an ORDER BY to ensure the results are order the way you want them to be (I have added this in for you).
You should also define columns that you are using rather than using an asterisk (*) but that's up to you (I've left this as is for now)
http://sqlfiddle.com/#!6/716b5/17
If you're wanting the results you showed and in the order you showed them
SELECT
s.part_number, c.supplier_product, MIN(c.customer_code)
FROM
supplier s
LEFT JOIN
(
SELECT * FROM customer_lookup t
) c
ON s.part_number = c.customer_product
GROUP BY s.part_number, c.supplier_product
ORDER BY c.supplier_product DESC

Get Products supporting all delivery modes in SQL

I have a table ProductDeliveryModes as:
ProductId DeliveryId
P101 D1
P101 D2
P101 D3
P102 D1
P102 D2
P102 D3
P103 D1
I need to get products which support all delivery modes (D1, D2, D3). From looking at the table the products should be: P101 and P102.
The query that I formed to get the solution is:
SELECT ProductId
FROM (SELECT DISTINCT ProductId,
DeliveryId
FROM ProductDeliveryModes) X
WHERE X.DeliveryId IN ( 'D1', 'D2', 'D3' )
GROUP BY ProductId
HAVING COUNT(*) = 3
The problem that I see in my solution is that one should know the count of the total number of delivery modes. We could make the count dynamic by getting the count from Sub-query.
Is there a better solution ?
I believe you can use DISTINCT with COUNT function to get the same result:
SELECT [ProductID]
FROM ProductDeliveryModes
GROUP BY [ProductID]
HAVING COUNT(DISTINCT [DeliveryId]) = 3
Check the example.
You can simple store the distinct delivery count in a variable and used it. If you need to do this in a single query, this is one of the possible ways:
WITH CTE (DeliveryCount) AS
(
SELECT COUNT(DISTINCT [DeliveryID])
FROM DataSource
)
SELECT [ProductID]
FROM DataSource
CROSS APPLY CTE
GROUP BY [ProductID]
,CTE.DeliveryCount
HAVING COUNT(DISTINCT [DeliveryID]) = DeliveryCount
See the example.
you can use this below query for better performance.
;WITH CTE_Product
AS
(
SELECT DISTINCT ProductID
FROM ProductDeliveryModes
),CTE_Delivery
AS
(
SELECT DISTINCT DeliveryId
FROM ProductDeliveryModes
)
SELECT *
FROM CTE_Product C
WHERE NOT EXISTS
(
SELECT 1
FROM CTE_Delivery D
LEFT JOIN ProductDeliveryModes T ON T.DeliveryId = D.DeliveryId AND T.ProductId=C.ProductId
WHERE T.ProductID IS NULL
)
You can modify your query just a bit to get the actual count of distinct delivery methods:
SELECT ProductID
FROM ProductDeliveryModes
GROUP BY ProductID
HAVING COUNT(*) =
(SELECT COUNT (DISTINCT DeliveryId) FROM ProductDeliveryModes)

Get Latest ID from a Duplicate Records in a table

so i have two tables, one is RAWtable and the other is MAINtable, I have to get the latest groupID if there
are more than one records exist (comparing same name, code). For example, I have this on RAWtable:
id groupid name code
1 G09161405 Name1 Code1
2 G09161406 Name1 Code1
the two records should be treated as one and should return this value only:
id groupid name code
2 G09161406 Name1 Code1
This row is the only row that shiuld be inserted in the main table. Provided returning the latest GroupID (the groupid is the combination of date and time)
I've tried this but its not working:
SELECT MAST.ID, MAST.code, MAST.name FROM RAWtable AS MAST INNER JOIN
(SELECT code, name, grouid,id FROM RAWtable AS DUPT GROUP BY code, name, groupid,id HAVING COUNT(*) >= 2) DUPT
ON DUPT.code =MAST.code and DUPT.name =MAST.name where dupt.groupid >mast.groupid
how can i do this? thanks a lot.
select R.id,
R.groupid,
R.name,
R.code
from (select id,
groupid,
name,
code,
row_number() over(partition by name, code order by groupid desc) as rn
from RawTable
) as R
where R.rn = 1
Or if you don't have row_number()
select R1.id,
R1.groupid,
R1.name,
R1.code
from RawTable as R1
inner join (
select name, code, max(groupid) as groupid
from RawTable
group by name, code
) as R2
on R1.name = R2.name and
R1.code = R2.code and
R1.groupid = R2.groupid
Try this way, it will give you max group id which will be latest :
SELECT MAX(GroupId), Name, Code
FROM RAWtable
GROUP BY Name, Code
select max(id),name, code from RaTable
group by name,code having count(*)>1
Will return:
id name code
2 Name1 Code1
Will return the max gorupid for all the records that have more than one record in the table
Try this:
select max(t.groupid), t.name, t.code
from RAWtable t
group by t.name, t.code
This will basically select the max value of groupid for each name and code combination.