Select all rows that have multiple sub-relations satisfying some constraints - sql

Let’s say I have a list of baskets that can contain fruits with a certain weight:
Table baskets
(id, name)
----------------
1, 'apples, oranges and more'
2, 'apples and small oranges'
3, 'apples and bananas'
4, 'only oranges'
5, 'empty'
Table basket_fruits
(id, basket, fruit, weight)
----------------
1, 1, 'apple', 2
2, 1, 'apple', 3
3, 1, 'orange', 2
4, 1, 'banana', 2
5, 2, 'apple', 2
6, 2, 'orange', 1
7, 3, 'apple', 2
8, 3, 'banana', 2
9, 4, 'orange', 2
SQL Fiddle with this data
I’m struggling to come up with reasonably efficient queries for these two scenarios:
I want to fetch all baskets that contain at least one apple AND at least one orange, each above a given weight. So the expected result for weight >= 2 is
1, 'apples, oranges and more'
and for weight >= 1 it’s
1, 'apples, oranges and more'
2, 'apples and small oranges'
I want to fetch all baskets that contain no fruit above a given weight. So for weight >= 2 I would expect
5, 'empty'
and for weight >= 3 it should return
2, 'apples and small oranges'
3, 'apples and bananas'
4, 'only oranges'
5, 'empty'
The weight constraint is just a placeholder for "each sub-relation must meet certain constraints". In practice, we need to restrict the sub-relation by date range, status, etc. but I didn’t want to complicate the example any further.
(I’m using postgresql, in case the solution needs to be database-specific.)

I strongly recommend using group by and having for this purpose.
For your first question, this query should work:
SELECT b.name
FROM baskets b INNER JOIN
basket_fruits bf
ON b.id = bf.basket
GROUP BY b.name
HAVING SUM( (bf.fruit = 'apple' AND bf.weight >= 2)::int ) > 0 AND
SUM( (bf.fruit = 'orange' AND bf.weight >= 2)::int ) > 0 ;
The second is a little more complicated, because there are no rows. But a left join and coalesce() suffice so you can express it in the same format:
SELECT b.name
FROM baskets b LEFT JOIN
basket_fruits bf
ON b.id = bf.basket
GROUP BY b.name
HAVING SUM( (COALESCE(bf.weight, 0) >= 2)::int ) = 0

Here are my solutions so far:
All baskets containing both fruits with weight >= 2 (thanks to Gordon Linoff’s suggestions):
SELECT b.* FROM baskets b
INNER JOIN (
SELECT basket FROM basket_fruits
WHERE weight >= 2
GROUP BY basket
HAVING SUM((fruit = 'apple')::int) > 0 AND SUM((fruit = 'orange')::int) > 0
) bf ON b.id = bf.basket
SQL-Fiddle
All baskets without fruits with weight >= 2:
SELECT b.* FROM baskets b
LEFT JOIN (
SELECT basket, fruit FROM basket_fruits
WHERE weight >= 2
) bf ON b.id = bf.basket
WHERE fruit IS NULL
SQL-Fiddle
If anyone has more efficient ideas, I’d love to hear them.

SELECT b.name
FROM baskets b
INNER JOIN basket_fruits f
ON b.id = f.basket
GROUP BY b.name
HAVING SUM(f.wage) >= 3
OR b.id = 5
Is it what you expected?

Related

How to filter a SQL Query using ROLLUP

This query:
SELECT 1, 2, count(*)
FROM t
GROUP BY ROLLUP (1, 2)
ORDER BY 1, 2
Shows:
1, 2
A Null 3
A Blue 2
A Neon 1
B NULL 2
B Navy 2
C NULL 4
C Neon 2
C Blue 2
You see the sums A = 3, B = 2, and C = 4?
I want to filter to only show if the SUM is greater than 2, and all related data. So I'd see all A and all C, but not B.
If I add HAVING COUNT(*) > 2
it affects all values. I'd see lines 1 and 6.
I have also tired
HAVING grouping(count(*)) > 2
but get error
"Cannot perform an aggregate function on an expression containing an aggregate or a subquery." I am semi new to SQL so I don't know if this related to what I am trying to do.
Thanks!
use exists like below
select a.* from
(
SELECT col1, col2, count(*) as cnt
FROM t
GROUP BY ROLLUP (col1, col2)
) a where
exists ( select 1 from
(
SELECT 1, 2, count(*) as cnt
FROM t
GROUP BY ROLLUP (1, 2)
) b where a.col1=b.col1 and b.cnt>2)

SQL query to select rows with all ACL that user has

Suppose I have following table RIGHTS with data:
ID NAME OWNER_ID ACL_ID ACL_NAME
--------------------------------------------------
100 Entity_1 1 1 g1
100 Entity_1 2 2 g2
100 Entity_1 3 3 g3
200 Entity_2 1 1 g1
200 Entity_2 2 2 g2
300 Entity_3 1 1 g1
300 Entity_3 2 2 g2
300 Entity_3 4 NULL NULL
400 Entity_4 1 1 g1
400 Entity_4 2 2 g2
400 Entity_4 3 3 g3
400 Entity_4 4 NULL NULL
500 Entity_5 4 NULL NULL
500 Entity_5 5 NULL NULL
500 Entity_5 6 NULL NULL
600 Entity_6 NULL NULL NULL
How to select all (ID, NAME) records for which there is no even single ACL_ID=NULL row except those rows with OWNER_ID=NULL. In this particular example I want to select 3 rows:
(100, Entity_1) - because all 3 rows with ACL_ID != NULL (1, 2, 3)
(200, Entity_2) - because all 2 rows with ACL_ID != NULL (1, 2)
(600, Entity_6) - because OWNER_ID=NULL
For now I use SQL Server, but I want it works on Oracle as well if it possible.
UPDATE
I apologize I had to mention that this table data is just a result of a query with joins, so it has to be taken into account:
SELECT DISTINCT
EMPLOYEE.ID
,EMPLOYEE.NAME
, OWNERS.OWNER_ID as OWNER_ID
, GROUPS.GROUP_ID as ACL_ID
, GROUPS.NAME as ACL_NAME
from EMPLOYEE
inner join ENTITIES on ENTITIES.ENTITY_ID = ID
left outer join OWNERS on (OWNERS.ENTITY_ID = ID and OWNERS.OWNER_ID != 123)
left outer join GROUPS on OWNERS.OWNER_ID = GROUPS.GROUP_ID
where
ENTITIES.STATUS != 'D'
Try this:
select s.id, s.name
from
(select id,name,max(coalesce(owner_id,-1)) owner_id, min(coalesce(acl_id,-1)) acl_id
from yourtable
group by id,name) as s
where s.owner_id = -1
or (s.owner_id > -1 and s.acl_id > -1)
We use COALESCE to default null values to -1 (assuming the columns are integers), and then get the minimum values of owner_id and acl_id per unique id-name combination. If the maximum value of owner_id is -1, then the owner column is null. Likewise, if minimum value of acl_id is -1, then at least one null valued row exists. Based on these 2 conditions, we filter the list to get the required id-name pairs. Note that in this case, I simply chose -1 as the default value because I assume you don't use negative numbers as IDs. If you do, you can choose a suitable, "impossible" value as the default for the COALESCE function.
This should work on SQL Server and Oracle.
Here's my solution on Oracle.
SELECT DISTINCT
EMPLOYEE.ID
,EMPLOYEE.NAME
, OWNERS.OWNER_ID as OWNER_ID
, GROUPS.GROUP_ID as ACL_ID
, GROUPS.NAME as ACL_NAME
from EMPLOYEE
inner join ENTITIES on ENTITIES.ENTITY_ID = ID
left outer join OWNERS on (OWNERS.ENTITY_ID = ID and OWNERS.OWNER_ID != 123)
left outer join GROUPS on OWNERS.OWNER_ID = GROUPS.GROUP_ID
where ENTITIES.STATUS != 'D'
and EMPLOYEE.ID not in (select id from EMPLOYEE
where GROUPS.GROUP_ID is null
and OWNERS.OWNER_ID is not null);
You simply need to append the inner subquery from my earlier answer and you will get your solution.
Using simple filters in where clause:
with tab(ID,NAME,OWNER_ID,ACL_ID,ACL_NAME) as (
select 100, 'Entity_1', 1,1, 'g1' from dual union all
select 100, 'Entity_1', 2,2, 'g2' from dual union all
select 100, 'Entity_1', 3,3, 'g3' from dual union all
select 200, 'Entity_2', 1,1, 'g1' from dual union all
select 200, 'Entity_2', 2,2, 'g2' from dual union all
select 300, 'Entity_3', 1,1, 'g1' from dual union all
select 300, 'Entity_3', 2,2, 'g2' from dual union all
select 300, 'Entity_3', 4,NULL, NULL from dual union all
select 400, 'Entity_4', 1,1, 'g1' from dual union all
select 400, 'Entity_4', 2,2, 'g2' from dual union all
select 400, 'Entity_4', 3,3, 'g3' from dual union all
select 400, 'Entity_4', 4,NULL,NULL from dual union all
select 500, 'Entity_5', 4,NULL,NULL from dual union all
select 500, 'Entity_5', 5,NULL,NULL from dual union all
select 500, 'Entity_5', 6,NULL,NULL from dual union all
select 600, 'Entity_6', NULL,NULL,NULL from dual)
--------------------------------
---End of data preparation here
--------------------------------
select a.id, a.name
from tab a
where ((a.ACL_ID is not null and a.ACL_NAME is not NULL) or a.OWNER_ID is null)
and not exists (select 'x'
from tab b
where b.id = a.id
and (b.ACL_ID is null or b.ACL_NAME is null)
and b.owner_id is not null)
group by a.id, a.name;
Output:
ID NAME
------------
200 Entity_2
100 Entity_1
600 Entity_6
But I still wonder, what would be you logic where there is data like :
ID NAME OWNER_ID ACL_ID ACL_NAME
--------------------------------------------------
600 Entity_1 null null null
600 Entity_1 2 null null
??????????

Left outer join on aggregate queries

So I have two payment tables that I want to compare in a Oracle SQL DB. I want to compare the the total payments using the location and invoice and total payments. It's more comlex then this but basically it is:
select
tbl1.location,
tbl1.invoice,
Sum(tbl1.payments),
Sum(tbl2.payments)
From
tbl1
left outer join tbl2 on
tbl1.location = tbl2.location
and tbl1.invoice = tbl2.invoice
group by
(tbl1.location,tbl1.invoice)
I want the left outer join because in addition to comparing payment amounts, I want see check all orders in tbl1 that may not exist in tbl2.
The issue is that there is that there is multiple records for each order (location & invoice) in both tables (not the same number of records necessarily ie 2 in tbl1 to 1 in tbl2 or vice versa) but the total payments for each order (location & invoice) should match. So just doing a direct join gives me a cartesian product.
So I am thinking I could do two queries, first aggregating the total payments by store & invoice for each and then do a join on those results because in the aggregate results, I would only have one record for each order (store & invoice). But I don't know how to do this. I've tried several subqueries but can't seem the shake the cartesian product. I'd like to be able to do this in one query as opposed to creating tables and joining on those as this will be ongoing.
Thanks in advance for any help.
You can use the With statement to create the two querys and join then as you said. I will put just the sintaxe and if you need more help just ask. Thats because you didn't provide full details on your tables. So I will just guess on my answer.
WITH tmpTableA as (
select
tbl1.location,
tbl1.invoice,
Sum(tbl1.payments) totalTblA
From
tbl1
group by
tbl1.location,
tbl1.invoice
),
tmpTableB as (
select
tbl2.location,
tbl2.invoice,
Sum(tbl2.payments) totalTblB
From
tbl2
group by
tbl2.location,
tbl2.invoice
)
Select tmpTableA.location, tmpTableA.invoice, tmpTableA.totalTblA,
tmpTableB.location, tmpTableB.invoice, tmpTableB.totalTblB
from tmpTableA, tmpTableB
where tmpTableA.location = tmpTableB.location (+)
and tmpTableA.invoice = tmpTableB.invoice (+)
The (+) operator is the left join operator for Oracle Database (Of course, you can use the LEFT JOIN statements if you prefer )
Two other options:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE tbl1 ( id, location, invoice, payments ) AS
SELECT 1, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 2, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 3, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 4, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 5, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 6, 'a', 2, 1 FROM DUAL
UNION ALL SELECT 7, 'a', 2, 1 FROM DUAL
UNION ALL SELECT 8, 'a', 2, 1 FROM DUAL
UNION ALL SELECT 9, 'b', 1, 1 FROM DUAL
UNION ALL SELECT 10, 'b', 2, 1 FROM DUAL;
CREATE TABLE tbl2 ( id, location, invoice, payments ) AS
SELECT 1, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 2, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 3, 'a', 1, 1 FROM DUAL
UNION ALL SELECT 4, 'a', 2, 1 FROM DUAL
UNION ALL SELECT 5, 'a', 2, 1 FROM DUAL
UNION ALL SELECT 6, 'b', 1, 1 FROM DUAL
UNION ALL SELECT 7, 'b', 1, 1 FROM DUAL
UNION ALL SELECT 8, 'b', 1, 1 FROM DUAL
UNION ALL SELECT 9, 'b', 1, 1 FROM DUAL
UNION ALL SELECT 10, 'b', 1, 1 FROM DUAL;
Query 1:
This one uses a correlated sub-query to calculate the total for the second table:
SELECT location,
invoice,
SUM( payments ) AS total_payments_1,
COALESCE( (SELECT SUM( payments )
FROM tbl2 i
WHERE o.location = i.location
AND o.invoice = i.invoice),
0 ) AS total_payments_2
FROM tbl1 o
GROUP BY
location,
invoice
ORDER BY
location,
invoice
Results:
| LOCATION | INVOICE | TOTAL_PAYMENTS_1 | TOTAL_PAYMENTS_2 |
|----------|---------|------------------|------------------|
| a | 1 | 5 | 3 |
| a | 2 | 3 | 2 |
| b | 1 | 1 | 5 |
| b | 2 | 1 | 0 |
Query 2:
This one uses a named sub-query to pre-calculate the totals for table 1 then performs a LEFT OUTER JOIN with the second table and includes the total for table 1 in the group.
Without any indexes then, from the explain plans, Query 1 seems to be much more efficient but your indexes might mean the optimizer finds a better plan.
WITH tbl1_sums AS (
SELECT location,
invoice,
SUM( payments ) AS total_payments_1
FROM tbl1
GROUP BY
location,
invoice
)
SELECT t1.location,
t1.invoice,
t1.total_payments_1,
COALESCE( SUM( t2.payments ), 0 ) AS total_payments_2
FROM tbl1_sums t1
LEFT OUTER JOIN
tbl2 t2
ON ( t1.location = t2.location
AND t1.invoice = t2.invoice)
GROUP BY
t1.location,
t1.invoice,
t1.total_payments_1
ORDER BY
t1.location,
t1.invoice
Results:
| LOCATION | INVOICE | TOTAL_PAYMENTS_1 | TOTAL_PAYMENTS_2 |
|----------|---------|------------------|------------------|
| a | 1 | 5 | 3 |
| a | 2 | 3 | 2 |
| b | 1 | 1 | 5 |
| b | 2 | 1 | 0 |
Sorry, my first answer was wrong. Thank you for providing the sqlfiddle, MT0.
The point that i missed is that you need to sum up the payments on each table first, so there's only one line left in each, then join them. This is what MT0 does in his statements.
If you want a solution that looks more "symmetric", try:
select A.location, A.invoice, B.total sum1, C.total sum2
from (select distinct location, invoice from tbl1) A
left outer join (select location, invoice, sum(payments) as total from tbl1 group by location, invoice) B on A.location=B.location and A.invoice=B.invoice
left outer join (select location, invoice, sum(payments) as total from tbl2 group by location, invoice) C on A.location=C.location and A.invoice=C.invoice
which results in
LOCATION INVOICE SUM1 SUM2
a 2 3 2
a 1 5 3
b 1 1 5
b 2 1 (null)

Selecting rows with duplicate values grouped

I have two tables that look something like this
Table Dog:
PK, color
1, red
2, yellow
3, red
4, red
5, yellow
The dogs have toys.
Table toys
PK, FK, name
1, 2, bowser
2, 2, oscar
3, 3, greg
4, 4, alp
5, 4, hanson
6, 5, omar
7, 5, herm
I need a query that selects the count of all yellow dogs that have more than one toy.
I was thinking somehting like:
Select count(*)
from toys t, dogs d
where t.fk = d.pk
and d.color = 'yellow'
group by t.fk
having count(t.fk) > 1;
It should return 2. but it comes back with mutiple rows
select count(*)
from (
select FK
from Toys t
inner join Dogs d on t.FK = d.PK
where d."color" = 'yellow'
group by FK
having count(*) > 1
)
SQL Fiddle Example

How to filter rows on a complex filter

I have these rows in a table
ID Name Price Delivery
== ==== ===== ========
1 apple 1 1
2 apple 3 2
3 apple 6 3
4 apple 9 4
5 orange 4 6
6 orange 5 7
I want to have the price at the third delivery (Delivery=3) or the last price if there's no third delivery.
It would give me this :
ID Name Price Delivery
== ==== ===== ========
3 apple 6 3
6 orange 5 7
I don't necessary want a full solution but an idea of what to look for would be greatly appreciated.
SQL> create table t (id,name,price,delivery)
2 as
3 select 1, 'apple', 1, 1 from dual union all
4 select 2, 'apple', 3, 2 from dual union all
5 select 3, 'apple', 6, 3 from dual union all
6 select 4, 'apple', 9, 4 from dual union all
7 select 5, 'orange', 4, 6 from dual union all
8 select 6, 'orange', 5, 7 from dual
9 /
Table created.
SQL> select max(id) keep (dense_rank last order by nullif(delivery,3) nulls last) id
2 , name
3 , max(price) keep (dense_rank last order by nullif(delivery,3) nulls last) price
4 , max(delivery) keep (dense_rank last order by nullif(delivery,3) nulls last) delivery
5 from t
6 group by name
7 /
ID NAME PRICE DELIVERY
---------- ------ ---------- ----------
3 apple 6 3
6 orange 5 7
2 rows selected.
EDIT: Since you want "an idea of what to look for", here is an description of why I think this solution is the best, besides being the query with the least amount of lines. Your expected result set indicates that you want to group your data per fruit name ("group by name"). And of each group you want to keep the values of the records with delivery = 3 or when that number doesn't exists, the last one ("keep (dense_rank last order by nullif(delivery,3) nulls last"). In my opinion, the query above just reads like that. And it uses only one table access to get the result, although my query is not unique in that.
Regards,
Rob.
Use ROW_NUMBER twice - once to filter the rows away that are after the third delivery, and the second time to find the last row remaining (i.e. a typical max per group query).
I've implemented this using CTEs. I tested it in SQL Server but I believe that Oracle supports the same syntax.
WITH T1 AS (
SELECT
ID, Name, Price, Delivery,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Delivery) AS rn
FROM Table1
), T2 AS (
SELECT
t1.*,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Delivery DESC) AS rn2
FROM T1
WHERE rn <= 3
)
SELECT ID, Name, Price, Delivery
FROM T2
WHERE rn2 = 1
Result:
ID Name Price Delivery
3 apple 6 3
6 orange 5 7
select t3.ID, t3.Name, t3.Price, t3.Delivery
from (
select Name, max(Delivery) as MaxDelivery
from MyTable
group by Name
) t1
left outer join MyTable t2 on t1.Name = t2.Name and Delivery = 3
inner join MyTable t3 on t1.Name = t3.name
and t3.Delivery = coalesce(t2.Delivery, t1.MaxDelivery)
Mark's and APC's answers work if you meant the third delivery, regardless of the Delivery number. Here's a solution using analytic functions that specifically searches for a record with Delivery = 3.
CREATE TABLE FRUITS (
ID NUMBER,
Name VARCHAR2(10),
Price INTEGER,
Delivery INTEGER);
INSERT INTO FRUITS VALUES (1, 'apple', 1, 1);
INSERT INTO FRUITS VALUES (2, 'apple', 3, 2);
INSERT INTO FRUITS VALUES (3, 'apple', 6, 3);
INSERT INTO FRUITS VALUES (4, 'apple', 9, 4);
INSERT INTO FRUITS VALUES (5, 'orange', 4, 6);
INSERT INTO FRUITS VALUES (6, 'orange', 5, 7);
INSERT INTO FRUITS VALUES (7, 'pear', 2, 5);
INSERT INTO FRUITS VALUES (8, 'pear', 4, 6);
INSERT INTO FRUITS VALUES (9, 'pear', 6, 7);
INSERT INTO FRUITS VALUES (10, 'pear', 8, 8);
SELECT ID,
Name,
Price,
Delivery
FROM (SELECT ID,
Name,
Price,
Delivery,
SUM(CASE WHEN Delivery = 3 THEN 1 ELSE 0 END)
OVER (PARTITION BY Name) AS ThreeCount,
ROW_NUMBER()
OVER (PARTITION BY Name ORDER BY Delivery DESC) AS rn
FROM FRUITS)
WHERE (ThreeCount <> 0 AND Delivery = 3) OR
(ThreeCount = 0 AND rn = 1)
ORDER BY ID;
DROP TABLE FRUITS;
And the results from Oracle XE 10g:
ID Name Price Delivery
---- ---------- ------- ----------
3 apple 6 3
6 orange 5 7
10 pear 8 8
I included a third fruit in the sample data to illustrate the effect of different interpretations of the question. The other solutions would pick ID=9 for the pear.