Finding everything not in SQL query result - sql

I have the following query:
SELECT first_name, age
FROM taba
WHERE NOT EXISTS
(
SELECT p.first_name, MAX(p.age) as age FROM taba p
GROUP BY p.first_name
);
The inner sub-query finds the largest age for a given name. I want to basically find every row that isn't in the inner subquery result. What's the best way to achieve that? This query gives me the empty set and I'm not sure why.
Max Age By First Name
All Data
I want everything in all data that isn't in the max age by first name query.

Using a correlated sub-query...
SELECT
*
FROM
taba T
WHERE
age < (
SELECT MAX(age)
FROM taba P
WHERE T.first_name = P.first_name
)
Using a sub-query and a join...
SELECT
t.*
FROM
taba t
INNER JOIN
(
SELECT first_name, MAX(age) AS max_age
FROM taba
GROUP BY first_name
)
AS age
ON age.first_name = t.first_name
AND age.max_age > t.age
Using EXECPT...
SELECT first_name, age FROM taba
EXCEPT
SELECT first_name, MAX(age) FROM taba GROUP BY first_name
Using EXISTS() (with a correlated sub-query)...
SELECT
*
FROM
taba T
WHERE
EXISTS (
SELECT *
FROM taba P
WHERE P.first_name = T.first_name
AND P.age > T.age
)

You need to refer to your outer table in your inner query to get this query worked.
SELECT first_name, age
FROM taba T
WHERE NOT EXISTS(SELECT NULL
FROM taba P
WHERE T.first_name = P.first_name
AND T.age = (SELECT MAX(age)
FROM taba P2
WHERE P.first_name = P2.first_name)
GROUP BY p.first_name
);
You may try below shorter version also -
SELECT first_name, age
FROM (SELECT first_name, age, RANK() OVER(PARTITION BY first_name ORDER BY age DESC) RNK
FROM taba)
WHERE RNK <> 1;

Fiddle
There is one other solution to at least be aware of.
This avoids the correlated subquery and does not require a join.
A common solution:
SELECT *
FROM taba
WHERE (first_name, age) NOT IN (
SELECT first_name, MAX(age)
FROM taba
GROUP BY first_name
)
;

One version can be:
SELECT b.first_name, b.age
FROM taba b
WHERE b.age < (SELECT Max(p.age)
FROM taba
WHERE p.first_name = b.first_name)

Related

SQL: how to limit a join on the first found row?

How to make a join between two tables but limiting to the first row that meets the join condition ?
In this simple example, I would like to get for every row in table_A the first row from table_B that satisfies the condition :
select table_A.id, table_A.name, table_B.city
from table_A join table_B
on table_A.id = table_B.id2
where ..
table_A (id, name)
1, John
2, Marc
table_B (id2, city)
1, New York
1, Toronto
2, Boston
The output would be:
1, John, New York
2, Marc, Boston
May be Oracle provides such a function (performance is a concern).
The key word here is FIRST. You can use analytic function FIRST_VALUE or aggregate construct FIRST.
For FIRST or LAST the performance is never worse and frequently better than the equivalent FIRST_VALUE or LAST_VALUE construct because we don't have a superfluous window sort and as a consequence a lower execution cost:
select table_A.id, table_A.name, firstFromB.city
from table_A
join (
select table_B.id2, max(table_B.city) keep (dense_rank first order by table_B.city) city
from table_b
group by table_B.id2
) firstFromB on firstFromB.id2 = table_A.id
where 1=1 /* some conditions here */
;
Since 12c introduced operator LATERAL, as well as CROSS/OUTER APPLY joins, make it possible to use a correlated subquery on right side of JOIN clause:
select table_A.id, table_A.name, firstFromB.city
from table_A
cross apply (
select max(table_B.city) keep (dense_rank first order by table_B.city) city
from table_b
where table_B.id2 = table_A.id
) firstFromB
where 1=1 /* some conditions here */
;
If you want just single value a scalar subquery can be used:
SELECT
id, name, (SELECT city FROM table_B WHERE id2 = table_A.id AND ROWNUM = 1) city
FROM
table_A
Query:
SELECT a.id,
a.name,
b.city
FROM table_A a
INNER JOIN
( SELECT id2,
city
FROM (
SELECT id2,
city,
ROW_NUMBER() OVER ( PARTITION BY id2 ORDER BY NULL ) rn
FROM Table_B
)
WHERE rn = 1
) b
ON ( a.id = b.id2 )
--WHERE ...
Outputs:
ID NAME CITY
---------- ---- --------
1 John New York
2 Marc Boston
select table_A.id, table_A.name,
FIRST_VALUE(table_B.city) IGNORE NULLS
OVER (PARTITION BY table_B.id2 ORDER BY table_B.city) AS "city"
from table_A join table_B
on table_A.id = table_B.id2
where ..
On Oracle12c there finally is the new cross/outer apply operator that will allow what you asked for without any workaround.
the following is an example that looks on dictionary views for just one of the (probably)many objects owned by those users having their name starting with 'SYS':
select *
from (
select USERNAME
from ALL_USERS
where USERNAME like 'SYS%'
) U
cross apply (
select OBJECT_NAME
from ALL_OBJECTS O
where O.OWNER = U.USERNAME
and ROWNUM = 1
)
On Oracle 11g and prior versions you should only use workarounds that generally full scan the second table based on IDs of the second table to get the same results, but for testing puposes you may enable the lateral operator (also available on 12c without need of enabling new stuff) and use this other one
-- Enables some new features
alter session set events '22829 trace name context forever';
select *
from (
select USERNAME
from ALL_USERS
where USERNAME like 'SYS%'
) U,
lateral (
select OBJECT_NAME
from ALL_OBJECTS O
where O.OWNER = U.USERNAME
and ROWNUM = 1
);
This solution uses the whole table, like in a regular join, but limits to the first row. I am posting this because for me the other solutions were not sufficient because they use one field only, or they have performance issues with large tables. I am no expert at Oracle so if someone can improve this please do so, I will be happy to use your version.
select *
from tableA A
cross apply (
select *
from (
select B.*,
ROW_NUMBER() OVER (
-- replace this by your own partition/order statement
partition by B.ITEM_ID order by B.DELIVERYDATE desc
) as ROW_NUM
from tableB B
where
A.ITEM_ID=B.ITEM_ID
)
where ROW_NUM=1
) B
I use the partition to separate the id2 and then just take the r_num = 1.
SELECT A.ID, A.NAME, B.CITY
FROM TABLE_A A,
(SELECT ID2, CITY,
ROW_NUMBER() OVER (PARTITION BY ID2 ORDER BY ID2) AS R_NUM
FROM TABLE_B) B
WHERE A.ID = B.ID2
AND R_NUM = 1;

Subqueries with different universes

I have an Oracle DB and I need to run a select with sub selects, however, none of them share the same table universe, therefore, I would need to do something like this:
SELECT (
SELECT COUNT(*)
FROM user_table
) AS tot_user,
(
SELECT COUNT(*)
FROM cat_table
) AS tot_cat,
(
SELECT COUNT(*)
FROM course_table
) AS tot_course
I know this is possible at other databases but I need something like this for Oracle.
Can someone help?
To make this work in oracle, add from dual to the end:
SELECT (SELECT COUNT(*)
FROM user_table
) AS tot_user,
(SELECT COUNT(*)
FROM cat_table
) AS tot_cat,
(SELECT COUNT(*)
FROM course_table
) AS tot_course
FROM dual;
A database independent way of writing the query is:
select tot_user, tot_cat, tot_course
from (SELECT COUNT(*) as tot_user
FROM user_table
) u cross join
(SELECT COUNT(*) as tot_cat
FROM cat_table
) c cross join
(SELECT COUNT(*) as tot_course
FROM course_table
) ct;

How to compare two rows in SQL Server

I'm used to mysql when you can do that with no problems. I would like to run the following statement in SQL Server however it doesn't see the column C_COUNT.
SELECT
A.customers AS CUSTOMERS,
(SELECT COUNT(ID) FROM Partners_customers B WHERE A.ID = B.PIID) AS C_COUNT
FROM Partners A
WHERE CUSTOMERS <> [C_COUNT]
Is it possible to utilize any mathematical operations in the SELECT area like
SELECT (CUSTOMERS - C_COUNT) AS DIFFERENCE
SQL Server does not allow you to use aliases in the WHERE clause. You'll have to have something like this:
SELECT *, Customers - C_COUNT "Difference"
FROM (
SELECT
A.customers AS CUSTOMERS,
(SELECT COUNT(ID)
FROM Partners_customers B WHERE A.ID = B.PIID)
AS C_COUNT FROM Partners A
) t
WHERE CUSTOMERS <> [C_COUNT]
Or, better yet, eliminating an inline count:
select A.customers, count(b.id)
FROM Partners A
LEFT JOIN Partners_customers B ON A.ID = B.PIID
Group By A.ID
having a.customers <> count(b.id)
WITH A AS
(
SELECT
A.customers AS CUSTOMERS,
(SELECT COUNT(ID) FROM Partners_customers B WHERE A.ID = B.PIID) AS C_COUNT
FROM Partners A
WHERE CUSTOMERS <> [C_COUNT]
)
SELECT
*,
(CUSTOMERS - C_COUNT) AS DIFFERENCE
FROM A
Completely untested....
(select * from TabA
minus
select * from TabB) -- Rows in TabA not in TabB
union all
(
select * from TabB
minus
select * from TabA
) -- rows in TabB not in TabA

SQL query to merge 2 tables with additional conditions?

I have 2 identical tables: user_id, name, age, date_added.
USER_ID column may contain multiple duplicate IDs.
Need to merge those 2 tables into 1 with the following condition.
If there are multiple records with identical 'name' for the same user then need to keep only the LATEST (by date_added) record.
This script will be used with MSSQL 2005, but would also appreciate if somebody comes up with version that does not use ROW_NUMBER(). Need this script to reload a broken table once, performance is not critical.
example:
table1:
1,'john',21,01/01/2010
1,'john',15,01/01/2005
1,'john',71,01/01/2001
table2:
1,'john',81,01/01/2007
1,'john',15,01/01/2005
1,'john',11,01/01/2008
result:
1,'john',21,01/01/2010
UPDATE:
I think that I've found my own solution. It is based on an answer for my previous question given by Larry Lustig and Joe Stefanelli.
with tmp2 as
(
SELECT * FROM table1
UNION
SELECT * FROM table2
)
SELECT * FROM tmp2 c1
WHERE (SELECT COUNT(*) FROM tmp2 c2
WHERE c2.user_id = c1.user_id AND
c2.name = c1.name AND
c2.date_added >= c1.date_added) <= 1
Could you please help me to convert this query to the one without 'WITH' clause?
Here's a variant of #Andomar's answer:
; with all_users as
(
select *
from table1 u1
union all
select *
from table2 u2
)
, ranker as (
select *,
rank() over (partition by userid order by recordtime) as [r]
)
select * from ranker where [r] = 1
Just in the interests of giving a different approach...
WITH distinctlist
As (SELECT user_id,
name
FROM table1
UNION
SELECT user_id,
name
FROM table2)
SELECT C.*
FROM distinctlist d
CROSS APPLY (SELECT TOP 1 *
FROM (SELECT TOP 1 *
FROM table1
WHERE user_id = d.user_id
AND name = d.name
ORDER BY date_added DESC
UNION ALL
SELECT TOP 1 *
FROM table1
WHERE user_id = d.user_id
AND name = d.name
ORDER BY date_added DESC) T
ORDER BY date_added DESC) C
You could use not exists, like:
; with all_users as
(
select *
from table1 u1
union all
select *
from table2 u2
)
select *
from all_users u1
where not exists
(
select *
from all_users u2
where u1.name = u2.name
and u1.record_time < u2.record_time
)
If the database doesn't support CTE's, expand all_users in the two places it is used.
P.S. If there are only three columns, and no more, you could use an even simpler solution:
select name
, MAX(record_time)
from (
select *
from table1 u1
union all
select *
from table2 u2
) sub
group by
name

SQL query get distinct records

I need help with a query to get distinct records from the table.
SELECT distinct cID, firstname, lastname,
typeId, email from tableA
typeId and email has different values in the table. I know this one causing to return 2 records because these values are different.
Is there anyway I can get 1 record for each cID irrespective of typeId and email?
If you don't care about what typeId and email get selected with each cID, following is one way to do it.
SELECT DISTINCT a.cID
, a.firstname
, a.lastname
, b.typeId
, b.email
FROM TableA a
INNER JOIN (
SELECT cID, MIN(typeID), MIN(email)
FROM TableA
GROUP BY
cID
) b ON b.cID = a.cID
If any one value for typeId and email are acceptable, then
SELECT cID, firstname, lastname,
max(typeId), max(email)
from tableA
group by cID, firstname, lastname,
should do it.
Is this what you are after:?
SELECT distinct a.cID, a.firstname,
a.lastname, (SELECT typeId from tableA
WHERE cID = a.cID), (Select email from
tableA WHERE cID = a.cID) from tableA
AS a