I am trying to figure out how the minus/except operator works.
Somehow I cannot find anything useful on the web. The "minus" operator is used to return all rows in the first statement that are not part of the second statement. But how exactly does it manage to do this?
Can someone please provide me how this is done "step by step"?
Note: The following answer is true for Oracle. See The UNION [ALL], INTERSECT, MINUS Operators
My SQL has only UNION and UNION ALL. Results for INTERSECT and MINUS can be got using IN and NOT IN. See Union, Difference, Intersection, and Division in MySQL [PDF]
In SQL Server, MINUS is called EXCEPT. See Set Operators (Transact-SQL)
I am surprised you are unable to find anything related to this on the Web. MINUS is a set operation in SQL, others include UNION, UNION ALL and INTERSECT.
This is what they do:
Sample Data:
EMPLOYEE
ID NAME SALARY AGE
1 Alice 5000 23
2 Joe 1000 25
3 Raj 2000 28
4 Pam 1500 32
UNION:
Returns results from SQL 1 combiled with Results from SQL 2, after removing duplicates. A variation is UNION ALL that does not remove duplicates. UNION ALL has better performance because it does not do the sort and remove duplication (internal) step. Union all is useful when the results of two SQLs being used are mutually exclusive.
select * from employee where salary > 1000
union
select * from employee where age > 25
returns all employees that are 25 years old or more or have a salary > 1000 (satisfy either condition)
ID NAME SALARY AGE
1 Alice 5000 23
3 Raj 2000 28
4 Pam 1500 32
Using UNION ALL in the above case returns record for Raj twice because UNION ALL does not remove duplicates.
select * from employee where salary > 1000
union all
select * from employee where age > 25
ID NAME SALARY AGE
1 Alice 5000 23
3 Raj 2000 28
4 Pam 1500 32
3 Raj 2000 28
INTERSECT:
Returns only common records between the result sets.
select * from employee where salary > 1000
intersect
select * from employee where age > 25
returns only those records that satisfy both conditions: Have salary > 1000 AND are over 25.
ID NAME SALARY AGE
3 Raj 2000 28
4 Pam 1500 32
MINUS:
Returns records from SQL 1 after removing results from SQL 2:
select * from employee where salary > 1000
intersect
select * from employee where age > 25
returns all those employees that have a salary > 1000 after removing employees that are more than 25 years of age:
ID NAME SALARY AGE
1 Alice 5000 23
Assume:
create table U
( x int not null primary key );
insert into U (x) values (1),(2),(3);
create table V
( y int not null primary key );
insert into U (x) values (3),(4);
Now U - V = { 1, 2 }
U - V can be expressed as:
All tupels in U that does not exist in V, i.e.
select x
from U
where not exists (
select 1
from V
where V.y = U.x
);
In the same way V - U = { 4 }
Did that clarify?
this should give you a clue !
create table #t(id int)
insert into #t values(1),(2),(3),(4),(5)
create table #t2(id1 int)
insert into #t2 values(2),(5),(6)
select * from #t except select * from #t2
select id from #t left join #t2 on #t.id=#t2.id1 where #t2.id1 is null
SEE DEMO
IN set theory
say A={A,B,C,D} ,B={B,X,Y,Z}
so A-B={A,C,D} -- which is a left join in sql
Related
My table contains 113 people.
48 of them are 20 years old. Now I am just selecting all people like
select * from persons
this will get me all persons, but 20 yr old are not the first 48 people.
I need the 20 yr old to be first 48 in 113 results.
something like
20 year ols ( 48 of them ), after that ..... all the rest in the table
How can I query this using PostgreSQL.
EDIT : there are age less than 20 too. after getting the first 48 , 20 yr olds, I dont care rest of the order I am getting the 48 to 113 people.
Just use order by :
select *
from persons
order by age
You can use asc or desc but because default is asc you do not need to put it in your example.
select *
from persons
order by age desc
After the comment from OP here is the new code(I do not know why but my firs assumption was that the value 20 is the lowest possible value... bad assumption):
select *
from persons
order by case when age = 20 then 1 else 2 end
OR
select *
from persons
order by (age = 20) desc
Here is a demo
If 20 is not your minimum age, you can use the CASE statement inside the ORDER BY clause, like this:
SELECT
*
FROM
persons
ORDER BY
CASE WHEN age = 20 THEN 0
ELSE 1
END ASC
my query as below , i want to minus some rows from query1 when query2 have rowdata , but i don't know how to do:
my query:
with query1 as(
select wm_concat(linkman_name) name,
wm_concat(phone_num) phone,
t.org_id
from (
select linkman_name, phone_num, LINK_ORG_ID, org_id
from TD_SM_LINKMAN
where STATE = '2'
and (LINK_ORG_ID is null or LINK_ORG_ID = '')) t
group by t.org_id) ,
query2 as(
select wm_concat(linkman_name) name,
wm_concat(phone_num) phone,
org_id
from (select linkman_name, phone_num, LINK_ORG_ID, org_id
from TD_SM_LINKMAN
where STATE = '2'
and (LINK_ORG_ID = '55')) t
group by org_id)
select *
from query1
union all
select *
from query2 minus
-- this doesn't work ,i want to minus the rowdata from query 1 when query1.org_id = query2.org_id. the query2 is marked as outer query column.
(select * from query1 where query1.ORG_ID = query2.ORG_ID)
;
sample table
name phone link_org_id org_id
lily 133 1
ming 144 1
hao 333 2
jane 1234 55 2
bob 666 3
herry 555 3
query 1 result:
name phone org_id
lily,ming 133,144 1
hao 333 2
bob,herry 666,555 3
query 2 result:
name phone org_id
jane 1234 2
such like this , jane selected by query2 and hao selected by query 1 . All of them are from a same org which org_id =2 . but i don't need hao ,i just need jane. how to do?
i means if query2 can find result , then no need query1's result. but if query2 can't find any data, then i need query1's data.
The way it is now, you'll first have to split names (and phones) into rows, and then apply set operators (UNION, MINUS) to such a data.
Which means that you shouldn't use WM_CONCAT at all; at least, not at the beginning, because
first you concatenate data
then you'd have to split it back into rows
UNION / MINUS sets
Doing useless job in the first 2 steps.
I'd suggest you to UNION / MINUS data first, then aggregate them using WM_CONCAT. By the way, which database version do you use? WM_CONCAT is a) undocumented, b) doesn't even exist in latest Oracle database versions so you'd rather switch to LISTAGG, if possible.
I have table like this:
NAME IDENTIFICATIONR SCORE
JOHN DB 10
JOHN IT NULL
KAL DB 9
HENRY KK 3
KAL DB 10
HENRY IP 9
ALI IG 10
ALI PA 9
And with select sentence I want that my result would be like only those names whose scores are 9 or above. So basically it means, that, for exaple, Henry cannot be selected, because he has score under the value of 9 in one line , but in the other he has the score of 3 (null values also should be emitted).
My newtable should look like this:
NAME
KAL
ALI
I'm using a sas program. THANK YOU!!
The COUNT of names will be <> COUNT of scores if there is a missing score. Requesting equality in the having clause will ensure no person with a missing score is in your result set.
proc sql;
create table want as
select distinct name from have
group by name
having count(name) = count(score) and min(score) >= 9;
here the solution
select name
from table name where score >= 9
and score <> NULL;
Select NAME from YOUR_TABLE_NAME name where SCORE > 9 and score is not null
You can do aggregation :
select name
from table t
group by name
having sum(case when (score < 9 or score is null) then 1 else 0 end) = 0;
If you want full rows then you can use not exists :
select t.*
from table t
where not exists (select 1
from table t1
where t1.name = t.name and (t1.score < 9 or t1.score is null)
);
You seem to be treated NULL scores as a value less than 9. You can also just use coalesce() with min():
select name
from have
group by name
having min(coalesce(score, 0)) >= 9;
Note that select distinct is almost never useful with group by -- and SAS proc sql probably does not optimize it well.
I have a table like this, where, suppose for the sake of an example, NAME is a unique identifier.
NAME AGE VALUE
Jack Under 65 3
Jack 66-74 5
John 66-74 7
John Over 75 9
Gill 25-35 11
Some NAMEs have more than one AGE, which is undesirable, as this is due to dirtiness of the data.
My aim is to update the duplicates only to have one AGE within each NAME. The desired output is thus:
NAME AGE VALUE
Jack Under 65 3
Jack Under 65 5
John 66-74 7
John 66-74 9
Gill 25-35 11
Something like this UPDATE statement should work, but it doesn't.
UPDATE table t1
SET t1.age=MAX(t1.age)
WHERE EXISTS (SELECT COUNT(t2.AGE)
FROM table t2
WHERE t1.NAME=t2.NAME
GROUP BY t2.NAME
HAVING COUNT(t2.AGE) > 1)
SQL Error: ORA-00934: group function is not allowed here
Second issue
Even if I got the above statement to work, there is a second issue. The idea there is to use the MAX (or MIN) function on strings to set the same value for all repeats within a group.
But unfortunately, this too would not quite work as desired. For consistency, ideally an age would default to the lowest age group. But because MAX/MIN compare alphabetic order on strings, this would give, e.g.:
"66-74" and "Under 65" => MAX="Under 65" -- Lowest
"66-74" and "Over 75" => MAX="Over 75" -- Highest
There are only four age groups, would it be possible to specify a custom order?
NB1: I am using Oracle SQL.
NB2: I do not mind if there is a way to achieve the result using a SELECT instead of an UPDATE statement.
Reproducible example
SELECT 'Jack' as NAME, 'Under 65' as AGE, 3 as VALUE from dual
UNION ALL
SELECT 'Jack' as NAME, '66-74' as AGE, 5 as VALUE from dual
UNION ALL
SELECT 'John' as NAME, '66-74' as AGE, 7 as VALUE from dual
UNION ALL
SELECT 'John' as NAME, 'Over 75' as AGE, 9 as VALUE from dual
UNION ALL
SELECT 'Gill' as NAME, '25-35' as AGE, 11 as VALUE from dual
You can define custom order with case when clause and then use analytic max(). This worked for given examples:
update t1 set age = (
select max(age) keep (dense_rank last
order by case when age = 'Over 75' then 1
when age = '66-74' then 2
when age = 'Under 65' then 3
when age = '25-35' then 4
end)
from t1 tx where tx.name = t1.name )
Trying to find the best way to write this SQL statement.
I have a customer table that has the internal credit score of that customer. Then i have another table with definitions of that credit score. I would like to join these tables together, but the second table doesn't have any way to link it easily.
The score of the customer is an integer between 1-999, and the definition table has these columns:
Score
Description
And these rows:
60 LOW
99 MED
999 HIGH
So basically if a customer has a score between 1 and 60 they are low, 61-99 they are med, and 100-999 they are high.
I can't really INNER JOIN these, because it would only join them IF the score was 60, 99, or 999, and that would exclude anyone else with those scores.
I don't want to do a case statement with the static numbers, because our scores may change in the future and I don't want to have to update my initial query when/if they do. I also cannot create any tables or functions to do this- I need to create a SQL statement to do it for me.
EDIT:
A coworker said this would work, but its a little crazy. I'm thinking there has to be a better way:
SELECT
internal_credit_score
(
SELECT
credit_score_short_desc
FROM
cf_internal_credit_score
WHERE
internal_credit_score = (
SELECT
max(credit.internal_credit_score)
FROM
cf_internal_credit_score credit
WHERE
cs.internal_credit_score <= credit.internal_credit_score
AND credit.internal_credit_score <= (
SELECT
min(credit2.internal_credit_score)
FROM
cf_internal_credit_score credit2
WHERE
cs.internal_credit_score <= credit2.internal_credit_score
)
)
)
FROM
customer_statements cs
try this, change your table to contain the range of the scores:
ScoreTable
-------------
LowScore int
HighScore int
ScoreDescription string
data values
LowScore HighScore ScoreDescription
-------- --------- ----------------
1 60 Low
61 99 Med
100 999 High
query:
Select
.... , Score.ScoreDescription
FROM YourTable
INNER JOIN Score ON YourTable.Score>=Score.LowScore
AND YourTable.Score<=Score.HighScore
WHERE ...
Assuming you table is named CreditTable, this is what you want:
select * from
(
select Description, Score
from CreditTable
where Score > 80 /*client's credit*/
order by Score
)
where rownum = 1
Also, make sure your high score reference value is 1000, even though client's highest score possible is 999.
Update
The above SQL gives you the credit record for a given value. If you want to join with, say, Clients table, you'd do something like this:
select
c.Name,
c.Score,
(select Description from
(select Description from CreditTable where Score > c.Score order by Score)
where rownum = 1)
from clients c
I know this is a sub-select that executed for each returning row, but then again, CreditTable is ridiculously small and there will be no significant performance loss because of the the sub-select usage.
You can use analytic functions to convert the data in your score description table to ranges (I assume that you meant that 100-999 should map to 'HIGH', not 99-999).
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 60 score, 'Low' description from dual union all
3 select 99, 'Med' from dual union all
4 select 999, 'High' from dual
5 )
6 select description,
7 nvl(lag(score) over (order by score),0) + 1 low_range,
8 score high_range
9* from x
SQL> /
DESC LOW_RANGE HIGH_RANGE
---- ---------- ----------
Low 1 60
Med 61 99
High 100 999
You can then join this to your CUSTOMER table with something like
SELECT c.*,
sd.*
FROM customer c,
(select description,
nvl(lag(score) over (order by score),0) + 1 low_range,
score high_range
from score_description) sd
WHERE c.credit_score BETWEEN sd.low_range AND sd.high_range