To find subset of data in Oracle - sql

I have some records in emp1 :
SELECT distinct
substrb(emp.employee_NAME,1,50) employee_NAME
FROM employee emp , employee_sites sites , (SELECT DISTINCT employee_id ,
emp_site_number
FROM abc
) abc
where emp.employee_id = sites.employee_id
and abc.employee_id=emp.employee_id
and abc.emp_site_number = sites.emp_site_number ;
and some records in emp:
SELECT distinct emp.employee_NAME employee_NAME
FROM employee emp
WHERE 1=1 and EXISTS
(SELECT 1 FROM employee_ACCOUNTS acc WHERE acc.employee_id = emp.employee_id
)
rowcount of emp : 205001
rowcount of emp1 : 18003
I want to find out if emp has all the records of emp1 ,in other words if emp is superset of emp1. I tried this :
select count(*) from (SELECT distinct emp.employee_NAME employee_NAME
FROM employee emp
WHERE 1=1 and EXISTS
(SELECT 1 FROM employee_ACCOUNTS acc WHERE acc.employee_id = emp.employee_id
) ) emp ,
(SELECT distinct
substrb(emp.employee_NAME,1,50) employee_NAME
FROM employee emp , employee_sites sites , (SELECT DISTINCT employee_id ,
emp_site_number
FROM abc
) abc
where emp.employee_id = sites.employee_id
and abc.employee_id=emp.employee_id
and abc.emp_site_number = sites.emp_site_number) emp1
where emp.employee_NAME = emp1.employee_NAME ;
Rowcount for the above query : 12360.
So I have concluded that emp is not a superset of emp1
Someone please let me know what I have done is fine or it needs some modification.
Also please share if you know some better way of doing it .
Thanks

You could avoid the correlated subqueries and just do a simple set MINUS operation:
select employee_name -- or whatever makes the employee the same in 2 tables
from emp1 -- the table which may have rows not in the other table
MINUS
select employee_name
from emp2 -- the table which you think may be missing some rows
You could also use a left join:
select emp2.employee_name from emp2
left join emp1 on emp2.employee_name = emp1.employee_name
where emp1.employee_name is null
The performance will depend on factors like indexes, data volumes. Inspection of the query plans and benchmarking will give you a good idea of which is the better option.

Related

Limit the data set of a single table within a multi-table sql select statement

I'm working in an Oracle environment.
In a 1:M table relationship I want to write a query that will bring me each row from the "1" table and only 1 matching row from the "many" table.
To give a made up example... ( * = Primary Key/Foreign Key )
EMPLOYEE
*emp_id
name
department
PHONE_NUMBER
*emp_id
num
There are many phone numbers for one employee.
Let's say I wanted to return all employees and only one of their phone numbers. (Please forgive the far-fetched example. I'm trying to simulate a workplace scenario)
I tried to run:
SELECT emp.*, phone.num
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id
WHERE phone.ROWNUM <= 1;
It turns out (and it makes sense to me now) that ROWNUM only exists within the context of the results returned from the entire query. There is not a "ROWNUM" for each table's data set.
I also tried:
SELECT emp.*, phone.num
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id
WHERE phone.num = (SELECT MAX(num)
FROM PHONE_NUMBER);
That one just returned me one row total. I wanted the inner SELECT to run once for each row in EMPLOYEE.
I'm not sure how else to think about this. I basically want my result set to be the number of rows in the EMPLOYEE table and for each row the first matching row in the PHONE_NUMBER table.
Obviously there are all sorts of ways to do this with procedures and scripts and such but I feel like there is a single-query solution in there somewhere...
Any ideas?
I'd use a rank (or dense_rank or row_number depending on how you want to handle ties)
SELECT *
FROM (SELECT emp.*,
phone.num,
rank() over (partition by emp.emp_id
order by phone.num) rnk
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id)
WHERE rnk = 1
will rank the rows in phone for each emp_id by num and return the top row. If there could be two rows for the same emp_id with the same num, rank would assign both a rnk of 1 so you'd get duplicate rows. You could add additional conditions to the order by to break the tie. Or you could use row_number rather than rank to arbitrarily break the tie.
All above answers will work beautifully with the scenario you described.
But if you have some employees which are missing in phone tables, then you need to do a left outer join like below. (I faced similar scenario where I needed isolated parents also)
EMP
---------
emp_id Name
---------
1 AA
2 BB
3 CC
PHONE
----------
emp_id no
1 7555
1 7777
2 5555
select emp.emp_id,ph.no from emp left outer join
(
select emp_id,no,
ROW_NUMBER() OVER (PARTITION BY emp_id ORDER BY emp_id) as rnum
FROM phone) ph
on emp.emp_id = ph.emp_id
where ph.rnum = 1 or ph.rnum is null
Result
EMP_ID NO
1 7555
2 5555
3 (null)
If you want only one phone number, then use row_number():
SELECT e.*, p.num
FROM EMPLOYEE emp JOIN
(SELECT p.*,
ROW_NUMBER() OVER (PARTITION BY emp_id ORDER BY emp_id) as seqnum
FROM PHONE_NUMBER p
) p
ON e.emp_id = p.emp_id and seqnum = 1;
Alternatively, you can use aggregation, to get the minimum or maximum value.
This is my solution. Simple but maybe wont scale well for lot of columns.
Sql Fiddle Demo
select e.emp_id, e.name, e.dep, min(p.phone_num)
from
EMPLOYEE e inner join
PHONE_NUMBER p on e.emp_id = p.emp_id
group by e.emp_id, e.name, e.dep
order by e.emp_id;
And this fix the query you try
Sql Fiddle 2
SELECT emp.*, phone.num
FROM EMPLOYEE emp
JOIN PHONE_NUMBER phone
ON emp.emp_id = phone.emp_id
WHERE phone.num = (SELECT MAX(num)
FROM PHONE_NUMBER p
WHERE p.emp_id = emp.emp_id );

does anyone know how to list the empid and the name of all supervisors if supervisors are in employee table too?

I have a table that contains empid, name, salary, hiredate, position and supervisor (which includes empid, not the name). How do I list the empid and name of all supervisors?
The output has to have to columns supervisor (and a list of their empid) and their names. This is the create statement used to create the Employee table:
/* Create table Employee */
IF OBJECT_ID('Employee', 'U') IS NOT NULL
DROP TABLE Employee
GO
CREATE TABLE Employee (
emp_id NCHAR(5),
name NVARCHAR(20),
position NVARCHAR(20),
hire_date DATETIME,
salary MONEY,
bcode NCHAR(3),
supervisor NCHAR(5)
)
I have tried a variety of statements using having statement and count but they don't seem to work.
select emp_id, name from employee where position='manager';
I tried this but it doesn't work. Anyone smart that knows how to do it?
You will have to join the table back on itself:
select a.name, a.position, a.hiredate, a.salary, a.supervisorid,
isnull(b.name, '') as SupervisorName
from EmployeeTable a
left join EmployeeTable b
on a.SupservisorID=b.ID
The left join will make sure that the employees who do not have supervisors are returned, and isnull(b.name, '<NONE>') can be used if you would like to have something other than NULL as a value in those cases.
SELECT e.empid ,ISNULL(b.name, 'No supervisor') SupervisorName
FROM employee e LEFT JOIN employee b
ON e.supervisorid = b.empid
Inner join will leave out the people who do not have a supervisor , Use left join to get all the employees
If you want supervisors only, you just need to select rows whose emp_id values are found in the supervisor column:
SELECT
SupervisorID = emp_id,
SupervisorName = name
FROM dbo.Employee
WHERE emp_id IN (SELECT supervisor FROM dbo.Employee)
;

SQL Server fetch alias name for query

Please check fiddle: myFiddle
Query:
create table Emp(empId int primary key, EmpName varchar(50),MngrID int)
insert into Emp(empId,EmpName,MngrID)values(1,'A',2)
insert into Emp(empId,EmpName,MngrID)values(2,'B',null)
A has mngr B but A has no mngr, so while fetching the record from query it shows me:
EmpId EmpName MngrName(alias MngrName)
1 A B
2 B null
How to fetch the above data using a query?
For some reason it doesn't work in SQLFiddle, but i ran it in my own instance of SQL Server to verify it does work:
SELECT e1.EmpID, e1.EmpName, e2.EmpName
FROM emp e1 LEFT OUTER JOIN emp e2
ON e1.MngrID = e2.EmpID
Basically, you're doing a 'self join' by declaring two instances of the table (e1 and e2), and then joining the first instance's MngrID to the second instance's EmpID.
You need to LEFT JOIN table to itself:
select A.empID, A.empName, b.empName as mgrName
from emp A left join emp B on A.mngrID = b.empID
http://sqlfiddle.com/#!3/184dc/8
select empId,EmpName,(SELECT EmpName FROM emp WHERE MngrID = amp1.MngrID) AS Manager from emp as amp1

Multiple column Union Query without duplicates

I'm trying to write a Union Query with multiple columns from two different talbes (duh), but for some reason the second column of the second Select statement isn't showing up in the output. I don't know if that painted the picture properly but here is my code:
Select empno, job
From EMP
Where job = 'MANAGER'
Union
Select empno, empstate
From EMPADDRESS
Where empstate = 'NY'
Order By empno
The output looks like:
EMPNO JOB
4600 NY
5300 MANAGER
5300 NY
7566 MANAGER
7698 MANAGER
7782 MANAGER
7782 NY
7934 NY
9873 NY
Instead of 5300 and 7782 appearing twice, I thought empstate would appear next to job in the output. For all other empno's I thought the values in the fields would be (null). Am I not understanding Unions correctly, or is this how they are supposed to work?
Thanks for any help in advance.
If you want the data in a separate column you will want a JOIN not a UNION:
Select e.empno, e.job, a.empstate
From EMP e
left join EMPADDRESS a
on e.empno = a.empno
Where job = 'MANAGER'
AND empstate = 'NY'
Order By e.empno
A UNION combines the two results into a single set but the data is listed in the same columns. So basically they are placed on top of one another:
select col1, col2, 'table1' as src
from table1
union all
select col1, col2, 'table2' as src
from table2
Will result in:
col1 | col2 | src
t1 | t1 | table1
t2 | t2 | table2
If you want to have the data in a separate column which is sounds like you do then you will use a join of the tables.
Bluefeet has the correct answer.
Think of joins as combining tables horizontally - you're adding more columns to the original query with each table you join.
Think of unions as stacking record sets vertically - you're adding extra rows to the same set of columns.
You need a JOIN for this..
Select e.empno, e.job, ea.empstate
From EMP e LEFT OUTER JOIN EMPADDRESS ea ON e.empno = ea.empno
Where e.job = 'MANAGER'
And ea.empstate = 'NY'
Order By e.empno
UNION is for taking 2 result sets with the same column names and merging them into one. In your example, its lumping column 2 (job and empstate) together, and taking the name from the first select.
i think you meant to write is as a join instead?
ie if you wanted empstate to be null for those employee's not in NY.
select empno, job, empstate
from emp e
left outer join empaddress a
on a.empno = e.empno
and e.empstate = 'NY'
where e.job = 'MANAGER';
this one works in oracle..by using union ..here inner query will fetch out the all the columns after that grouping with empno and rest of the columns is string concatenated
select EMPNO
,wm_concat(job) job
,wm_concat(EMPSTATE) EMPSTATE
from
( select EMPNO,job,'' as EMPSTATE from EMP Where job ='MANAGER'
union select EMPNO,'' as job, EMPSTATE from EMPADDRESS Where empstate ='NY'
)
group by EMPNO order by 1

selecting unique row from a table- ORACLE

I have a table employee that has employee’s benefit data.
I have a field in the table called isenrolled if field is 1 that means employee has enrolled for benefit and if field is 0 that means not enrolled.
My problem is i have multiple recs of a employee, that means Scott has two entries with isenrolled =1 and isenrolled =0.
I want to select only one rec of SCOTT where his isenrolled =1 and reject the one where isenrolled =0, that way i will get only unique recs for employees who has enrolled and who has not enrolled. How do i select those employees? I tried the qry below and it doesn't work
select * FROM employee e
WHERE e.empid not IN( SELECT empid FROM employee e2
WHERE e2.isenrolled =1)
First I set up some test data using:
create table t4
as select * from scott.emp
alter table t4 add (isenrolled number(10))
update t4
set isenrolled = 0
insert into t4
(select emp.*, 1
from scott.emp)
At this point the t4 table has now two records for each employee (one where isenrolled = 0 and one where isenrolled = 1)
so I change ALLEN's data so he has "opted" out
delete from t4
where ename = 'ALLEN'
and isenrolled = 1
This query then shows the data for ALLEN (1 record) and SMITH (2 records)
select *
from t4
where ename IN ('ALLEN', 'SMITH')
order by ename
Then to show just one record per employee (restricted to ALLEN and SMITH in this case) you could use:
select t.*
from t4 t
where isenrolled = (select MAX(isenrolled)
from t4
where t4.empno = t.empno)
and ename IN ('ALLEN', 'SMITH')
Hope this helps
select unique(e.empid) from employee e where e2.isenrolled =1
this will give you unique list of employees who are enrolled. Is that what you want to get?