Simple SQL Problem - sql

I have a SQL query that I cant wrap my mind around. I do not have a large amount of sql experience. So I need some help
I have a table XXX:
Social Security No (SSN).
Name.
organisation. (Finance/IT)
In english what I want is:
To select all SSNs and Names in "Finance" where there is a different name for that SSN in "IT".
My not working attempt:
select ssn , name from XXX where org = "Finance" and name not in (select name from XXX where org="IT" and ssn=the_first_ssn)
Please help.
I have decided to make it a bit more difficult.
SSN can ocur multiple times in "IT":
So I want to select all SSNs and Names in Finance where the SSN does not exist with the same Name in "IT"

You could use a subquery in an exists clause:
select ssn, name
from YourTable a
where organisation = 'Finance'
and exists
(
select *
from YourTable b
where organisation = 'IT'
and a.ssn = b.ssn
and a.name <> b.name
)
The subquery says there must be a row in IT with the same SSN but a different name.

Assuming ssn is a unique key...
select ssn, name
from XXX XXX1
where org = "Finance"
and ssn in
(
select ssn
from XXX XXX2
where org="IT"
and XXX1.name<>XXX2.name
)

SELECT distinct
t1.ssn,
t1.name
from
xxx t1
inner join xxx t2 on t1.ssn=t2.ssn and t1.name<>t2.name
where t1.org='Finance' and t2.org='IT'

I know I'm late to the party, but I'm working on learning SQL and I wanted to try my hand at a solution and compare against the existing answers. I created a table Personnel with some testing data.
My SQL Server only query uses CTEs and an INNER JOIN:
WITH
Finance AS (SELECT SSN, Name FROM Personnel WHERE Org = 'Finance'),
IT AS (SELECT SSN, Name FROM Personnel WHERE Org = 'IT')
SELECT Finance.SSN, Finance.Name
FROM Finance
INNER JOIN IT ON IT.SSN = Finance.SSN
WHERE IT.Name != Finance.Name
Alexander's solution uses a straight INNER JOIN. I rewrote it a little bit, putting the name comparison in the WHERE clause, and dropping DISTINCT because it's not required:
SELECT Finance.SSN, Finance.Name
FROM Personnel Finance
INNER JOIN Personnel IT ON Finance.SSN = IT.SSN
WHERE
(Finance.Org = 'Finance' AND IT.Org = 'IT') AND
(Finance.Name != IT.Name)
Andomar's solution using a correlated subquery inside an EXISTS clause:
SELECT SSN, Name
FROM Personnel a
WHERE
(Org = 'Finance') AND
EXISTS
(
SELECT *
FROM Personnel b
WHERE (Org = 'IT') AND (a.SSN = b.SSN) AND (a.Name != b.Name)
)
barrylloyd's solution using a correlated subquery inside an IN clause:
SELECT SSN, Name
FROM Personnel p1
WHERE
(Org = 'Finance') AND
SSN IN
(
SELECT SSN FROM Personnel p2
WHERE (Org = 'IT') AND (p1.Name != p2.Name)
)
I plugged all of these into SQL Server, and it turns out that queries 1 and 2 both generate the same query plan, and queries 3 and 4 generate the same query plan. The difference between the two groups is the former group actually does an INNER JOIN internally, while the latter group does a left semi-join instead. (See here for an explanation of the different types of joins.)
I'm assuming there is a slight performance advantage favouring the left semi-join; however, for the business case, if you want to see any data columns from the right table (for example, if you want to display both names to compare them), you would have to completely rewrite those queries to use an INNER JOIN-based solution.
So given all that, I would favour solution 2, because the performance is so similar to 3 and 4, and it's far more flexible than those as well. My solution makes the SELECT statement very easy to read, but it's more verbose than 2 and not as portable. I suppose that mine might be better for readability if you have to do additional filtering on each of the two "sub-tables," or if the results of this query are going to be used as an intermediate step to a further goal.

Related

How do I check for occurrence in two tables simultaneously

I have a SQL query (oracle) that checks for both persons and firms, the problem is that you won't find a company in the user table and the other way around.
As of now I write this in two queries, but I would like to make this into one query (for example if I can get some help creating a temporay table)
I have a info table that tells me if this is a user, a company or both
The sql looks a bit like this:
Table1:
fk_id,
info1,
info2,
info3
Info_table:
fk_id,
<info if user, company or both>
User_table:
firstname,
lastname,
adress,
fk_id
Company_table:
Companyname,
adress,
fk_id
I would like to eighter 1:
Make a temporary table that looks like this:
Temptable:
fk_id,
firstname(if user or both, else empty),
lastname(if user, else companyname),
adress
or make a query like this:
select table1.info1, table1.info2, firstname, lastname, adress
from table1,
user_table,
company_table,
info_table
where table1.fk_id = user_table.fk_id (if user or both)
or table1.fk_id = company_table.fk_id (if company)
Any tips on how to solve this would be great. What is the best solution (making a temp table or to add this into the initial query)?
Use left outer join (in this response i'll use the + operator for convenience)
select table1.info1, table1.info2,
firstname,
nvl(lastname,company_name) lastname,
nvl(user_table.adress,company_table.adress) adress
from table1,
user_table,
company_table,
info_table
where
table1.fk_id=info_table.fk_id(+)
and table1.fk_id = user_table.fk_id(+) --(if user or both)
and table1.fk_id = company_table.fk_id(+) --(if company)
You can use a union:
quick example:
select firstname
,lastname as name
,'person_table' as source_table
from person_table
union
select null
,company_name
,'company_table'
from company_table;
The result will be a list of both persons and companies.
The correct way to write this query:
select t1.info1, t1.info2, ut.firstname,
coalesce(ut.lastname, ct.company_name) as lastname,
coalesce(ut.adress, ct.adress) as address
from table1 t1 left join
info_table it
on t1.fk_id = it.fk_id left join
user_table ut
on t1.fk_id = ut.fk_id and
it.info in ('both', 'user') left join
company_table ct
on t1.fk_id = ct.fk_id and
ct.info in ('both', 'company') ;
Notes:
This uses proper, explicit, standard JOIN syntax, as recommended by Oracle itself.
This only does the joins when the type specifies that the join is appropriate.
This uses COALESCE(), a standard function, rather than the bespoke NVL().
All column names are qualified.
This uses table aliases, so the query is easier to write and to read.

Create filter for most recent date using combined columns

I have created a filtering application in Access that references four simple tables:
Employee: Emp_ID, FirstName, LastName
Skill: Skill_ID, SkillName, SkillDescription, SkillGroup
Employee_Skill: Entry_ID, Emp_ID, Skill_ID, LevelofExperience, Dateupdated
SkillGroupName:SkillGroup_ID SkillGroupName`
Basically the idea of this database is to track employee skills and how the level of experience improves (or not!) over time. The problem I am facing is that I want the application to filter by the most recently updated combination of Skill and Employee. I have found the query that will allow for me to use the two columns as a distinct entity:
SELECT DISTINCT Emp_ID, Skill_ID FROM Employee_Skill
WHERE (SELECT MAX(DateUpdated)From Employee_Skill);
And it works perfectly on its own, but I don't know how to incorporate it either into my main query, which simply joins together the necessary columns for an easier end user experience. It does not visibly show Emp_ID or Skill_ID. It also doesn't in the VBA for the application. (-1 = Include all History; 0 = Only include most updated.)
Update:
I have been able to select the distinct combination of Employee and Skill through my main query by doing this:
SELECT
Employee.FirstName,
Employee.LastName,
Max(Employee_Skill.LevelOfExperience) AS LevelOfExperience,
Skill.SkillName,
Max(Employee_Skill.DateUpdated) AS DateUpdated,
Max(SkillGroup.SkillGroupName) AS SkillGroupName
FROM
SkillGroup INNER JOIN
(Skill INNER JOIN
(Employee INNER JOIN
Employee_Skill ON
Employee.Emp_ID = Employee_Skill.Emp_ID) ON
Skill.Skill_ID = Employee_Skill.Skill_ID) ON
SkillGroup.SkillGroup_ID = Skill.SkillGroup
WHERE
Employee.Active=True
GROUP BY
Employee.FirstName,
Employee.LastName,
Skill.SkillName
ORDER BY
Max(Employee_Skill.LevelOfExperience) DESC;
However, my forms and reports built on this query are stuck with only the option of seeing the most updated version. I am really hoping to have a dynamic form that removes the constraints as desired.
Not sure what you're doing with Max(Employee_Skill.LevelOfExperience) or Max(SkillGroup.SkillGroupName) but I think you need to stick with querying for the detail rows and then include another column marking the Max(Employee_Skill.DateUpdated) filter, like:
SELECT
Employee.FirstName,
Employee.LastName,
Employee_Skill.LevelOfExperience,
Skill.SkillName,
Employee_Skill.DateUpdated,
SkillGroup.SkillGroupName,
iif(max_dateUpdated=dateupdated,1,0) as is_max_DateUpdated
FROM
SkillGroup INNER JOIN
(Skill INNER JOIN
(Employee INNER JOIN
Employee_Skill ON
Employee.Emp_ID = Employee_Skill.Emp_ID) ON
Skill.Skill_ID = Employee_Skill.Skill_ID) ON
SkillGroup.SkillGroup_ID = Skill.SkillGroup inner join
(select
empID,
max(dateupdated) as max_dateUpdated
from
Employee_Skill
group by
empID) mx on
Employee.empID = mx.empID
WHERE
Employee.Active=True

sql server - how to modify values in a query statement?

I have a statement like this:
select lastname,firstname,email,floorid
from employee
where locationid=1
and (statusid=1 or statusid=3)
order by floorid,lastname,firstname,email
The problem is the column floorid. The result of this query is showing the id of the floors.
There is this table called floor (has like 30 rows), which has columns id and floornumber. The floorid (in above statement) values match the id of the table floor.
I want the above query to switch the floorid values into the associated values of the floornumber column in the floor table.
Can anyone show me how to do this please?
I am using Microsoft sql server 2008 r2.
I am new to sql and I need a clear and understandable method if possible.
select lastname,
firstname,
email,
floor.floornumber
from employee
inner join floor on floor.id = employee.floorid
where locationid = 1
and (statusid = 1 or statusid = 3)
order by floorid, lastname, firstname, email
You have to do a simple join where you check, if the floorid matches the id of your floor table. Then you use the floornumber of the table floor.
select a.lastname,a.firstname,a.email,b.floornumber
from employee a
join floor b on a.floorid = b.id
where a.locationid=1 and (a.statusid=1 or a.statusid=3)
order by a.floorid,a.lastname,a.firstname,a.email
You need to use a join.
This will join the two tables on a certain field.
This way you can SELECTcolumns from more than one table at the time.
When you join two tables you have to specify on which column you want to join them.
In your example, you'd have to do this:
from employee join floor on employee.floorid = floor.id
Since you are new to SQL you must know a few things. With the other enaswers you have on this question, people use aliases instead of repeating the table name.
from employee a join floor b
means that from now on the table employee will be known as a and the table floor as b. This is really usefull when you have a lot of joins to do.
Now let's say both table have a column name. In your select you have to say from which table you want to pick the column name. If you only write this
SELECT name from Employee a join floor b on a.id = b.id
the compiler won't understand from which table you want to get the column name. You would have to specify it like this :
SELECT Employee.name from Employee a join floor b on a.id = b.id or if you prefer with aliases :
SELECT a.name from Employee a join floor b on a.id = b.id
Finally there are many type of joins.
Inner join ( what you are using because simply typing Join will refer to an inner join.
Left outer join
Right outer join
Self join
...
To should refer to this article about joins to know how to use them correctly.
Hope this helps.

SQL Query for finding values that do not exist in one table, with WHERE clause

I'm struggling to compile a query for the following and wonder if anyone can please help (I'm a SQL newbie).
I have two tables:
(1) student_details, which contains the columns: student_id (PK), firstname, surname (and others, but not relevant to this query)
(2) membership_fee_payments, which contains details of monthly membership payments for each student and contains the columns: membership_fee_payments_id (PK), student_id (FK), payment_month, payment_year, amount_paid
I need to create the following query:
which students have not paid fees for March 2012?
The query could be for any month/year, March is just an example. I want to return in the query firstname, surname from student_details.
I can query successfully who has paid for a certain month and year, but I can't work out how to query who has not paid!
Here is my query for finding out who has paid:
SELECT student_details.firstname, student_details.surname,
FROM student_details
INNER JOIN membership_fee_payments
ON student_details.student_id = membership_fee_payments.student_id
WHERE membership_fee_payments.payment_month = "March"
AND membership_fee_payments.payment_year = "2012"
ORDER BY student_details.firstname
I have tried a left join and left outer join but get the same result. I think perhaps I need to use NOT EXISTS or IS NULL but I haven't had much luck writing the right query yet.
Any help much appreciated.
I'm partial to using WHERE NOT EXISTS Typically that would look something like this
SELECT D.firstname, D.surname
FROM student_details D
WHERE NOT EXISTS (SELECT * FROM membership_fee_payments P
WHERE P.student_id = D.student_id
AND P.payment_year = '2012'
AND P.payment_month = 'March'
)
This is know an a correlated subquery as it contains references to the outer query. This allows you to include your join criteria in the subquery without necessarily writing a JOIN. Also, most RDBMS query optimizers will implement this as a SEMI JOIN which does not typically do as much 'work' as a complete join.
You could use a left join. When the payment is missing, all the columns in the left join table will be null:
SELECT student_details.firstname, student_details.surname,
FROM student_details
LEFT JOIN membership_fee_payments
ON student_details.student_id = membership_fee_payments.student_id
AND membership_fee_payments.payment_month = "March"
AND membership_fee_payments.payment_year = "2012"
WHERE membership_fee_payments.student_id is null
ORDER BY student_details.firstname
You can also write following query. This will gives your expected output.
SELECT student_details.firstname,
student_details.surname,
FROM student_details
Where
student_details.student_id Not in
(SELECT membership_fee_payments.student_id
from membership_fee_payments
WHERE
membership_fee_payments.payment_year = '2012'
AND membership_fee_payments.payment_month = 'March'
)

Explanation of self-joins

I don't understand the need for self-joins. Can someone please explain them to me?
A simple example would be very helpful.
You can view self-join as two identical tables. But in normalization, you cannot create two copies of the table so you just simulate having two tables with self-join.
Suppose you have two tables:
Table emp1
Id Name Boss_id
1 ABC 3
2 DEF 1
3 XYZ 2
Table emp2
Id Name Boss_id
1 ABC 3
2 DEF 1
3 XYZ 2
Now, if you want to get the name of each employee with his or her boss' names:
select c1.Name , c2.Name As Boss
from emp1 c1
inner join emp2 c2 on c1.Boss_id = c2.Id
Which will output the following table:
Name Boss
ABC XYZ
DEF ABC
XYZ DEF
It's quite common when you have a table that references itself. Example: an employee table where every employee can have a manager, and you want to list all employees and the name of their manager.
SELECT e.name, m.name
FROM employees e LEFT OUTER JOIN employees m
ON e.manager = m.id
A self join is a join of a table with itself.
A common use case is when the table stores entities (records) which have a hierarchical relationship between them. For example a table containing person information (Name, DOB, Address...) and including a column where the ID of the Father (and/or of the mother) is included. Then with a small query like
SELECT Child.ID, Child.Name, Child.PhoneNumber, Father.Name, Father.PhoneNumber
FROM myTableOfPersons As Child
LEFT OUTER JOIN myTableOfPersons As Father ON Child.FatherId = Father.ID
WHERE Child.City = 'Chicago' -- Or some other condition or none
we can get info about both child and father (and mother, with a second self join etc. and even grand parents etc...) in the same query.
Let's say you have a table users, set up like so:
user ID
user name
user's manager's ID
In this situation, if you wanted to pull out both the user's information and the manager's information in one query, you might do this:
SELECT users.user_id, users.user_name, managers.user_id AS manager_id, managers.user_name AS manager_name INNER JOIN users AS manager ON users.manager_id=manager.user_id
Imagine a table called Employee as described below. All employees have a manager which is also an employee (maybe except for the CEO, whose manager_id would be null)
Table (Employee):
int id,
varchar name,
int manager_id
You could then use the following select to find all employees and their managers:
select e1.name, e2.name as ManagerName
from Employee e1, Employee e2 where
where e1.manager_id = e2.id
They are useful if your table is self-referential. For example, for a table of pages, each page may have a next and previous link. These would be the IDs of other pages in the same table. If at some point you want to get a triple of successive pages, you'd do two self-joins on the next and previous columns with the same table's id column.
Without the ability for a table to reference itself, we'd have to create as many tables for hierarchy levels as the number of layers in the hierarchy. But since that functionality is available, you join the table to itself and sql treats it as two separate tables, so everything is stored nicely in one place.
Apart from the answers mentioned above (which are very well explained), I would like to add one example so that the use of Self Join can be easily shown.
Suppose you have a table named CUSTOMERS which has the following attributes:
CustomerID, CustomerName, ContactName, City, Country.
Now you want to list all those who are from the "same city" .
You will have to think of a replica of this table so that we can join them on the basis of CITY. The query below will clearly show what it means:
SELECT A.CustomerName AS CustomerName1, B.CustomerName AS CustomerName2,
A.City
FROM Customers A, Customers B
WHERE A.CustomerID <> B.CustomerID
AND A.City = B.City
ORDER BY A.City;
There are many correct answers here, but there is a variation that is equally correct. You can place your join conditions in the join statement instead of the WHERE clause.
SELECT e1.emp_id AS 'Emp_ID'
, e1.emp_name AS 'Emp_Name'
, e2.emp_id AS 'Manager_ID'
, e2.emp_name AS 'Manager_Name'
FROM Employee e1 RIGHT JOIN Employee e2 ON e1.emp_id = e2.emp_id
Keep in mind sometimes you want e1.manager_id > e2.id
The advantage to knowing both scenarios is sometimes you have a ton of WHERE or JOIN conditions and you want to place your self join conditions in the other clause to keep your code readable.
No one addressed what happens when an Employee does not have a manager. Huh? They are not included in the result set. What if you want to include employees that do not have managers but you don't want incorrect combinations returned?
Try this puppy;
SELECT e1.emp_id AS 'Emp_ID'
, e1.emp_name AS 'Emp_Name'
, e2.emp_id AS 'Manager_ID'
, e2.emp_name AS 'Manager_Name'
FROM Employee e1 LEFT JOIN Employee e2
ON e1.emp_id = e2.emp_id
AND e1.emp_name = e2.emp_name
AND e1.every_other_matching_column = e2.every_other_matching_column
Self-join is useful when you have to evaluate the data of the table with itself. Which means it'll correlate the rows from the same table.
Syntax: SELECT * FROM TABLE t1, TABLE t2 WHERE t1.columnName = t2.columnName
For example, we want to find the names of the employees whose Initial Designation equals to current designation. We can solve this using self join in following way.
SELECT NAME FROM Employee e1, Employee e2 WHERE e1.intialDesignationId = e2.currentDesignationId
One use case is checking for duplicate records in a database.
SELECT A.Id FROM My_Bookings A, My_Bookings B
WHERE A.Name = B.Name
AND A.Date = B.Date
AND A.Id != B.Id
SELF JOIN:
Joining a table by itself is called as Self Join.
We can perform operations on a single table.
When we use self join we should create alias names on a table otherwise we cannot implement self join.
When we create alias name on a table internally system is preparing virtual table on each alias name of a table.
We can create any number of alias names on a table but each alias name should be different.
Basic Rules of self join:
CASE-I: Comparing a single column values by itself with in the table
CASE-II: Comparing two different columns values to each other with in the table.
Example:
SELECT * from TEST;
ENAME
RICHARD
JOHN
MATHEW
BENNY
LOC
HYDRABAD
MUMBAI
HYDRABAD
CHENNAI
SELECT T1. ENAME, T1. LOC FROM TEST.T1, TEST T2 WHERE T1.LOC=T2.LOC AND T2.ENAME='RICHARD';
It's the database equivalent of a linked list/tree, where a row contains a reference in some capacity to another row.
Here is the exaplanation of self join in layman terms. Self join is not a different type of join. If you have understood other types of joins (Inner, Outer, and Cross Joins), then self join should be straight forward. In INNER, OUTER and CROSS JOINS, you join 2 or more different tables. However, in self join you join the same table with itslef. Here, we don't have 2 different tables, but treat the same table as a different table using table aliases. If this is still not clear, I would recomend to watch the following youtube videos.
Self Join with an example