Unexpected result of Cross join - sql

These are my tblEmp table and tblDept table(I'm using MS-SQL Server 2012), when I try using a cross join on these two tables it's giving me the result which I didn't expected, just wanted to know why this cross join gives this kind of result, Thank you.
ID Name Gender Salary Dept_id
1 abc male 2004 1
2 Tom female 5004 2
3 Sara female 29404 2
4 Jim male 8604 3
5 Lisan male 2078 1
6 Brad male 9804 3
7 Diana female 2095 2
8 Henry male 28204 2
9 Mark male 20821 1
10 Miley female 9456 1
11 Richie male 8604 NULL
12 Lisan female 20776 NULL
tblDept
ID Dept_Name Location
1 IT Mumbai
2 HR Delhi
3 Accounts London
4 OtherDepartment NewYork
this is the cross join query and it's output
select Name, Gender, Salary, Dept_Name
from tblEmp
CROSS JOIN tblDept
where tblEmp.Dept_id is NULL
OUTPUT
Name Gender Salary Dept_Name
Richie male 8604 IT
Richie male 8604 HR
Richie male 8604 Accounts
Richie male 8604 OtherDepartment
Lisan female 20776 IT
Lisan female 20776 HR
Lisan female 20776 Accounts
Lisan female 20776 OtherDepartment
What I expected was something like this
Name Gender Salary Dept_Name
Richie male 8604 NULL
Richie male 8604 NULL
Richie male 8604 NULL
Richie male 8604 NULL
Lisan female 20776 NULL
Lisan female 20776 NULL
Lisan female 20776 NULL
Lisan female 20776 NULL

A CROSS JOIN would give you each row of the first table join with each row of the second table, (a Cartesian product) unless you add a condition using the where clause to connect the two tables (and in that case, it behaves like an inner join)
Here is a quick demonstration of Cross join:
DECLARE #A table
(
A1 int identity(1,1),
A2 int
)
DECLARE #B table
(
B1 int identity(1,1),
B2 int
)
INSERT INTO #A VALUES (1), (2), (NULL)
INSERT INTO #B VALUES (4), (5), (6)
SELECT *
FROM #A
CROSS JOIN #B
Results:
A1 A2 B1 B2
----------- ----------- ----------- -----------
1 1 1 4
2 2 1 4
3 NULL 1 4
1 1 2 5
2 2 2 5
3 NULL 2 5
1 1 3 6
2 2 3 6
3 NULL 3 6
As you can see, for each record in table #A, you join each record of table #B
SELECT *
FROM #A
CROSS JOIN #B
WHERE A2 IS NULL
Results:
A1 A2 B1 B2
----------- ----------- ----------- -----------
3 NULL 1 4
3 NULL 2 5
3 NULL 3 6
As you can see, for each record in table #A where A2 is null, you join each record of table #B.

The result is correct, the cross join will give you all combinations based on two tables: tblEmp and tblDept.
And since you use Dept_Name as the combination, without where clause, it will give you every combination possible between your two tables:
Name Gender Salary Dept_Name
abc male 2004 IT
abc male 2004 HR
abc male 2004 Accounts
abc male 2004 OtherDepartment
Tom female 5004 IT
Tom female 5004 HR
Tom female 5004 Accounts
Tom female 5004 OtherDepartment
... and so on
Richie male 8604 IT
Richie male 8604 HR
Richie male 8604 Accounts
Richie male 8604 OtherDepartment
Lisan female 20776 IT
Lisan female 20776 HR
Lisan female 20776 Accounts
Lisan female 20776 OtherDepartment
That is, by cross-joining, you would actually get 12 (from tblEmp) x 4 (from tblDept) = 48 rows
Then your where clause will simply take away everybody except Richie and Lisan, since the two of them are the only ones having Dept_id = NULL
Name Gender Salary Dept_Name
Richie male 8604 IT
Richie male 8604 HR
Richie male 8604 Accounts
Richie male 8604 OtherDepartment
Lisan female 20776 IT
Lisan female 20776 HR
Lisan female 20776 Accounts
Lisan female 20776 OtherDepartment
If you query Dept_id column too,
select Name, Gender, Salary, Dept_id, Dept_Name
from tblEmp
CROSS JOIN tblDept
where tblEmp.Dept_id is NULL
The result will be clearer, as you actually only get the employees with Dept_id = NULL:
Name Gender Salary Dept_id Dept_Name
Richie male 8604 NULL IT
Richie male 8604 NULL HR
Richie male 8604 NULL Accounts
Richie male 8604 NULL OtherDepartment
Lisan female 20776 NULL IT
Lisan female 20776 NULL HR
Lisan female 20776 NULL Accounts
Lisan female 20776 NULL OtherDepartment
Your Dept_Name column comes from 4 tblDept entries, not from tblEmp entries.

If you need show all employees and their departments, you can use LEFT JOIN:
SELECT Name, Gender, Salary, Dept_Name
FROM
tblEmp AS E
LEFT JOIN
tblDept AS D
ON E.Dept_id = D.ID
Result:
Name Gender Salary Dept_Name
abc male 2004 IT
Tom female 5004 HR
Sara female 29404 HR
Jim male 8604 Accounts
Lisan male 2078 IT
Brad male 9804 Accounts
Diana female 2095 HR
Henry male 28204 HR
Mark male 20821 IT
Miley female 9456 IT
Richie male 8604 NULL
Lisan female 20776 NULL
OR
If you need show all employees and all departments, you can use FULL JOIN:
SELECT Name, Gender, Salary, Dept_Name
FROM
tblEmp AS E
FULL JOIN
tblDept AS D
ON E.Dept_id = D.ID
Result:
Name Gender Salary Dept_Name
abc male 2004 IT
Tom female 5004 HR
Sara female 29404 HR
Jim male 8604 Accounts
Lisan male 2078 IT
Brad male 9804 Accounts
Diana female 2095 HR
Henry male 28204 HR
Mark male 20821 IT
Miley female 9456 IT
Richie male 8604 NULL
Lisan female 20776 NULL
NULL NULL NULL OtherDepartment

If you do want a NULL row for each department per individual that has a null dept_id e.g.
Name Gender Salary Dept_Name
Richie male 8604 NULL
Richie male 8604 NULL
Richie male 8604 NULL
Richie male 8604 NULL
Lisan female 20776 NULL
Lisan female 20776 NULL
Lisan female 20776 NULL
Lisan female 20776 NULL
you could execute this...
select Name, Gender, Salary, NULL AS Dept_Name
from tblEmp
CROSS JOIN tblDept
where tblEmp.Dept_id is NULL

Related

Sql distinct group of rows

In sql i want get distict sets of rows : identical group for Characteristic and Value only one time :
The column Characteristic can range from one to 10
Table :
Name
Characteristic
Value
Mary
eyes
Blu
Mary
hair
blonde
Mary
Sex
Female
Jhon
eyes
Black
Jhon
Hair
Black
Jhon
Sex
Male
Jhon
Nation
Franch
Bill
eyes
Blu
Bill
Hair
Blond
Bill
Sex
Male
Will
eyes
Green
Will
Hair
Blond
Will
Sex
Male
Will
Nation
Spain
Lilly
eyes
Blu
Lilly
Hair
Blonde
Lilly
Sex
Female
mark
eyes
Black
mark
Hair
Black
mark
Sex
Male
mark
Nation
Franch
Anna
eyes
Blu
Anna
Hair
Blonde
Anna
Sex
Female
Antonio
eyes
Black
Antonio
Hair
Black
Antonio
Sex
Male
Antonio
Nation
Franch
The result that i want to achieve :
Group
Characteristic
Value
1
eyes
Blu
1
Hair
Blonde
1
Sex
Female
2
eyes
Black
2
Hair
Black
2
Sex
Male
2
Nation
Franch
3
eyes
Blu
3
Hair
Blond
3
Sex
Male
4
eyes
Green
4
Hair
Blode
4
Sex
Male
4
Nation
Spain
and finally if it's possible :
Name
Characteristic
Value
Group
Mary
eyes
Blu
1
Mary
Hair
Blonde
1
Mary
Sex
Female
1
Jhon
eyes
Black
2
Jhon
Hair
Black
2
Jhon
Sex
Male
2
Jhon
Nation
Franch
2
Bill
eyes
Blu
3
Bill
Hair
Blond
3
Bill
Sex
Male
3
Will
eyes
Green
4
Will
Hair
Blond
4
Will
Sex
Male
4
Will
Nation
Spain
4
Lilly
eyes
Blu
1
Lilly
Hair
Blonde
1
Lilly
Sex
Female
1
mark
eyes
Black
2
mark
Hair
Black
2
mark
Sex
Male
2
mark
Nation
Franch
2
Anna
eyes
Blu
1
Anna
Hair
Blonde
1
Anna
Sex
Female
1
Antonio
eyes
Black
2
Antonio
Hair
Black
2
Antonio
Sex
Male
2
Antonio
Nation
Franch
2
You can use STRING_AGG to join all the characteristics together, then use ROW_NUMBER and DENSE_RANK to count them. Then you re-join that back to the base table.
For your first query, you can do it like this.
SELECT
Groups.GroupId,
t.Characteristic,
t.Value
FROM YourTable t
JOIN (
SELECT
t.Name,
t.GroupDefinition,
GroupId = DENSE_RANK() OVER (ORDER BY t.GroupDefinition),
RowId = ROW_NUMBER() OVER (PARTITION BY t.GroupDefinition ORDER BY t.Name)
FROM (
SELECT
t.Name,
GroupDefinition = STRING_AGG(Characteristic + ':' + Value, '|')
WITHIN GROUP (ORDER BY t.Characteristic)
FROM YourTable t
GROUP BY
t.Name
) t
) Groups ON Groups.Name = t.Name
WHERE Groups.RowId = 1;
The second query is as follows.
SELECT
Groups.GroupId,
t.*
FROM YourTable t
JOIN (
SELECT
t.Name,
t.GroupDefinition,
GroupId = DENSE_RANK() OVER (ORDER BY t.GroupDefinition),
RowId = ROW_NUMBER() OVER (PARTITION BY t.GroupDefinition ORDER BY t.Name)
FROM (
SELECT
t.Name,
GroupDefinition = STRING_AGG(Characteristic + ':' + Value, '|')
WITHIN GROUP (ORDER BY t.Characteristic)
FROM YourTable t
GROUP BY
t.Name
) t
) Groups ON Groups.Name = t.Name;
db<>fiddle
Another option would be to aggregate it into a JSON or XML format, then shred it back out without re-joining the base table.

Find top 2 employees with longest work experience for each salary range

I have two tables.
Salary_Grade
GRADE
Min_Salary
Max_Salary
12
2100
3600
13
3601
4200
14
4201
6000
15
6001
9000
16
9001
30000
Employees
EMPLOYEE_NO
NAME
HIRE_DATE
SALARY
1007
SMITH
2016-02-20 00:00:00.000
15000
2340
JOHNSON
2018-02-07 00:00:00.000
3300
2341
WILLIAMS
2019-10-11 00:00:00.000
3750
2345
BROWN
2018-01-01 00:00:00.000
8925
2355
JONES
2015-07-13 00:00:00.000
8550
3434
GARCIA
2011-08-11 00:00:00.000
7350
4356
MILLER
2013-10-12 00:00:00.000
3750
4455
DAVIS
2000-04-30 00:00:00.000
2850
4456
WILSON
1980-03-03 00:00:00.000
9000
4467
ANDERSON
2001-07-28 00:00:00.000
3900
5643
THOMAS
2011-03-10 00:00:00.000
4800
6538
TAYLOR
2011-08-11 00:00:00.000
9000
6578
MOORE
2020-11-27 00:00:00.000
2400
8900
LEE
2015-03-03 00:00:00.000
4500
My task is to display the two employees with the longest work experience, for each GRADE (the grade is results from the salary range in the SALARY_GRADE and the corresponding SALARY from the EMPLOYEE table)
Expected result:
GRADE
NAME
EXPERIENCE(DAYS)
12
JOHNSON
1359
12
DAVIS
7851
13
MILLER
2938
13
ANDERSON
7397
14
THOMAS
3885
14
LEE
2431
15
WILSON
15214
15
TAYLOR
3731
16
SMITH
2077
I created table EMPLOYEE_SALGRADE with employee id and salary grades connected to them
CREATE TABLE [EMPLOYEE_SALGRADE](
[GRADE_NO] [int] not null,
[EMPLOYEE_NO] [int] not null,
FOREIGN KEY (Grade_NO) REFERENCES Salary_Grade(grade),
FOREIGN KEY (Employee_NO) REFERENCES Employee(Employee_NO))
insert into EMPLOYEE_SALGRADE(GRADE_NO, EMPLOYEE_NO)
SELECT s.grade, e.EMPLOYEE_NO FROM employee as e,salary_grade as s
WHERE e.salary BETWEEN s.min_salary AND s.max_salary
order by e.salary'
and added column Experience to Employee table
Alter table Employee
add Experience as DATEDIFF(dd,Hire_date,getdate())
Now I'm trying with subquery
select s.GRADE, e.NAME, e.Experience
from SALARY_GRADE as S
join EMPLOYEE_SALGRADE AS ES
ON S.GRADE=es.GRADE_NO
join EMPLOYEE as e
on es.Employee_no=e.EMPLOYEE_NO
where Experience in (select top 2(experience) from EMPLOYEE group by Experience)
But this not correct result
I did some research, and the correct answer is:
select * from ( select s.GRADE, e.NAME, e.Experience, row_number() over (partition by s.grade order by e.experience desc) as employee_rank
from SALARY_GRADE as S
join EMPLOYEE_SALGRADE AS ES
ON S.GRADE=es.GRADE_NO
join EMPLOYEE as e
on es.Employee_no=e.EMPLOYEE_NO) ranks
where employee_rank <= 2;

How to select data in SQL based on a filter which changes if there is no data in a specific table column?

I have tables similar to the three below. I need to join the first two tables based on id, and then join the third table based on second name. However the last table needs a filter where the city should be equal to London unless age is empty in which case the city should equal Manchester.
I tried the code below using CASE statement but it is not working. I am new to SQL so I was not sure how can I combine a where statement with an if clause where the filter for the selection changes depending on whether there is data in a different column than the one used to filter by. The DBMS I am using Toad for Oracle.
FIRST.NAME.TABLE
ID FIRST_NAME ENTRY_DATE
1 JOHN 09/09/2019
2 NICOLA 09/09/2019
3 PATRICK 05/09/2019
4 JOAN 01/09/2019
5 JAKE 09/09/2019
6 AMELIA 01/09/2019
7 CAMERON 09/09/2019
SECOND.NAME.TABLE
ID SECOND_NAME ENTRY_DATE
1 BROWN 09/09/2019
2 SMITH 09/09/2019
3 COLE 05/09/2019
4 HOUSTON 01/09/2019
5 FARRIS 09/09/2019
6 HATHAWAY 01/09/2019
7 JONES 09/09/2019
CITY.AGE.TABLE
CITY SECOND_NAME AGE
LONDON BROWN 24.00
LONDON SMITH
MANCHESTER COLE 30.00
MANCHESTER HOUSTON 66.00
LONDON FARRIS
LONDON HATHAWAY 32.00
GLASGOW JONES 28.00
MANCHESTER SMITH 32.00
LONDON FARRIS 62.00
SELECT FN.ID,
FN.FIRST_NAME,
SN.SECOND_NAME,
AC.CITY,
AC.AGE
FROM FIRST.NAME.TABLE AS FN
INNER JOIN SECOND.NAME.TABLE SN
ON FN.ID=SN.ID
INNER JOIN CITY.AGE.TABLE AS CA
ON SN.SECOND NAME=AC.SECOND_NAME
WHERE FN.ENTRY_DATE='09-SEP-19'
AND SN.ENTRY_DATE='09-SEP-19'
AND (CASE WHEN AC.CITY='LONDON' AND AC.AGE IS NOT NULL
THEN AC.CITY='LONDON'
ELSE AS.CITY='MANCHESTER' END)
You can express this as boolean logic:
WHERE FN.ENTRY_DATE = DATE '2019-09-09' AND
SN.ENTRY_DATE = DATE '2019-09-09' AND
(AC.AGE IS NOT NULL AND AC.CITY = 'LONDON' OR
AC.AGE IS NULL AND AC.CITY = 'MANCHESTER'
)
This answers your question about how to implement the logic using SQL. However, I'm not sure that is the logic that you really want. I speculate that you really want a LEFT JOIN to the age table.

Obtain percentage of values in a column

I have a table called Director, which looks like
DirectorID FirstName FamilyName FullName DoB Gender
1 Steven Spielberg Steven Spielberg 1946-12-18 Male
2 Joel Coen Joel Coen 1954-11-29 Male
3 Ethan Coen Ethan Coen 1957-09-21 Male
4 George Lucas George Lucas 1944-05-14 Male
5 Ang Lee Ang Lee 1954-10-23 Male
6 Martin Scorsese Martin Scorsese 1942-11-17 Male
7 Mimi Leder Mimi Leder 1952-01-26 Female
I am trying to work out the percentage of Female to Male directors
I can work out the number of Male and Female Directors using:
SELECT count(*) as myMale from Director
where Gender = 'Male'
SELECT count(*) as myFemale from Director
where Gender = 'Female')
But I am having trouble combining them to obtain a percentage of Female Directors.
I am looking for the result of 14.3%, which is calculated using:
Total Female Directors / (Total Male Directors + Total Female Directors)
or
1/(6+1)
How would I do this with SQL?
A simple method uses aggregation. Assuming directors are either male or female (binary), then a simple conditional aggregation suffices:
select avg(case when gender = 'Female' then 1.0 else 0 end) as ratio_female
from directors;
If you want to limit this only to male and female (assuming other options), then include where gender in ('female', 'male').

List the Id who appeared once only in Relational Algebra

Let's say there's a table called Winner, with 3 attributes: Name, Gender and Id.
Name Gender Id
Kevin Male 8
Kevin Male 8
Benny Male 31
Jenny Female 7
Louie Male 4
Peter Male 11
Kevin Male 2
Jenny Female 7
Jenny Female 7
Chris Male 23
Louie Female 14
Apart from those people who is actually 2 different person but with the same name and those people who have the same name but with different gender, their Id's will be the unique value to identify themselves. If I want to list all the Id's who appeared once only in the list, I am thinking to do something like this:
Am I expressing it correctly ?
I don't know what your formula is trying to say, but in SQL you can achieve the result you want with a GROUP BY query:
SELECT Id, COUNT(Id) AS idCount
FROM Winner
GROUP BY Id
HAVING COUNT(Id) = 1