SQL Query: Gathering users from 3 different databases - sql

I am new to SQL and stackoverflow, but I was hoping someone could help me out with the following problem.
There are three different databases for a company, each of which represents a different division within the company.
Each database contains employee information in the employee table:
id, first name, last name, division.
There are three divisions. Each employee can be in more than one division. The id is unique to the employee; an employee in multiple divisions has the same id in each table.
How can I write a query that selects each unique employee and the divisions that they work in (in one row)?
My results from the following code is incomplete, meaning that there are a few missing employees that are unaccounted for.
Insert into #temp1 (id, first name, last name, division AS divison1) from db1.table WHERE active_flag = 1 AND termination_date IS NULL
Insert into #temp2 (id, first name, last name, division AS division2) from db2.table WHERE active_flag = 1 AND termination_date IS NULL
Insert into #temp3 (id, first name, last name, division AS division3) from db3.table WHERE active_flag = 1 AND termination_date IS NULL
Insert into #uniqueids (id, first name, last name)
SELECT id, first name, last name FROM #temp1
UNION SELECT id, first name, last name FROM #temp2
UNION SELECT id, first name, last name FROM #temp3
SELECT #uniqueids.id, #uniqueids.first name, #uniqueids.last name,
division1+division2+division3 AS divisions
FROM #uniqueids
LEFT JOIN #temp1 ON #uniqueids.id=#temp1.id
LEFT JOIN #temp2 ON #uniqueids.id=#temp2.id
LEFT JOIN #temp3 ON #uniqueids.id=#temp3.id
WHERE #uniqueids.id NOT LIKE '%default%' AND #uniqueids.id NOT LIKE 'S%'
*** I edited the code to make it more clear
I know that certain employees are unaccounted for because I was given a result set that has a list of 778 unique employees.
Example row:
[id, first name, last name, divisions]
[asd1234, Julie, Wong, 1 2 3]
The results of inserting into and selecting the #uniqueids displaying 900 employees, due to the existence of ids that contain "default" and starting with "S", which do not count. These are addressed in the last section of the code, which I have just now included.
My current result set has 770 unique employees after running the entire query, which means that I am missing 8.
I am using SQL Server 2014.

Related

How to query how many different IDs use the same column value?

I have this homework assignment where I'm attempting to query a table to find the id numbers that are all using the same column value, let's say last name in this case. I'd like to find the ids that use the same last name more than once, and have a column that tells me the total number of unique IDs that used that same last name.
SELECT id, COUNT(*) as ID_count
FROM [table]
WHERE l_name IN
(
SELECT l_name
FROM [table]
GROUP BY l_name HAVING COUNT(*)>1
)
GROUP BY id;
This is what I have so far. It grants me the ID number, but the count(*) is not what I'm going for. What I'm instead trying to get is how many unique IDs have "Smith" as their last name, instead of all the occurrences of one specific ID that has used "Smith".
I've tried different things but I feel like I'm at a roadblock. Any hints or tips are nice; I don't need this problem solved 100%, but I feel as if I can't past the idea of using count(*).
Thanks all.
It sounds like you were already there WITHIN the inner query. Just add the count to it for the output.
SELECT
t1.id,
t1.l_name,
max( PQ.UniqCount ) UniqCount,
COUNT(*) as countForThisSingleID
FROM
[table] t1
JOIN
( SELECT
t.l_name,
COUNT( DISTINCT t.ID ) as UniqCount
FROM
[table] t
GROUP BY
t.l_name
HAVING
COUNT( DISTINCT t.ID ) > 1 ) PQ
on t1.l_name = PQ.l_name
group by
t1.id,
t1.l_name
order by
t1.l_name,
t1.id
So by doing a COUNT( DISTINCT ) on the inner pre-query (alias PQ), for each L_Name, you are getting a count of distinct IDs. I dont know if your [table] has multiple entries for the same ID in it or not, so applying the DISTINCT. Same for the HAVING clause. But at least now the inner pre-query gets the overall distinct counts for a given L_Name value.
Now, doing a JOIN to the outer table on that L_Name will get the corresponding count in the result query, along with showing the l_name that it qualified against. So if you have a table with 18 DISTINCT ID instances of John, 37 of Karen, 11 of Mike, your inner query will get those. Now joined to the outer, you will get the output of EACH instance of John and their corresponding IDs, then all Karen instance and Mike instances.
The count for the outer query is getting the count of the one ID (and name) times that it appears in the table. So if the table had ID = 5, L_Name = John and ID 5 appeared 3 times in the table, the output of his record might look like
ID L_Name countForThisSingleID UniqCount
5 John 3 18
72 John 8 18
127 John 2 18
etc...
Similarly the output would include all Karen's and Mike's within the table (and any others that qualify).
Again, without knowing if your [table] is a unique instance per ID such as a master customer lookup table where it would only appear once vs an order table where the ID may appear more than once for a single person's ID, not positive what your final answer is looking for.
But I think I have given you a bunch to chew on and run with.

SQL CASE Statement - Return first match (ignore other matches)

I have a simple problem - I have two tables (table A and B) with records for staff members in each. A staff member may be reflected in both tables. I'm trying to put together a case statement that returns the first match for an employee from Table A and then exits the case statement (i.e., do not try to find that same employee in Table B). Right now, my current code returns matches from both Table A and Table B for that employee. How can I stop this?
How about something like this:
with AllStaff as
(
select 1 as Level, StaffId, Name
from TableA
union all
select 2 as Level, StaffId, Name
from TableB
),
DistinctStaff as
(
select distinct StaffId from AllStaff
)
select s.StaffId, sRec.*
from DistinctStaff as s
outer apply
(select top(1) * from AllStaff as a where a.StaffId = s.StaffId order by a.Level) as sRec
group by s.StaffId

join and group by in SQL

I have two tables that I was going to join, but I understand it's more efficient to use CREATE VIEW. This is what I have:
CREATE OR REPLACE VIEW view0_joinedTablesGrouped
AS
Select table1.*,table2.*
FROM table1
inner join table2 on table1.col =
table2.matchingcol
group by table2.matchingcol;
which causes the following error:
ERROR: column "table1.col" must appear in the GROUP BY clause or be
used in an aggregate function
LINE 3: Select table.*,table2.*
Group By cannot do what you are trying to do.
Consider a simple table:
Name Age
-------
Ann 10
Bill 10
Chris 11
If you try to group by age with:
Select * from Table group by Age
What, exactly, do you expect to appear in the Name column for Age=10? Ann, or Bill or both or neither or ....? There is no good answer.
So, when you group by, every column in the output has to be an aggregate – that means a function of every row in the group.
So these are valid:
Select Age, Count(*) from Table group by Age
Select Age, Max( Length(Name)) from Table group by Age
Select Age, Max(Name) from Table group by Age
But this is impossible to do, and isn't valid:
Select Age,Name from Table group by Age
So your select * is the problem -- you can't just select column values because when you group by there's a whole group of column values for every output row, and you can't stuff all those values into one column of one row.
As for using a view, #systemjack's comment is correct.

Check if tables are identical using SQL in Oracle

I was asked this question during an interview for a Junior Oracle Developer position, the interviewer admitted it was a tough one:
Write a query/queries to check if the table 'employees_hist' is an exact copy of the table 'employees'. Any ideas how to go about this?
EDIT: Consider that tables can have duplicate records so a simple MINUS will not work in this case.
EXAMPLE
EMPLOYEES
NAME
--------
Jack Crack
Jack Crack
Jill Hill
These two would not be identical.
EMPLOYEES_HIST
NAME
--------
Jack Crack
Jill Hill
Jill Hill
If the tables have the same columns, you can use this; this will return no rows if the rows in both tables are identical:
(
select * from test_data_01
minus
select * from test_data_02
)
union
(
select * from test_data_02
minus
select * from test_data_01
);
Identical regarding what? Metadata or the actual table data too?
Anyway, use MINUS.
select * from table_1
MINUS
select * from table_2
So, if the two tables are really identical, i.e. the metadata and the actual data, it would return no rows. Else, it would prove that the data is different.
If, you receive an error, it would mean the metadata itself is different.
Update If the data is not same, and that one of the table has duplicates.
Just select the unique records from one of the table, and simply apply MINUS against the other table.
One possible solution, which caters for duplicates, is to create a subquery which does a UNION on the two tables, and includes the number of duplicates contained within each table by grouping on all the columns. The outer query can then group on all the columns, including the row count column. If the table match, there should be no rows returned:
create table employees (name varchar2(100));
create table employees_hist (name varchar2(100));
insert into employees values ('Jack Crack');
insert into employees values ('Jack Crack');
insert into employees values ('Jill Hill');
insert into employees_hist values ('Jack Crack');
insert into employees_hist values ('Jill Hill');
insert into employees_hist values ('Jill Hill');
with both_tables as
(select name, count(*) as row_count
from employees
group by name
union all
select name, count(*) as row_count
from employees_hist
group by name)
select name, row_count from both_tables
group by name, row_count having count(*) <> 2;
gives you:
Name Row_count
Jack Crack 1
Jack Crack 2
Jill Hill 1
Jill Hill 2
This tells you that both names appear once in one table and twice in the other, and therefore the tables don't match.
select name, count(*) n from EMPLOYEES group by name
minus
select name, count(*) n from EMPLOYEES_HIST group by name
union all (
select name, count(*) n from EMPLOYEES_HIST group by name
minus
select name, count(*) n from EMPLOYEES group by name)
You could merge the two tables and then subtract one of the tables from the result. If the result of the subtraction is an empty table then you know that the the tables must be the same since merge had no effect (every row and column were effectively the same)
How do I merge two tables with different column number while removing duplicates?
That link provides a good way to merge the two tables without duplicates without knowing what the columns are.
Ensure the rows are unique by adding a pseudo column
WITH t1 AS
(SELECT <All_Columns>
, row_number() OVER
(PARTITION BY <All_Columns>
ORDER BY <All_Columns>) row_num
FROM employees)
, t2 AS
(SELECT <All_Columns>
, row_number() OVER
(PARTITION BY <All_Columns>
ORDER BY <All_Columns>) row_num
FROM employees_hist)
(SELECT *
FROM t1
MINUS
SELECT *
FROM t2
UNION ALL
(SELECT *
FROM t1
MINUS
SELECT *
FROM t2)
Use row_number to make sure there are no duplicate rows. Now you can use minus and if there are no results, the tables are identical.
SELECT ROW_NUMBER() OVER (Order By Name), *
FROM tab1
MINUS
SELECT ROW_NUMBER() OVER (Order By Name), *
FROM tab2

help with query in DB2

i would like your help with my query.I have a table employee.details with the following columns:
branch_name, firstname,lastname, age_float.
I want this query to list all the distinct values of the age_float
attribute, one in each row of the result table, and beside each in the second field show the
number of people in the details table who had ages less than or equal to that value.
Any ideas? Thank you!
You can use OLAP functions:
SELECT DISTINCT age_float,
COUNT(lastname) OVER(ORDER BY age_float) AS number
FROM employee_details
COUNT(lastname) OVER(ORDER BY age_float) AS number orders rows by age, and returns employees count whose age <= current row age
or a simple join:
SELECT A.age_float, count(lastname)
FROM (SELECT DISTINCT age_float FROM employee_details) A
JOIN employee_details AS ED ON ED.age_float <= A.age_float
GROUP BY A.age_float