SQL question: How can I extract this information from these tables? - sql

I have these 3 tables:
EMPLOYEES
Emp_id PK || First_Name || Last_Name || Hiring_Date
Department
Dep_Name || emp_id
SALARIES
salary || emp_id
Two months ago, the company hired new employees.
I need to write a SQL statement, that counts how many employees were hired. In the SAME statement, I need to find out, what are the financial increases by each department, after the new hirings.
For the first thing, I think this is the query:
SELECT COUNT(emp_id) FROM employees
WHERE YEAR(NOW()) - YEAR(hiring_date) = 0 AND (MONTH(NOW()) - MONTH(hiring_date) = 2 OR MONTH(NOW()) - MONTH(hiring_date) = - 2);
but, I don't know how can I extract the information for the 2nd thing. (I know I need to make a join, but I don't know how to extract the increases by each department)
Once again, the 1st and 2nd must be IN THE SAME SQL STATEMENT.

This variant needs all three tables. It uses Standard SQL interval notations; not many DBMS actually support it, but this works when the current date is in January and the question's version does not:
SELECT Dep_Name, COUNT(*), SUM(SALARY)
FROM Employees AS E NATURAL JOIN Salaries AS S ON E.Emp_ID = S.Emp_ID
NATURAL JOIN Department AS D ON E.Emp_ID = D.Emp_ID
WHERE CURRENT_DATE - Hiring_Date <= INTERVAL(2) MONTH
GROUP BY Dep_Name;
I note that the Department table is a little unusual - more normally, it would be called something like Department_Emps; as it stands, its primary key is the Emp_ID column, not the Dep_Name column.
[For the record, the query below is what I used with IBM Informix Dynamic Server.]
SELECT Dep_Name, COUNT(*), SUM(SALARY)
FROM employees AS E JOIN salaries AS S ON E.Emp_ID = S.Emp_ID
JOIN department AS D ON E.Emp_ID = D.Emp_ID
WHERE CURRENT YEAR TO DAY <= INTERVAL(2) MONTH TO MONTH + Hiring_Date
GROUP BY Dep_Name;

SELECT COUNT(emp_id), SUM(salary)
FROM employees e JOIN salaries s ON (s.emd_id = e.emp_id)
WHERE YEAR(NOW()) - YEAR(hiring_date) = 0
AND (MONTH(NOW()) - MONTH(hiring_date) = 2
OR MONTH(NOW()) - MONTH(hiring_date) = - 2)

Related

Recursive SQL query for finding matches

I have 5 SQL Tables with the following columns:
tbl_department:
department_id, parent_id
tbl_employee
employee_id, department_id
tbl_department_manager
department_id, employee_manager_id
tbl_request_regular_employee
request_id, employee_id
tbl_request_special_employee
request_id, employee_id
As input data I have employee_id and request_id.
I need to figure out whether the employee has access to the request (whether he's a manager or not)
We cannot use ORM here since app's responsiveness is our priority and the script might be called a lot.
Here's the logic I want to implement:
First we query to tbl_department_manager based on employee_id to check whether the current employee is a manager or not (also the employee can be a manager in a few departments). If so, we get a list of department_id (if nothing is found, just return false)
If we got at least one id in tbl_department_manager we query to tbl_request_regular_employee AND tbl_request_special_employee based on request_id and get employee_id from both tables (they are the same)
Based on employee_id collected above we query to tbl_employee to get a unique list of department_id that the employee belongs to.
Finally have a list of unique department_id from p.3 which we can compare to the one (ones) that we got in p.1.
The catch is, however, that in tbl_department there might be departments which inherit from the one (ones) that we got from p.1 (so we might need to find it recursively based on parent_id until we find at least one match with one element from p.1). If there's at least one match between one element in p.1 and one element in p.3 return true. So there's a need to look for it recursively.
Could someone give a clue how to implement it in MSSQL? Any help would be greatly appreciated.
declare #employee_id int, #request_id int;
with reqEmployees as (
select regular_employee_id as employee_id
from tbl_request_regular_employee
where request_id = #request_id
union all --concatenate the two tables
select special_employee_id
from tbl_request_special_employee
where request_id = #request_id
),
cte as (
select e.department_id, null as parent_id
from reqEmployees r
join tbl_employee e on e.employee_id = r.employee_id -- get these employees' departments
union all
select d.department_id, d.parent_id
from cte -- recurse the cte
join tbl_department d on d.department_id = cte.parent_id -- and get parent departments
)
-- we only want to know if there is any manager row, so exists is enough
select case when exists (select 1
from cte --join on managers
join tbl_department_manager dm on dm.department_id = cte.department_id
where dm.employee_manager_id = #employee_id)
then 1 else 0 end;

How to keep a bucket using case statement even if the count for items in that bucket is 0?

Here's the data table named "Salary_table" that i've created for this question:
So I want to find the number of employees in each salary bucket in each department. the buckets are
"<$100" "$100-$200" and ">$200"
The desired output is:
Below is my code for achieving this task:
select distinct(st.department) as "Department",
sb.salary_bucket as "salary range", count(*)
from Salary_table st
Left join (
select department, employee, case
when salary < 100 then "<$100"
when salary between 100 and 200 then "$100-$200"
else ">$200"
end
as salary_bucket
from Salary_table
) sb
on sb.employee = st.employee
group by st.department, sb.salary_bucket
order by st.department, sb.salary_bucket
;
but my output is a bit short of what im expecting:
There are TWO problems with my current output:
The buckets with 0 employees earning the salary in the bucket range are not listed; I want it to be listed with a value "0"
The salary bucket is NOT in the right order, even though I added in the statement "order by" but I think it's b/c its texts so can't really do that.
I would really appreciate some hints and pointers on how to fix/achieve these two issues I've addressed above. Thank you so much!
what i've tried
I tried use "left join" but output came out the same
I tried adding the "order by" clause but doesnt seem to work on text buckets
You are sort of on the right track, but the idea is a bit more complicated. Use a cross join to get all the rows -- the buckets and departments. Then use left join to bring in the matching information and finally group by for the aggregation:
select d.department, b.salary_bucket,
count(sb.department) as cnt
from (select '<$100' as salary_bucket union all
select '$100-$200' union all
select '>$200'
) b cross join
(select distinct department from salary_table
) d left join
(select department, employee,
(case when salary < 100 then '<$100'
when salary between 100 and 200 then '$100-$200'
else '>$200'
end) as salary_bucket
from Salary_table
) sb
on sb.department = d.department and
sb.salary_bucket = b.salary_bucket
group by d.department, b.salary_bucket;

Two Table problems with oracle

Students
ID FName Lname Status Major Code GPA Admitted Date
104 Donald Nento Sophomore 105 2.64 1-Jul-2015
Departments
Dept Code Dept Name College
105 Mathematics AS
These are the above tables... I am stuck on two questions:
List the college of the student with the highest GPA.
List Number of days elapsed since admission for each student.
Can anyone shed some light please?
List the college of the student with the highest GPA.
You can get all the college(s) with the maximum GPA without using a correlated sub-query like this:
SELECT College
FROM (
SELECT College,
RANK() OVER ( ORDER BY GPA DESC ) AS gpa_rank
FROM Students s
INNER JOIN
Departments d
ON ( s."Major Code" = d."Dept Code" )
)
WHERE gpa_rank = 1;
List Number of days elapsed since admission for each student.
SELECT ID,
FNAME,
LNAME,
FLOOR( SYSDATE - "Admitted Date" ) AS days_since_admission
FROM students;
You could use TRUNC(SYSDATE) - "Admitted Date" but if the admitted date has a time component then it will not be a round number.
(Note: it is unclear what your column names actually are. Your data shows them as case sensitive and with spaces but this is unusual as it is more usual to have case insensitive column names with underscores instead of spaces. I've used "" to match the column names used in your post but please adjust the names to whatever the actual values are.)
Here are some suggested queries. You'll need to adapt field names if they don't match. If you use double quotes the names are case sensitive, but if you have spaces in them, then you need to use double quotes:
List the college of the student with the highest GPA.
SELECT d."Dept Name"
FROM Departments d
INNER JOIN Students s
ON d."Dept Code" = s."Major Code"
WHERE s.GPA = (SELECT MAX(GPA) FROM Students);
Or, if you are not allowed to use INNER JOIN then:
SELECT d."Dept Name"
FROM Departments d,
Students s
WHERE d."Dept Code" = s."Major Code"
AND s.GPA = (SELECT MAX(GPA) FROM Students);
List Number of days elapsed since admission for each student.
SELECT s.*,
TRUNC(SYSDATE) - s.Admitted_Date AS days_since_admission
FROM Students s

Set default value to zero when count is null

select rd.description||'('||rs.department_code||')' DepartmentName,
rd.code DepartmentCode,
rs.description||'('||emv.section_code||')' SectionName,
emv.section_code SectionCode,
rsg.description||'('||emv.staff_group_code||')' StaffGroupName,
emv.staff_group_code StaffGroupCode,
e.staff_id STAFFID,
e.surname || ', ' || e.given_name FullName,
hp.description Position,
het.description EmploymentType,
to_char(e.Join_Date, 'dd/MM/yyyy') JoinDate,
e.employee_no EmployeeNo,
edt.shift Type,
to_char(edt.timesheet_date,'Mon') Mth,
to_char(to_Date(edt.timesheet_date),'mm') MthNum,
nvl(count(*), 0) Days
FROM employee e,
employee_daily_timesheet edt,
employee_assignment_vw emv,
ref_section rs,
ref_department rd,
hris_position hp,
hris_employment_type het,
ref_staffgroup rsg
WHERE e.employee_no = emv.employee_no
AND edt.assignment_no = emv.assignment_no
AND to_char(timesheet_date,'yyyy')=TO_CHAR(TO_DATE('01/31/2015','MM/dd/yyyy'),'yyyy')
AND emv.section_code = rs.code
AND rs.department_code = rd.code
AND e.position_code = hp.code(+)
AND e.employment_type_code = het.code(+)
AND emv.staff_group_code = rsg.code
AND edt.shift='OFF'
GROUP BY rd.description||'('||rs.department_code||')',
rd.code,
rs.description||'('||emv.section_code||')',
emv.section_code,
rsg.description||'('||emv.staff_group_code||')',
emv.staff_group_code,
e.staff_id,
e.surname || ', ' || e.given_name,
hp.description ,
het.description ,
to_char(e.Join_Date, 'dd/MM/yyyy') ,
e.employee_no,
edt.shift,
to_char(edt.timesheet_date,'Mon'),
to_char(to_Date(edt.timesheet_date),'mm')
Im trying to count the days where the Shift is equal to 'OFF', but it doesnt display records when count is zero or when a month doesnt have an 'OFF' shift. How can I set the value to zero when count is null or when the month doesnt have an 'OFF' shift.
I can't understand your query without knowing the data model (and why in 2016 are you using the ancient (+) syntax for outer joins?!) but the general principle is simple. A group by query only returns rows for groups that have some data. For example:
select deptno, count(*)
from emp
group by deptno;
This will only returns rows for departments that have at least one employee. But you say "I want to see all departments, even if they have no employees". Well, the EMP table doesn't contain all the departments, but table DEPT does. So we can use an outer join (I'll use ANSI syntax) like this:
select d.deptno, count(e.empno)
from dept d
left join emp e on e.deptno = d.deptno
group by d.deptno;
Note that the driving table has changed from EMP, which doesn't have all the deptno values, to DEPT which does. If there are no employees for deptno=60 then the query will still return a row for that DEPT, thanks to the outer join.
Note also that I have used count(e.empno) not count(*), because I am trying to count employees in the department, not rows returned by the query prior to grouping. If I used count(*) then deptno=60 would return a count of 1 (because there is 1 DEPT row with deptno=60, outer joined to 0 EMP rows, resulting in one result).
If you understand this principle and your data model then you should be able to write an analogous query for your case.

Trying to Find the Solutions to Exercise 2 Chapter 17 in Sams Teach Yourself SQL in 24 Hours 5th Edition

Unfortunately, the 5th edition of Sams Teach Yourself SQL in 24 Hours is riddled with errors, but I like the format and method in which the book uses to teach you SQL from scratch. I was able to fix the errors on my own, having previous experience, but I have reached a wall in Chapter 17 for exercise 2.
The book provided incorrect solutions to the problems. I assumed it was just poor formatting (forgetting a parenthesis or something minor like that), so I played around with the formatting of the solutions until the query would go through. Alas, no success!
I'm using Microsoft SQL.
Here are the problems and the solutions the book provided for exercise 2:
Add another table called EMPLOYEE_PAYHIST_TBL that contains a large amount of pay history data. Use the table that follows to write the series of SQL statements to address the following problems.
Table
CREATE TABLE EMPLOYEE_PAYHIST_TBL
(PAYHIST_ID VARCHAR(9) NOT NULL primary key,
EMP_ID VARCHAR(9) NOT NULL,
START_DATE DATETIME NOT NULL,
END_DATE DATETIME,
PAY_RATE DECIMAL(4,2) NOT NULL,
SALARY DECIMAL(8,2) NOT NULL,
BONUS DECIMAL(8,2) NOT NULL,
CONSTRAINT EMP_FK FOREIGN KEY (EMP_ID) REFERENCES EMPLOYEE_TBL (EMP_ID));
I then inserted random values with existing employees in the database.
a. Find the SUM of the salaried versus nonsalaried employees by the year in which their pay started.
Their Solution
SELECT START_YEAR,
SUM(SALARIED) AS SALARIED,
SUM(HOURLY) AS HOURLY
FROM (SELECT YEAR(E.START_DATE) AS START_YEAR,
COUNT(E.EMP_ID) AS SALARIED,
0 AS HOURLY
FROM EMPLOYEE_PAYHIST_TBL E
INNER JOIN (SELECT MIN(START_DATE) START_DATE,
EMP_ID
FROM EMPLOYEE_PAYHIST_TBL
GROUP BY EMP_ID) F
ON E.EMP_ID = F.EMP_ID
AND E.START_DATE = F.START_DATE
WHERE E.SALARY > 0.00
GROUP BY YEAR(E.START_DATE)
UNION
SELECT YEAR(E.START_DATE) AS START_YEAR,
0 AS SALARIED,
COUNT(E.EMP_ID) AS HOURLY
FROM EMPLOYEE_PAYHIST_TBL E
INNER JOIN (SELECT MIN(START_DATE) START_DATE,
EMP_ID
FROM EMPLOYEE_PAYHIST_TBL
GROUP BY EMP_ID) F
ON E.EMP_ID = F.EMP_ID
AND E.START_DATE = F.START_DATE
WHERE E.PAY_RATE > 0.00
GROUP BY YEAR(E.START_DATE)) A
GROUP BY START_YEAR
ORDER BY START_YEAR;
I've tried the code, tried to fix it myself, but no avail. Looking at this coding only confused me more on coming up with a solution myself. I do understand that they want columns called Start_Year, Salaried, and Hourly; that can be understood from the question and their own coding.
After going through this nightmare, I don't even want to try to see if their solutions for the others problems will work. I took a peek at the other solutions and they seem to have the same, problematic formatting.
b. Find the difference in the yearly pay of salaried employees versus nonsalaried employees by the year in which their pay started. Consider the nonsalaried employees to be working full time during the year ( PAY_RATE * 52 * 40).
Their Solution
SELECT START_YEAR,
SALARIED AS SALARIED,
HOURLY AS HOURLY,
(SALARIED - HOURLY) AS PAY_DIFFERENCE
FROM (SELECT YEAR(E.START_DATE) AS START_YEAR,
AVG(E.SALARY) AS SALARIED,
0 AS HOURLY
FROM EMPLOYEE_PAYHIST_TBL E
INNER JOIN (SELECT MIN(START_DATE) START_DATE,
EMP_ID
FROM EMPLOYEE_PAYHIST_TBL
GROUP BY EMP_ID) F
ON E.EMP_ID = F.EMP_ID
AND E.START_DATE = F.START_DATE
WHERE E.SALARY > 0.00
GROUP BY YEAR(E.START_DATE)
UNION
SELECT YEAR(E.START_DATE) AS START_YEAR,
0 AS SALARIED,
AVG(E.PAY_RATE * 52 * 40 ) AS HOURLY
FROM EMPLOYEE_PAYHIST_TBL E
INNER JOIN (SELECT MIN(START_DATE) START_DATE,
EMP_ID
FROM EMPLOYEE_PAYHIST_TBL
GROUP BY EMP_ID) F
ON E.EMP_ID = F.EMP_ID
AND E.START_DATE = F.START_DATE
WHERE E.PAY_RATE > 0.00
GROUP BY YEAR(E.START_DATE)) A
GROUP BY START_YEAR
ORDER BY START_YEAR;
c. Find the difference in what employees make now versus what they made when they started with the company. Again, consider the nonsalaried employees to be full-time. Also consider that the employees’ current pay is reflected in the EMPLOYEE_PAY_TBL as well as the EMPLOYEE_PAYHIST_TBL . In the pay history table, the current pay is reflected as a row with the END_DATE for pay equal to NULL.
Their Solution
SELECT CURRENTPAY.EMP_ID,
STARTING_ANNUAL_PAY,
CURRENT_ ANNUAL_PAY,
CURRENT_ANNUAL_PAY - STARTING_ANNUAL_PAY AS PAY_DIFFERENCE
FROM (SELECT EMP_ID,(SALARY + (PAY_RATE * 52 * 40)) AS CURRENT_ANNUAL_PAY
FROM EMPLOYEE_PAYHIST_TBL
WHERE END_DATE IS NULL) CURRENTPAY
INNER JOIN
(SELECT E.EMP_ID,
(SALARY + (PAY_RATE * 52 * 40)) AS STARTING_ANNUAL_PAY
FROM EMPLOYEE_PAYHIST_TBL E
(SELECT MIN(START_DATE) START_DATE,
EMP_ID
FROM EMPLOYEE_PAYHIST_TBL
GROUP BY EMP_ID) F
ON E.EMP_ID = F.EMP_ID
AND E.START_DATE = F.START_DATE) STARTINGPAY
ON CURRENTPAY.EMP_ID = STARTINGPAY.EMP_ID;
I was fine with the errors in this book, easily able to fix them... But this problem is making me want to rage at the authors for leaving so many errors!
Edit:
I would really like help in solving these problems. As an added note, the solutions don't need to follow what they provided; it can be a new way of approaching the solution. I just need help understanding how I can approach the solution.
Edit:
I want to apologize! The coding works now for some reason and I believe it has to do with the database now registering the EMPLOYEE_PAYHIST_TBL. I had issues yesterday where the database acted like the table didn't exist (it had it underlined in red and said it didn't exist and I most definitely checked the spelling), but I was able to do simple queries on it despite the issue. I guess a complicated query caused it to fuss more. It was saying that fields that I clearly did have in the table did not exist.
It's now registering the EMPLOYEE_PAYHIST_TBL and its columns, after I started up my computer today, and allowed the queries for every problem to go through... Very unusual...
I'm sorry for not stating this, I was going with the idea that it's purely syntax errors before, but it was issues with the database afterall.
What could have caused the database to not register the table before? Cache issues?