How To Select data from multiple tables with grouping for duplicates - sql

I have Two Tables, one with Employees Details and another with vacations taken by them in different years.Please check this image for the tables
Here as you'll find out in the vacation table, for the same employee with same employeeId and in sam year different vacation days are mentioned. Like John Smith in 2011 have two entries one with 10 vacation and one with 3 vacation. I want my query to return a single row with vacations mentioned as 13.
I tried the following query but no luck
SELECT Employee_Details.EmployeeId, Employee_Details.EmployeeName, Employees_Vacation.Year, Employees_Vacation.Vacation, Employee_Details.Department
FROM Employees_Vacation INNER JOIN Employee_Details ON Employees_Vacation.EmployeeId=Employee_Details.EmployeeId group by Employee_Details.EmployeeId ORDER BY Employee_Details.EmployeeName, Employees_Vacation.Year ;

if i understood you right, i think this may help you
select sum(vacation) as sum, ev.year, ed.EmployeeName from employee_Details as ed inner join employee_Vacation as ev
on ed.employeeID = ev.employeeID
group by ev.year, ed.EmployeeName

A lot here will depend on the sql engine you are using, however there are some things that will apply regardless of the engine to consider:
Your current GROUP BY clause is grouping only by employeeId - from the question text it seems like you are instead looking for results grouped by employee AND vacation year
Your projection (SELECT statement) currently isn't actually aggregating anything - it's just projecting a bunch of fields. On some db engines, this actually isn't even allowed (SQL Server for example will only allow grouped or aggregated columns in the projection). Again, from the question text it seems like you are looking for the SUM of vacation days per employee and year.
Taking these into account and assuming the assumptions made are accurate, something like the following should work in most/all modern RDBMS's:
SELECT Employee_Details.EmployeeId,
Employee_Details.EmployeeName,
Employees_Vacation.Year,
SUM(Employees_Vacation.Vacation) AS TotalVacationDays,
Employee_Details.Department
FROM Employees_Vacation
INNER JOIN Employee_Details
ON Employees_Vacation.EmployeeId = Employee_Details.EmployeeId
GROUP BY
Employee_Details.EmployeeId, Employee_Details.EmployeeName,
Employees_Vacation.Year, Employee_Details.Department
ORDER BY
Employee_Details.EmployeeName,
Employee_Details.EmployeeId,
Employees_Vacation.Year;
You may be able to get away with fewer grouping clauses in some engines (MySql for example). Additionally I added an EmployeeId to the order by clause to ensure records for the same employee remain together in the results (for employees with the same names for example).

Related

Counting Unique IDs In a LEFT/RIGHT JOIN Query in Access

I am working on a database to track staff productivity. Two of the ways we do that is by monitoring the number of orders they fulfil and by tracking their error rate.
Each order they finish is recorded in a table. In one day they can complete many orders.
It is also possible for a single order to have multiple errors.
I am trying to create a query that provides a summary of their results. This query should have one column with "TotalOrders" and another with "TotalErrors".
I connect the two tables with a LEFT/RIGHT join since not all orders will have errors.
The problem comes when I want to total the number of orders. If someone made multiple mistakes on an order, that order gets counted multiple times; once for each error.
I want to modify my query so that when counting the number of orders it only counts records with distinct OrderID's; yet, in the same query, also count the total errors without losing any.
Is this possible?
Here is my SQL
SELECT Count(tblTickets.TicketID) AS TotalOrders,
Count(tblErrors.ErrorID) AS TotalErrors
FROM tblTickets
LEFT JOIN tblErrors ON tblTickets.TicketID = tblErrors.TicketID;
I have played around with SELECT DISTINCT and UNION but am struggling with the correct syntax in Access. Also, a lot of the examples I have seen are trying to total a single field rather than two fields in different ways.
To be clear when totalling the OrderCount field I want to only count records with DISTINCT TicketID's. When totalling the ErrorCount field I want to count ALL errors.
Ticket = Order.
Query Result: Order Count Too High
Ticket/Order Table: Total of 14 records
Error Table: You can see two errors for the same order on 8th
do a query that counts orders by staff from Orders table and a query that counts errors by staff from Errors table then join those two queries to Staff table (queries can be nested for one long SQL statement)
correlated subqueries
SELECT Staff.*,
(SELECT Count(*) FROM Orders WHERE StaffID = Staff.ID) AS CntOrders,
(SELECT Count(*) FROM Errors WHERE StaffID = Staff.ID) AS CntErrors
FROM Staff;
use DCount() domain aggregate function
Option 1 is probably the most efficient and option 3 the least.

Microsoft SQL server select statements on multiple tables?

so I've been struggling with some of the select statements on multiple tables:
Employee table
Employee_ID
First_Name
Last_Name
Assignment table
Assignment_ID
Employee_ID
Host_Country_arrival_Date
Host_Country_departure_Date
Scheduled_End_Date
I'm being asked to display query to display employee full name, number of days between the host country arrival date and host country departure date, number of days between today's date and the assignment scheduled end date and the results sorted according to host country arrival date with the oldest date on top.
also, I'm not familiar with the sort function in SQL server..
Here's my query and I've been getting syntax errors:
SELECT
First_Name
Last_Name
FROM Employee
SELECT
Host_Country_Arrival_Date
Host_Country_Departure_Date
FROM Assignment;
So, Basically what your code is doing is 2 different queries. The first getting all the employees names, and the second one getting the dates of the assignments.
What you'll want to do here is take advantage of the relationship between the tables using a JOIN. That is basically saying "Give me all employees and all of HIS/HERS assignments". So, for each assignment that the employee has, it will bring a row in the result with his name and the assignment info.
To get the difference between days you use DATEDIFF passing 3 parameters, the timespan in which to calculate the difference, the first and the second date. It will then Subtract the first one from the second one and give you the result in the selected timespan.
And finnaly the sorting: Just add 'ORDER BY' followed by each column that you want to use for ordering and then specify if you want it ascending (ASC) or descending (DESC).
You can check how I would answer the if that question was proposed to me in a coding challenge.
SELECT
CONCAT(E.First_Name,' ', E.Last_Name) FullName,
DATEDIFF(DAY,Scheduled_End_Date,getdate()) DaysTillScheduledDate,
DATEDIFF(DAY,Host_Country_Arrival_Date,Host_Country_Departure_Date) DaysTillScheduledDate
FROM Employee As E --Is nice to add aliases
Inner Join
Assignment As A
on E.Employee_ID = A.Employee_ID -- Read a little bit about joins, there are a lot of material availabel an its going to be really necessary moving forward with SQL
order by Host_Country_Arrival_Date DESC -- Just put the field that you want to order by here, desc indicates that it should be descending
You should use a JOIN to link the tables together on Employee_ID:
SELECT
First_Name,
Last_Name,
Host_Country_Arrival_Date,
Host_Country_Departure_Date
FROM Employee
JOIN Assignment ON Assignment.Employee_ID = Employee.Employee_ID;
What this is saying basically is that for each employee, go out to the assignments table and add the assignments for that employee. If there are multiple assignments, the employee columns will be repeated on each row with the assignment columns for the assignment.
You need to look for the join and group by. Please find this link for reference tutorial
For now you may try this...
SELECT
CONCAT(Emp.First_Name,' ', Emp.Last_Name) FullName,
DATEDIFF(DAY,Scheduled_End_Date,getdate()) DaysTillScheduledDate,
DATEDIFF(DAY,Host_Country_Arrival_Date,Host_Country_Departure_Date) DaysTillScheduledDate
FROM Employee As Emp Inner Join Assignment As Assign on Emp.Employee_ID = Assign.Employee_ID
order by Host_Country_Arrival_Date DESC

Ms-Access: counting from 2 tables

I have two tables in a Database
and
I need to retrieve the number of staff per manager in the following format
I've been trying to adapt an answer to another question
SELECT bankNo AS "Bank Number",
COUNT (*) AS "Total Branches"
FROM BankBranch
GROUP BY bankNo
As
SELECT COUNT (*) AS StaffCount ,
Employee.Name AS Name
FROM Employee, Stafflink
GROUP BY Name
As I look at the Group BY I'm thinking I should be grouping by The ManID in the Stafflink Table.
My output with this query looks like this
So it is counting correctly but as you can see it's far off the output I need to get.
Any advice would be appreciated.
You need to join the Employee and Stafflink tables. It appears that your FROM clause should look like this:
FROM Employee INNER JOIN StaffLink ON Employee.ID = StaffLink.ManID
You have to join the Eployee table twice to get the summary of employees under manager
select count(*) as StaffCount,Manager.Name
from Employee join Stafflink on employee.Id = StaffLink.EmpId
join Employee as Manager on StaffLink.ManId = Manager.Id
Group by Manager.Name
The answers that advise you on how to join are correct, assuming that you want to learn how to use SQL in MS Access. But there is a way to accomplish the same thing using the ACCESS GUI for designing queries, and this involves a shorter learning curve than learning SQL.
The key to using the GUI when more than one table is involved is to realize that you have to define the relationships between tables in the relationship manager. Once you do that, designing the query you are after is a piece of cake, just point and click.
The tricky thing in your case is that there are two relationships between the two tables. One relationship links EmpId to ID and the other links ManId to ID.
If, however, you want to learn SQL, then this shortcut will be a digression.
If you don't specify a join between the tables, a so called Cartesian product will be built, i.e., each record from one table will be paired with every record from the other table. If you have 7 records in one table and 10 in the other you will get 70 pairs (i.e. rows) before grouping. This explains why you are getting a count of 7 per manager name.
Besides joining the tables, I would suggest you to group on the manager id instead of the manager name. The manager id is known to be unique per manager, but not the name. This then requires you to either group on the name in addition, because the name is in the select list or to apply an aggregate function on the name. Each additional grouping slows down the query; therefore I prefer the aggregate function.
SELECT
COUNT(*) AS StaffCount,
FIRST(Manager.Name) AS ManagerName
FROM
Stafflink
INNER JOIN Employee AS Manager
ON StaffLink.ManId = Manager.Id
GROUP BY
StaffLink.ManId
I don't know if it makes a performance difference, but I prefer to group on StaffLink.ManId than on Employee.Id, since StaffLink is the main table here and Employee is just used as lookup table in this query.

SQL Maximum number of doctors in a department

my problem is this:
I have a table named
Doctor(id, name, department)
and another table named
department(id, name).
a Doctor is associated with a department (only one department, not more)
I have to do a query returning the department with the maximum number of doctors associated with it.
I am not sure how to proceed, I feel like I need to use a nested query but I just started and I'm really bad at this.
I think it should be something like this, but again I'm not really sure and I can't figure out what to put in the nested query:
SELECT department.id
FROM (SELECT FROM WHERE) , department d, doctor doc
WHERE doc.id = d.id
A common approach to the "Find ABC with the maximum number of XYZs" problem in SQL is as follows:
Query for a list of ABCs that includes each ABC's count of XYZs
Order the list in descending order according to the count of XYZs
Limit the result to a single item; that would be the top item that you need.
In your case, you can do it like this (I am assuming MySQL syntax for taking the top row):
SELECT *
FROM department dp
ORDER BY (SELECT COUNT(*) FROM doctor d WHERE d.department_id=dp.id) DESC
LIMIT 1
You can use Group BY
Select top (1) department.id,count(Doctor.*) as numberofDocs
from department inner join Doctor on Doctor.id = department.id
Group by department.id
Order by count(Doctor.*) desc
I generally avoid using sub queries in MySQL due to a well known bug in MySQL. Due to the bug, MySQL executes the inner query for every single outer query result. Therefore, if you have 10 departments, then doctor query would be executed 10 times. The bug may have been fixed in MySQL 5.6. In this particular case the number of departments may not be large, therefore performance may not be your main concern. However, the following solution should work for MySQL and much more optimized. The answer by dasblinkenlight is almost the same, just got ahead of me :). But MySQL does not support the command top.
select dep.id, dep.name, count(doc.id) as dep_group_count from Doctor doc join department dep on doc.department = dep.id group by doc.department order by dep_group_count DESC LIMIT 1

Need help in understanding JOINS in SQL

I was asked the below SQL question in an interview. Kindly explain how it works and what join it is.
Q: There are two tables: table emp contains 10 rows and table department contains 12 rows.
Select * from emp,department;
What is the result and what join it is?
It would return the Cartesian Product of the two tables, meaning that every combination of emp and department would be included in the result.
I believe that the next question would be:
Blockquote
How do you show the correct department for each employee?
That is, show only the combination of emp and department where the employee belongs to the department.
This can be done by:
SELECT * FROM emp LEFT JOIN department ON emp.department_id=department.id;
Assuming that emp has a field called department_id, and department has a matching id field (This is quite standard in these type of questions).
The LEFT JOIN means that all items from the left side (emp) will be included, and each employee will be matched with the corresponding department. If no matching department is found, the resulting fields from departments will remain empty. Note that exactly 10 rows will be returned.
To show only the employees with valid department IDs, use JOIN instead of LEFT JOIN. This query will return 0 to 10 rows, depending on the number of matching department ids.
The join you specified is a cross join. It will produce one row for each combination of records in the tables being joined.
I'll let you do the math from there.
This will do a cross join I believe, returning 120 rows. One row for each pair-wise combination of rows from each of the two tables.
All-in-all a fairly useless join most of the time.
You will get all rows from both tables with each row joined together.
This is known as a Cartesian join and is very bad.
You will get a total of 120 rows.
This is also the old implied syntax (18 yeasr out of date) and accidental cross joins are a common problem with this syntax. One should never use it. Explict joins are a better choice. I would have also mentioned this in an interview and explained why. I also would not have taken the job if they actually used crappy syntax like this because it's very use shows me the database is very likely to be poorly designed.