Historical Data Modeling - sql

In AdventureWorks2008R2 the Sales.SalesPerson table contains a TerritoryID which creates an easy reference to the current Territory assigned to a SalesPerson. The Sales.SalesTerritoryHistory table is also available to analyze past assignments.
I noticed the HumanResources.EmployeeDepartmentHistory follows a similar pattern; however, the HumanResources.Employee table does not have a direct reference to the current Department. In other words, there is no DepartmentID on the HumanResources.Employee table.
Is there a good reason why they wouldn't follow the same pattern?

It's most likely the employee can work in multiple departments at the same time and that's why it is set up that way

Related

Ms-Access: counting from 2 tables

I have two tables in a Database
and
I need to retrieve the number of staff per manager in the following format
I've been trying to adapt an answer to another question
SELECT bankNo AS "Bank Number",
COUNT (*) AS "Total Branches"
FROM BankBranch
GROUP BY bankNo
As
SELECT COUNT (*) AS StaffCount ,
Employee.Name AS Name
FROM Employee, Stafflink
GROUP BY Name
As I look at the Group BY I'm thinking I should be grouping by The ManID in the Stafflink Table.
My output with this query looks like this
So it is counting correctly but as you can see it's far off the output I need to get.
Any advice would be appreciated.
You need to join the Employee and Stafflink tables. It appears that your FROM clause should look like this:
FROM Employee INNER JOIN StaffLink ON Employee.ID = StaffLink.ManID
You have to join the Eployee table twice to get the summary of employees under manager
select count(*) as StaffCount,Manager.Name
from Employee join Stafflink on employee.Id = StaffLink.EmpId
join Employee as Manager on StaffLink.ManId = Manager.Id
Group by Manager.Name
The answers that advise you on how to join are correct, assuming that you want to learn how to use SQL in MS Access. But there is a way to accomplish the same thing using the ACCESS GUI for designing queries, and this involves a shorter learning curve than learning SQL.
The key to using the GUI when more than one table is involved is to realize that you have to define the relationships between tables in the relationship manager. Once you do that, designing the query you are after is a piece of cake, just point and click.
The tricky thing in your case is that there are two relationships between the two tables. One relationship links EmpId to ID and the other links ManId to ID.
If, however, you want to learn SQL, then this shortcut will be a digression.
If you don't specify a join between the tables, a so called Cartesian product will be built, i.e., each record from one table will be paired with every record from the other table. If you have 7 records in one table and 10 in the other you will get 70 pairs (i.e. rows) before grouping. This explains why you are getting a count of 7 per manager name.
Besides joining the tables, I would suggest you to group on the manager id instead of the manager name. The manager id is known to be unique per manager, but not the name. This then requires you to either group on the name in addition, because the name is in the select list or to apply an aggregate function on the name. Each additional grouping slows down the query; therefore I prefer the aggregate function.
SELECT
COUNT(*) AS StaffCount,
FIRST(Manager.Name) AS ManagerName
FROM
Stafflink
INNER JOIN Employee AS Manager
ON StaffLink.ManId = Manager.Id
GROUP BY
StaffLink.ManId
I don't know if it makes a performance difference, but I prefer to group on StaffLink.ManId than on Employee.Id, since StaffLink is the main table here and Employee is just used as lookup table in this query.

Retrieving "views" in Access 2010

My SQL Code:
SELECT EmpLastName +', '+ EmpFirstName AS ProgramSupervisorName, TeamNo
FROM Employee, Salary, ProgramSupervisor
WHERE Employee.EmpNo = Salary.EmpNo
AND Salary.EmpNo = ProgramSupervisor.EmpNo
ORDER BY TeamNo
I realize that Access doesn't support creating views. The problem I'm running into is that I'm looking to group employees by ProgramSupervisor name but program supervisors and employees are both part of the employee table. The different types of employees are differentiated by their PosNo and employees belong to ProgramSupervisors through a series of tables (Hourly --> ISL <--- ProgramSupervisor). That being said, I can't reference ProgramSupervisorName and EmpName AS a renamed field in just one SELECT Statement because they are from the same table, differentiated by their positionNo. I was hoping I could create a query or "view", like the code above, that takes care of pulling the ProgramSupervisor name from the employee table then using that view in another query. In the other query I could then use the code : "EmpLastName +', '+ EmpFirstName AS EmpName." The pages I've searched online are too vague for my limited understanding so please explain to me simply. I'll try to clarify any confusion also. I'll include my ERD so you can see where I'm coming from:
EDIT: query so far
SELECT EmpLastName +', '+ EmpFirstName AS EmpName, ProgramSupervisorName,
ProgSupName.TeamNo
FROM Employee, ISL, Hourly, Salary, ProgramSupervisor, ProgSupName
WHERE Employee.EmpNo = Hourly.EmpNo
AND Hourly.ISLNo = ISL.ISLNo
AND Employee.EmpNo = Salary.EmpNo
AND Salary.EmpNo = ProgramSupervisor.EmpNo
AND ProgramSupervisor.EmpNo = ISL.ProgramSupervisor_EmpNo
ORDER BY ProgSupName.TeamNo
Do I need to relate all of these tables if I've done them already in the ProgSupName query?
I realize that Access doesn't support creating views.
For the record, Access does support the creation of saved Queries, which are just "Views" by another name. (In fact, the OLEDB variant of Access SQL DDL does support CREATE VIEW as a way of creating a saved Query.)
The key to answering this question lies in two parts:
The first is using an Access stored query like a view. That has been answered here.
The second is referencing the same table twice in a query using different column or table aliases. That has been asnwered in the correct answer you got to your other question, asked here:
Displaying the same fields AS different names from the same table -Access 2010
Combine those two answers, and you have the solution.

How to create views against a hierarchical table

I am learning how to create view using SQL server. I am trying to add a view called Managers in the Northwind database that shows only employees that supervise other employees. This is what I have so far.
Create View Manager_vw
As
Select LastName,
FirstName,
EmployeeID
From Employees
Where
What I am stuck on is how and I going to put in supervise other employees. I am not to sure how to do this. If someone can help me understand how to do this.
In northwind.dbo.employees you would find employees who supervise other employees by looking the reportsto column. Basically you want to return employees whose id is in the reportsto column in another row. That can be done like this:
SELECT LastName,
FirstName,
EmployeeID
FROM employees E
WHERE EXISTS(SELECT * FROM Employees WHERE reportsTo = E.EmployeeID)
The EXISTS is like a JOIN but is usually implemented as a "semi-join" which will stop processing after it finds a singe match (rather than finding all the subordinate employees which would take extra work) Because it doesn't return any additional records, you also save the cost of the additional step to eliminate duplicates (a JOIN would do more work to process the join, and even more work to undo the work that wasn't necessary by doing a DISTINCT.)
You'll notice that I reference E.EmployeeID in the subquery, which relates the subquery to the outer query, this is called a Correlated Subquery.
A word of caution: Views have their place in a DB but can easily be misused. When an engineer comes to the database from an OO background, views seem like a convenient way to promote inheritance and reusability of code. Often people eventually find themselves in a position where they have nested views joined to nested views of nested views. SQL processes nested views by essentially taking the definition of each individual view and expanding that into a beast of a query that will make your DBA cry.
Also, you followed excellent practice in your example and I encourage you to continue this. You specified all your columns individually, never ever use SELECT * to specify the results of your view. It will, eventually, ruin your day. You'll see I do have a SELECT * in my EXISTS clause but EXISTS does not return a resultset and the optimizer will ignore that in that specific case.
Here's another option:
SELECT DISTINCT manager_tbl.*
FROM Employees AS staff_tbl
JOIN Employees AS manager_tbl
ON staff_tbl.ReportsTo = manager_tbl.EmployeeID
Adapted from this site. There are a number of example queries there that you might find interesting and useful.
Notes:
Using the DISTINCT keyword because a single manager could have more than one direct report. DISTINCT will omit the repetition caused by such a one-to-many relationship.
The Employees table in the Northwind database is an example of a hierarchical relationship modeled in a single table.
All together:
CREATE VIEW Manager_vw
AS
SELECT DISTINCT manager_tbl.*
FROM Employees AS staff_tbl
JOIN Employees AS manager_tbl
ON staff_tbl.ReportsTo = manager_tbl.EmployeeID

Priority in SQL queries

Three tables:
Employees {EmployeeID, EmployeeName}
Courses {CourseID, CourseName}
TrainingLog {LogID, EmployeeID, CourseID)
There's one position left in a course and the priority is to give it to someone who's never done a course before, then someone who has done courses but not this one, then anyone. The course must be filled.
Can the appropriate employee be found in one query?
Ya. It can be done.
Join Employees with TrainingLog on the courseId in question.
Some number of employees will not match yielding a NULL courseid.
order the set by DESC with NULLs first in the list.
rank them by rownum, and pick the first row.
if there is no first row, then there is noone to assign to the class.
otherwise,
if the first one in the list will be the person to assign.
base on the problem you want to resolve this could not be possible if you'll try to do it on query alone you have to work the logic with your application as well perhaps you'd love to create a stored procedure for this kind of conditions.

SQL - Updating records based on most recent date

I am having difficulty updating records within a database based on the most recent date and am looking for some guidance. By the way, I am new to SQL.
As background, I have a windows forms application with SQL Express and am using ADO.NET to interact with the database. The application is designed to enable the user to track employee attendance on various courses that must be attended on a periodic basis (e.g. every 6 months, every year etc.). For example, they can pull back data to see the last time employees attended a given course and also update attendance dates if an employee has recently completed a course.
I have three data tables:
EmployeeDetailsTable - simple list of employees names, email address etc., each with unique ID
CourseDetailsTable - simple list of courses, each with unique ID (e.g. 1, 2, 3 etc.)
AttendanceRecordsTable - has 3 columns { EmployeeID, CourseID, AttendanceDate, Comments }
For any given course, an employee will have an attendance history i.e. if the course needs to be attended each year then they will have one record for as many years as they have been at the company.
What I want to be able to do is to update the 'Comments' field for a given employee and given course based on the most recent attendance date. What is the 'correct' SQL syntax for this?
I have tried many things (like below) but cannot get it to work:
UPDATE AttendanceRecordsTable
SET Comments = #Comments
WHERE AttendanceRecordsTable.EmployeeID = (SELECT EmployeeDetailsTable.EmployeeID FROM EmployeeDetailsTable WHERE (EmployeeDetailsTable.LastName =#ParameterLastName AND EmployeeDetailsTable.FirstName =#ParameterFirstName)
AND AttendanceRecordsTable.CourseID = (SELECT CourseDetailsTable.CourseID FROM CourseDetailsTable WHERE CourseDetailsTable.CourseName =#CourseName))
GROUP BY MAX(AttendanceRecordsTable.LastDate)
After much googling, I discovered that MAX is an aggregate function and so I need to use GROUP BY. I have also tried using the HAVING keyword but without success.
Can anybody point me in the right direction? What is the 'conventional' syntax to update a database record based on the most recent date?
So you want to update the AttendantsRecordsTable, and set the comment to the comment in the most recent CourseDetailsTable for each employee?
UPDATE
dbo.AttendanceRecordsTable
SET
Comments = #Comments
FROM
CourseDetailsTable cd
INNER JOIN
Employee e ON e.EmployeeID = AttendanceRecordTable.EmployeeID
WHERE
e.LastName = #LastName
AND e.FirstName = #FirstName
AND cd.CourseName = #CourseName
AND AttendanceRecordsTable.CourseID = cd.CourseID
AND AttendanceRecordsTable.LastDate =
(SELECT MAX(LastDate)
FROM AttendanceRecordsTable a
WHERE a.EmployeeID = e.EmployeeID
AND a.CourseID = cd.CourseID)
I think something like that should work.
You basically need to do a join between the AttendanceRecordTable, which you want to update, and the Employee and CourseDetailsTable tables. For these two, you have defined certain parameters to select a single row each, and then you need to make sure to update only that last AttendanceRecordTable entry which you do by making sure it's the MAX(LastDate) of the table.
The subselect here:
(SELECT MAX(LastDate)
FROM AttendanceRecordsTable a
WHERE a.EmployeeID = e.EmployeeID AND a.CourseID = cd.CourseID)
will select the MAX (last) of the LastDate entries in AttendanceRecordsTable, based on selection of a given employee (e.EmployeeID) and a given course (cd.CourseID).
Pair that with the selects to select the single employee by first name and last name (that of course only works if you never have two John Miller in your employee table!). You also select the course by means of the course name, so that too must be unique - otherwise you'll get multiple hits in the course table.
Marc
Assuming that you primary key on the AttendanceRecordsTable is id:
UPDATE AttendanceRecordsTable SET Comments = #Comments
WHERE AttendanceRecordsTable.id = (
SELECT AttendanceRecordsTable.id
FROM EmployeeDetailsTable
JOIN AttendanceRecordsTable ON AttendanceRecordsTable.EmployeeID = EmployeeDetailsTable.EmployeeID·
JOIN CourseDetailsTable ON AttendanceRecordsTable.CourseID = CourseDetailsTable.CourseID
WHERE
EmployeeDetailsTable.LastName =#ParameterLastName AND EmployeeDetailsTable.FirstName =#ParameterFirstName AND
CourseDetailsTable.CourseName =#CourseName
ORDER BY AttendanceRecordsTable.LastDate DESC LIMIT 1)
Basically, that sub select will first join the attendence, employee and coursedetail tables, extract those rows where the employee's and course details' name match those given by your parameters and limit the output in reverted order to one line. You might want to test that sub-select statement first.
Edit: I just read your posting again, you don't have a single primary key column on AttendanceRecordsTable. Bummer.