How to 'add' a column to a query result while the query contains aggregate function? - sql

I have a table named 'Attendance' which is used to record student attendance time in courses. This table has 4 columns, say 'id', 'course_id', 'attendance_time', and 'student_name'. An example of few records in this table is:
23 100 1/1/2010 10:00:00 Tom
24 100 1/1/2010 10:20:00 Bob
25 187 1/2/2010 08:01:01 Lisa
.....
I want to create a summary of the latest attendance time for each course. I created a query below:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id
The result would be something like this
100 1/1/2010 10:20:00
187 1/2/2010 08:01:01
Now, all I want to do is add the 'id' column to the result above. How to do it?
I can't just change the command to something like this
SELECT id, course_id, max(attendance_time) FROM attendance GROUP BY id, course_id
because it would return all the records as if the aggregate function is not used. Please help me.

This is a typical 'greatest per group', 'greatest-n-per-group' or 'groupwise maximum' query that comes up on Stack Overflow almost every day. You can search Stack Overflow for these terms to find many different examples of how to solve this with different databases. One way to solve it is as follows:
SELECT
T2.course_id,
T2.attendance_time
T2.id
FROM (
SELECT
course_id,
MAX(attendance_time) AS attendance_time
FROM attendance
GROUP BY course_id
) T1
JOIN attendance T2
ON T1.course_id = T2.course_id
AND T1.attendance_time = T2.attendance_time
Note that this query can in theory return multiple rows per course_id if there are multiple rows with the same attendance_time. If that cannot happen then you don't need to worry about this issue. If this is a potential problem then you can solve this by adding an extra grouping on course_id, attendance_time and selecting the minimum or maximum id.

What do you need the additional column for? It already has a course ID, which identifies the data. A synthetic ID to the query would be useless because it does not refer to anything. If you want to get the max from the query results for a single course, then you can add a where condition like this:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id **WHERE course_id = your_id_here**;
If you mean that the column should be named 'id', you can alias it in the query:
SELECT course_id **AS id**, max(attendance_time) FROM attendance GROUP BY course_id;
You could make a view out of your query to easily access the aggregate data:
CREATE VIEW max_course_times AS SELECT course_id AS id, max(attendance_time) FROM attendance GROUP BY course_id;
SELECT * FROM max_course_times;

For SQL Server 2008 onwards, I like to use a Common Table Expression to add aggregated columns to queries:
WITH AttendanceTimes (course_id, maxTime)
AS
(
SELECT
course_id,
MAX(attendance_time)
FROM attendance
GROUP BY course_id
)
SELECT
a.course_id,
t.maxTime,
a.id
FROM attendance a
INNER JOIN AttendanceTimes t
ON a.course_id = t.course_id

Related

Show unique ID's in a table with all extra info

SELECT Personeelsnummer, Achternaam, Voornaam, Departement, SubDep, SubSubDep, FTE, RedenUitDienst, Anciennitëitsdatum, GeldigOp, Schrapping, Ancienniteit, Positie, Nieveau, OmschrijfingStatuut
FROM tbl_Worker
GROUP BY Personeelsnummer
OR
SELECT (DISTINCT Personeelsnummer), Achternaam, Voornaam, Departement, SubDep, SubSubDep, FTE, RedenUitDienst, Anciennitëitsdatum, GeldigOp, Schrapping, Ancienniteit, Positie, Nieveau, OmschrijfingStatuut
FROM tbl_Worker
GROUP BY Personeelsnummer
I have a worker table with 49000 records, this includes a 'snapshot' from all workers EVERY month. But what I need is a table with all employees the company 'ever' had but only once. so I tried to wright the query's show above but they are not working.
So what I need is a query that shows all unique 'Personeelsnummers' with all the extra information about these persons.
what does work is this: SELECT DISTINCT Personeelsnummer FROM tbl_Worker ==> this gives me a table with 1200 records but only the numbers but I need all the extra information.
Instead of GROUP BY, use WHERE to get the first or last record:
SELECT w.*
FROM tbl_Worker as w
WHERE monthcol = (SELECT MAX(w2.monthcol)
FROM tbl_Worker as w2
WHERE w2.Personeelsnummer = w.Personeelsnummer
);
You would use MIN() to get the first month's record. My Dutch is a bit weak, so I don't know which column refers to the date for the record.
For performance, you want an index on tbl_Worker(Personeelsnummer, GeldigOp):
create index idx_tbl_worker_Personeelsnummer_GeldigOp on tbl_Worker(Personeelsnummer, GeldigOp);
EDIT:
Or you can do it with a JOIN:
SELECT w.*
FROM tbl_Worker as w INNER JOIN
(SELECT Personeelsnummer, MAX(GeldigOp) as max_GeldigOp
FROM tbl_Worker
GROUP BY Personeelsnummer
) as ww
ON ww.Personeelsnummer = w.Personeelsnummer and ww.max_GeldigOp = w.GeldigOp;
You're looking for a group by:
select *
from table
group by field1
Which can occasionally be written with a distinct on statement:
select distinct on field1 *
from table
As seen in this topic.

SQL Server select duplicated rows

I am newbie to SQL Server, and I want to select all those who changed their department at least once.
The table structure is:
BusinessEntityID
DepartmentID
ShiftID
StartDate
RateChangeDate
Rate
NationalIDNumber
I have the following code to generate an intermediate table
select distinct
DepartmentID, NationalIDNumber
from
Table
where
NationalIDNumber in (select NationalIDNumber
from Ben_VEmployee
group by NationalIDNumber
having count(NationalIDNumber) > 1)
Output:
DepartmentID NationalIDNumber
-----------------------------
1 112457891
2 112457891
4 24756624
4 895209680
5 24756624
5 895209680
7 259388196
My questions is: how to remove non-duplicate records in the intermediate table as above?
So record "7 - 259388196" should be removed because he did not change department.
Thanks.
Try using group by and comparing the maximum and minimum department. If it changed, then these will be different:
select NationalIDNumber
from Ben_VEmployee
group by NationalIDNumber
having min(DepartmentID) <> max(DepartmentID);
If you need the actual departments, you can join this back in to the original data.
If you want a list of every ID number that has been in more than one department, you can use
SELECT COUNT(DepartmentID) AS noDepartments
, NationalIDNumber
FROM Table
GROUP BY NationalIDNumber
HAVING COUNT(DepartmentID) > 1
If you want to delete the records for the deparment the employee used to be in, but isn't any more, than you'd have to know which department that was to know which record to delete! If you do know this, then say, and we can work it out.

Count number of cases

I have a table called Leaves which has Employee ID, Leave Type and Date. For example, If an employee with ID = 1234 applies for a sick leave from 1-June-2014 to 5-June-2014, this will be stored in Leave tables day by day, means that the following records will be added:
1234 sick leave 1-June-2014
1234 sick leave 2-June-2014
1234 sick leave 3-June-2014
1234 sick leave 4-June-2014
1234 sick leave 5-June-2014
This is considered as one case. To clarify what I mean by the case: The total cases is how many leave request had been applied… for example:
What I need is to get the following information by SQL statement (I should determine a period: 1-January-2014 to 30-December-2014, for example):
Sick leave cases: 2
Escort leave cases: 2
Study leave cases: 1
I am using PostgreSQL 9.2.
This design is a bit strange, because of these daily rows, if the same person will have several escort leaves, then you have to figure out different cases.
For this certain case you can use something like this
SELECT COUNT(*), leavetype
FROM (
SELECT leavetype
FROM Leaves
GROUP BY employee_id, leavetype
)
GROUP BY leavetype;
My suggestion is to use case_start and case_end dates for one case row.
Perhaps you can try this:
select leave_type, count(*)
from (
select employee_id, leave_type
from leaves
where date between ...
group by employee_id, leave_type) t
group by leave_type;
select LeaveType, count(EmployeeID) as TotalCases from(
select EmployeeID, LeaveType, count(LeaveType) as count_LeaveType
from Leaves
where Date BETWEEN '2007-02-01' AND '2007-02-31';
group by EmployeeID, LeaveType) as A
group by LeaveType, A.EmployeeID
Please find the SQLFiddle below.
SQL FIDDLE
SELECT leave_type,COUNT(leave_type) FROM
(SELECT leave_type,count(leave_type)
FROM leaves
WHERE Date BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY leave_type,emp_id) t
GROUP BY leave_type
Hope this solves your issue.
I am no sure how this will be done in PostgreSQL but in MySQL it can be achieve by following command :
QUERY
SELECT Leave type, count(Leave type)
FROM Leave
WHERE (Date BETWEEN '2014-01-01' AND '2014-12-30' )
GROUP BY Leave type

How to get records from both tables using ms access query

I have 2 Tables in Ms Access
tbl_Master_Employess
tbl_Emp_Salary
I want to show all the employees in the employee table linked with employee salary table
to link both table the id is coluqEmpID in both table
In the second table, I have a date column. I need a query which should fetch records from both tables using a particular date
I tried the following query:
select coluqEID as EmployeeID , colEName as EmployeeName,"" as Type, "" as Amt
from tbl_Master_Employee
union Select b.coluqEID as EmployeeID, b.colEName as EmployeeName, colType as Type, colAmount as Amt
from tbl_Emp_Salary a, tbl_Master_Employee b
where a.coluqEID = b.coluqEID and a.colDate = #12/09/2013#
However, it shows duplicates.
Query4
EmployeeID EmployeeName Type Amt
1 LAKSHMANAN
1 LAKSHMANAN Advance 100
2 PONRAJ
2 PONRAJ Advance 200
3 VIJAYAN
4 THIRUPATHI
5 VIJAYAKUMAR
6 GOVINDAN
7 TAMILMANI
8 SELVAM
9 ANAMALAI
10 KUMARAN
How would I rewrite my query to avoid duplicates, or what would be a different way to not show duplicates?
The problem with your query is that you are using union when what you want is a join. The union is first going to list all employees with the first part:
select coluqEID as EmployeeID , colEName as EmployeeName,"" as Type, "" as Amt
from tbl_Master_Employee
and then adds to that list all employee records where they have a salary with a certain date.
Select b.coluqEID as EmployeeID, b.colEName as EmployeeName, colType as Type,
colAmount as Amt
from tbl_Emp_Salary a, tbl_Master_Employee b
where a.coluqEID = b.coluqEID and a.colDate = #12/09/2013#
Is your goal to get a list of all employees and only display salary information for those who have a certain date? Some sample data would be useful. Assuming the data here: SQL Fiddle this query should create what you want.
Select a.coluqEID as EmployeeID, colEName as EmployeeName,
b.colType as Type, b.colAmount as Amt
FROM tbl_Master_Employees as a
LEFT JOIN (select coluqEID, colType, colAmount FROM tbl_EMP_Salary
where colDate = '20130912') as b ON a.coluqEID = b.coluqEID;
The first step is to create a select that will get you just the salaries that you want by date. You can then perform a join on this as if you were performing a separate query. You use a LEFT JOIN because you want all of the records from one side, the employees, and only the records that match your criteria from the second side, your salaries.
I believe you will need a join, however as to your question on Unique names.
select **DISTINCT** coluqEID as EmployeeID
Adding the distinct operator would give only uniquely returned results.

SQL query for the given table

I have 2 tables, Student and Supervisor:
STUDENT(supervisorid(pk),name,email....)
SUPERVISOR(supervisorid(pk),name,email....)
Now I need to print supervisor name, email and the # of students under the supervisor (they will have same supervisor id). Something like:
select supervisorname,
supervisoremail,
tot_stud as (select count(*)
Phd_Student s
where s.supervisor_id = r.supervisor_id)
from Phd_Supervisor r
Can you please tell me the SQL query for this.
You will want to use the group by clause for this query. You can specify all of the fields that you want to display, as well as the count(*), join the tables, relate the tables , and then put in your group by clause, listing all of the display fields,(without the count(*)), as those are the fields you are grouping the students by to get their count.
select supervisorname,
supervisoremail,
(select count(*)
from Phd_Student s
where s.supervisor_id = r.supervisor_id) as tot_stud
from Phd_Supervisor r