Slow MS Access Sub Query - sql

I have three tables in Access:
employees
----------------------------------
id (pk),name
times
----------------------
id (pk),employee_id,event_time
time_notes
----------------------
id (pk),time_id,note
I want to get the record for each employee record from the times table with an event_time immediately prior to some time. Doing that is simple enough with this:
select employees.id, employees.name,
(select top 1 times.id from times where times.employee_id=employees.id and times.event_time<=#2018-01-30 14:21:48# ORDER BY times.event_time DESC) as time_id
from employees
However, I also want to get some indication of whether there's a matching record in the time_notes table:
select employees.id, employees.name,
(select top 1 time_notes.id from time_notes where time_notes.time_id=(select top 1 times.id from times where times.employee_id=employees.id and times.event_time<=#2018-01-30 14:21:48# ORDER BY times.event_time DESC)) as time_note_present,
(select top 1 times.id from times where times.employee_id=employees.id and times.event_time<=#2018-01-30 14:21:48# ORDER BY times.event_time DESC) as last_time_id
from employees
This does work but it's SOOOOO SLOW. We're talking 10 seconds or more if there's 100 records in the employee table. The problem is peculiar to Access as I can't use the last_time_id result of the other sub-query like I can in MySQL or SQL Server.
I am looking for tips on how to speed this up. Either a different query, indexes. Something.

Not sure if something like this would work for you?
SELECT
employees.id,
employees.name,
time_notes.id AS time_note_present,
times.id AS last_time_id
FROM
(
employees LEFT JOIN
(
times INNER JOIN
(
SELECT times.employee_id AS lt_employee_id, max(times.event_time) AS lt_event_time
FROM times
WHERE times.event_time <= #2018-01-30 14:21:48#
GROUP BY times.employee_id
)
AS last_times
ON times.event_time = last_times.lt_event_time AND times.employee_id = last_times.lt_employee_id
)
ON employees.id = times.employee_id
)
LEFT JOIN time_notes ON times.id = time_notes.time_id;
(Completely untested and may contain typos)

Basically, your query is running multiple correlated subqueries even a nested one in a WHERE clause. Correlated queries calculate a value separately for each row, corresponding to outer query.
Similar to #LeeMac, simply join all your tables to an aggregate query for the max event_time grouped by employee_id which will run once across all rows. Below times is the baseFROM table joined to the aggregate query, employees, and time_notes tables:
select e.id, e.name, t.event_time, n.note
from ((times t
inner join
(select sub.employee_id, max(sub.event_time) as max_event_time
from times sub
where sub.event_time <= #2018-01-30 14:21:48#
group by sub.employee_id
) as agg_qry
on t.employee_id = agg_qry.employee_id and t.event_time = agg_qry.max_event_time)
inner join employees e
on e.id = t.employee_id)
left join time_notes n
on n.time_id = t.id

Related

How to add in a Count in the sub Query

I have to count in the sub query. I want to add this to the subquery.
I am getting a fail when I add this line to the sub query:
SELECT COUNT(DISTINCT v.Department_descr), v.Patient_Number, b.patient_name FROM vwGenVouchInfo v) as pat_dept
SELECT *
FROM vwGenVouchInfo v
WHERE v.Patient_Number=b.Patient_Number
AND v.Voucher_Primary_Diagnosis_Code IN ('Z00.129', 'Z00.00') )
ORDER BY voucher_primary_diagnosis_code
I want to get a result that will have the patient which department they went to the most times. I am not sure how to add this in. I want pat_dept to hold the dept the patient went to the most times besides the dental dept.
SELECT distinct
a.voucher_primary_diagnosis_code,
a.Patient_Number
, b.Patient_Name, b.patient_home_phone, patient_age
FROM vwGenVouchInfo a
LEFT JOIN vwGenPatInfo b ON a.Patient_Number=b.Patient_Number
WHERE
a.Department_Descr = 'Dental'
and a.Voucher_Service_Date >= '2015-01-01'
AND NOT EXISTS (
-- This subquery looks at other vouchers of the same patient.
--
SELECT *
FROM vwGenVouchInfo v
WHERE v.Patient_Number=b.Patient_Number
AND v.Voucher_Primary_Diagnosis_Code IN ('Z00.129', 'Z00.00') )
ORDER BY voucher_primary_diagnosis_code
If you need this in the subquery, you will have to put all the necessary columns one by one along with the count and then group by the same columns.

How to get the value of max() group when in subquery?

So i woud like to find the department name or department id(dpmid) for the group that has the max average of age among the other group and this is my query:
select
MAX(avg_age) as 'Max average age' FROM (
SELECT
AVG(userage) AS avg_age FROM user_data GROUP BY
(select dpmid from department_branch where
(select dpmbid from user_department_branch where
user_data.userid = user_department_branch.userid)=department_branch.dpmbid)
) AS query1
this code show only the max value of average age and when i try to show the name of the group it will show the wrong group name.
So, How to show the name of max group that has subquery from another table???
You may try this..
select MAX(avg_age) as max_avg, SUBSTRING_INDEX(MAX(avg_age_dep),'##',-1) as max_age_dep from
(
SELECT
AVG(userage) as avg_age, CONCAT( AVG(userage), CONCAT('##' ,department_name)) as avg_age_dep
FROM user_data
inner join user_department_branch
on user_data.userid = user_department_branch.userid
inner join department_branch
on department_branch.dpmbid = user_department_branch.dpmbid
inner join department
on department.dpmid = department_branch.dpmid
group by department_branch.dpmid
) tab_avg_age_by_dep
;
I've done some change on ipothesys that the department name is placed in a "department" anagraphical table.. so, as it needed put in join a table in plus, then I changed your query, eventually if the department name is placed (but I don't thing so) in the branch_department table you can add the field and its treatment to your query
update
In adjunct to as said, if you wanto to avoid identical average cases you can furtherly make univocal the averages by appending a rownum id in this way:
select MAX(avg_age) as max_avg, SUBSTRING_INDEX(MAX(avg_age_dep),'##',-1) as max_age_dep from
(
SELECT
AVG(userage) as avg_age, CONCAT( AVG(userage), CONCAT('##', CONCAT( #rownum:=#rownum+1, CONCAT('##' ,department_name)))) as avg_age_dep
FROM user_data
inner join user_department_branch
on user_data.userid = user_department_branch.userid
inner join department_branch
on department_branch.dpmbid = user_department_branch.dpmbid
inner join department
on department.dpmid = department_branch.dpmid
,(SELECT #rownum:=0) r
group by department_branch.dpmid
) tab_avg_age_by_dep
;
I took a shot at what I think you are looking for. The following will give you the department branch with the highest average age. I assumed the department_branch table had a department_name field. You may need an additional join to get the department.
SELECT db.department_name, udb.dpmid, AVG(userage) as `Average age`
FROM user_data as ud
JOIN user_department_branch as udb
ON udb.userid = ud.userid
JOIN department_branch as db
ON db.dpmbid = udb.dpmbid
GROUP BY udb.dpmid
ORDER BY `Average age` DESC
LIMIT 1

Multiple grouped items

I can't seem to find out how to get the functionality I want. Here is an example of what my table looks like:
EmpID | ProjectID | hours_worked |
3 1 8
3 1 8
4 2 8
4 2 8
4 3 8
5 4 8
I want to group by EmpID and ProjectID and then sum up the hours worked. I then want to inner join the Employee and Project table rows that are associated with EmpID and ProjectID, however when I do this then I get an error about the aggregate function thing, which I understand from research but I don't think this would have that problem since there will be one row per EmpID and ProjectID.
Real SQL:
SELECT
WorkHours.EmpID,
WorkHours.ProjectID,
Employees.FirstName
FROM WorkHours
INNER JOIN Projects ON WorkHours.ProjectID = Projects.ProjectID
INNER JOIN Employees ON WorkHours.EmpID = Employees.EmpID
GROUP BY WorkHours.ProjectID, WorkHours.EmpID
This gives the error:
Column 'Employees.FirstName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
You might want to use OVER (PARTITION BY) so you won't have to use GROUP BY:
Select a.EmpID
,W.ProjectID
,W.SUM(hours_worked) OVER (PARTITION BY W.EmpID,W.ProjectID)
,E.FirstName
FROM WorkHours W
INNER JOIN Projects P ON WorkHours.ProjectID = Projects.ProjectID
INNER JOIN Employees E ON WorkHours.EmpID = Employees.EmpID
You can do a basic query to get the grouped hours and use that as a basis for the rest, either in a CTE or as a subquery. For example, as a subquery:
SELECT *
FROM
(SELECT EmpID, ProjectID, SUM(hours_worked) as HoursWorked
FROM WorkHours
GROUP BY EmpID, ProjectID) AS ProjectHours
JOIN Projects
ON Projects.ID = ProjectHours.ProjectID
JOIN Employees
ON Employees.ID = ProjectHours.EmpID
One way is to use a CTE to first form the data you want, then join onto the other table(s)
WITH AggregatedHoursWorked
AS
(
SELECT EmpID,
ProjectID,
SUM(HoursWorked) AS TotalHours
FROM WorkHours
GROUP BY EmpID, ProjectID
)
SELECT e.FirstName
p.ProjectName,
hw.TotalHours
FROM AggregatedHoursWorked hw
INNER JOIN Employees e
ON hw.EmpID = e.ID
INNER JOIN Projects p
ON hw.ProjectID = p.ID
If you use an aggregate function, all the columns must be named in the aggregate function and/or in the GROUP BY clause. If you want to join the descriptions (normally unique for a given ID), you have to include the description columns in the GROUP BY clause. This will not affect the result of the query.

How to retrieve results for top items

I have ran a query to give me the total number of students within each school but now I need to know the name of those students within each school while keeping the top result by total number at the top. How can I add to this query to show me the names of the students?
Here is what I have to show me the total number of students at each school:
SELECT
dbo_Schools.Schools,
Count(dbo_tStudent.Student) AS NumberOfStudents
FROM
dbo_tStudent
INNER JOIN dbo_tSchools ON dbo_tStudent.SchoolID=dbo_tSchool.SchoolID
GROUP BY dbo_tSchool.School
ORDER BY Count(dbo_tStudent.Student) DESC;
Its important that I keep the schools in order from top number of students while listing the students
In this case you could use a Sub Query to achieve your resultset.
To use order by inside a subquery, you will also need a top or limit operator.
SELECT sc.schoolname
,st.columns...
FROM dbo_tStudent st
INNER JOIN (
SELECT TOP 1000 dbo_Schools.SchoolID
,min(schoolname) schoolname
,Count(dbo_tStudent.Student) AS NumberOfStudents
FROM dbo_tStudent
INNER JOIN dbo_tSchools ON dbo_tStudent.SchoolID = dbo_tSchools.SchoolID
GROUP BY dbo_tSchool.School
ORDER BY Count(dbo_tStudent.Student) DESC
) sc ON st.SchoolID = sc.SchoolID
Assuming that you are using SQL Server, you can use a CTE to join the first aggregate with the details like this:
;WITH cte as (
SELECT TOP 1000 dbo_Schools.SchoolID, Count(dbo_tStudent.Student) AS NumberOfStudents
FROM
dbo_tStudent
INNER JOIN dbo_tSchools ON dbo_tStudent.SchoolID = dbo_tSchools.SchoolID
GROUP BY dbo_tSchool.School
ORDER BY Count(dbo_tStudent.Student) DESC
)
SELECT
sc.<your school name column>,
st.<your student columns>
from
dbo_tStudent st
INNER JOIN cte ON st.SchoolID = cte.SchoolID
INNER JOIN dbo_tSchools sc on cte.SchoolID = sc.SchoolID
More generally speaking: you need a derived table (your aggregation containing the group by clause) that is joined with the select statement for the student details. In this example, the CTE basically is a SQL Server feature that facilitates the use of derived tables.

Eliminate duplicate rows from query output

I have a large SELECT query with multiple JOINS and WHERE clauses. Despite specifying DISTINCT (also have tried GROUP BY) - there are duplicate rows returned. I am assuming this is because the query selects several IDs from several tables. At any rate, I would like to know if there is a way to remove duplicate rows from a result set, based on a condition.
I am looking to remove duplicates from results if x.ID appears more than once. The duplicate rows all appear grouped together with the same IDs.
Query:
SELECT e.Employee_ID, ce.CC_ID as CCID, e.Manager_ID, e.First_Name, e.Last_Name,,e.Last_Login,
e.Date_Created AS Date_Created, e.Employee_Password AS Password,e.EmpLogin
ISNULL((SELECT TOP 1 1 FROM Gift g
JOIN Type t ON g.TypeID = t.TypeID AND t.Code = 'Reb'
WHERE g.Manager_ID = e.Manager_ID),0) RebGift,
i.DateCreated as ImportDate
FROM #EmployeeTemp ct
JOIN dbo.Employee c ON ct.Employee_ID = e.Employee_ID
INNER JOIN dbo.Manager p ON e.Manager_ID = m.Manager_ID
LEFT JOIN EmployeeImp i ON e.Employee_ID = i.Employee_ID AND i.Active = 1
INNER JOIN CreditCard_Updates cc ON m.Manager_ID = ce.Manager_ID
LEFT JOIN Manager m2 ON m2.Manager_ID = ce.Modified_By
WHERE ce.CCType ='R' AND m.isT4L = 1
AND CHARINDEX(e.first_name, Selected_Emp) > 0
AND ce.Processed_Flag = #isProcessed
I don't have enough reputation to add a comment, so I'll just try to help you in an answer proper (even though this is more of a comment).
It seems like what you want to do is select distinctly on just one column.
Here are some answers which look like that:
SELECT DISTINCT on one column
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?