Getting a snapshot of records where an "event" can mean several entries on the same date - sql

This is really frustrating me.
So, I'm making a database recording people joining and leaving our office, as well as changing roles, in order to keep track of headcount. This is succinctly recorded in the following table:
EmployeeID | RoleID | FTE | Date
FTE is the proportion of full-time hours the role is worth (i.e. 1 is full-time, 0.5 is part-time, etc). Leaving events are recorded as changing the role to 0 (Absent) and FTE to 0. The trouble is, people can have more than one role, which means that the number of hours they actually worked is a composite of all the events for that employee that occur on the same day. So if someone goes from full time on one project to splitting their time between two projects, a ChangeRole event is logged for each.
So I want to know the total headcount on a monthly basis. Essentially the query I would want is "Select all records from this table where, for each EmployeeID, the date is the maximum date below a specified date." From there I can sum the FTE to get the headcount.
Now I can get some of those things in isolation: I can do max(date), I can do criteria:<#dd/mm/yyyy##. But for some reason I can't seem to combine it all to get what I want, and I'm at a point where I've been staring at the problem so long that it doesn't make sense to me. Can anyone help me out? Thanks!

Something like this?
SELECT Events.*
FROM Events INNER JOIN (
SELECT EmployeeID, Max(Date) AS LatestDate
FROM Events
WHERE Events.Date < [Date entered]
GROUP BY EmployeeID) AS S
ON (Events.EmployeeID = S.EmployeeID) AND (Events.Date = S.LatestDate)

Related

TSQL query to find latest (current) record from period column when there are past present and future records

edited as requested:
My apologies. I've been dealing with this a bit and it's well and truly in my head, but not for the reader.
We have multiple records in table A which have multiple entries in the Period column. Say it's like a football schedule. Teams will have multiple dates/times in the Period column.
When we run query:
We want records selected for the most recent games only.
We don't want the earlier games.
We don't want the games "scheduled" and not yet played.
"Last game played" i.e. Period for teams are often on different days.
Table like:
Team Period
Reds 2021020508:00
Reds 2021011107:00
City 2021030507:00
Reds 2021032607:00
City 2021041607:00
Reds 2021050707:00
When I run query, I want to see the records for last game played regardless of date. So if I run the query on 27 Mar 2021, I want:
City 2021030507:00
Reds 2021032607:00
Keep in mind I used the above as an easily understandable example. In my case I have 1000s of "Teams" each of which may have 100+ different date entries in the Period column and I would like the solution to be applicable regardless of number of records, dates, or when the query is run.
What can I do?
Thanks!
So this gives you your desired output using the sample data, does it fulfil your requirement?
create table x (Team varchar(10), period varchar(20))
insert into x values
('Reds','2021020508:00'),
('Reds','2021011107:00'),
('City','2021030507:00'),
('Reds','2021032607:00'),
('City','2021041607:00'),
('Reds','2021050707:00')
select Team, Max(period) LastPeriod
from x
where period <=Format(GetDate(), 'yyyyMMddhh:mm')
group by Team
The string-formatted date you have order by text, so I think this would work
SELECT TOP 2 *
FROM tableA
WHERE period = FORMAT( GETDATE(), 'yyyyMMddhh:mm' )
ORDER BY period
Perhaps you want:
where period = (select max(t2.period) from t t2)
This returns all rows with the last period in the table.

SQL how to implement if and else by checking column value

The table below contains customer reservations. Customers come and make one record in this table, and the last day this table will be updated its checkout_date field by putting that current time.
The Table
Now I need to extract all customers spending nights.
The Query
SELECT reservations.customerid, reservations.roomno, rooms.rate,
reservations.checkin_date, reservations.billed_nights, reservations.status,
DateDiff("d",reservations.checkin_date,Date())+Abs(DateDiff("s",#12/30/1899
14:30:0#,Time())>0) AS Due_nights FROM reservations, rooms WHERE
reservations.roomno=rooms.roomno;
What I need is, if customer has checkout status, due nights will be calculated checkin_date subtracting by checkout date instead current date, also if customer has checkout date no need to add extra absolute value from 14:30.
My current query view is below, also my computer time is 14:39 so it adds 1 to every query.
Since you want to calculate the Due nights upto the checkout date, and if they are still checked in use current date. I would suggest you to use an Immediate If.
The condition to check would be the status of the room. If it is checkout, then use the checkout_date, else use the Now(), something like.
SELECT
reservations.customerid,
reservations.roomno,
rooms.rate,
reservations.checkin_date,
reservations.billed_nights,
reservations.status,
DateDiff("d", checkin_date, IIF(status = 'checkout', checkout_date, Now())) As DueNights
FROM
reservations
INNER JOIN
rooms
ON reservations.roomno = rooms.roomno;
As you might have noticed, I used a JOIN. This is more efficient than merging the two tables with common identifier. Hope this helps !

Query Distinct on a single Column

I have a Table called SR_Audit which holds all of the updates for each ticket in our Helpdesk Ticketing system.
The table is formatted as per the below representation:
|-----------------|------------------|------------|------------|------------|
| SR_Audit_RecID | SR_Service_RecID | Audit_text | Updated_By | Last_Update|
|-----------------|------------------|------------|------------|------------|
|........PK.......|.......FK.........|
I've constructed the below query that provides me with the appropriate output that I require in the format I want it. That is to say that I'm looking to measure how many tickets each staff member completes every day for a month.
select SR_audit.updated_by, CONVERT(CHAR(10),SR_Audit.Last_Update,101) as DateOfClose, count (*) as NumberClosed
from SR_Audit
where SR_Audit.Audit_Text LIKE '%to "Completed"%' AND SR_Audit.Last_Update >= DATEADD(day, -30, GETDATE())
group by SR_audit.updated_by, CONVERT(CHAR(10),SR_Audit.Last_Update,101)
order by CONVERT(CHAR(10),SR_Audit.Last_Update,101)
However the query has one weakness which I'm looking to overcome.
A ticket can be reopened once its completed, which means that it can be completed again. This allows a staff member to artificially inflate their score by re-opening a ticket and completing it again, thus increasing their completed ticket count by one each time they do this.
The table has a field called SR_Service_RecID which is essentially the Ticket number. I want to put a condition in the query so that each ticket is only counted once regardless of how many times its completed, while still honouring the current where clause.
I've tried sub queries and a few other methods but haven't been able to get the results I'm after.
Any assistance would be appreciated.
Cheers.
Courtenay
use as
COUNT(DISTINCT(SR_Service_RecID)) as NumberClosed
Use:
COUNT(DISTINCT SR_Service_RecID) as NumberClosed

SQL code generation question, group by and aggregates

I'm working on a web application that lets a user design ad-hoc queries against an employee database. The queries are designed in an AJAX web based interface where the user specifies groups of crtieria that get intersected together, i'm trying to add functionality to also allow the user introduce date relationships between crtieria. For example, here's a sample (problematic) generated code for a query that says "Give me all employees that had at least 3 audits 150+ days after they started on the job"
select * FROM
(
SELECT employee_id
, max(employee_start_date) employee_start_date
from employees
where employee_salary_type in (55, 66, 77)
group by employee_id having count(*) >= 1
) employee_criteria_1,
(
SELECT employee_id
,max(audit_date) audit_date
from employees
where job_audit_id in (5, 6, 7)
-- They had at least 3 audits
group by employee_id having count(*) >= 3
) employee_criteria_2
WHERE
employee_criteria_1.employee_id = employee_criteria_2.employee_id
-- The audits must have happened at least 150 days after employee's start date
and employee_criteria_2.audit_date > employee_criteria_1.employee_start_date + 150
As you notice, each criteria from the UI gets generated into a SQL SELECT block, the all are intersected together. Here's my problem:
The query above checks whether the employee had at least 3 audits, and the last audit MAX occurs 150 days after start date INSTEAD of the 3 audits occur 150+ after start date.
You might ask, "well, why do you have a max(audit_date) statement then?" The reason is that I need to have an aggregate function in order for the group by to work (the group here is generated out of the "occurs at least 3 times" high-level query criteria).
So, what can I add to this code (without much changes, cause i'd like to keep this code generation mechanism) so that i'm now checking that all those 3 occurrences/audits happen 150+ days after (instead of only the max one)??
Thanks!
It sounds like you need to look into window functions, and possibly the having clause rather than the where clause.

SQL: Calculating system load statistics

I have a table like this that stores messages coming through a system:
Message
-------
ID (bigint)
CreateDate (datetime)
Data (varchar(255))
I've been asked to calculate the messages saved per second at peak load. The only data I really have to work with is the CreateDate. The load on the system is not constant, there are times when we get a ton of traffic, and times when we get little traffic. I'm thinking there are two parts to this problem: 1. Determine ranges of time that are considered peak load, 2. Calculate the average messages per second during these times.
Is this the right approach? Are there things in SQL that can help with this? Any tips would be greatly appreciated.
I agree, you have to figure out what Peak Load is first before you can start to create reports on it.
The first thing I would do is figure out how I am going to define peak load. Ex. Am I going to look at an hour by hour breakdown.
Next I would do a group by on the CreateDate formated in seconds (no milleseconds). As part of the group by I would do an avg based on number of records.
I don't think you'd need to know the peak hours; you can generate them with SQL, wrapping a the full query and selecting the top 20 entries, for example:
select top 20 *
from (
[...load query here...]
) qry
order by LoadPerSecond desc
This answer had a good lesson about averages. You can calculate the load per second by looking at the load per hour, and dividing by 3600.
To get a first glimpse of the load for the last week, you could try (Sql Server syntax):
select datepart(dy,createdate) as DayOfYear,
hour(createdate) as Hour,
count(*)/3600.0 as LoadPerSecond
from message
where CreateDate > dateadd(week,-7,getdate())
group by datepart(dy,createdate), hour(createdate)
To find the peak load per minute:
select max(MessagesPerMinute)
from (
select count(*) as MessagesPerMinute
from message
where CreateDate > dateadd(days,-7,getdate())
group by datepart(dy,createdate),hour(createdate),minute(createdate)
)
Grouping by datepart(dy,...) is an easy way to distinguish between days without worrying about month borders. It works until you select more that a year back, but that would be unusual for performance queries.
warning, these will run slow!
this will group your data into "second" buckets and list them from the most activity to least:
SELECT
CONVERT(char(19),CreateDate,120) AS CreateDateBucket,COUNT(*) AS CountOf
FROM Message
GROUP BY CONVERT(Char(19),CreateDate,120)
ORDER BY 2 Desc
this will group your data into "minute" buckets and list them from the most activity to least:
SELECT
LEFT(CONVERT(char(19),CreateDate,120),16) AS CreateDateBucket,COUNT(*) AS CountOf
FROM Message
GROUP BY LEFT(CONVERT(char(19),CreateDate,120),16)
ORDER BY 2 Desc
I'd take those values and calculate what they want