Query Distinct on a single Column - sql

I have a Table called SR_Audit which holds all of the updates for each ticket in our Helpdesk Ticketing system.
The table is formatted as per the below representation:
|-----------------|------------------|------------|------------|------------|
| SR_Audit_RecID | SR_Service_RecID | Audit_text | Updated_By | Last_Update|
|-----------------|------------------|------------|------------|------------|
|........PK.......|.......FK.........|
I've constructed the below query that provides me with the appropriate output that I require in the format I want it. That is to say that I'm looking to measure how many tickets each staff member completes every day for a month.
select SR_audit.updated_by, CONVERT(CHAR(10),SR_Audit.Last_Update,101) as DateOfClose, count (*) as NumberClosed
from SR_Audit
where SR_Audit.Audit_Text LIKE '%to "Completed"%' AND SR_Audit.Last_Update >= DATEADD(day, -30, GETDATE())
group by SR_audit.updated_by, CONVERT(CHAR(10),SR_Audit.Last_Update,101)
order by CONVERT(CHAR(10),SR_Audit.Last_Update,101)
However the query has one weakness which I'm looking to overcome.
A ticket can be reopened once its completed, which means that it can be completed again. This allows a staff member to artificially inflate their score by re-opening a ticket and completing it again, thus increasing their completed ticket count by one each time they do this.
The table has a field called SR_Service_RecID which is essentially the Ticket number. I want to put a condition in the query so that each ticket is only counted once regardless of how many times its completed, while still honouring the current where clause.
I've tried sub queries and a few other methods but haven't been able to get the results I'm after.
Any assistance would be appreciated.
Cheers.
Courtenay

use as
COUNT(DISTINCT(SR_Service_RecID)) as NumberClosed

Use:
COUNT(DISTINCT SR_Service_RecID) as NumberClosed

Related

Tableau combining rows with the same info

I have a dashboard in Tableau which shows different payments received - the amount, the date the payment was received, and a calculated field which shows the number days since the payment was received.
However, a lot of payments are the same, with the same amount, and received on the same day; so Tableau collapses these together, and adds the total days since the payments were received together in the final column, i.e. five lots of £5.50, each received on 1st January shows as below (as of 01/02/2018)
Column 1 Column 2 Column 3
£5.50 01/01/2018 155
But I need separate rows for each. Does anyone know how to stop tableau doing this, or of a workaround?
Many thanks.
You could try using RANK_UNIQUE function.
First of all, in the Analysis Menu, uncheck Aggregate Measures.
Then, starting from this data:
You can get this result:
Additionally, you may want to hide Rank from rows just not-showing header.
Is this something close to what you're looking for?
EDIT/UPDATE
In order to get all values and not just for the top rows, just move the Rank at the very beginning of the shelf:

SQL how to implement if and else by checking column value

The table below contains customer reservations. Customers come and make one record in this table, and the last day this table will be updated its checkout_date field by putting that current time.
The Table
Now I need to extract all customers spending nights.
The Query
SELECT reservations.customerid, reservations.roomno, rooms.rate,
reservations.checkin_date, reservations.billed_nights, reservations.status,
DateDiff("d",reservations.checkin_date,Date())+Abs(DateDiff("s",#12/30/1899
14:30:0#,Time())>0) AS Due_nights FROM reservations, rooms WHERE
reservations.roomno=rooms.roomno;
What I need is, if customer has checkout status, due nights will be calculated checkin_date subtracting by checkout date instead current date, also if customer has checkout date no need to add extra absolute value from 14:30.
My current query view is below, also my computer time is 14:39 so it adds 1 to every query.
Since you want to calculate the Due nights upto the checkout date, and if they are still checked in use current date. I would suggest you to use an Immediate If.
The condition to check would be the status of the room. If it is checkout, then use the checkout_date, else use the Now(), something like.
SELECT
reservations.customerid,
reservations.roomno,
rooms.rate,
reservations.checkin_date,
reservations.billed_nights,
reservations.status,
DateDiff("d", checkin_date, IIF(status = 'checkout', checkout_date, Now())) As DueNights
FROM
reservations
INNER JOIN
rooms
ON reservations.roomno = rooms.roomno;
As you might have noticed, I used a JOIN. This is more efficient than merging the two tables with common identifier. Hope this helps !

Getting a snapshot of records where an "event" can mean several entries on the same date

This is really frustrating me.
So, I'm making a database recording people joining and leaving our office, as well as changing roles, in order to keep track of headcount. This is succinctly recorded in the following table:
EmployeeID | RoleID | FTE | Date
FTE is the proportion of full-time hours the role is worth (i.e. 1 is full-time, 0.5 is part-time, etc). Leaving events are recorded as changing the role to 0 (Absent) and FTE to 0. The trouble is, people can have more than one role, which means that the number of hours they actually worked is a composite of all the events for that employee that occur on the same day. So if someone goes from full time on one project to splitting their time between two projects, a ChangeRole event is logged for each.
So I want to know the total headcount on a monthly basis. Essentially the query I would want is "Select all records from this table where, for each EmployeeID, the date is the maximum date below a specified date." From there I can sum the FTE to get the headcount.
Now I can get some of those things in isolation: I can do max(date), I can do criteria:<#dd/mm/yyyy##. But for some reason I can't seem to combine it all to get what I want, and I'm at a point where I've been staring at the problem so long that it doesn't make sense to me. Can anyone help me out? Thanks!
Something like this?
SELECT Events.*
FROM Events INNER JOIN (
SELECT EmployeeID, Max(Date) AS LatestDate
FROM Events
WHERE Events.Date < [Date entered]
GROUP BY EmployeeID) AS S
ON (Events.EmployeeID = S.EmployeeID) AND (Events.Date = S.LatestDate)

Fact table designing for SSAS

I'm designing a fact table for SSAS and this is the first time I'm trying my hand at this as this is to be a prototype system just to show what could be done and to show to someone to decide if it what they are after.
I've made up some data and am now trying to create the fact table. The cube will be looking at referrals and what I'm trying to show is the information over time showing the number of referrals that opened in a month, number that closed in a month and the number that were open at any point in the month (i.e. they could have opened in previous month and closed in a future month).
How is it best to design these measure is where I'm stuck. Should it be three fact tables or can I get away with one? If I do three fact tables, I can link on the record number and the open date to get number that opened in a month, I can link on record number and closed date to create number that closed in a month, but the one I have no idea on is to describe when it was open at any point in the month. For this table would I need to create a row for every day for every referral? This seems a bit intensive and so immediately I thought it was wrong.
So the questions are twofold:
Can I do the three measures in one table and if so what is the best method for this?
What is the best method for the open at any point in the month count?
Any thoughts would be most appreciated as I truely am a beginner at this and all I have to aid me is google as I have a short deadline for this.
Dimensions I have:
Demographics: Record number; Gender; Ethnicity; Birth date;
Referral: Record number; Open date; End date;
Time: Date; Month; Quarter; Year;
The fact table I initially designed was:
Data:
Record number; Opened_in_month; Closed_in_month; Open_in_month;
Since creating the cube, I can see that the numbers do not match up to what I put in the test data and so I know that I have messed up the fact table and it's that table I need to re-create.
I have little experience with creating cubes in SSAS but i would probably create a view as something like this
ReferallFacts:
Id | IsOpen | DateOpened | OpenedBy | DateClosed | ClosedBy | OpenForMinutes...
CalendarDimension:
ShortDate | Week | Month | Quarter | Year | FinancialWeek...
EmployeeDimension:
Id | FirstName | LastName | LineManager | Department...
DepartmentDimension:
Id | Name | ParentDepartment | Manager | Location...
I don't really see a need for more than one fact table in this case as all of what you describe "by month", "by day" is handled by the calendar dimension.
Here is a really nice walkthough, and also pcteach.me has some good videos on SSAS.
Have you considered an event-based approach, an event being a referral opening or closing?
First of all, you need to determine the granularity level of your fact table. If you need to know the number of open referrals at a specific date and time in a month, then your fact table must be at the lowest granularity (individual referral records):
FactReferrals: ( DateId, TimeId, EventId, RecordNumber, ReferralEventValue )
Here, ReferralEventValue is just an integer value of 1 when a Referral opens, and -1 when a Referral closes. EventId refers to a dimension with only two members: Opened and Closed.
This approach allows you to get the number of closed or opened events over any given time period. Also, by taking the sum of ReferralEventValue from the beginning of time, and up to a certain point in time, you get the exact amount of open referrals at that specific moment. To speed up this sum in SSAS, you could design aggregations or create a separate measure that is the accumulated sum of ReferralEventValue.
Edit: Of course, if you don't need data at individual referral granularity, you could always sum up the ReferralEventValue per day or even month, before loading the fact table.

Is there a set based solution for this problem?

We have a table set up as follows:
|ID|EmployeeID|Date |Category |Hours|
|1 |1 |1/1/2010 |Vacation Earned|2.0 |
|2 |2 |2/12/2010|Vacation Earned|3.0 |
|3 |1 |2/4/2010 |Vacation Used |1.0 |
|4 |2 |5/18/2010|Vacation Earned|2.0 |
|5 |2 |7/23/2010|Vacation Used |4.0 |
The business rules are:
Vacation balance is calculated by vacation earned minus vacation used.
Vacation used is always applied against the oldest vacation earned amount first.
We need to return the rows for Vacation Earned that have not been offset by vacation used. If vacation used has only offset part of a vacation earned record, we need to return that record showing the difference. For example, using the above table, the result set would look like:
|ID|EmployeeID|Date |Category |Hours|
|1 |1 |1/1/2010 |Vacation Earned|1.0 |
|4 |2 |5/18/2010|Vacation Earned|1.0 |
Note that record 2 was eliminated because it was completely offset by used time, but records 1 and 4 were only partially used, so they were calculated and returned as such.
The only way we have thought of to do this is to get all of the vacation earned records in a temporary table. Then, get the total vacation used and loop through the temporary table, deleting the oldest record and subtracting that value from the total vacation used until the total vacation used is zero. We could clean it up for when the remaining vacation used is only part of the oldest vacation earned record. This would leave us with just the outstanding vacation earned records.
This works, but it is very inefficient and performs poorly. Also, the performance will just degrade over time as more and more records are added.
Are there any suggestions for a better solution, preferable set based? If not, we'll just have to go with this.
EDIT: This is a vendor database. We cannot modify the table structure in any way.
The following should do it..
(but as others mention, the best solution would be to adjust remaining vacations as they are spent..)
select
id, employeeid, date, category,
case
when earned_so_far + hours - total_spent > hours then
hours
else
earned_so_far + hours - total_spent
end as hours
from
(
select
id, employeeid, date, category, hours,
(
select
isnull(sum(hours),0)
from
vacations
WHERE
category = 'Vacation Earned'
and
date < v.date
and
employeeid = v.employeeid
) as earned_so_far,
(
select
isnull(sum(hours),0)
from
vacations
where
category = 'Vacation Used'
and
employeeid = v.employeeid
) as total_spent
from
vacations V
where category = 'Vacation Earned'
) earned
where
earned_so_far + hours > total_spent
The logic is
calculate for each earned row, the hours earned so far
calculate the total hours used for this user
select the record if the total_hours_so_far + hours of this record - total_spent_hours > 0
In thinking about the problem, it occurred to me that the only reason you need to care about when vacation is earned is if it expires. And if that's the case, the simplest solution is to add 'vacation expired' records to the table, such that the amount of vacation remaining for an employee is always just the sum(vacation earned) - (sum(vacation expired) + sum(vacatation used)). You can even show the exact records you want by using the last vacation expired record as a starting point for the query.
But I'm guessing that's not an option. To address the problem as asked, keep in mind that whenever you find yourself using a temporary table try putting that data into CTE (common table expression) instead. Unfortunately I have a meeting right now and so I don't have time to write the query (maybe later, it sounds like fun), but this should get you started.
I find your whole result set confusing and inaccurate and I can see employees sayng, "no I earned 2 hours on Jan 25th not 1." It is not true that they earned 1 hour on that date that was only partially offset, and you will have no end of problems if you choose to display this way. I'd look at a different way to present the information. Typically you either present a list of all leave actions (earned, expired and used) with a total at the bottom or you present a summary of available for use and used.
In over 30 years in the workforce and having been under many differnt timekeeping systems (as well as having studied even more when I was a managment analyst), I have never seen anyone want to display timekeeping information this way. I'm thinking there is a reason. If this is a requirement, I'd suggest pushing back on it and explaining how it will be confusing to read the data this was as well as being difficult to get a well-performing solution. I would not accept this as a requirement without trying to convince the client that it is a poor idea.
As time passes and records are added, performance will get worse and worse unless you do something about it, such as:
Purge old rows once they're "cancelled out" (e.g. vacation earned has had equivalent vacation used rows added and accounted for; vacation used has been used set "expire" vacation earned as "expended")
Add a column that flags if a a row has been "cancelled out", and incorporate this column into your indexes
Tracking how the data changes in this fashion seems an argument to modify your table sturctures (have several, not just one), but that's outside the scope of your current problem.
As for the query itself, I'd build two aggregates, do some subtraction, make that a subquery, then join it on some clever use of one of the ranking functions. Smells like a correlated subquery in there somewhere, too. I may try and hash this out later (I'm short on time), but I bet someone beats me to it.
I'd suggest modifying the table to keep track of Balance in its own column. That way, you only need to grab the most recent record to know where the employee stands.
That way, you can satisfy the simple case ("How much vacation time do I have"), while still being able to do the awkward rollup you're looking for in your "Which bits of vacation time don't line up with other bits" report, which I'd hope is something you don't need very often.