Result of a Query in One Column on Another Query - sql

I need to create a report that shows attendance at a weekly class. We use an electronic check-in system so each person's info is saved with a date into the database.
I can easily write a query that gets the list of people who checked in to the class on a given day. The results from that query look something like this:
+-----------+------------+-----------+
| person id | first name | last name |
+-----------+------------+-----------+
| 1234 | john | smith |
| 1235 | jane | smith |
+-----------+------------+-----------+
But what I need is an additional column that says how many times the particular person has attended that class within a 12 week period. So the results I want would look something like this:
+-----------+------------+-----------+------------+
| person id | first name | last name | attendance |
+-----------+------------+-----------+------------+
| 1234 | john | smith | 3 |
| 1235 | jane | smith | 5 |
+-----------+------------+-----------+------------+
I can also get the proper results for the "attendance" column with a COUNT query. This query only works when I specify a person ID in the attendance table.
So I need a query that will take the person ids from the first query and figure out how many times that person has attended in the last 12 weeks and apply that to a column.
Additional information:
It needs to count how many times the person has attended that class in a 12 week period including the last Sunday.
It is always 12 weeks from the current date
I suppose it is possible they could have attended two classes in a day but all the classes are on a Sunday and it is unlikely enough that I do not need to account for this
It is all based on a Sunday so I guess the answer is 12 weeks from last Sunday.
Here's my query that returns the results of who checked in last Sunday:
SELECT
pb.person_id, pb.nick_name, pb.last_name, COS.occurrence_name, COS.occurrence_description, COS.date_created, COS.occurrence_type, COS.occurrence_id, sm.role_luid
FROM core_v_occurrence_service COS
JOIN core_occurrence_attendance oa ON oa.occurrence_id = COS.occurrence_id
JOIN core_v_person_basic pb ON pb.person_id = oa.person_id
JOIN smgp_member sm ON sm.person_id = OA.person_id
WHERE oa.attended = 1
AND COS.occurrence_type = 140
AND COS.date_created BETWEEN (DATEADD(week,-1,GETDATE())) AND GETDATE()
AND sm.role_luid IN (24,25, 28)
AND sm.group_id = 3
And here's the query that gives me the count of how many times someone has attended a class:
SELECT COUNT (oa.person_id)
FROM core_occurrence_attendance oa
JOIN core_v_occurrence_service COS on cos.occurrence_id = oa.occurrence_id
WHERE oa.person_id = 27276
AND oa.attended = 1
AND COS.occurrence_type = 140
AND COS.date_created BETWEEN (DATEADD(week,-12,GETDATE())) AND GETDATE()
Basically what I need is a query that count how many times a person's person id is entered into the attendance table. Because a person's person id will be entered into that table every time they attend the class.

You should be able to add a group by into the query:
SELECT Person_ID, FIRST_NAME, LAST_NAME, COUNT(Person_ID) AS Attendance
FROM <YOURTABLE>
WHERE <DateField> BETWEEN <begindate> AND <enddate>
GROUP BY Attendance
The group by will count how many times that person has shown up for class and put it into one cell.
I a not guaranteeing this code runs because you post is missing some information.replace everything in the angle brackets.

I did it!
I created a CTE for the query that counts the attendance.
I then did SELECT DISTINCT on that CTE and added a GROUP BY with all the column names.
I also added the date range of 12 weeks I needed in the CTE.

Related

SQL table truncates when using a trigger to join. Require work around

I have two tables. One contains a list of employees and their information
EmployeeID | Name | Start Date |HoursCF | HoursTaken
------------+-------+-------------+--------+------------
1 | Conor | 15/10/2018 | 0 |0
2 | Joe | 01/05/2018 | 0 |0
3 | Tom | 01/01/2019 | 0 |0
The other Contains Holiday Request put in by Employees
EmployeeID | HoursTaken |
------------+----------------+
1 | 8 |
2 | 16 |
3 | 8 |
2 | 8 |
1 | 16 |
I want it so when a new Holiday request is created,deleted,or updated on my holiday request table it updates on my employee table such as
EmployeeID | Name | Start Date |HoursCF | HoursTaken
-----------+-------+-------------+--------+------------
1 | Conor | 15/10/2018 | 0 |24
2 | Joe | 01/05/2018 | 0 |24
3 | Tom |01/01/2019 | 0 |8
I have tried creating a view
CREATE VIEW vw_HoursTakenPerEmployee AS
SELECT e.[EmployeeID],
COALESCE(SUM(hr.[HoursTaken]), 0) AS HoursTaken
FROM [dbo].[Employees] e LEFT JOIN
[dbo].[HolidayRequests] hr
ON e.[EmployeeID] = hr.[EmployeeID]
GROUP BY e.[EmployeeID];
Ans then using a trigger to insert new data entered into the holiday request table into the employee table
ALTER trigger Inserttrigger on [dbo].[HolidayRequestForm]
after INSERT, UPDATE, DELETE
as
begin
TRUNCATE TABLE [dbo].[HoursTakenPerEmployee]
INSERT INTO [dbo].[HoursTakenPerEmployee] ([EmployeeID],[HoursTaken])
SELECT * FROM vw_HoursTakenPerEmployee;
end
I know the problem is the truncate statement as it works fine if all the employees already have an entry in the holiday request table. If they don't, they get truncated from the employees table any time a new holiday request is made that does not belong to them.
Any Thoughts?
You've already done all the work:
CREATE VIEW vw_HoursTakenPerEmployee AS
SELECT e.*,
COALESCE(SUM(hr.[HoursTaken]), 0) AS HoursTaken
FROM [dbo].[Employees] e LEFT JOIN
[dbo].[HolidayRequests] hr
ON e.[EmployeeID] = hr.[EmployeeID]
GROUP BY e.[EmployeeID];
Drop the hours taken column from the EmployeeTable. Any time you want to know how many hours an employee took for holiday, query the view. The view query will be re-run every time the view is queried, so it will be up to date
Bear in mind your system will only work for a year. I recommend that you add a year indicator to your HolidayRequests table so you can give employees a new allowance every new fiscal/holiday year they're at the company.
Also, if an employee starts part way through a year, you can have the view calculate how many hours they're entitled to:
CREATE VIEW vw_HoursTakenPerEmployee AS
SELECT
e.*,
hr.*,
DATEDIFF(HOUR, e.HolEntitleFrom, e.HolEntitleTo) / e.HolidayEntitlementHours as HoursEarnedSoFar,
COALESCE(hr.HoursTaken), 0) AS HoursTaken
FROM
(
SELECT *,
--change "Start Date" column name so it doesn't have a space in it!
CASE WHEN [Start Date] < d.HolidayYearStart THEN d.HolidayYearStart ELSE [Start Date] END as HolEntitleFrom,
CASE WHEN TerminationDate IS NULL THEN GetDate() ELSE TerminationDate as HolEntitleTo,
FROM
[dbo].[Employees]
--useful constants can go here, like when holiday year starts from
CROSS JOIN
(SELECT DATEFROMPARTS(YEAR(GETDATE()), 1, 1) as HolidayYearStart) d
) e
LEFT JOIN
(
SELECT employeeid, HolidayYear, SUM(HoursTaken) AS HoursTaken
FROM [dbo].[HolidayRequests]
GROUP BY EmployeeID, holidayyear
) hr
ON hr.EmployeeID = e.EmployeeID AND
hr.holidayYear = YEAR(e.HolEntitleFrom) --this year's holiday requests
;
as an example..
Put a HolidayYear column in HolidayRequests, to track which holiday year the request was made in. INT, data like 2018, 2019..
Put a HolidayEntitlementHours in Employee to track how many holiday hours the emp gets this year (most places have a system where the amount of holiday increases for each full year of service so year on year). Float/decimal type.. Or make it something the view calculates from a basic allowance plus a time-in-service calculation
Put a column for the terminationdate of an emp, so you can get an indication of whether they overspent their holidays (if your place allows to take more hours than have been earned so far). Imagine the emp took 20 days in jan, and then has to work a year to earn it. They hand in 1 months notice on 1 june. Setting a termination date of 1 july (about 6 months after the holiday entitlement start) would show them as having maximally earned 10 by the time they leave, so they need to pay back 10

Select PERIOD_BEGIN and PERIOD_END dates from historical data containing only timestamp in Oracle SQL

I've run into a bit of a problem. Background: I work as a business controller in a financial institution that offers wealth management services and it falls to me to do internal reporting on euros coming and going. As this is one of the KPIs used to evaluate managers' performance, I need to be able to report these numbers per manager. This bit is straightforward as each customer has a manager assigned to it. Now here's the fun thing - some questionable DW design choices were made in the past and the table containing the manager/customer relationship lacks all the relevant temporal information such as 'valid from' or 'valid until'. Basically it just stores the current state. Occasionally customers and portfolios get reassigned to other managers and this causes all the transaction done during the old manager's reign to show up as belonging to the new manager.
E.g. manager Joe is managing a customer called Blammo Ltd between January and March and the customer subscribes funds with 10 million $. Joe leaves the company and the customer gets assigned to manager Helen. During April the customer withdraws 5 mil. When I compile my reports at the end of April, Joe's KPI reads +-0 while Helen's shows +5 million while in truth it should tell Joe made 10 million and Helen lost 5.
We do have an audit table that contains all the rows from the table containing the manager/customer relationships and each row has a timestamp when it was created. What I hope to achieve is to build a view that uses these timestamps to build a table that has a VALID_FROM and VALID_UNTIL dates so I can easily assign transactions to specific managers by joining the transaction between the VALID dates.
So basically what I have is...
CUSTOMERID MANAGERID TIMESTAMP
------------ ----------- ------------
1 A 01-01-2018
1 B 28-02-2018
1 A 31-05-2018
1 C 31-08-2018
And what I need is...
CUSTOMERID MANAGERID VALID_FROM VALID_UNTIL
------------ ----------- ------------ -------------
1 A 01-01-2018 28-02-2018
1 B 28-02-2018 31-05-2018
1 A 31-05-2018 31-08-2018
1 C 31-08-2018
What I've tried is
SELECT
CUSTOMERID,
MANAGERID,
MIN(TIMESTAMP) AS VALID_FROM,
MAX(TIMESTAMP) AS VALID_UNTIL
FROM CUSMAN.CUS_MAN_AUDIT
GROUP BY
CUSTOMERID,
MANAGERID
and this would work in a case where customers are never reassigned back to a previous manager. However due to maternal leaves etc. the customers get assigned back and forth between managers so the solution above won't produce correct result - joining a transaction made by customer '1' on '30-04-2018' to the customer/manager relationship data would produce two results - both managers A and B. Below is the table the query above would produce.
CUSTOMERID MANAGERID VALID_FROM VALID_UNTIL
------------ ----------- -------------- -------------
1 A 01-01-2018 31-08-2018
1 B 28-02-2018 31-05-2018
1 C 31-08-2018
It feels like there's a simple way to do this but I'm stumped. Any ideas?
EDIT
Bloody 'ell, I forgot to mention that the table CUS_MAN_AUDIT also contains plenty of other columns, such as customer name, legal form etc and now Caius's answer returns a result set shown below (CUSTOMERNAME included for sake of clarity, not in actual result set)
+------------+-----------+------------+-------------+--------------+
| CUSTOMERID | MANAGERID | VALID_FROM | VALID_UNTIL | CUSTOMERNAME |
+------------+-----------+------------+-------------+--------------+
| 1 | A | 01-01-2018 | 02-01-2018 | Blam-O Litnd |
| 1 | A | 02-01-2018 | 15-01-2018 | Blamo Litd |
| 1 | A | 15-01-2018 | 28-02-2018 | Blammo Ltd |
+------------+-----------+------------+-------------+--------------+
while it should (or at least what I'd like it to)
+------------+-----------+------------+-------------+
| CUSTOMERID | MANAGERID | VALID_FROM | VALID_UNTIL |
+------------+-----------+------------+-------------+
| 1 | A | 01-01-2018 | 28-02-2018 |
+------------+-----------+------------+-------------+
And I can't remember how I formatted my tables in the original post, sorry...
You can do it with a window function that gets the LEAD (next) value of the date, per customer, ordered by the timestamp
SELECT
CUSTOMERID,
MANAGERID,
TIMESTAMP AS VALID_FROM,
LEAD(TIMESTAMP) OVER(PARTITION BY CUSTOMER ORDER BY TIMESTAMP) as VALID_TIL
FROM CUSMAN.CUS_MAN_AUDIT
If it aids your understanding it's functionally similar to this:
SELECT
CUSTOMERID,
MANAGERID,
cur.TIMESTAMP AS VALID_FROM,
MIN(nxt.TiMESTAMP) as VALID_TIL
FROM
CUSMAN.CUS_MAN_AUDIT cur
LEFT OUTER JOIN
CUSMAN.CUS_MAN_AUDIT nxt
ON
cur.CUSTOMERID = nxt.CUSTOMERID AND
cur.TIMESTAMP < nxt.TIMESTAMP
GROUP BY
CUSTOMERID,
MANAGERID,
cur.TIMESTAMP
Joining the table back to itself on the same customer but where each cur record is associated with every record that has a later date (nxt) and then getting the MIN of the later dates..

SQL duration between dates for different persons

hopefully someone can help me with the following task:
I hVE got 2 tables Treatment and 'Person'. Treatment contains the dates when treatments for the different persons were started, Person contains personal information, e.g. lastname.
Now I have to find all persons where the duration between the first and last treatment is over 20 years.
The Tables look something like this:
Person
| PK_Person | First name | Name |
_________________________________
| 1 | A_Test | Karl |
| 2 | B_Test | Marie |
| 3 | C_Test | Steve |
| 4 | D_Test | Jack |
Treatment
| PK_Treatment | Description | Starting time | PK_Person |
_________________________________________________________
| 1 | A | 01.01.1989 | 1
| 2 | B | 02.11.2001 | 1
| 3 | A | 05.01.2004 | 1
| 4 | C | 01.09.2013 | 1
| 5 | B | 01.01.1999 | 2
So in this example, the output should be person Karl, A_Test.
Hopefully its understandable what the problem is and someone can help me.
Edit: There seems to be a problem with the formatting, the tables are not displayed correctly, I hope its readable.
SELECT *
FROM person p
INNER JOIN Treatment t on t.PK_Person = p.PK_Person
WHERE DATEDIFF(year,[TREATMENT_DATE_1], [TREATMENT_DATE_2]) > 20
This should do it, it is however untested so will need tweaking to your schema
Your data looks a bit suspicious, because the first name doesn't look like a first name.
But, what you want to do is aggregate the Treatment table for each person and get the minimum and maximum starting times. When the difference is greater than 20 years, then keep the person, and join back to the person table to get the names.
select p.FirstName, p.LastName
from Person p join
(select pk_person, MIN(StartingTime) as minst, MAX(StartingTime) as maxst
from Treatment t
group by pk_person
having MAX(StartingTime) - MIN(StartingTime) > 20*365.25
) t
on p.pk_person = t.pk_person;
Note that date arithmetic does vary between databases. In most databases, taking the difference of two dates counts the number of days between them, so this is a pretty general approach (although not guaranteed to work on all databases).
I've taken a slightly different approach and worked with SQL Fiddle to verify that the below statements work.
As mentioned previously, the data does seem a bit suspicious; nonetheless per your requirements, you would be able to do the following:
select P.PK_Person, p.FirstName, p.Name
from person P
inner join treatment T on T.pk_person = P.pk_person
where DATEDIFF((select x.startingtime from treatment x where x.pk_person = p.pk_person order by startingtime desc limit 1), T.StartingTime) > 7305
First, we need to inner join treatements which will ignore any persons who are not in the treatment table. The where portion now just needs to select based on your criteria (in this case a difference of dates). Doing a subquery will generate the last date a person has been treated, compare that to each of your records, and filter by number of days (7305 = 20 years * 365.25).
Here is the working SQL Fiddle sample.

Find last (first) instance in table but exclude most recent (oldest) date

I have a table that reflects a monthly census of a certain population. Each month on an unpredictable day early in that month, the population is polled. Any member who existed at that point is included in that month's poll, any member who didn't is not.
My task is to look through an arbitrary date range and determine which members were added or lost during that time period. Consider the sample table:
ID | Date
2 | 1/3/2010
3 | 1/3/2010
1 | 2/5/2010
2 | 2/5/2010
3 | 2/5/2010
1 | 3/3/2010
3 | 3/3/2010
In this case, member with ID "1" was added between Jan and Feb, and member with ID 2 was lost between Feb and Mar.
The problem I am having is that if I just poll to try and find the most recent entry, I will capture all the members that were dropped, but also all the members that exist on the last date. For example, I could run this query:
SELECT
ID,
Max(Date)
FROM
tableName
WHERE
Date BETWEEN '1/1/2010' AND '3/27/2010'
GROUP BY
ID
This would return:
ID | Date
1 | 3/3/2010
2 | 2/5/2010
3 | 3/3/2010
What I actually want, however, is just:
ID | Date
2 | 2/5/2010
Of course I can manually filter out the last date, but since the start and end date are parameters I want to generalize that. One way would be to run sequential queries. In the first query I'd find the last date, and then use that to filter in the second query. It would really help, however, if I could wrap this logic into a single query.
I'm also having a related problem when I try to find when a member was first added to the population. In that case I'm using a different type of query:
SELECT
ID,
Date
FROM
tableName i
WHERE
Date BETWEEN '1/1/2010' AND '3/27/2010'
AND
NOT EXISTS(
SELECT
ID,
Date
FROM
tableName ii
WHERE
ii.ID=i.ID
AND
ii.Date < i.Date
AND
Date BETWEEN '1/1/2010' AND '3/27/2010'
)
This returns:
ID | Date
1 | 2/5/2010
2 | 1/1/2010
3 | 1/1/2010
But what I want is:
ID | Date
1 | 2/5/2010
I would like to know:
1. Which approach (the MAX() or the subquery with NOT EXISTS) is more efficient and
2. How to fix the queries so that they only return the rows I want, excluding the first (last) date.
Thanks!
You could do something like this:
SELECT
ID,
Max(Date)
FROM
tableName
WHERE
Date BETWEEN '1/1/2010' AND '3/27/2010'
GROUP BY
ID
having max(date) < '3/1/2010'
This filters out anyone polled in March.

Best way to join the two tables *including* duplicates from one table

Accounts (table)
+----+----------+----------+-------+
| id | account# | supplier | RepID |
+----+----------+----------+-------+
| 1 | 123xyz | Boston | 2 |
| 2 | 245xyz | Chicago | 2 |
| 3 | 425xyz | Chicago | 3 |
+----+----------+----------+-------+
PayOut (table)
+----+----------+----------+-------------+--------+
| id | account# | supplier | datecreated | Amount |
+----+----------+----------+-------------+--------+
| 5 | 245xyz | Chicago | 01-15-2009 | 25 |
| 6 | 123xyz | Boston | 10-15-2011 | 50 |
| 7 | 123xyz | Boston | 10-15-2011 | -50 |
| 8 | 123xyz | Boston | 10-15-2011 | 50 |
| 9 | 425xyz | Chicago | 10-15-2011 | 100 |
+----+----------+----------+-------------+--------+
I have accounts table and I have payout table. Payout table comes from abroad so we do not have any control over it. This leaves us with a problem that we can't join the two tables based on record ID field, that is one problem which we can't solved. We therefore join based on Account#, SupplierID (2nd and 3rd column). This creates a problem that it creates (possibly) many to many relationship. But we filter our records if they are active and we use a second filter on payout table when the payout was created. Payout are created months to month. There are two problems with this in my view
The query takes quite a bit of time to complete (could be inefficient)
There are certain duplicates that are removed which should not be removed. Example is record 6 and 8 in payout table. What happened here is, we got a customer, then the customer cancelled then he got him back. In this case +50, -50 and +50. Again all values are valid and must show in the report for audit purposes. Currently only one +50 is shown, the other is lost. There are a couple of other problems within the report that comes once in a while.
Here is the query. It uses groups by to remove duplicates. I would like to have an advance query which outperforms and which does takes into account that no record in PayOut table is duplicated as long as they come up in the month of the report.
Here is our current query
/* Supplied to Store Procedure */
-----------------------------------
#RepID // the person for whome payout is calculated
#Month // of payment date
#year // year of payment date
-----------------------------------
select distinct
A.col1,
A.col2,
...
A.col10,
B.col2,
B.Col2,
B.Amount /* this is the important column, portion of which goes to Rep */
from records A
JOIN payout B
on A.Supplier = B.Supplier AND A.Account# = B.Account#
where datepart(mm, B.datecreated) = #Month /* parameter to stored procedure */
and datepart(yyyy, B.datecreated) = #Year
and A.[rep ID] = #RepID /* parameter to SP */
group by
col1,col2,col3,....col10
order by customerName
Is this query optimum? Can I improve it using CROSS APPLY or WHERE EXISTs that will make it faster as well as remove the duplicate problem?
Note that this query is used to get payout of a rep. Hence every record has repid field who it is assigned to. Ideally I would like to use Select WHERE Exist query.
It's difficult to understand exactly what you want because in one place you say you 'want' the duplicates but then you say that you are using the group by to remove duplicates. So the first thought would be "Why not just get rid of the group by?". But I have to believe you are smart enough to have thought of that yourself, so I assume it's got to be there for a reason.
I think someone here could help you pretty easily if you could post the actual query, but since you say you can't I will just try to give you some direction in solving the problem...
Instead of trying to do everything in one statement, use temporary tables or views to split it up. It may be easier for you to think about how to get rid of the duplicates you don't want and keep the ones you do first and put those into a temporary table, and then join the tables together and work with that.