OracleSQL: Assigning employees to groups with date values, querying current assignments by date - sql

I have a database which consists of employees (one table) which can be assigned to groups (another table). Bother are joined together with another table, employee-to-group, which lists the group id, the employee id and the start date of the assignment.
An employee always has to be assigned to a group, but the assignments can change daily. One employee could be working in group A for day, then change into group B and work in group C only a week later.
My task is to find out which employees are assigned to a certain group given by its name at any given date. So the input should be: group name, date and I want the output to be the data of all the employees which are part of that group at the given moment in time.
Here's an SQL fiddle with some test data:
http://sqlfiddle.com/#!9/6d0bb
I recreated the database with mysql-statements because I couldn't figure out the oracle statements, I'm sorry.
As you can see from the test data, some employees may never change groups, while others change frequently. THere are also employees which are planned to change assignments in the future. The query has to account for that.
Because the application is a legacy one, the values (especially in the date field) are questionable. They are given as "days since the 1st of january, 1990", so the entry "9131" means "1st of january, 2015". 9468 would be today (2015-12-04) and 9496 would be 2016-01-01).
What I already have is code to find out the "date value" for any given date in what I call the "legacy format" of the application I'm working with (here I've just used CURRENT_DATE):
SELECT FLOOR(CURRENT_DATE - TO_DATE('1990-01-01', 'YYYY-MM-DD')) AS diffdate
For finding out which group a certain employee is assigned to, I tried:
SELECT * FROM history h
WHERE emp_nr = 1 AND valid_from <= 9131
ORDER BY valid_from DESC
FETCH FIRST ROW ONLY;
which should return me the group which an employee is assigned to on the 1st of january 2015.
What I do need help with is creating a statement that joins all tables does the same for a whole group instead of only one employee (as there are thousands of employees in the database and I only want the data of at most 10 groups).
I'm thankful for any kind of pointers in the right direction.

Use row_number to rank your history and get the latest group, just as you did with your FETCH FIRST query:
select *
from
(
select
h.*,
row_number() over (partition by emp_nr order by valid_from desc) as rn
from history h
where valid_from <= 9131
)
where rn = 1
You can then join this result with other tables.

Related

SQL Oracle: How to show only one row when the columns diverge

I have an employees table in which most of the results show me only one employee per row.
However, I have to bring the amount of employees by area where 3 employees out of the 3432 have worked on a different area before.
Therefore, the results show me duplicated rows for these 3 employees. It's something like this:
Notice that on Brian's situation he's been admitted on a different area before.
How can I show Brian only once? Nonetheless, how can I show only the most recent area where he's worked on?
You can use ROW_NUMBER() to identify new and old rows per each employee, ordered by admission date.
Then filtering out old rows is easy. For example:
select *
from (
select t.*,
row_number() over(partition by employee order by admission desc) as rn
from t
) x
where rn = 1 -- keeps the latest row only, per employee

Query monitoring changes in the field

I need to program a query where I can see the changes that certain fields have undergone in a certain date period.
Example: From the CAM_CONCEN table bring those records where the ACCOUNT_NUMBER undergoes a modification in the CONCTACT field in a period of 6 months before the date.
I would be grateful if you can guide me.
You can use LAG() to peek at the previous row of a particular subset of rows (the same account in this case).
For example:
select *
from (
select c.*,
lag(contact) over(partition by account_number
order by change_date) as prev_contact
from cam_concen c
) x
where contact <> prev_contact

SQL group by not returning row value for an aggregate column

I was using SQL statement to bring an aggregate (MAX) for a column and rest of the columns should come from that row. I was using group by clause but for other columns I must also use either max or min, etc. This was budget oriented project so I could not have time to do it using LINQ. (Where I could have used first or default). Anyways I believe this is strong inability of SQL language.
Again this could have done by many ways but not using simple SQL group by.
any ideas?
Your question is a bit light on details but it sounds like you want to know, for some set of items, which item has the maximum of something and then what it’s other properties are.
You cannot group by all the non max columns because this breaks the group down into too small chunks to make the max work
You cannot max all the other columns because this mixes row data up
Here is a simple example:
Name, JobRole, StartDate
John, JuniorProgrammer, 2000-01-01
John, SeniorProgrammer, 2010-01-01
John was promoted to senior programmer in 2010. We want johns most recent promotion and what he does now. If we do this:
SELECT name, jobrole, max(startdate)
FROM emp
GROUP BY name
The database will complain that jobrole is not in the group by. If we add it to the group by, John will appear twice, not what we want. If instead we max(jobrole), it DOES accidentally work out ok because alphabetically, SeniorProgrmamer is higher than JuniorProgrammer
If however, John then gets a promotion again in 2019:
Name, JobRole, StartDate
John, JuniorProgrammer, 2000-01-01
John, SeniorProgrammer, 2010-01-01
John, ExecutiveDirector, 2019-01-01
This time our query is wrong:
SELECT name, max(jobrole), max(startdate)
FROM emp
GROUP BY name
Hi he row data will be mixed up: the date will be 2019 but the job will still be seniorprogrammer because it’s alphabetically the maximum value
Instead we have to find the max for the person and then join it back to find the rest of the data:
SELECT name, jobrole, startdate
FROM
emp
INNER JOIN
(
SELECT name, max(startdate) d
FROM emp
GROUP BY name
)findmax
ON findmax.d = emp.startdate and findmax.name = emp.name
There are other ways of achieving the same thing without a join- this method would have issues if an employee was promoted twice on the same day, two records would result. In a dB that supports analytical functions we an do:
SELECT name, jobrole, row_number() over (partition by name order by startdate desc)
FROM emp
This establishes an incrementing counter in order of descending start date. The counter restarts from 1 for every different employee. There is no group by so no complaints that the extra data isn’t grouped or on aggregate function. All we need to do to choose the most recent promotion date is wrap the whole thing in a select that demands the row number be 1:
SELECT * FROM
(
SELECT name, jobrole, row_number() over (partition by name order by startdate desc) r
FROM emp
) emp_with_rownum
WHERE r = 1
You don't want a group by. You seem to want a window function:
select t.*, max(col) over () as overall_max
from t;

Suppress Nonadjacent Duplicates in Report

Medical records in my Crystal Report are sorted in this order:
...
Group 1: Score [Level of Risk]
Group 2: Patient Name
...
Because patients are sorted by Score before Name, the report pulls in multiple entries per patient with varying scores - and since duplicate entries are not always adjacent, I can't use Previous or Next to suppress them. To fix this, I'd like to only display the latest entry for each patient based on the Assessment Date field - while maintaining the above order.
I'm convinced this behavior can be implemented with a custom SQL command to only pull in the latest entry per patient, but have had no success creating that behavior myself. How can I accomplish this compound sort?
Current SQL Statement in use:
SELECT "EpisodeSummary"."PatientID",
"EpisodeSummary"."Patient_Name",
"EpisodeSummary"."Program_Value"
"RiskRating"."Rating_Period",
"RiskRating"."Assessment_Date",
"RiskRating"."Episode_Number",
"RiskRating"."PatientID",
"Facility"."Provider_Name",
FROM (
"SYSTEM"."EpisodeSummary"
"EpisodeSummary"
LEFT OUTER JOIN "FOOBARSYSTEM"."RiskAssessment" "RiskRating"
ON (
("EpisodeSummary"."Episode_Number"="RiskRating"."Episode_Number")
AND
("EpisodeSummary"."FacilityID"="RiskRating"."FacilityID")
)
AND
("EpisodeSummary"."PatientID"="RiskRating"."PatientID")
), "SYSTEM"."Facility" "Facility"
WHERE (
"EpisodeSummary"."FacilityID"="Facility"."FacilityID"
)
AND "RiskRating"."PatientID" IS NOT NULL
ORDER BY "EpisodeSummary"."Program_Value"
The SQL code below may not be exactly correct, depending on the structure of your tables. The code below assumes the 'duplicate risk scores' were coming from the RiskAssessment table. If this is not correct, the code may need to be altered.
Essentially, we create a derived table and create a row_number for each record, based on the patientID and ordered by the assessment date - The most recent date will have the lowest number (1). Then, on the join, we restrict the resultset to only select record #1 (each patient has its own rank #1).
If this doesn't work, let me know and provide some table details -- Should the Facility table be the starting point? are there multiple entries in EpisodeSummary per patient? thanks!
SELECT es.PatientID
,es.Patient_Name
,es.Program_Value
,rrd.Rating_Period
,rrd.Assessment_Date
,rrd.Episode_Number
,rrd.PatientID
,f.Provider_Name
FROM SYSTEM.EpisodeSummary es
LEFT JOIN (
--Derived Table retreiving highest risk score for each patient)
SELECT PatientID
,Assessment_Date
,Episode_Number
,FacilityID
,Rating_Period
,ROW_NUMBER() OVER (
PARTITION BY PatientID ORDER BY Assessment_Date DESC
) AS RN -- This code generates a row number for each record. The count is restarted for every patientID and the count starts at the most recent date.
FROM RiskAssessment
) rrd
ON es.patientID = rrd.patientid
AND es.episode_number = rrd.episode_number
AND es.facilityid = rrd.facilityid
AND rrd.RN = 1 --This only retrieves one record per patient (the most recent date) from the riskassessment table
INNER JOIN SYSTEM.Facility f
ON es.facilityid = f.facilityid
WHERE rrd.PatientID IS NOT NULL
ORDER BY es.Program_Value

SQL statement - how to build a timeline graph

Hi I have a table with following columns:
ID
student_id
score (int)
scanned_date
close_date.
A script is run every week. it collects a score for each student every week. Each week the scores remain the same. When the script is run for the first time, it enters ID, Student_id, score, scanned_date and Null for Close_date for each student.
For each additional scan, if the score is same as last week's score then, the script does nothing.
But if a new score is found, then it enter's the date in the close_date field and enters a new row containing id, student_id, score, scanned_date and Null for close date.
I'm trying to build a sql statement which will help me build a timeline graph. For each distinct scanned_date, it will return sum of all the scores for each student so that I can build a graph.
is that possible to do?
-Maria
The following query extracts all the distinct scanned dates and them joins them back to the table to find what records are active on each date. It then aggregates the results by date:
select dates.scanned_date,
count(t.id) as numids,
sum(score) as sumscore
from (select distinct scanned_date from t) as dates left outer join
t
on dates.scanned_date >= t.scanned_date and
dates.scanned_date < t.close_date
group by dates.scanned_date
order by 1