SQL Oracle: How to show only one row when the columns diverge - sql

I have an employees table in which most of the results show me only one employee per row.
However, I have to bring the amount of employees by area where 3 employees out of the 3432 have worked on a different area before.
Therefore, the results show me duplicated rows for these 3 employees. It's something like this:
Notice that on Brian's situation he's been admitted on a different area before.
How can I show Brian only once? Nonetheless, how can I show only the most recent area where he's worked on?

You can use ROW_NUMBER() to identify new and old rows per each employee, ordered by admission date.
Then filtering out old rows is easy. For example:
select *
from (
select t.*,
row_number() over(partition by employee order by admission desc) as rn
from t
) x
where rn = 1 -- keeps the latest row only, per employee

Related

How do I return only the most recent record on a date field split into two

Scenario: Person A takes test B three times in the span of two year. There will be three entries for that person. However, I need to write a query that tells me the number of Persons that have taken a test(just one, the latest test). The problem with this is I have a column labeled, Test_Month (xx) and Test_year(xx).
What I need: I need to be able to just pull the test with the most recent test month and year, basically the most recent test they took. (For example(see pic below) I need, the record for 2/20 only.)
I have no idea how to retrieve only one record per person by the last test they took based on the separate columns test_Month and test_year.
You can use window functions:
select *
from (
select
t.*,
row_number() over(
partition last_name, firt_name
order by test_year desc, test_month desc
) rn
from mytable t
) t
where rn = 1

SQL group by not returning row value for an aggregate column

I was using SQL statement to bring an aggregate (MAX) for a column and rest of the columns should come from that row. I was using group by clause but for other columns I must also use either max or min, etc. This was budget oriented project so I could not have time to do it using LINQ. (Where I could have used first or default). Anyways I believe this is strong inability of SQL language.
Again this could have done by many ways but not using simple SQL group by.
any ideas?
Your question is a bit light on details but it sounds like you want to know, for some set of items, which item has the maximum of something and then what it’s other properties are.
You cannot group by all the non max columns because this breaks the group down into too small chunks to make the max work
You cannot max all the other columns because this mixes row data up
Here is a simple example:
Name, JobRole, StartDate
John, JuniorProgrammer, 2000-01-01
John, SeniorProgrammer, 2010-01-01
John was promoted to senior programmer in 2010. We want johns most recent promotion and what he does now. If we do this:
SELECT name, jobrole, max(startdate)
FROM emp
GROUP BY name
The database will complain that jobrole is not in the group by. If we add it to the group by, John will appear twice, not what we want. If instead we max(jobrole), it DOES accidentally work out ok because alphabetically, SeniorProgrmamer is higher than JuniorProgrammer
If however, John then gets a promotion again in 2019:
Name, JobRole, StartDate
John, JuniorProgrammer, 2000-01-01
John, SeniorProgrammer, 2010-01-01
John, ExecutiveDirector, 2019-01-01
This time our query is wrong:
SELECT name, max(jobrole), max(startdate)
FROM emp
GROUP BY name
Hi he row data will be mixed up: the date will be 2019 but the job will still be seniorprogrammer because it’s alphabetically the maximum value
Instead we have to find the max for the person and then join it back to find the rest of the data:
SELECT name, jobrole, startdate
FROM
emp
INNER JOIN
(
SELECT name, max(startdate) d
FROM emp
GROUP BY name
)findmax
ON findmax.d = emp.startdate and findmax.name = emp.name
There are other ways of achieving the same thing without a join- this method would have issues if an employee was promoted twice on the same day, two records would result. In a dB that supports analytical functions we an do:
SELECT name, jobrole, row_number() over (partition by name order by startdate desc)
FROM emp
This establishes an incrementing counter in order of descending start date. The counter restarts from 1 for every different employee. There is no group by so no complaints that the extra data isn’t grouped or on aggregate function. All we need to do to choose the most recent promotion date is wrap the whole thing in a select that demands the row number be 1:
SELECT * FROM
(
SELECT name, jobrole, row_number() over (partition by name order by startdate desc) r
FROM emp
) emp_with_rownum
WHERE r = 1
You don't want a group by. You seem to want a window function:
select t.*, max(col) over () as overall_max
from t;

Include all rows in table when value in column changes

I have an employee change table which tracks every change made to an employee work history with no clear flag for what that change is. I am trying to track the different departments that an employee has worked for including the first department he/she worked for. So all changes plus the first department he/she worked at. An employee may come back to the department he/she once worked for and we need to be bring those rows too. I have highlighted rows that I would like to bring back
Emp Change History Table
You seem to just want lag():
select t.*
from (select t.*, lag(dept_no) over (partition by emp_no order by effective_date) as prev_dept_no
from t
) t
where prev_dept_no is null or prev_dept_no <> dept_no

Suppress Nonadjacent Duplicates in Report

Medical records in my Crystal Report are sorted in this order:
...
Group 1: Score [Level of Risk]
Group 2: Patient Name
...
Because patients are sorted by Score before Name, the report pulls in multiple entries per patient with varying scores - and since duplicate entries are not always adjacent, I can't use Previous or Next to suppress them. To fix this, I'd like to only display the latest entry for each patient based on the Assessment Date field - while maintaining the above order.
I'm convinced this behavior can be implemented with a custom SQL command to only pull in the latest entry per patient, but have had no success creating that behavior myself. How can I accomplish this compound sort?
Current SQL Statement in use:
SELECT "EpisodeSummary"."PatientID",
"EpisodeSummary"."Patient_Name",
"EpisodeSummary"."Program_Value"
"RiskRating"."Rating_Period",
"RiskRating"."Assessment_Date",
"RiskRating"."Episode_Number",
"RiskRating"."PatientID",
"Facility"."Provider_Name",
FROM (
"SYSTEM"."EpisodeSummary"
"EpisodeSummary"
LEFT OUTER JOIN "FOOBARSYSTEM"."RiskAssessment" "RiskRating"
ON (
("EpisodeSummary"."Episode_Number"="RiskRating"."Episode_Number")
AND
("EpisodeSummary"."FacilityID"="RiskRating"."FacilityID")
)
AND
("EpisodeSummary"."PatientID"="RiskRating"."PatientID")
), "SYSTEM"."Facility" "Facility"
WHERE (
"EpisodeSummary"."FacilityID"="Facility"."FacilityID"
)
AND "RiskRating"."PatientID" IS NOT NULL
ORDER BY "EpisodeSummary"."Program_Value"
The SQL code below may not be exactly correct, depending on the structure of your tables. The code below assumes the 'duplicate risk scores' were coming from the RiskAssessment table. If this is not correct, the code may need to be altered.
Essentially, we create a derived table and create a row_number for each record, based on the patientID and ordered by the assessment date - The most recent date will have the lowest number (1). Then, on the join, we restrict the resultset to only select record #1 (each patient has its own rank #1).
If this doesn't work, let me know and provide some table details -- Should the Facility table be the starting point? are there multiple entries in EpisodeSummary per patient? thanks!
SELECT es.PatientID
,es.Patient_Name
,es.Program_Value
,rrd.Rating_Period
,rrd.Assessment_Date
,rrd.Episode_Number
,rrd.PatientID
,f.Provider_Name
FROM SYSTEM.EpisodeSummary es
LEFT JOIN (
--Derived Table retreiving highest risk score for each patient)
SELECT PatientID
,Assessment_Date
,Episode_Number
,FacilityID
,Rating_Period
,ROW_NUMBER() OVER (
PARTITION BY PatientID ORDER BY Assessment_Date DESC
) AS RN -- This code generates a row number for each record. The count is restarted for every patientID and the count starts at the most recent date.
FROM RiskAssessment
) rrd
ON es.patientID = rrd.patientid
AND es.episode_number = rrd.episode_number
AND es.facilityid = rrd.facilityid
AND rrd.RN = 1 --This only retrieves one record per patient (the most recent date) from the riskassessment table
INNER JOIN SYSTEM.Facility f
ON es.facilityid = f.facilityid
WHERE rrd.PatientID IS NOT NULL
ORDER BY es.Program_Value

OracleSQL: Assigning employees to groups with date values, querying current assignments by date

I have a database which consists of employees (one table) which can be assigned to groups (another table). Bother are joined together with another table, employee-to-group, which lists the group id, the employee id and the start date of the assignment.
An employee always has to be assigned to a group, but the assignments can change daily. One employee could be working in group A for day, then change into group B and work in group C only a week later.
My task is to find out which employees are assigned to a certain group given by its name at any given date. So the input should be: group name, date and I want the output to be the data of all the employees which are part of that group at the given moment in time.
Here's an SQL fiddle with some test data:
http://sqlfiddle.com/#!9/6d0bb
I recreated the database with mysql-statements because I couldn't figure out the oracle statements, I'm sorry.
As you can see from the test data, some employees may never change groups, while others change frequently. THere are also employees which are planned to change assignments in the future. The query has to account for that.
Because the application is a legacy one, the values (especially in the date field) are questionable. They are given as "days since the 1st of january, 1990", so the entry "9131" means "1st of january, 2015". 9468 would be today (2015-12-04) and 9496 would be 2016-01-01).
What I already have is code to find out the "date value" for any given date in what I call the "legacy format" of the application I'm working with (here I've just used CURRENT_DATE):
SELECT FLOOR(CURRENT_DATE - TO_DATE('1990-01-01', 'YYYY-MM-DD')) AS diffdate
For finding out which group a certain employee is assigned to, I tried:
SELECT * FROM history h
WHERE emp_nr = 1 AND valid_from <= 9131
ORDER BY valid_from DESC
FETCH FIRST ROW ONLY;
which should return me the group which an employee is assigned to on the 1st of january 2015.
What I do need help with is creating a statement that joins all tables does the same for a whole group instead of only one employee (as there are thousands of employees in the database and I only want the data of at most 10 groups).
I'm thankful for any kind of pointers in the right direction.
Use row_number to rank your history and get the latest group, just as you did with your FETCH FIRST query:
select *
from
(
select
h.*,
row_number() over (partition by emp_nr order by valid_from desc) as rn
from history h
where valid_from <= 9131
)
where rn = 1
You can then join this result with other tables.