T-SQL Select first instance - sql

I have a table that contains patient locations. I'm trying to find the first patient location that is not the emergency department. I tried using MIN but since the locations have numbers in them it pulls the MIN location but not necessarily the first location. There is a datetime field associated with the location, but I'm not certain how to link the min datetime to the first location. Any help would be appreciated. My query looks something like this:
SELECT PatientName,
MRN,
CSN,
MIN (LOC) as FirstUnit,
MIN (DateTime)as FirstUnitTime
FROM Patients
WHERE LOC <> 'ED'

I presume that you want the first unit for each patient. If so, then you can use row_number():
select PatientName, MRN, CSN, LOC as FirstUnit, DateTime as FirstUnitTime
from (select p.*,
row_number() over (partition by PatientName, MRN, CSN
order by datetime asc) as seqnum
from Patients p
where loc <> 'ED'
) p
where seqnum = 1;
row_number() assigns a sequential number to a group of rows, where the group is specified by the partition by clause. The numbers are in order, as defined by the order by clause. So, the oldest (first) row in each group is assigned a value of 1. The outside query chooses this row.

Related

PostgreSQL fill out nulls with previous value by category

I am trying fill out some nulls where I just need them to be the previous available value for a name (sorted by date).
So, from this table:
I need the query to output this:
Now, the idea is that for Jane, on the second and third there was no score, so it should be equal to the previous date on which an score was available, for Jane. And the same for Jon. I am trying coalesce and range, but range is not implemented yet in Redshift. I also looked into other questions and they don't fully apply to different categories. Any alternatives?
Thanks!
select day, name,
coalesce(score, (select score
from [your table] as t
where t.name = [your table].name and t.date < [your table].date
order by date desc limit 1)) as score
from [your table]
The query straightforwardly implements the logic you described:
if score is not null, coalesce will return its value without executing the subquery
if score is null, the subquery will return the last available score for that name before the given date
It's a "gaps and islands" problem and a query can be like this
SELECT
day,
name,
MAX(score) OVER (PARTITION BY name, group_id) AS score
FROM (
SELECT
*,
SUM(CASE WHEN score IS NULL THEN 0 ELSE 1 END) OVER (PARTITION BY name ORDER BY day) AS group_id
FROM data
) groups
ORDER BY name DESC, day
You can check a working demo here

Selecting 1 column's value in a group after grouping by another column

How would I include the name of any one of the books that belong to that particular type in the below query?
select distinct
(select sum(ob.Balance)),
ob.BookType
from orders.OrderBooks ob
group by ob.BookType
In its current state it does what I need it to and groups books by BookType and sums their balances, as seen below.
However I need the name of any book that belongs to that BookType as part of the result.
If I select the BookName column and then group by it like below, it results in more unique entries and to an extent undoes the original grouping.
select distinct
(select sum(ob.Balance)),
ob.BookType,
ob.BookName
from orders.OrderBooks ob
group by ob.BookType, ob.BookName
;WITH x AS
(
SELECT
Balance = SUM(Balance) OVER (PARTITION BY BookType),
BookType,
BookName,
rn = ROW_NUMBER() OVER (PARTITION BY BookType ORDER BY BookName DESC)
FROM orders.OrderBooks
)
SELECT Balance, BookType, BookName
FROM x
WHERE rn = 1;
db<>fiddle
ORDER BY BookName DESC was dealer's choice. If you truly don't care which title shows up in the result, you can use any ordering you like. If you want the results to be random every time, you can use ORDER BY NEWID().
In general I like this flexibility better than the TOP (1) subquery approach, in addition to a single scan instead of an additional table access per row. But you can also do it a different way; just take min/max of the bookname, too:
SELECT Balance = SUM(Balance),
BookType,
BookName = MIN(BookName) -- or MAX()
FROM dbo.OrderBooks
GROUP BY BookType;
You can see these give similar results in this db<>fiddle. Plan is simpler, too; most notably: no spools. However when you use an aggregate function against that column, it makes it harder to provide arbitrary/random results, and if you intend to add other columns pulled from the right row, you'll need to go back to the row_number solution.
You can use a correlated subquery to get a single book name of that type. This assumes there's an ID field and you want to pull the most recent one:
select
Balance = (select sum(ob.Balance)),
ob.BookType,
BookName = (SELECT TOP(1) ob.BookName FROM orders.OrderBooks ob2 WHERE ob2.BookType = ob.BookType ORDER BY ob2.ID DESC)
from orders.OrderBooks ob
group by ob.BookType, ob.BookName

Can SQL Compare rows in same table , and dynamic select value?

Recently, i got a table which name Appointments
The requirement is that i need to select only one row for each customer by 2 rule:
if same time and (same location or different location), put null on tutor and location.
if different time and (same location or different location), pick the smallest row.
Since i'm so amateur in SQL, i've search the method of self join, but it seems not working in this case.
Expected result
Thanks all, have a great day...
You seem to want the minimum time for each customer, with null values if there are multiple rows and the tutor or location don't match.
You can use window functions:
select customer, starttime,
(case when min(location) = max(location) then min(location) end) as location,
(case when min(tutor) = max(tutor) then min(tutor) end) as tutor
from (select t.*, rank() over (partition by customer order by starttime) as seqnum
from t
) t
where seqnum = 1
group by customer, starttime

Oracle SQL Return First & Last Value From Different Columns By Partition

I need help with a query that will return a single record per partition in the below dataset. I used the DENSE_RANK to get the order and first/last position within each partition, but the problem is that I need to get a single record for each EMPLOYEE ITEM_ID combination which contains:
MIN(START) which is date type with time
SUM(DURATION) which is a number type signifying seconds of activity
MIN ranked value from INIT_STATUS
MAX ranked value from FIN_STATUS
Here is the initial data table, the same data table ordered with rank, and the desired result at the end (see image below):
Also, here is the code used to get the ordered table with rank values:
SELECT T.*,
DENSE_RANK() OVER (PARTITION BY T.EMPLOYEE, T.ITEM_ID ORDER BY T.START) AS D_RANK
FROM TEST_DATA T
ORDER BY T.EMPLOYEE, T.ITEM_ID, T.START;
Use first/last option to find statuses. The rest is classic aggregation:
select employee, min(start_), sum(duration),
max(init_status) keep (dense_rank first order by start_),
max(fin_status) keep (dense_rank last order by start_)
from test_data t
group by employee, item_id
order by employee, item_id;
start is a reserved word, so I used start_ for my test.

How to count rows in SQL Server 2012?

I am trying to find whether a person (id = A3) is continuously active in a program at least five months or more in a given year (2013). Any suggestion would be appreciated. My data look like as follows:
You simply use group by and a conditional expression:
select id,
(case when count(ActiveMonthYear) >= 5 then 'YES!' else 'NAW' end)
from table t
where ListOfTheMonths between '201301' and '201312'
group by id;
EDIT:
I suppose "continuously" doesn't just mean any five months. For that, there are various ways. I like the difference of row numbers approach
select distinct id
from (select t.*,
(row_number() over (partition by id order by ListOfTheMonths) -
count(ActiveMonthYear) over (partition by id order by ListOfTheMonths)
) as grp
from table t
where ListOfTheMonths between '201301' and '201312'
) t
where ActiveMonthYear is not null
group by id, grp
having count(*) >= 5;
The difference in the subquery is constant for groups of consecutive active months. This is then used a grouping. The result is a list of all ids that meet this criteria. You can add a where for a particular id (do it in the subquery).
By the way, this is written using select distinct and group by. This is one of the rare cases where these two are appropriately used together. A single id could have two periods of five months in the same year. There is no reason to include that person twice in the result set.