how to query the value that has changed +/- 10% of the value from first encounter of each patient in sql server? - sql

I have this query that finds users with multiple hospital visits.
Table has about 593 columns, so I don't think I can show you the structure. But let's assume these are basic patients table with following columns.
id, sex, studyDate, referringPhysician, bmi, bsa, height, weight, bloodPressure, heartRate. These are also in the real table.
The patient visits the hospital and has some worked done. What we would like to find is how much of patient's bmi has changed since the first encounter. For example,
ID |SEX| StudyDate | Physician|BMI| BSA | ht| Wt | BP | HR |
1 PatientA | M | 2017-09-11 | Dr. Hale | 60| 2.03 | 6 | 282 | 116/82 | 77 |
2 PatientA | M | 2017-12-11 | Dr. Hale | 58| 2.03 | 6 | 296 | 126/82 | 72 |
3 PatientA | M | 2018-03-17 | Dr. Hale | 50| 2.03 | 6 | 282 | 126/82 | 72 |
In the example above, row 1 was the first encounter and the BMI was 60. In row 2, the bmi decreased to 58, but it's not more than 10%. So, that shouldn't be displayed. However, row 3 has bmi 50 which is decreased by more than 10% of bmi in row 1. That should be displayed.
I'm sorry, I don't have the data that I can share.
with G as(
select * from Patients P
inner join (
select count(*) as counts, ID as oeID
from Patients
group by ID
Having count(*) > 2
) oe on P.ID = oe.oeID where P.BMI > 30
)
select * from G
order by StudyDate asc;
From this, what I'd like to do is find out patients whose BMI has changed by 10% from the first encounter.
How can I do this?
Can you also help me understand the concept of for-each users in SQL, and how it handles such queries?

Guessing at your data model here...I suspect you've got a heavily denormalized structure here with everything being crammed into one table. You would be far better off to have a patient table separate from the table that stores their visits. The with G syntax here is very unneeded as well, especially if you are just doing a select * from the table after. Heh, I'm trying to get into medical analytics, so will give this a try.
I'll build this as I see your data model...you may have to change a step here and there to fit your column names. Lets start by getting first and most recent (last) visit dates by id
select id, min(StudyDate) as first_visit, max(studydate) as last_visit
from patients
group by id
having min(StudyDate) <> max(StudyDate)
Simply query at this point and by using the having clause we ensure that these are two separate visits. But we are lacking the BMI numbers for these visits...so we will have to join back to the patient table to grab them. We will iunclude a where clause to ensure only the +/- of 10% is found
select a.id, a.first_visit, a.last_visi, b.bmi as first_bmi, c.bmi as last_bmi, b.bmi - c.bmi as bmi_change
from
(select id, min(StudyDate) as first_visit, max(studydate) as last_visit
from patients
group by id
having min(StudyDate) <> max(StudyDate) a
inner join patients b on b.id = a.id and b.study_date = a.first_visit
inner join patients c on c.id = a.id and c.study_date = a.last_visit
where b.bmi - c.bmi >= 10 or b.bmi - c.bmi <= -10
Hopefully that makes sense, you'll want to change the top select line to grab all the fields you actually want to return, I'm just returning the ones of interest to your question
Part 2:
Lets approach this from a similar angle:
select id, min(StudyDate) as first_visit
from patients
group by id
Now we've got the first visit date. Lets join back to patients and get the bmi here.
select a.id, first_visit, p.bmi
from
(select id, min(StudyDate) as first_visit
from patients
group by id) a
inner join patients b on a.first_visit = b.studydate and a.id = b.id
This will simply be a list of each patient by ID giving us their first_visit date and their BMI on that first visit. Now we want to compare this bmi to all subsequent visits...so lets join all rows to back to this query. Subquery a below is simply the query above in brackets:
select a.id, a.first_visit, b.study_date, a.bmi, b.bmi, a.bmi-b.bmi as bmi_change
from
(select a.id, first_visit, b.bmi
from
(select id, min(StudyDate) as first_visit
from patients
group by id) a
inner join patients b on a.first_visit = b.studydate and a.id = b.id) a
inner join patients b on a.id = b.id
where a.bmi - b.bmi >= 10 or a.bmi - b.bmi <= -10
Similar idea, instead of joining on the max_date to get most recent, we are joining to all records for that patient and running the math from there. In the commented example, this will give rows 3,5,6.
Part 3
A little more complex...getting rows 3,4,5,6 when row 4 shows less than a 10 change in BMI means you are now trying to pick out the first date that the 10 change is seen and displaying all records from that. Lets call the query in part 2 subquery a and go pseudo code for a moment:
Select id, min(studydate)
from (subquerya) a
(subquerya) simply stands for the entire query used at the end of part 2. This will grab the study date of the first time a bmi change of over 10 is detected for each patient id (in our comment example, it would be visit 3). Now we can join back to patients, this time getting all records that are equal to or more recent than the min(studydate) of the first time bmi changed more than 10 since the first visit
select a.id, b.studydate, b.bmi
from
(Select id, min(studydate) as min_studydate
from (subquerya) a) a
inner join patients b on a.id = b.id and a.min_studydate <= b.studydate
This will bring back the list of all study dates happening after the first time a bmi change more than 10 was detected (3,4,5,6 from our comment example). Of course we've now lost the first study date's bmi value, so lets add that back in and bring the query all together.
select a.id, b.studydate, b.bmi, c.bmi as start_bmi, c.bmi - b.bmi as bmi_change
from
(Select id, min(studydate) as min_studydate
from ( select a.id, a.first_visit, b.study_date, a.bmi, b.bmi, a.bmi-b.bmi as bmi_change
from
(select a.id, first_visit, b.bmi
from
(select id, min(StudyDate) as first_visit
from patients
group by id) a
inner join patients b on a.first_visit = b.studydate and a.id = b.id) a
inner join patients b on a.id = b.id
where a.bmi - b.bmi >= 10 or a.bmi - b.bmi <= -10) a) a
inner join patients b on a.id = b.id and a.min_studydate <= b.studydate
inner join (select a.id, first_visit, p.bmi
from
(select id, min(StudyDate) as first_visit
from patients
group by id) a
inner join patients b on a.first_visit = b.studydate and a.id = b.id) c on c.id = a.id
If I have everything right, this should bring back rows 3,4,5,6 and the change in BMI across each visit. I've left a few more columns in there than need be and it could be cleaned a little, but all logic should be there. I don't have

Related

Grouping the data and showing 1 row per group in postgres

I have two tables which look like this :-
Component Table
Revision Table
I want to get the name,model_id,rev_id from this table such that the result set has the data like shown below :-
name model_id rev_id created_at
ABC 1234 2 23456
ABC 5678 2 10001
XYZ 4567
Here the data is grouped by name,model_id and only 1 data for each group is shown which has the highest value of created_at.
I am using the below query but it is giving me incorrect result.
SELECT cm.name,cm.model_id,r.created_at from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by cm.name,cm.model_id,r.created_at
ORDER BY cm.name asc,
r.created_at DESC;
Result :-
Anyone's help will be highly appreciated.
use max and sub-query
select T1.name,T1.model_id,r.rev_id,T1.created_at from
(
select cm.name,
cm.model_id,
MAX(r.created_at) As created_at from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by cm.name,cm.model_id
) T1
left join revision r
on T1.created_at =r.created_at
http://www.sqlfiddle.com/#!17/68cb5/4
name model_id rev_id created_at
ABC 1234 2 23456
ABC 5678 2 10001
xyz 4567
In your SELECT you're missing rev_id
Try this:
SELECT
cm.name,
cm.model_id,
MAX(r.rev_id) AS rev_id,
MAX(r.created_at) As created_at
from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by 1,2
ORDER BY cm.name asc,
r.created_at DESC;
What you were missing is the statement to say you only want the max record from the join table. So you need to join records, but the join will bring in all records from table r. If you group by the 2 columns in component, then select the max from r, on the id and created date, it'll only pick the top out the available to join
I would use distinct on:
select distinct on (m.id) m.id, m.name, r.rev_id, r.created_at
from model m left join
revision r
on m.model_id = r.model_id
order by m.id, r.rev_id;

Left join with logic / condition

My title is a bit ambigious but I'll try to clarify below.
I've created a view (A) with a couple of joins. I'm trying to join that view with another view (B). Both views contain a year field, Company ID, Industry ID and a, let's call it, Product code that takes a value of I or U.
View B also contains employee ID, natrually there are multiple employee IDs for every Company ID. Any employee ID could have a Product code that is I, U or both. If it has both there will be 2 identical employee IDs distiguished by the produc code.
Now I want to join view A on Year, Customer ID, Industry ID and Product Code. BUT, since every Customer ID in view B could occure twice (if the underlying employees for that customer have both product code I and U) I only want to join once.
This is the distribution of Customer IDs for Product Codes:
I and NOT U: 165'370
U and NOT I: 45'27
U and I : 48'920
left join [raw].[ViewA] a on a.year=b.year and a.CustomerID=b.CustomerID
and a.IndustryID=b.IndustryID and a.ProductCode ='I'
This is the join I'm running with currently, though I'm excluding all records where the CustomerID only have product code U. The reason why I want to only join once per Customer ID/Year/IndustryID is because I'm later on aggregating some other value from View A. Thus, I can't have that value appear twice.
Current result
Year CustomerID IndustyID ProductCode Value
2015 A Z I 50
2015 A Z U NULL
2015 B Z I 40
2016 A Z I 20
2016 B Z U NULL
What I'd like
Year CustomerID IndustyID ProductCode Value
2015 A Z I 50
2015 A Z U NULL
2015 B Z I 40
2016 A Z I 20
2016 B Z U 30
If I understand correctly, try something as following
[pseudocode]
left join ( Select *, rank() over (partition by whatMakesItUnique Order by ProductCode) distinction From tableA ) a
on your conditions + a.distinction = 1
The idea is to assign number 1 either if there are 2 rows for whatMakesItUnique to ProductCode "I" or to "U" when there is only one row and then join on this assigned number
"whatMakesItUnique" are probably columns Year, CustomerID, IndustyID in your case.
I think you can do what you want with outer apply:
outer apply
(select top (1) a.*
from [raw].[ViewA] a
where a.year = b.year and a.CustomerID = b.CustomerID and
a.IndustryID = b.IndustryID
order by (case a.ProductCode when 'I' then 1 when 'U' then 2 else 3 end)
) a
This will return one row from ViewA, prioritized by the product code.

Oracle SQL: SQL join with group by (count) and having clauses

I have 3 tables
Table 1.) Sale
Table 2.) ItemsSale
Table 3.) Items
Table 1 and 2 have ID in common and table 2 and 3 have ITEMS in common.
I'm having trouble with a query that I have made so far but can't seem to get it right.
I'm trying to select all the rows that only have one row and match a certain criteria here is my query:
select *
from sales i
inner join itemssales j on i.id = j.id
inner join item u on j.item = u.item
where u.code = ANY ('TEST','HI') and
i.created_date between TO_DATE('1/4/2016 12:00:00 AM','MM/DD/YYYY HH:MI:SS AM') and
TO_DATE('1/4/2016 11:59:59 PM','MM/DD/YYYY HH:MI:SS PM')
group by i.id
having count(i.id) = 1
In the ItemSale table there are two entries but in the sale table there is only one. This is fine...but I need to construct a query that will only return to me the one record.
I believe the issue is with the "ANY" portion, the query only returns one row and that row is the record that doesn't meet the "ANY ('TEST', 'HI')" criteria.
But in reality that record with that particular ID has two records in ItemSales.
I need to only return the records that legitimately only have one record.
Any help is appreciated.
--EDIT:
COL1 | ID
-----|-----
2 | 26
3 | 85
1 | 23
1 | 88
1 | 6
1 | 85
What I also do is group them and make sure the count is equal to 1 but as you can see, the ID 85 is appearing here as one record which is a false positive because there is actually two records in the itemsales table.
I even tried changing my query to j.id after the select since j is the table with the two records but no go.
--- EDIT
Sale table contains:
ID
---
85
Itemsales table contains:
ID | Position | item_id
---|----------|---------
85 | 1 | 6
85 | 2 | 7
Items table contains:
item_id | code
--------|------
7 | HI
6 | BOOP
The record it is returning is the one with the Code of 'BOOP'
Thanks,
"I need to only return the records that legitimately only have one record."
I interpret this to mean, you only want to return SALES with only one ITEM. Furthermore you need that ITEM to meet your additional criteria.
Here's one approach, which will work fine with small(-ish) amounts of data but may not scale well. Without proper table descriptions and data profiles it's not possible to offer a performative solution.
with itmsal as
( select sales.id
from itemsales
join sales on sales.id = itemsales.id
where sales.created_date >= date '2016-01-04'
and sales.created_date < date '2016-01-05'
group by sales.id having count(*) = 1)
select sales.*
, item.*
from itmsal
join sales on sales.id = itmsal.id
join itemsales on itemsales.id = itmsal.id
join items on itemsales.item = itemsales.item
where items.code in ('TEST','HI')
I think you are trying to restrict the results so that items MUST ONLY have the code of 'TEST' or 'HI'.
select
sales.*
from (
select
s.id
from Sales s
inner join Itemsales itss on s.id = itss.id
inner join Items i on itss.item_id = i.item_id
group by
s.id
where s.created_date >= date '2016-01-04'
and s.created_date < date '2016-01-05'
having
sum(case when i.code IN('TEST','HI') then 0 else 1 end) = 0
) x
inner join sales on x.id = sales.id
... /* more here as required */
This construct only returns sales.id that have items with ONLY those 2 codes.
Note it could be done with a common table expression (CTE) but I prefer to only use those when there is an advantage in doing so - which I do not see here.
If I get it correctly this may work (not tested):
select *
from sales s
inner join (
select i.id, count( i.id ) as cnt
from sales i
inner join itemssales j on i.id = j.id
inner join item u on j.item = u.item and u.code IN ('TEST','HI')
where i.created_date between TO_DATE('1/4/2016 12:00:00 AM','MM/DD/YYYY HH:MI:SS AM') and
TO_DATE('1/4/2016 11:59:59 PM','MM/DD/YYYY HH:MI:SS PM')
group by i.id
) sj on s.id = sj.id and sj.cnt = 1

I need a SQL query for comparing column values against rows in the same table

I have a table called BB_BOATBKG which holds passengers travel details with columns Z_ID, BK_KEY and PAXSUM where:
Z_ID = BookingNumber* LegNumber
BK_KEY = BookingNumber
PAXSUM = Total number passengers travelled in each leg for a particular booking
For Example:
Z_ID BK_KEY PAXSUM
001234*01 001234 2
001234*02 001234 3
001287*01 001287 5
001287*02 001287 5
002323*01 002323 7
002323*02 002323 6
I would like to get a list of all Booking Numbers BK_KEY from BB_BOATBKG where the total number of passengers PAXSUM is different in each leg for the same booking
Example, For Booking number A, A*Leg01 might have 2 Passengers, A* Leg02 might have 3 passengers
Dependent of your RDBMs there might be several options availible. A solution that should work for most is:
SELECT A.Z_ID, A.BK_KEY, A.PAXSUM
FROM BB_BOATBKG A
JOIN (
SELECT BK_KEY
FBB_BOATBKGROM BB_BBK_KEY
GROUP BY BK_KEY
HAVING COUNT( DISTINCT PAXSUM ) > 1
) B
ON A.BK_KEY = B.BK_KEY
If your DBMS support OLAP functions, have a look at RANK() OVER (...)
It's a little counterintuitive, but you could join the table to itself on {BK_KEY, PAXSUM} and pull out only the records whose joined result is null.
I think this does it:
SELECT
a.BK_KEY
FROM
BB_BOATBKG a
LEFT OUTER JOIN BB_BOATBKG b ON a.BK_KEY = b.BK_KEY AND a.PAXSUM = b.PAXSUM
WHERE
b.Z_ID IS NULL
GROUP BY
a.BK_KEY
Edit: I think I missed anything beyond the trivial case. I think you can do it with some really nasty subselecting though, a la:
SELECT
b.BK_KEY
FROM
(
SELECT
a.BK_KEY,
Count = COUNT(*)
FROM
(
SELECT
a.BK_KEY,
a.PAXSUM
FROM
BB_BOATBKG a
GROUP BY
a.BK_KEY,
a.PAXSUM
HAVING
COUNT(*) = 1
) a
GROUP BY
a.BK_KEY
) b
INNER JOIN
(
SELECT
c.BK_KEY,
Count = COUNT(*)
FROM
BB_BOATBKG c
GROUP BY
c.BK_KEY
) c ON b.BK_KEY = c.BK_KEY AND b.Count = c.Count

Finding group maxes in SQL join result [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
SQL: Select first row in each GROUP BY group?
Two SQL tables. One contestant has many entries:
Contestants Entries
Id Name Id Contestant_Id Score
-- ---- -- ------------- -----
1 Fred 1 3 100
2 Mary 2 3 22
3 Irving 3 1 888
4 Grizelda 4 4 123
5 1 19
6 3 50
Low score wins. Need to retrieve current best scores of all contestants ordered by score:
Best Entries Report
Name Entry_Id Score
---- -------- -----
Fred 5 19
Irving 2 22
Grizelda 4 123
I can certainly get this done with many queries. My question is whether there's a way to get the result with one, efficient SQL query. I can almost see how to do it with GROUP BY, but not quite.
In case it's relevant, the environment is Rails ActiveRecord and PostgreSQL.
Here is specific postgresql way of doing this:
SELECT DISTINCT ON (c.id) c.name, e.id, e.score
FROM Contestants c
JOIN Entries e ON c.id = e.Contestant_id
ORDER BY c.id, e.score
Details about DISTINCT ON are here.
My SQLFiddle with example.
UPD To order the results by score:
SELECT *
FROM (SELECT DISTINCT ON (c.id) c.name, e.id, e.score
FROM Contestants c
JOIN Entries e ON c.id = e.Contestant_id
ORDER BY c.id, e.score) t
ORDER BY score
The easiest way to do this is with the ranking functions:
select name, Entry_id, score
from (select e.*, c.name,
row_number() over (partition by e.contestant_id order by score) as seqnum
from entries e join
contestants c
on c.Contestant_id = c.id
) ec
where seqnum = 1
I'm not familiar with PostgreSQL, but something along these lines should work:
SELECT c.*, s.Score
FROM Contestants c
JOIN (SELECT MIN(Score) Score, Contestant_Id FROM Entries GROUP BY Contestant_Id) s
ON c.Id=s.Contestant_Id
one of solutions is
select min(e.score),c.name,c.id from entries e
inner join contestants c on e.contestant_id = c.id
group by e.contestant_id,c.name,c.id
here is example
http://sqlfiddle.com/#!3/9e307/27
This simple query should do the trick..
Select contestants.name as name, entries.id as entry_id, MIN(entries.score) as score
FROM entries
JOIN contestants ON contestants.id = entries.contestant_id
GROUP BY name
ORDER BY score
this grabs the min score for each contestant and orders them ASC