Adding a dummy identifier to data that varies by position and value

Adding a dummy identifier to data that varies by position and value - sql

I am working on a project in SQL Server with diagnosis codes and a patient can have up to 4 codes but not necessarily more than 1 and a patient cannot repeat a code more than once. However, codes can occur in any order. My goal is to be able to count how many times a Diagnosis code appears in total, as well as how often it appears in a set position.
My data currently resembles the following:
PtKey
Order #
Order Date
Diagnosis1
Diagnosis2
Diagnosis3
Diagnosis 4
345
1527
7/12/20
J44.9
R26.2
NULL
NULL
367
1679
7/12/20
R26.2
H27.2
G47.34
NULL
325
1700
7/12/20
G47.34
NULL
NULL
NULL
327
1710
7/12/20
I26.2
J44.9
G47.34
NULL
I would think the best approach would be to create a dummy column here that would match up the diagnosis by position. For example, Diagnosis 1 with A, and Diagnosis 2 with B, etc.
My current plan is to rollup the diagnosis using an unpivot:
UNPIVOT ( Diag for ColumnALL IN (Diagnosis1, Diagnosis2, Diagnosis3, Diagnosis4)) as unpvt
However, this still doesn’t provide a way to count the diagnoses by position on a sales order.
I want it to look like this:
Diagnosis
Total Count
Diag1 Count
Diag2 Count
Diag3 Count
Diag4 Count
J44.9
2
1
1
0
0
R26.2
1
1
0
0
0
H27.2
1
0
1
0
0
I26.2
1
1
0
0
0
G47.34
3
1
0
2
0

You can unpivot using apply and aggregate:
select v.diagnosis, count(*) as cnt,
sum(case when pos = 1 then 1 else 0 end) as pos_1,
sum(case when pos = 2 then 1 else 0 end) as pos_2,
sum(case when pos = 3 then 1 else 0 end) as pos_3,
sum(case when pos = 4 then 1 else 0 end) as pos_4
from data d cross apply
(values (diagnosis1, 1),
(diagnosis2, 2),
(diagnosis3, 3),
(diagnosis4, 4)
) v(diagnosis, pos)
where diagnosis is not null;

Another way is to use UNPIVOT to transform the columns into groupable entities:
SELECT Diagnosis, [Total Count] = COUNT(*),
[Diag1 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis1' THEN 1 ELSE 0 END),
[Diag2 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis2' THEN 1 ELSE 0 END),
[Diag3 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis3' THEN 1 ELSE 0 END),
[Diag4 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis4' THEN 1 ELSE 0 END)
FROM
(
SELECT * FROM #x UNPIVOT (Diagnosis FOR DiagGroup IN
([Diagnosis1],[Diagnosis2],[Diagnosis3],[Diagnosis4])) up
) AS x GROUP BY Diagnosis;
Example db<>fiddle

You can also manually unpivot via UNION before doing the conditional aggregation:
SELECT Diagnosis, COUNT(*) As Total Count
, SUM(CASE WHEN Position = 1 THEN 1 ELSE 0 END) As [Diag1 Count]
, SUM(CASE WHEN Position = 2 THEN 1 ELSE 0 END) As [Diag2 Count]
, SUM(CASE WHEN Position = 3 THEN 1 ELSE 0 END) As [Diag3 Count]
, SUM(CASE WHEN Position = 4 THEN 1 ELSE 0 END) As [Diag4 Count]
FROM
(
SELECT PtKey, Diagnosis1 As Diagnosis, 1 As Position
FROM [MyTable]
UNION ALL
SELECT PtKey, Diagnosis2 As Diagnosis, 2 As Position
FROM [MyTable]
WHERE Diagnosis2 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis3 As Diagnosis, 3 As Position
FROM [MyTable]
WHERE Diagnosis3 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis4 As Diagnosis, 4 As Position
FROM [MyTable]
WHERE Diagnosis4 IS NOT NULL
) d
GROUP BY Diagnosis
Borrowing Aaron's fiddle, to avoid needing to rebuild the schema from scratch, and we get this:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d1f7f525e175f0f066dd1749c49cc46d

Related

Creating a view from two tables, with case and sum on one table

I'm having some troubles trying to create a view from two tables, which includes a sum + case for the first table. I've tried multiple different joins/unions, and I can get just the XTS table to come over, or just the case count scenarios to work, but I cannot get both.
here are the tables. For Table 1, UWI is non-unique. For Table 2, UWI is Unique. new_view is what I'm hoping to achieve for my view.
TABLE 1
UWI ET
1 A
1 B
1 B
2 B
2 C
2 C
TABLE 2
UWI XTS
1 10
2 20
3 10
4 30
new_view
UWI XTS B_COUNT C_COUNT
1 10 4 3
2 20 3 4
3 10 4 5
4 30 3 2
Here's what I'm currently working with.
CREATE VIEW new_view AS
SELECT t1.UWI,
sum(case when t1.ET='B' then 1 else 0 end) as B_COUNT,
sum(case when t1.ET='C' then 1 else 0 end) as C_COUNT,
sum(case when t1.ET='D' then 1 else 0 end) as D_COUNT,
sum(case when t1.ET='E' then 1 else 0 end) as E_COUNT,
sum(case when t1.ET='F' then 1 else 0 end) as F_COUNT
FROM TABLE_1 t1
INNER JOIN (SELECT t2.UWI, t2.XTS AS TSC
from TABLE_2 t2)
on t1.UWI = t2.UWI
group by t1.UWI;

Your sample select does not match your sample data, so this is a guess but I think you just need to move the aggregation into an apply()
select t2.UWI, t2.XTS, s.*
from Table_2 t2
outer apply (
select
sum(case when t1.ET='B' then 1 else 0 end) as B_COUNT,
sum(case when t1.ET='C' then 1 else 0 end) as C_COUNT,
sum(case when t1.ET='D' then 1 else 0 end) as D_COUNT,
sum(case when t1.ET='E' then 1 else 0 end) as E_COUNT,
sum(case when t1.ET='F' then 1 else 0 end) as F_COUNT
from table_1 t1
where t1.UWI = t2.UWI
group by t1.UWI
)s

how to sum count sql

How to total count?
SELECT
COUNT(CASE WHEN SHP.id = 1 then 1 ELSE NULL END) as "New",
COUNT(CASE WHEN SHP.id = 2 then 5 ELSE NULL END) as "Accepted"
from SHP
RESULT:
NEW Accepted
1 5
But I need a total count
result: 6

I'd do something like this;
SELECT
COUNT(CASE WHEN id = 1 THEN 1 END) as New,
COUNT(CASE WHEN id = 2 THEN 5 END) as Accepted,
COUNT(CASE WHEN id = 1 THEN 1
WHEN id = 2 THEN 5 END) as Total
FROM SHP
This is exactly what the CASE statement should be used for, the logic is very simple. This will avoid having to perform multiple calculations on the same fields.
As a note, the value in your THEN statement isn't used in this instance at all, it's just doing a COUNT of the number rather than performing a SUM. I've also removed the ELSE NULL because this is what the CASE will do by default anyway.
If your intention was to SUM the values then do this;
SELECT
SUM(CASE WHEN id = 1 THEN 1 END) as New,
SUM(CASE WHEN id = 2 THEN 5 END) as Accepted,
SUM(CASE WHEN id = 1 THEN 1
WHEN id = 2 THEN 5 END) as Total
FROM SHP
Example
Assuming you have only two values in your database, 1 and 2, we can create test data like this;
CREATE TABLE #SHP (id int)
INSERT INTO #SHP (id)
VALUES (1),(2)
And use this query;
SELECT
SUM(CASE WHEN id = 1 then 1 END) as New,
SUM(CASE WHEN id = 2 then 5 END) as Accepted,
SUM(CASE WHEN id = 1 THEN 1
WHEN id = 2 THEN 5 END) as Total
FROM #SHP
Gives this result;
New Accepted Total
1 5 6

Try this:
SELECT
COUNT(CASE WHEN SHP.id = 1 then 1 ELSE NULL END) +
COUNT(CASE WHEN SHP.id = 2 then 5 ELSE NULL END) as "Total"
from SHP

You could wrap your query into a subquery and do something like this:
SELECT SUM(New) as New, Sum(Accepted) as Accepted, Sum(New + Accepted) as Total FROM
(SELECT
COUNT(CASE WHEN SHP.id = 1 then 1 ELSE NULL END) as "New",
COUNT(CASE WHEN SHP.id = 2 then 5 ELSE NULL END) as "Accepted"
from SHP) as SubQuery
That's if you don't want to duplicate doing the counts and just adding the two together.

try this
with s1 as(
SELECT
COUNT(CASE WHEN SHP.id = 1 then 1 ELSE 0 END) as "New"
from SHP
),s2 as
(
SELECT
COUNT(CASE WHEN SHP.id = 2 then 5 ELSE 0 END) as "Accepted"
from SHP
)
select sum("New"+ "Accepted") from s1,s2

sql subquery that collects from 3 rows

I have a huge database with over 4 million rows that look like that:
Customer ID Shop
1 Asda
1 Sainsbury
1 Tesco
2 TEsco
2 Tesco
I need to count customers that within last 4 weeks had shopped in all 3 shops Tesco Sainsbury and Asda. Can you please advice if its possible to do it with subqueries?

This is an example of a "set-within-sets" subquery. You can solve it with aggregation:
select customer_id
from Yourtable t
where <shopping date within last four weeks>
group by customer_id
having sum(case when shop = 'Asda' then 1 else 0 end) > 0 and
sum(case when shop = 'Sainsbury' then 1 else 0 end) > 0 and
sum(case when shop = 'Tesco' then 1 else 0 end) > 0;
This structure is quite flexible. So if you wanted Asda and Tesco but not Sainsbury, then you would do:
select customer_id
from Yourtable t
where <shopping date within last four weeks>
group by customer_id
having sum(case when shop = 'Asda' then 1 else 0 end) > 0 and
sum(case when shop = 'Sainsbury' then 1 else 0 end) = 0 and
sum(case when shop = 'Tesco' then 1 else 0 end) > 0;
EDIT:
If you want a count, then use this as a subquery and count the results:
select count(*)
from (select customer_id
from Yourtable t
where <shopping date within last four weeks>
group by customer_id
having sum(case when shop = 'Asda' then 1 else 0 end) > 0 and
sum(case when shop = 'Sainsbury' then 1 else 0 end) > 0 and
sum(case when shop = 'Tesco' then 1 else 0 end) > 0
) t

Need Help on Oracle Select Query

I am finding difficulty to frame a select query.
PFB, for the table and corresponding data:
ID DLS MATCH_STATUS LAST_UPDATE_TIME BO CH FT
1 0 0 09-07-2013 00:00:00 IT TE NA
1 1 1 09-07-2013 00:01:01 IT TE NA
2 0 0 09-07-2013 10:00:00 IP TE NA
3 0 0 09-07-2013 11:00:00 IT YT NA
3 2 2 09-07-2013 11:01:00 IT YT NA
Here
Match_Status 0-->Initial Record
1-->Singel Match
2-->Multi Match
For every record there will be a initial entry with match_status 0 and subsequent matching process end other number such as 1,2 will be update.
I am trying to retrieve records such as total record , waiting match ,single match and multi match group by BO, CH and FT
Below is the expected out put:
BO CH FT TOTAL_RECORD AWAITNG_MATCH SINGLE_MATCH MULTI_MATCH
IT TE NA 1 0 1 0
IP TE NA 1 1 0 0
IT YT NA 1 0 0 2
I have tried below query :
select BO,CH,FT,sum(case when match_status=0 then 1 else 0 end) as TOTAL_RECORD,
sum(case when match_status = 0 then 1 else 0 end) as AWAITING_MATCH,
sum(case when match_status = 1 then 1 else 0 end) as SINGLE_MATCH,
sum(case when match_status = 2 then 1 else 0 end) as MULTI_MATCH from
table1 where last_update_time >= current_timestamp-1
group by BO,CH,FT;
problem with the above query is, awaiting_match is getting populated same as total record as I understand because of match_status=0
Similarly I tried with
select BO,CH,FT,sum(case when match_status=0 then 1 else 0 end) as TOTAL_RECORD,
select (sum(case when t1.ms=0 then 1 else 0 end) from
(select max(match_status) as ms from table1 where last_update_time >= current_timestamp-1 group by id)t1) )awaiting_match,
sum(case when match_status = 1 then 1 else 0 end) as SINGLE_MATCH,
sum(case when match_status = 2 then 1 else 0 end) as MULTI_MATCH from
table1 where last_update_time >= current_timestamp-1
group by BO,CH,FT;
problem with the approach is awaiting_match is getting populated with the same value for subsequent row entry.
Please help me with a suitable query for the desired format.
Thanks a lot in advance.

It seems that you want the last match status. I am guessing that this is actually the maximum of the statuses. If so, the following solves the problem by first grouping on id and then doing the grouping to summarize:
select BO, CH, FT,
count(*) as TOTAL_RECORD,
sum(case when lastms = 0 then 1 else 0 end) as AWAITING_MATCH,
sum(case when lastms = 1 then 1 else 0 end) as SINGLE_MATCH,
sum(case when lastms = 2 then 1 else 0 end) as MULTI_MATCH
from (select id, bo, ch, ft, MAX(match_status) as lastms
from table1
where last_update_time >= current_timestamp-1
group by id, bo, ch, ft
) t
group by BO, CH, FT;
If you actually want the last update to provide the status for the id, then you can use row_number() to enumerate the rows for each id, order by update time descending, and choose the first one:
select BO, CH, FT,
count(*) as TOTAL_RECORD,
sum(case when lastms = 0 then 1 else 0 end) as AWAITING_MATCH,
sum(case when lastms = 1 then 1 else 0 end) as SINGLE_MATCH,
sum(case when lastms = 2 then 1 else 0 end) as MULTI_MATCH
from (select id, bo, ch, ft, match_status,
ROW_NUMBER() over (partition by id order by last_update_time desc) as seqnum
from table1
where last_update_time >= current_timestamp-1
) t
where seqnum = 1
group by BO, CH, FT;

How do I return count and not count in a SQL query?

If I have a table
AgentID | IsNew | TeamID
1 N 1
2 Y 2
3 Y 2
4 N 2
5 Y 1
I want to return the following from a query:
Team | CountIsNew = N | CountIsNew = Y
1 1 1
2 1 2
Is there a way I can do this?
Using Oracle 10

SELECT team, SUM(DECODE(IsNew, 'N', 1, 0)), SUM(DECODE(IsNew, 'Y', 1, 0))
FROM mytable
GROUP BY
team

SELECT TeamId
, SUM(CASE WHEN IsNew = 'N' THEN 1 ELSE 0 END) AS CountIsNotNew
, SUM(CASE WHEN IsNew = 'Y' THEN 1 ELSE 0 END) AS CountIsNew
FROM Agent
GROUP BY TeamId

Yet another way - COUNT doesn't count NULLs (except for COUNT(*)):
SELECT TeamId,
COUNT(DECODE(IsNew,'N',1)) CountIsNotNew,
COUNT(DECODE(IsNew,'Y',1)) CountIsNew
FROM Agent
GROUP BY TeamId;
Or, if you prefer CASE:
SELECT TeamId,
COUNT(CASE IsNew WHEN 'N' THEN 1 END) CountIsNotNew,
COUNT(CASE IsNew WHEN 'Y' THEN 1 END) CountIsNew
FROM Agent
GROUP BY TeamId;
(note: the "1"s could be any literal value)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Adding a dummy identifier to data that varies by position and value - sql

Related

Creating a view from two tables, with case and sum on one table

how to sum count sql

sql subquery that collects from 3 rows

Need Help on Oracle Select Query

How do I return count and not count in a SQL query?

Categories

Resources