Trying to combine multiples of a key ID into single row, but with different values in columns - sql

TSQL - SQL Sever
I'm building a report to very specific requirements. I'm trying to combine multiples of a key ID into single rows, but there's different values in some of the columns, so GROUP BY won't work.
SELECT count(tt.Person_ID) as CandCount, tt.Person_ID,
CASE e.EthnicSuperCategoryID WHEN CandCount > 1 THEN 10 ELSE e.EthnicSuperCategoryID END as EthnicSuperCategoryID,
CASE e.Ethnicity_Id WHEN 1 THEN 1 ELSE 0 END as Black ,
CASE e.Ethnicity_Id WHEN 2 THEN 1 ELSE 0 END as White ,
CASE e.Ethnicity_Id WHEN 3 THEN 1 ELSE 0 END as Asian,
etc
FROM T_1 TT
JOINS
WHERE
GROUP
Msg 102, Level 15, State 1, Line 4
Incorrect syntax near '>'.
Here's the results (without the first CASE). Note person 3 stated multiple ethnicities.
SELECT count(tt.Person_ID) as CandCount, tt.Person_ID,
CASE e.Ethnicity_Id WHEN 1 THEN 1 ELSE 0 END as Black ,
CASE e.Ethnicity_Id WHEN 2 THEN 1 ELSE 0 END as White ,
CASE e.Ethnicity_Id WHEN 3 THEN 1 ELSE 0 END as Asian,
etc
FROM T_1 TT
JOINS
WHERE
GROUP
That’s expected, but the goal would be to assign multiple ethnicities to Ethnicity_Id of 10 (multiple). I also want them grouped on a single line.
So the end result would look like this:
So my issue is two fold. If the candidate has more than 2 ethnicities, assign the records to Ethnicity_Id of 10. I also need duplicated person IDs grouped into a single row, while displaying all of the results of the columns.

This should bring your desired result:
SELECT Person_ID
, ISNULL(ID_Dummy,Ethnicity_ID) Ethnicity_ID
, MAX(Black) Black
, MAX(White) White
, MAX(Asian) Asian
FROM #T T
OUTER APPLY(SELECT MAX(10) FROM #T T2
WHERE T2.Person_ID = T.Person_ID
AND T2.Ethnicity_ID <> T.Ethnicity_ID
)EthnicityOverride(ID_Dummy)
GROUP BY Person_ID, ISNULL(ID_Dummy,Ethnicity_ID)

You want conditional aggregation. Your query is incomplete, but the idea is:
select
person_id,
sum(case ethnicity_id = 1 then 1 else 0 end) as black,
sum(case ethnicity_id = 2 then 1 else 0 end) as white,
sum(case ethnicity_id = 3 then 1 else 0 end) as asian
from ...
where ...
group by person_id
You might want max() instead of sum(). Also I did not get the logic for column the second column in the desired results - maybe that's just count(*).

This would be my approach
SELECT
person_id,
CASE WHEN flag = 1 THEN Ethnicity_Id ELSE 10 END AS Ethnicity_Id,
[1] as black,
[2] as white,
[3] as asian
FROM
(
SELECT
person_id,
Ethnicity_Id as columns,
1 as n,
MAX(Ethnicity_Id) over(PARTITION BY person_id) as Ethnicity_Id,
COUNT(Ethnicity_Id) over(PARTITION BY person_id) as flag
FROM
#example
) AS SourceTable
PIVOT
(
MAX(n) FOR columns IN ([1], [2], [3])
) AS PivotTable;
Pivot the Ethnicity_Id column into multiples columns, Using constant
1 to make it complain with your expected result.
Using Max(Ethnicity_Id) with Partition By to get the original
Ethnicity_Id
Using Count(Ethnicity_Id) to flag if a need to raplace Ethnicity_Id
with 10 bc there is more that 1 row for that person_id
If you need to add more Ethnicitys add the ids in ... IN ([1], [2], [3]) ... and in the select

Related

SQL or operator how to use with having

i have a table which i am joining with with operator. I can have 2 combinations id that table FDEL - 1 or 0 and FDVE 1 or 0, what i would like to do is to dispay if - item has fdve, or item has fdel (and count) but it doesnt work (i can see all fdve, or all fdel)
select
lpad(purchase_id,10,0) as purchase_id,
sum(has_label_fdel) as FDEL_count,
case when LABELS like '%FDVE%' then 1 else 0 end as HAS_LABEL_FDVE,
sum(has_label_fdve) as FDVE_count
from
"SRC_ORACLEIWP"."PURCHASE_ANALYSIS_RULES"
group by
lpad(purchase_id,10,0),has_label_fdve
having FDVE_count>0 -- FDEL_count>0
You want one resut row per product, so group by product only. Use SUMfor counting and MAX for the aggregated yes/no.
select
lpad(purchase_id,10,0) as padded_purchase_id,
max(has_label_fdve) as has_labels_fdve,
sum(has_label_fdve) as fdve_count,
max(has_label_fdel) as has_labels_fdel,
sum(has_label_fdel) as fdel_count
from src_oracleiwp.purchase_analysis_rules
group by padded_purchase_id
having has_labels_fdve = 1
or has_labels_fdel = 1
order by padded_purchase_id;
I've changed your alias names slightly, so they are digfferent from the columns you have (because such ambiguities can sometimes lead to problems).
The check on labels like '%FDVE%' is unnecessary, because you already have the has_label_fdve flag, which is always 0 or 1. Or so it seems. If the flags can be null, use COALESCE on them or do use LIKE expressions.
If you don't have has_label_fdve and has_label_fdel yet, use the labels column instead:
select
lpad(purchase_id,10,0) as padded_purchase_id,
max(case when labels like '%FDVE%' then 1 else 0 end) as has_labels_fdve,
sum(case when labels like '%FDVE%' then 1 else 0 end) as fdve_count,
max(case when labels like '%FDEL%' then 1 else 0 end) as has_labels_fdel,
sum(case when labels like '%FDEL%' then 1 else 0 end) as fdel_count
from src_oracleiwp.purchase_analysis_rules
group by padded_purchase_id
having has_labels_fdve = 1
or has_labels_fdel = 1
order by padded_purchase_id;

Multiple Word Count in SQL

I have a list of words I need to find in a specific column , "description of what happenned "
this holds anything up to 500 or more characters. I have the script below that does work
However how do I replace the Name column 1.2.3 with the actual name of the word I am looking for with the total next to it.
Just cant get it to display prob something simple.
select GROUPING_ID ( Amoxicillin ,Atorvastatin ) as Name ,count(*) as Total
from ( select case when [description_of_what_happened] like '%Amoxicillin%'
then 1 else 0 end as Amoxicillin ,
case when [description_of_what_happened] like '%Atorvastatin%'
then 1 else 0 end as Atorvastatin
FROM "NAME OF TABLE"
group by grouping sets (() ,(Amoxicillin),(Atorvastatin))
having coalesce (Amoxicillin,1) != 0 and coalesce (Atorvastatin,1) != 0
order by grouping_id (Amoxicillin,Atorvastatin)
row 3 being the total I need row 1 and row 2 to show the name of the product
result as below
Name Total
1 7
2 9
3 4112
You can use strings instead of flags:
select coalesce(Amoxicillin, Atorvastatin, 'Total') as Name,
count(*) as Total
from (select (case when [description_of_what_happened] like '%Amoxicillin%'
then 'Amoxicillin'
end) as Amoxicillin ,
(case when [description_of_what_happened] like '%Atorvastatin%'
then 'Atorvastatin'
end
) as Atorvastatin
from "NAME OF TABLE"
where Amoxicillin is not null or Atorvastatin is not null
group by grouping sets ((), (Amoxicillin), (Atorvastatin))
order by name;
Note that I also moved the logic in the having to the where.

Create a query that counts instances in a related table

I have re-written this query about 20 times today and I keep getting close but no dice... I'm sure this is easy-peasy for y'all, but my SQL (Oracle) is pretty rusty.
Here's what I need:
PersonID Count1 Count2 Count3 Count4
1 0 0 2 1
2 1 1 1 0
3 1 1 1 2
Data is coming from several sources. I have a table People, and a table Values. People can have any number of values in that table.
PersonID Item Value
1 Check1 3
1 Check2 3
1 Check3 4
2 Check4 2
2 Check5 3
2 Check6 1
.. etc
So the query would, for each PersonID, count how many times the particular Value appears. The values are always 1, 2, 3, or 4. I tried to do 4 subqueries, but it wouldn't read the PersonID from the main query and just returned the count of all instances of value=1.
I was then thinking do a Group_By ... I don't know. Any help is appreciated!
ETA: I've deleted & re-written the query many times in many ways and unfortunately did not save any intermediate attempts. I didn't include it originally because I was in the middle of rearranging it again, and it's not runnable as-is. But here it is as it stands now:
/*sources are the tested requirements
values are the scores people received on the tested sources
people are those who were tested on the requirements */
WITH sub_query4 (
SELECT values.personid,
count (values.ID) as count4 --how many 4s
FROM values
INNER JOIN sources ON values.valueid = sources.sourceid
INNER JOIN people ON people.personid = values.personid
WHERE values.yearid = 2017
AND values.quarter = 'Q1'
AND instr (sources.identifier, 'TESTBANK.01', 1 ,1) > 0
AND values.value = '4'
GROUP_BY people.personid
)
SELECT p.first_name,
p.last_name,
p.position,
p.email,
p.locationid,
sub_query4.count4 as count4 --eventually this would repeat for 1, 2, & 3
FROM people p
WHERE p.locationid=406
AND p.position in (9,10);
values is a bad name for a table because it is a SQL keyword.
In any case, conditional aggregation should work:
select personid,
sum(case when value = 1 then 1 else 0 end) as cnt_1,
sum(case when value = 2 then 1 else 0 end) as cnt_2,
sum(case when value = 3 then 1 else 0 end) as cnt_3,
sum(case when value = 4 then 1 else 0 end) as cnt_4
from values
group by personid;
I prefer to use PIVOT for this. Here is Example SQL Fiddle
SELECT "PersonID", val1,val2,val3,val4 FROM
(
SELECT "PersonID", "Value" from VALS
)
PIVOT
(
count("Value")
FOR "Value" IN (1 as val1, 2 as val2, 3 as val3, 4 as val4)
);

Transpose rows from split_string into columns

I'm stuck trying to transpose a set of rows into a table. In my stored procedure, I take a delimited string as input, and need to transpose it.
SELECT *
FROM string_split('123,4,1,0,0,5|324,2,0,0,0,4','|')
CROSS APPLY string_split(value,',')
From which I receive:
value value
123,4,1,0,0,5 123
123,4,1,0,0,5 4
123,4,1,0,0,5 1
123,4,1,0,0,5 0
123,4,1,0,0,5 0
123,4,1,0,0,5 5
324,2,0,0,0,4 324
324,2,0,0,0,4 2
324,2,0,0,0,4 0
324,2,0,0,0,4 0
324,2,0,0,0,4 0
324,2,0,0,0,4 4
The values delimited by | are client details. And within each client, there are six attributes, delimited by ,. I would like an output table of:
ClientId ClientTypeId AttrA AttrB AttrC AttrD
------------------------------------------------
123 4 0 0 0 5
324 2 0 0 0 4
What's the best way to go about this? I've been looking at PIVOT but can't make it work because it seems like I need row numbers, at least.
This answer assumes that row number function will "follow the order" of the string. If it does not you will need to write your own split that includes a row number in the resulting table. (This is asked on the official documentation page but there is no official answer given).
SELECT
MAX(CASE WHEN col = 1 THEN item ELSE null END) as ClientId,
MAX(CASE WHEN col = 2 THEN item ELSE null END) as ClientTypeId,
MAX(CASE WHEN col = 3 THEN item ELSE null END) as AttrA,
MAX(CASE WHEN col = 4 THEN item ELSE null END) as AttrB,
MAX(CASE WHEN col = 5 THEN item ELSE null END) as AttrC,
MAX(CASE WHEN col = 6 THEN item ELSE null END) as AttrD
FROM (
SELECT A.value as org, B.value as item,
ROW_NUMBER() OVER (partition by A.value) as col
FROM string_split('123,4,1,0,0,5|324,2,0,0,0,4','|') as A
CROSS APPLY string_split(A.value,',') as B
) X
GROUP BY org
You might get a message about nulls in aggregate function ignored. (I always forget which platforms care and which don't.) If you do you can replace the null with 0.
Note, this is not as fast and using a CTE to find the 5 commas in the string with CHARINDEX and then using SUBSTRING to extract the values. But I'm to lazy to write up that solution which I would need to test to get all the off by 1 issues right. Still, I suggest you do it that way if you have a big data set.
I know you already got it pretty much answered, but here you can find a PIVOT solution
select [ClientID],[ClientTypeId],[AttrA],[AttrB],[AttrC],[AttrD]
FROM
(
select case when ColumnRow = 1 then 'ClientID'
when ColumnRow = 2 then 'ClientTypeId'
when ColumnRow = 3 then 'AttrA'
when ColumnRow = 4 then 'AttrB'
when ColumnRow = 5 then 'AttrC'
when ColumnRow = 6 then 'AttrD' else null end as
ColumnRow,t.value,ColumnID from (
select ColumnID,z.value as stringsplit,b.value, cast(Row_number()
over(partition by z.value order by z.value) as
varchar(50)) as ColumnRow from (SELECT cast(Row_number() over(order by
a.value) as
varchar(50)) as ColumnID,
a.value
FROM string_split('123,4,1,0,0,5|324,2,0,0,0,4','|') a
)z
CROSS APPLY string_split(value,',') b
)t
) AS SOURCETABLE
PIVOT
(
MAX(value)
FOR ColumnRow IN ([ClientID],[ClientTypeId],[AttrA],[AttrB],[AttrC],
[AttrD])
)
AS
PivotTable

SQL Query for sorting by number of identical values in column

I am attempting to write an sql query for Postgresql that looks through a table of descriptions for a subject and another table that keeps tally of how many likes or dislikes or other flags are attached to the description. I want it to go through the tally table, find all the flags attached to each description, find the sum of how many identical flags there are for each description, then order by the number of likes each flag has minus how many dislikes etc it has, then return a list of all descriptions ordered by the sum of the previously described equation ( likes - dislikes etc.) and the number of likes, dislikes, etc. this is an example of the code I have so far ( there are more variables in the likes/dislikes as variable section ):
SELECT likes, dislikes, positive - negative AS orderCondition
FROM( SELECT d.id, d.l_id, d.user_id, d.description, a.flaggee_id,
SUM( CASE WHEN a.actions_id = 1 THEN 1 WHEN actions_id = 6 THEN 1 ELSE 0 END ) AS positive,
SUM( CASE WHEN a.actions_id <> 1 THEN 1 WHEN a.actions_id <> 6 THEN 1 ELSE 0 END ) AS negative,
SUM( CASE WHEN a.actions_id = 1 THEN 1 ELSE 0 END ) AS likes,
SUM( CASE WHEN a.actions_id = 2 THEN 1 ELSE 0 END ) AS dislikes
FROM descriptions d, description_actions a
WHERE d.id = a.flaggee_id OR d.id > 0 AND d.id <> a.flaggee_id
GROUP BY d.id, a.flaggee_id ) as result
ORDER BY orderCondition DESC;
this is not working however, it returns an empty set without errors. data in the tables are random for testing, id's are integers, things that are not id's are random strings, when querying the tables individually the results are accurate, so its not a case of the data not being in the tables. I'm having a really difficult time figuring it out.. any help would be appreciated.
I figured it out, I needed to specify that it was a left join, in order to still display results if the flags table was empty, I also added variables to the outer select. here is the working query
SELECT description, likes, dislikes, positive - negative AS orderCondition
FROM( SELECT d.id, d.l_id, d.user_id, d.description AS description, a.flaggee_id,
SUM( CASE WHEN a.actions_id = 1 THEN 1 WHEN actions_id = 6 THEN 1 ELSE 0 END ) AS positive,
SUM( CASE WHEN a.actions_id <> 1 THEN 1 WHEN a.actions_id <> 6 THEN 1 ELSE 0 END ) AS negative,
SUM( CASE WHEN a.actions_id = 1 THEN 1 ELSE 0 END ) AS likes,
SUM( CASE WHEN a.actions_id = 2 THEN 1 ELSE 0 END ) AS dislikes
FROM descriptions d LEFT JOIN description_actions a ON a.flaggee_id = d.id
GROUP BY d.id, a.flaggee_id ) as result
ORDER BY orderCondition DESC;