SQL : Create columns from the distinct values of a column [duplicate] - sql

I have a dataset [lipid] that extracted from electronic medical record system (EMRS). In that EMRS, the physician give order to obtain a laboratory blood profile from a patient with a unique order number BUT with a different service types. So, if one order has 4 service types, EMRS will record the event on 4 rows (identical [duplicates] order number in Order_no column, BUT with a different service types in Service_type column) like this;
Order_no
Service_type
Result
1
TC
230
1
HDL
40
1
TG
150
1
LDL
90
Sometimes, one order may has <4 service types, hence, order will be like that;
Order_no
Service_type
Result
1
TC
230
1
HDL
40
1
TG
150
1
LDL
90
2
TC
230
2
HDL
40
4
TC
230
4
HDL
40
4
LDL
90
5
TC
230
5
TG
150
5
LDL
90
6
TC
230
8
TC
230
8
HDL
40
8
TG
150
8
LDL
90
What I'm trying to do is writing a query that keeps Order_no column and change direction of table as well as merge identical order number in one row like this;
Order_no
TC
HDL
TG
LDL
1
230
40
150
90
2
250
66
4
199
39
99
5
299
45
190
6
400
8
400
40
250
290
How can I write this query in Google BigQuery?

Use below approach
select * from your_table
pivot (any_value(Result) for Service_type in ('TC', 'HDL', 'TG', 'LDL'))
In case if Service Type is not known in advance - you can use below
execute immediate (select '''
select * from your_table
pivot (any_value(Result) for Service_type in (''' || string_agg(distinct "'" || Service_type || "'") ||
"))"
from your_table
)

You can use PIVOT.
Example:
WITH your_table AS
(
SELECT 1 AS Order_no, 'TC' AS Service_type, 230 AS Result
UNION ALL
SELECT 1, 'HDL', 40
UNION ALL
SELECT 1, 'TG', 150
UNION ALL
SELECT 1, 'LDL', 90
)
SELECT *
FROM your_table PIVOT(SUM(Result) FOR Service_type IN ('TC', 'HDL', 'TG', 'LDL'))

Related

SQL, label user based on the similarity

Is below case possible in SQL?
Let say I have a table like this:
user_id
product_id
1
123
1
122
1
121
2
124
2
125
2
121
3
123
3
122
3
122
4
123
4
212
4
222
5
124
5
125
5
121
I want to label the user if they have same product_id, regardless the order, so the output looks like this:
user_id
product_id
label
1
123
a
1
122
a
1
121
a
2
124
b
2
125
b
2
121
b
3
123
a
3
121
a
3
122
a
4
123
c
4
212
c
4
222
c
5
124
b
5
125
b
5
121
b
Please advise
You can use the string_agg function to get the list of product_ids for each user (as a single string), then use the dense_rank function on that string to get unique labels for each product_ids list.
select T.user_id, T.product_id, D.label
from table_name T join
(
select user_id,
chr(dense_rank() over (order by user_products) + 96) label
from
(
select user_id,
string_agg(cast(product_id as string), ',' order by product_id) user_products
from table_name
group by user_id
) lbl
) D
on T.user_id = D.user_id
order by T.user_id

Keep duplicate rows in Google BigQuery

I have a dataset [lipid] that extracted from electronic medical record system (EMRS). In that EMRS, the physician give order to obtain a laboratory blood profile from a patient with a unique order number BUT with a different service types. So, if one order has 4 service types, EMRS will record the event on 4 rows (identical [duplicates] order number in Order_no column, BUT with a different service types in Service_type column) like this;
Order_no
Service_type
1
TC
1
HDL
1
TG
1
LDL
Sometimes, one order may has <4 service types, hence, order will be like that;
Order_no
Service_type
1
TC
1
HDL
1
TG
1
LDL
2
TC
2
HDL
4
TC
4
HDL
4
LDL
5
TC
5
TG
5
LDL
6
TC
8
TC
8
HDL
8
TG
8
LDL
What I'm trying to do is write a query that keeps orders that has four identical Order_no but different Service_type like this;
Order_no
Service_type
1
TC
1
HDL
1
TG
1
LDL
8
TC
8
HDL
8
TG
8
LDL
How can I write this query in Google BigQuery?
Use below simple approach
select * from your_table
qualify count(*) over(partition by Order_no) > 3
if applied to sample data in your question - output is
In case if you need to count ONLY distinct services - use below
select * from your_table
qualify count(distinct Service_type) over(partition by Order_no) > 3

Total Aggregate Sum of Each Category Retrieved in the Group By Clause

"I have a table like this is which I have used group by with sum() function to calculate
total participants against each training type held in different districts.now what I want to do is I want total sum for each training type category e.g: for CMST I want total sum of male,female and total participants after CMST and all the other categories something like this:"
I am displaying this table in my views table and working with mvc entity framework.is it possible to do this via sql query or it would be more appropriate to do it via coding.please suggest the best way to achieve it.
Query which i am using to achieve it is below:
select [District_Name] as DISTRICT_NAME,
[Training_Type],
sum(Male_Participants) as Male_Participants,
sum([Female_Participants]) as Female_Participants,
sum([Total_Participants]) as Total_Participants
from [TrainingsData]
where
[Training_Type] = 'CMST'
or
[Training_Type] = 'LMST'
or
[Training_Type] = 'Community Awareness Training (CAT)'
or
[Training_Type] = 'Exposure Visit'
or
[Training_Type] = 'Literacy & Numeracy'
or
[Training_Type] = 'Orientation Training Workshop (OTW)'
or
[Training_Type] = 'TVET'
group by [District_Name],[Training_Type]
You can achieve your desired output with this following script but need some adjustment in your report like replacing the District_name and Training_type in the report part. The condition will be - if District_Name = 'ZZZZZ' then replace both district_name,training_type with '' before display.
WITH Tab1(district_name,training_type,male_participants,female_participants,total_participants)
AS
(
SELECT 'Jhal MAgsi','CMST',10,20,30 UNION ALL
SELECT 'Khuzdar','CMST',5,5,10 UNION ALL
SELECT 'Killa Abdullah','CMST',15,15,30 UNION ALL
SELECT 'Jhal MAgsi','CAT',1,2,3 UNION ALL
SELECT 'Khuzdar','CAT',14,20,34 UNION ALL
SELECT 'Loralai','CAT',100,250,350 UNION ALL
SELECT 'Pishin','CAT',1,1,2 UNION ALL
SELECT 'Jhal MAgsi','LN',3,3,6 UNION ALL
SELECT 'Khuzdar','LN',9,100,109 UNION ALL
SELECT 'Loralai','LN',200,50,250 UNION ALL
SELECT 'Jhal MAgsi','LMST',5,8,13 UNION ALL
SELECT 'Khuzdar','LMST',9,5,14
)
SELECT district_name,training_type,male_participants,female_participants,total_participants
FROM Tab1 T1
UNION ALL
SELECT 'ZZZZZ' district_name,
training_type,
SUM(T1.male_participants) male_participants,
SUM(T1.female_participants) female_participants,
SUM(T1.total_participants) total_participants
FROM tab1 T1
GROUP BY training_type
ORDER BY 2,1
Output is-
district_name training_type male_participants female_participants total_participants
Jhal MAgsi CAT 1 2 3
Khuzdar CAT 14 20 34
Loralai CAT 100 250 350
Pishin CAT 1 1 2
ZZZZZ CAT 116 273 389
Jhal MAgsi CMST 10 20 30
Khuzdar CMST 5 5 10
Killa Abdullah CMST 15 15 30
ZZZZZ CMST 30 40 70
Jhal MAgsi LMST 5 8 13
Khuzdar LMST 9 5 14
ZZZZZ LMST 14 13 27
Jhal MAgsi LN 3 3 6
Khuzdar LN 9 100 109
Loralai LN 200 50 250
ZZZZZ LN 212 153 365

Count distinct values of a Column based on Distinct values of First Column

I am dealing with a huge volume of traffic data. I want to identify the vehicles which have changed their lanes, I'm Microsoft Access with VB.Net.
Traffic Data:
Vehicle_ID Lane_ID Frame_ID Distance
1 2 12 100
1 2 13 103
1 2 14 105
2 1 16 130
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
I have tried to distinct the Vehicle_ID and then count(distinct Lane_ID).
I could list the distinct Vehicle_ID but the it counts the total Lane_ID instead of Distinct Lane_ID.
SELECT
Distinct Vehicle_ID, count(Lane_ID)
FROM Table1
GROUP BY Vehicle_ID
Shown Result:
Vehicle_ID Lane Count
1 3
2 3
3 2
Correct Result:
Vehicle_ID Lane Count
1 1
2 2
3 2
Further to that i would like to get all Vehicle_ID who have changed their lane (all data including previous lane and new lane). Output result would be somehow like: Vehicle_ID Lane_ID Frame_ID Distance
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
Access does not support COUNT(DISTINCT columnname) so do this:
SELECT t.Vehicle_ID, COUNT(t.Lane_ID) AS [Lane Count]
FROM (
SELECT DISTINCT Vehicle_ID, Lane_ID FROM Table1
) AS t
GROUP BY t.Vehicle_ID
So
to identify the vehicles which have changed their lanes
you need to add to the above query:
HAVING COUNT(t.Lane_ID) > 1
SELECT
Table1.Vehicle_ID,
LANE_COUNT
FROM Table1
JOIN (
SELECT Vehicle_ID, COUNT(*) as LANE_COUNT FROM (
SELECT distinct Vehicle_ID, Lane_ID FROM Table1
) dTable1 # distinct vehicle and land id
GROUP BY Vehicle_ID # counting the distinct
) cTable1 ON cTable1.Vehicle_ID = Table1.Vehicle_ID # join the table with the counting
I think you should do one by one,
Distinct the vehicle id and land id
counting the distinct combination
and merge the result with the actual table.
If you want vehicles that have changed their lanes, then you can do:
SELECT Vehicle_ID,
IIF(MIN(Lane_ID) = MAX(Lane_ID), 0, 1) as change_lane_flag
FROM Table1
GROUP BY Vehicle_ID;
I think this is as good as counting the number of distinct lanes, because you are not counting actual "lane changes". So this would return "2" even though the vehicle changes lanes multiple times:
2 1 16 130
2 1 17 135
2 2 18 136
2 1 16 140
2 1 17 145
2 2 18 146

complex paratition sum in postgresql

I have tables as follow:
A deliveries
delveryid clientid deliverydate
1 10 2015-01-01
2 10 2015-02-02
3 11 2015-04-08
B items in deliveris
itemid deliveryid qty status
70 1 5 1
70 1 8 2
70 2 10 1
72 1 12 1
70 3 100 1
I need to add a column to my query that gives me the qty of each part in other deliveris of the same client.
meaning that for given data of client 10 and delivery id 1 I need to show:
itemid qty status qtyOther
70 5 1 10 //itemid 70 exists in delivery 2
70 8 2 10 //itemid 70 exists in delivery 2
72 12 1 0 //itemid 72 doesn't exists in other delivery of client 11
Since I need to add qtyOther to my existing qry i'm trying to avoid using Group By as it's a huge query and if I use SUM in select I will have to group by all items in select.
This is what I have so far:
Select ....., coalesce( SUM(a.qty) OVER (PARTITION BY a.itemid) ,0) AS qtyOther
FROM B b
LEFT JOIN A a USING
LEFT JOIN (other tables)
WHERE clientid=10 ....
This query gives me the total sum of qty per itemid for specific clientid, regardless of which delivery it is. How do I change it so it will consider the delivryid? I need something like:
coalesce( SUM(a.qty) OVER (PARTITION BY a.itemid) FROM B where deliveryid<>b.deliveryid ,0) AS qtyOther
Any suggestions how to do that?
Note: I can NOT change the condition in WHERE.
I think you just want to subtract out the total for the current delivery:
Select .....,
(coalesce( SUM(a.qty) OVER (PARTITION BY a.itemid), 0) -
coalesce( SUM(a.qty) OVER (PARTITION BY a.itemid, a.deliveryid), 0)
) as qtyOther