I want to create a rank column using existing rank and binary columns. Suppose for example a table with ID, RISK, CONTACT, DATE. The existing rank is RISK, say 1,2,3,NULL, with 3 being the highest. The binary-valued is CONTACT with 0,1 or FAILURE/SUCESS. I want to create a new RANK that will order by RISK once a certain number of successful contacts has been exceeded.
For example, suppose the constraint is a minimum of 2 successful contacts. Then the rank should be created as follows in the two instances below:
Instance 1. Three ID, all have a min of two successful contacts. In that case the rank mirrors the risk:
ID risk contact date rank
1 3 S 1 3
1 3 S 2 3
1 3 F 3 3
1 3 F 4 3
2 2 S 1 2
2 2 S 2 2
2 2 F 3 2
2 2 F 4 2
3 1 S 1 1
3 1 S 2 1
3 1 S 3 1
Instance 2. Suppose ID=1 has only one successful contact. In that case it is relegated to the lowest rank, rank=1, while ID=2 gets the highest value, rank=3, and ID=3 maps to rank=2 because it satisfies the constraint but has a lower risk value than ID=2:
ID risk contact date rank
1 3 S 1 1
1 3 F 2 1
1 3 F 3 1
1 3 F 4 1
2 2 S 1 3
2 2 S 2 3
2 2 F 3 3
2 2 F 4 3
3 1 S 1 2
3 1 S 2 2
3 1 S 3 2
This is SQL, specifically Hive. Thanks in advance.
Edit - I think Gordon Linoff's code does it correctly. In the end, I used three interim tables. The code looks like that:
First,
--numerize risk, contact
select A.* ,
case when A.risk = 'H' then 3
when A.risk = 'M' then 2
when A.risk = 'L' then 1
when A.risk is NULL then NULL
when A.risk = 'NULL' then NULL
else -999 end as RISK_RANK,
case when A.contact = 'Successful' then 1
else NULL end as success
Second,
-- sum_successes_by_risk
select A.* ,
B.sum_successes_by_risk
from T as A
inner join
(select A.person, A.program, A.risk, sum(a.success) as sum_successes_by_risk
from T as A
group by A.person, A.program, A.risk
) as B
on A.program = B.program
and A.person = B.person
and A.risk = B.risk
Third,
--Create table that contains only max risk category
select A.* ,
B.max_risk_rank
from T as A
inner join
(select A.person, max(A.risk_rank) as max_risk_rank
from T as A
group by A.person
) as B
on A.person = B.person
and A.risk_rank = B.max_risk_rank
This is hard to follow, but I think you just want window functions:
select t.*,
(case when sum(case when contact = 'S' then 1 else 0 end) over (partition by id) >= 2
then risk
else 1
end) as new_risk
from t;
Related
I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.
I have a table that looks something like this:
id name status
2 a 1
2 a 2
2 a 3
2 a 2
2 a 1
3 b 2
3 b 1
3 b 2
3 b 1
and the resultant i want is:
id name total count count(status3) count(status2) count(status1)
2 a 5 1 2 2
3 b 4 0 2 2
please help me get this result somehow, i can just get id, name or one of them at a time, don't know how to put a clause to get this table at once.
Here's a simple solution using group by and case when.
select id
,count(*) as 'total count'
,count(case status when 3 then 1 end) as 'count(status1)'
,count(case status when 2 then 1 end) as 'count(status3)'
,count(case status when 1 then 1 end) as 'count(status2)'
from t
group by id
id
total count
count(status3)
count(status2)
count(status1)
2
5
1
2
2
3
4
0
2
2
Fiddle
Here's a way to solve it using pivot.
select *
from (select status,id, count(*) over (partition by id) as "total count" from t) tmp
pivot (count(status) for status in ([1],[2],[3])) pvt
d
total count
1
2
3
3
4
2
2
0
2
5
2
2
1
Fiddle
I have following tables.
Part
id
name
1
Part 1
2
Part 2
3
Part 3
Operation
id
name
part_id
order
1
Op 1
1
10
2
Op 2
1
20
3
Op 3
1
30
4
Op 1
2
10
5
Op 2
2
20
6
Op 1
3
10
Lot
id
part_id
Operation_id
10
1
2
11
2
5
12
3
6
I am selecting the results from Lot table and I want to select a column last_Op which is based on the order value of the operation_id. If value of order for the operation_id is the highest for the respective part_id, return 1 else return 0
SELECT
id,
part_id,
operation_id,
last_Op
FROM Lot
expected result set based on the tables above.
id
part_id
operation_id
last_op
10
1
2
0
11
2
5
1
12
3
6
1
In above example, first row returns last_op = 0 because operation_id = 2 is associated with part_id = 1 and it has the highest order = 30. Since operation_id for this part is not pointing towards the highest order value, 0 is returned.
The other two rows return 1 because operation_id 5 and 6 are associated with part_id 2 and 3 respectively and they are pointing towards the highest 'order' value.
If value of order for the operation_id is the highest for the respective part_id, return 1 else return 0
This sounds like window functions will help:
select l.*,
(case when o.order = o.max_order then 1 else 0 end) as last_op
from lot l left join
(select o.*,
max(o.order) over (partition by o.part_id) as max_order
from operations o
) o
on l.operation_id = o.id;
Note: order is a very poor name for a column because it is a SQL keyword.
I am building a report and I am stuck formulating a query. I am bringing the following data from multiple tables after a lot of joins.
ID TYPE RATING
----- ---- ------
ID_R1 A 1
ID_R1 B 3
ID_R2 A 2
ID_R2 B 1
ID_R3 A 4
ID_R3 B 4
ID_R4 A 2
ID_R4 B 3
ID_R5 A 2
ID_R5 B 3
What actually is happening is that Every ID will have a Rating for Type A & B so what I need to do is transform the above into the following
ID Type_A_Rating Type_B_Rating
----- ------------- -------------
ID_R1 1 3
ID_R2 3 1
ID_R3 4 4
ID_R4 2 3
ID_R5 2 3
I have think group by and different techniques but so far I am unable to come up with a solution. Need help F1! F1!
p.s just for the record my end game is getting the count of (A,B) combinations
Type_A_Rating Type_B_Rating Count
------------- ------------- -----
1 1 0
1 2 0
1 3 1
1 4 0
2 1 0
2 2 0
2 3 2
2 4 0
3 1 1
3 2 0
3 3 0
3 4 0
4 1 0
4 2 0
4 3 0
4 4 1
From this you can see that a simple GROUP BY with any form AND OR conditions doesn't suffice until I get the data as mentioned. I could use two intermediate/temp tables, in one get Type_A_Rating with ID and then in second Type_B_Rating with ID and then in another combine both but isn't there a better way.
This should work as SQL engine agnostic solution (provided that there is exactly one row with type A for each ID and one row with type B for each ID):
select
TA.ID,
TA.RATING as Type_A_Rating,
TB.RATING as Type_B_Rating
from
(select ID, RATING
from T where TYPE = 'A') as TA
inner join
(select ID, RATING
from T where TYPE = 'B') as TB
on TA.ID = TB.ID
Related SQL Fiddle: http://sqlfiddle.com/#!9/7e6fd9/2
Alternative (simpler) solution:
select
ID,
sum(case when TYPE = 'A' then RATING else 0 end) as Type_A_Rating,
sum(case when TYPE = 'B' then RATING else 0 end) as Type_B_Rating
from
T
group by
ID
Fiddle: http://sqlfiddle.com/#!9/7e6fd9/3
EDIT:
The above is correct but both can be simplified a bit:
select TA.ID, TA.RATING as Type_A_Rating, TB.RATING as Type_B_Rating
from T TA join
T TB
on TA.ID = TB.ID AND A.type = 'A' and B.type = 'B';
And (because I prefer NULL when there are no matches:
select ID,
max(case when TYPE = 'A' then RATING end) as Type_A_Rating,
max(case when TYPE = 'B' then RATING end) as Type_B_Rating
from T
group by ID
thanks in advance for the help and sorry for how the "table" looks. Here's my question...
Let's say I have a subquery with this table (imagine the bold as column headers) as its output -
id 1 1 2 3 3 3 3 4 5 6 6 6
action o c o c c o c o o c c c
I would like my new query to output -
id 1 1 2 3 3 3 3 4 5 6 6 6
action o c o c c o c o o c c c
ct 1 2 1 1 2 3 4 1 1 1 2 3
#c 0 1 0 1 2 2 3 0 0 1 2 3
#o 1 1 1 0 0 1 1 1 1 0 0 0
where ct stands for count. Basically, I want to count (for each id) the occurrences of consecutive id and action as they happen. Let me know if this makes sense, and if not, how I can clarify my question.
Note: I realize the lag/lead functions may be helpful in this situation, along with the row_number() function. Looking for as many creative solutions as possible!
You are looking for the row_number() analytic function:
select id, action, row_number() over (partition by id order by id) as ct
from table t;
For #c and #o, you want cumulative sum:
select id, action, row_number() over (partition by id order by id) as ct,
sum(case when action = 'c' then 1 else 0 end) over
(partition by id order by <some column here>) as "#c",
sum(case when action = 'c' then 1 else 0 end) over
(partition by id order by <some column here>) as "#o"
from table t;
The one caveat is that you need a way to specify the order of the rows -- an id or date time stamp or something. SQL result sets and tables are inherently unordered, so there is no idea that one row comes before or after another.
SQL> select id, action,
2 row_number() over(partition by id order by rowid) ct,
3 sum(decode(action,'c',1,0)) over(partition by id order by rowid) c#,
4 sum(decode(action,'o',1,0)) over(partition by id order by rowid) o#
5 from t1
6 /
ID A CT C# O#
---------- - ---------- ---------- ----------
1 o 1 0 1
1 c 2 1 1
2 o 1 0 1
3 c 1 1 0
3 c 2 2 0
3 o 3 2 1
3 c 4 3 1
4 o 1 0 1
5 o 1 0 1
6 c 1 1 0
6 c 2 2 0
6 c 3 3 0
P.S. Sorry Gordon, didn't see your post.