Redshift- How to get max date - sql

I'm trying to get a Max date without having to do a join back to the table and I'm wondering if that's possible. I'm trying to get the max of the version column and create a flag. Below is a table as an example on how the data is displayed after I execute the script below. But only version = 3 should have a flag value of 1 everything else should be 0
select e.id, e.version, e.version_type, q.status,
(case when e.version = max(e.version) over (partition by e.id) then 1 else 0 end) as flag
from data.deployment_events e
join data.deployments d on e.id = d.id
id
version
version_type
status
flag
1
1
test
unopened
1
2
1
test
declined
1
3
1
test
unopened
1
4
1
test
completed
1
5
1
test
completed
0
6
2
test
opened
0
7
3
test
declined
1
Actual result set expect or needed
id
version
version_type
status
flag
1
1
test
unopened
0
2
1
test
declined
0
3
1
test
unopened
0
4
1
test
completed
0
5
1
test
completed
0
6
2
test
opened
0
7
3
test
declined
1

This query would determin the maximum version overall:
select e.id, e.version, e.version_type, q.status,
(case when e.version = max(e.version) over () then 1 else 0 end) as flag
from data.deployment_events e
join data.deployments d on e.id = d.id;
Or something like this if you want to partition the data based on another column (x?):
select e.id, e.version, e.version_type, q.status,
(case when e.version = max(e.version) over (partition by x) then 1 else 0 end) as flag
from data.deployment_events e
join data.deployments d on e.id = d.id;

Related

Best way to by column and aggregation on another column

I want to create a rank column using existing rank and binary columns. Suppose for example a table with ID, RISK, CONTACT, DATE. The existing rank is RISK, say 1,2,3,NULL, with 3 being the highest. The binary-valued is CONTACT with 0,1 or FAILURE/SUCESS. I want to create a new RANK that will order by RISK once a certain number of successful contacts has been exceeded.
For example, suppose the constraint is a minimum of 2 successful contacts. Then the rank should be created as follows in the two instances below:
Instance 1. Three ID, all have a min of two successful contacts. In that case the rank mirrors the risk:
ID risk contact date rank
1 3 S 1 3
1 3 S 2 3
1 3 F 3 3
1 3 F 4 3
2 2 S 1 2
2 2 S 2 2
2 2 F 3 2
2 2 F 4 2
3 1 S 1 1
3 1 S 2 1
3 1 S 3 1
Instance 2. Suppose ID=1 has only one successful contact. In that case it is relegated to the lowest rank, rank=1, while ID=2 gets the highest value, rank=3, and ID=3 maps to rank=2 because it satisfies the constraint but has a lower risk value than ID=2:
ID risk contact date rank
1 3 S 1 1
1 3 F 2 1
1 3 F 3 1
1 3 F 4 1
2 2 S 1 3
2 2 S 2 3
2 2 F 3 3
2 2 F 4 3
3 1 S 1 2
3 1 S 2 2
3 1 S 3 2
This is SQL, specifically Hive. Thanks in advance.
Edit - I think Gordon Linoff's code does it correctly. In the end, I used three interim tables. The code looks like that:
First,
--numerize risk, contact
select A.* ,
case when A.risk = 'H' then 3
when A.risk = 'M' then 2
when A.risk = 'L' then 1
when A.risk is NULL then NULL
when A.risk = 'NULL' then NULL
else -999 end as RISK_RANK,
case when A.contact = 'Successful' then 1
else NULL end as success
Second,
-- sum_successes_by_risk
select A.* ,
B.sum_successes_by_risk
from T as A
inner join
(select A.person, A.program, A.risk, sum(a.success) as sum_successes_by_risk
from T as A
group by A.person, A.program, A.risk
) as B
on A.program = B.program
and A.person = B.person
and A.risk = B.risk
Third,
--Create table that contains only max risk category
select A.* ,
B.max_risk_rank
from T as A
inner join
(select A.person, max(A.risk_rank) as max_risk_rank
from T as A
group by A.person
) as B
on A.person = B.person
and A.risk_rank = B.max_risk_rank
This is hard to follow, but I think you just want window functions:
select t.*,
(case when sum(case when contact = 'S' then 1 else 0 end) over (partition by id) >= 2
then risk
else 1
end) as new_risk
from t;

SQL Query. limit an update per rows if condition is X and Y for the same ID number

Have the following table tblTrans where
Trans_ID Trans Sequence Trans_PointsEarned Trans_PointsApplied
4452 1 1 1
4452 2 1 1
4452 3 0 1
4462 1 1 1
4462 2 1 1
4462 3 1 1
4462 4 1 1
4462 5 1 1
9101 1 0 1
9101 2 0 1
9101 3 0 1
9101 4 0 1
(useless table doesnt work)
I need to set the following on another field per every customer ID.
So Customer_OverallPoints
4452 = 2 (doesn't count 0's)
4462 = 4 (I want to cap the points to 4 based on the sequence and transID and customerID)
9101 = 0 (dont count 0's).
This needs to be applied to thousands of records based on customerID and TransID where Trans_Sequence is within the same Trans_ID and it only counts the first 4 rows that have the Trans_pointsEarned = 1.
I tried putting a psuedocode together but it just looked ridicilous and I can't even come up with the logic for this.
Thanks
Assuming that TransId is really the customer id, I think the basic logic is just an aggregation:
select t.TransId,
(case when sum(t.Trans_PointsEarned) > 4 then 4
else sum(t.Trans_PointsEarned)
end) as Customer_OverallPoints
from tblTrans t
group by t.TransId;
You can put this into an update statement as:
update customers c
set Customer_OverallPoints = (select (case when sum(t.Trans_PointsEarned) > 4 then 4
else sum(t.Trans_PointsEarned)
end)
from tblTrans t
where t.TransId = c.CustomerId
);

Formatting the results of a query

Let's say I have the following table:
first second
A 1
A 1
A 2
B 1
B 2
C 1
C 1
If I run the following query:
select first, second, count(second) from tbl group by first, second
It will produce a table with the following information:
first second count(second)
A 1 2
A 2 1
B 1 1
B 2 1
C 1 2
How can I write the query so that I am given the information with the options from the second column as columns and the values for those columns being the count like this:
first 1 2
A 2 1
B 1 1
C 2 0
You can use CASE:
SELECT "first",
SUM(CASE WHEN "second" = 1 THEN 1 ELSE 0 END) AS "1",
SUM(CASE WHEN "second" = 2 THEN 1 ELSE 0 END) AS "2"
FROM tbl
GROUP BY "first"

Count occurrences of field values as they are displayed in order

thanks in advance for the help and sorry for how the "table" looks. Here's my question...
Let's say I have a subquery with this table (imagine the bold as column headers) as its output -
id 1 1 2 3 3 3 3 4 5 6 6 6
action o c o c c o c o o c c c
I would like my new query to output -
id 1 1 2 3 3 3 3 4 5 6 6 6
action o c o c c o c o o c c c
ct 1 2 1 1 2 3 4 1 1 1 2 3
#c 0 1 0 1 2 2 3 0 0 1 2 3
#o 1 1 1 0 0 1 1 1 1 0 0 0
where ct stands for count. Basically, I want to count (for each id) the occurrences of consecutive id and action as they happen. Let me know if this makes sense, and if not, how I can clarify my question.
Note: I realize the lag/lead functions may be helpful in this situation, along with the row_number() function. Looking for as many creative solutions as possible!
You are looking for the row_number() analytic function:
select id, action, row_number() over (partition by id order by id) as ct
from table t;
For #c and #o, you want cumulative sum:
select id, action, row_number() over (partition by id order by id) as ct,
sum(case when action = 'c' then 1 else 0 end) over
(partition by id order by <some column here>) as "#c",
sum(case when action = 'c' then 1 else 0 end) over
(partition by id order by <some column here>) as "#o"
from table t;
The one caveat is that you need a way to specify the order of the rows -- an id or date time stamp or something. SQL result sets and tables are inherently unordered, so there is no idea that one row comes before or after another.
SQL> select id, action,
2 row_number() over(partition by id order by rowid) ct,
3 sum(decode(action,'c',1,0)) over(partition by id order by rowid) c#,
4 sum(decode(action,'o',1,0)) over(partition by id order by rowid) o#
5 from t1
6 /
ID A CT C# O#
---------- - ---------- ---------- ----------
1 o 1 0 1
1 c 2 1 1
2 o 1 0 1
3 c 1 1 0
3 c 2 2 0
3 o 3 2 1
3 c 4 3 1
4 o 1 0 1
5 o 1 0 1
6 c 1 1 0
6 c 2 2 0
6 c 3 3 0
P.S. Sorry Gordon, didn't see your post.

MSSQL DB get records that don't have a specific status

I have a DB with 3 tables:
Alarm
ID Message
-------------------
1 Server01 Down
2 Switch01 Port 2 down
3 Webserver Down
ListAlarmStates
ID StateName
------------------
1 Raised
2 RaisedNotified
3 Cleared
4 ClearedNotified
5 ForceClear
AlarmStates
ID AlarmId ListAlarmStatesId
-----------------------------------------
1 1 1
2 1 2
3 1 3
4 1 4
5 2 1
6 2 5
7 3 1
Now I would like to know all alarms that don't have the status ClearedNotified but do have the status cleared (the status cleared I could catch in code)
Thanks in advance!
SELECT AlarmId FROM
AlarmStates AS
INNER JOIN Alarm A
ON (AS.AlarmID = A.ID)
INNER JOIN ListAlarmStates LA
ON ( AS.ListAlarmStatesId = LA.ID)
GROUP BY AS.AlarmID
HAVING COUNT(CASE WHEN LA.StateName = 'ClearedNotified' THEN 1 ELSE NULL END) = 0
AND COUNT(CASE WHEN LA.StateName = 'Cleared' THEN 1 ELSE NULL END) > 0)