Update next column in table if previous column value is not null - sql

When a person receives a score, an entry is added into the table #uniqueScores:
Pid | Date | Score
I have a stored procedure returning a table #people with the score columns containing the data from #uniqueScores (that fall within the past 3 months)
Pid | S1 | S2 | S3 | S4 | S5
I have a small test dataset, however I'm having trouble getting any scores beyond the first score registered to a user to appear in Score2 or beyond.
Here is my test dataset
Pid | Date | Score
#1 | 2020/07/01 | 8
#1 | 2020/09/15 | 8
#2 | 2020/09/21 | 3
#3 | 2020/10/01 | 5
#4 | 2020/10/18 | 6
#4 | 2020/10/31 | 2
My update statement, to update the Person column with the data
BEGIN
UPDATE #people
SET [Score5] = (CASE WHEN ( [p].[Score4] is not null and [p].[Score5] is null ) THEN [us].[Score] ELSE NULL END)
,[Score4] = (CASE WHEN ( [p].[Score3] is not null and [p].[Score4] is null ) THEN [us].[Score] ELSE NULL END)
,[Score3] = (CASE WHEN ( [p].[Score2] is not null and [p].[Score3] is null ) THEN [us].[Score] ELSE NULL END)
,[Score2] = (CASE WHEN ( [p].[Score1] is not null and [p].[Score2] is null ) THEN [us].[Score] ELSE NULL END)
,[Score1] = (CASE WHEN ( [p].[Score1] is null ) THEN [us].[Score] ELSE NULL END)
FROM #people [p] inner join #uniqueScores [us]
on [p].[PersonID] = [us].[PersonID]
WHERE [Date] >= #DateLimit -- within the previous 3 months
END
However, the query isn't updating the table with any but the first eligible values. The returned table looks like this
Pid | S1 | S2 | S3 | S4 | S5
#1 | 8 | null | null | null | null
#2 | 3 | null | null | null | null
#3 | 5 | null | null | null | null
#4 | 6 | null | null | null | null
The first table entry which is ineligible to be considered for the table isn't included which is great, however Person #4's second score is also missing.
I've been looking at PIVOT, WHILE and a CURSOR but I've got no closer to making this work. I'm sure I've missed something simple however I just can't see it.

UPDATE updates each row once. Preaggregate for multiple updates:
UPDATE p
SET Score1 = us.score_1,
Score2 = us.score_2,
Score3 = us.score_3,
Score4 = us.score_4,
Score5 = us.score_5
FROM #people [p] inner join
(SELECT us.PersonID,
MAX(CASE WHEN seqnum = 1 THEN Score END) as score_1,
MAX(CASE WHEN seqnum = 2 THEN Score END) as score_2,
MAX(CASE WHEN seqnum = 3 THEN Score END) as score_3,
MAX(CASE WHEN seqnum = 4 THEN Score END) as score_4,
MAX(CASE WHEN seqnum = 5 THEN Score END) as score_5
FROM (SELECT us.*,
ROW_NUMBER() OVER (PARTITION BY PersonId ORDER BY Date) as seqnum
FROM #uniqueScores us
WHERE [Date] >= #DateLimit -- within the previous 3 months
) us
GROUP BY us.PersonID
) s
ON us.PersonID = p.PersonId;
Note: You don't specify what order you want the scores in. This puts the oldest ones first. Use ORDER BY DESC if you want the newer ones first.

Related

Selecting all rows conditionally between 2 arbitrary column values in SQL Server

I've joined a number of tables to get to this table. From this table I need to select all of the b_id values that fall between the start end end values that are not null. There could be multiple start and end values in the table. How can I write a SQL Server query to select all of the b_ids between but not including those rows. So for this example table I would need the b_ids 99396 AND 71828
I tried to find a similar question and found something like this but I don't believe I'm using the correct values where they need to be. Is there another way to do it. I have a solution using a cursor, but I'm trying to find a non cursor solution. My friend told me the responses on here can be brutal if you don't word the question a certain way. Please be easy on me lol.
a_id | b_id | sequence | start | end |
---------+-------+----------+-------+-------+
3675151 | 68882 | 1 | null | null |
3675151 | 79480 | 2 | 79480 | null |
3675151 | 99396 | 3 | null | null |
3675151 | 71828 | 4 | null | null |
3675151 | 28911 | 5 | null | 28911 |
3675151 | 27960 | 6 | null | null |
3675183 | 11223 | 1 | null | null |
3675183 | 77810 | 2 | null | null |
3675183 | 11134 | 3 | null | null |
3675183 | 90909 | 4 | null | null |
Is this what you are looking for
select a_id, b_id, sequence
from
table
where
(a_id,sequence )
in
(select a_id, sequence from table t1
where
sequence >
(select sequence from table t2 where t1.a_id = t2.a_id and start is not null)
and
sequence <
(select sequence from table t3 where t1.a_id = t3.a_id and end is not null)
);
Would it be this?
Mark as answer if yes, if not exemplify otherwise.
create table #table (
a_id int
,b_id int
,c_sequence int
,c_start int
,c_end int
)
insert into #table
values
(3675151 ,68882 , 1 , null , null )
,(3675151 ,79480 , 2 , 79480 , null )
,(3675151 ,99396 , 3 , null , null )
,(3675151 ,71828 , 4 , null , null )
,(3675151 ,28911 , 5 , null , 28911)
,(3675151 ,27960 , 6 , null , null )
,(3675183 ,11223 , 1 , null , null )
,(3675183 ,77810 , 2 , 4343 , null )
,(3675183 ,11134 , 3 , null , null )
,(3675183 ,90939 , 4 , null , 1231 )
select
t.*
from #table t
where
exists (select t1.b_id,t1.c_sequence
from #table t1
where t1.c_start is not null
and t.a_id =t1.a_id and t.c_sequence>t1.c_sequence )
and exists (select t1.b_id,t1.c_sequence
from #table t1
where t1.c_end is not null
and t.a_id =t1.a_id
and t.c_sequence<t1.c_sequence
You can use window functions for this:
select t.*
from (select t.*,
max(case when c_start is not null then c_sequence end) over (partition by a_id order by c_sequence) as last_c_start,
max(case when c_end is not null then c_sequence end) over (partition by a_id order by c_sequence) as last_c_end,
min(case when c_end is not null then c_sequence end) over (partition by a_id order by c_sequence desc) as next_c_end
from t
) t
where c_sequence > last_c_start and
c_sequence < next_c_end and
(last_c_start > last_c_end or last_c_end is null);
Here is a db<>fiddle.
The subquery is returning the previous start and next end. That is pretty simply. The where uses this information. The last condition just checks that the most recent "start" is the one that should be considered.
Note: This does not handle more complicated scenarios like start-->start-->end-->end. If that is a possibility, you should ask another question.
EDIT:
Actually, there is an even easier way:
select t.*
from (select t.*,
count(coalesce(c_start, c_end)) over (partition by a_id order by c_sequence) as counter
from t
) t
where c_start is null and c_end is null and
counter % 2 = 1;
This returns rows where there two values are NULL (to avoid the endpoints) and there are an odd number of non-NULL c_start/c_end values up to that row.

How can i check the order of column values(by date) for every unique id?

I have this table, Activity:
| ID | Date of activity | activity |
|----|---------------------|----------|
| 1 | 2016-05-01T13:45:03 | a |
| 1 | 2016-05-02T13:45:03 | b |
| 1 | 2016-05-03T13:45:03 | a |
| 1 | 2016-05-04T13:45:03 | b |
| 2 | 2016-05-01T13:45:03 | b |
| 2 | 2016-05-02T13:45:03 | b |
and this table:
| id | Right order |
|----|-------------|
| 1 | yes |
| 2 | no |
How can I check for every ID if the order of the activities is sumiliar to this order for example ?
a b a b a b ..
of course i'll check according to activity date
In SQL Server 2012+ you could use common table expression with lag(), and then the min() of a case expression that follows your logic like so:
;with cte as (
select *
, prev_activity = lag(activity) over (partition by id order by date_of_activity)
from t
)
select id
, right_order = min(case
when activity = 'a' and isnull(prev_activity,'b')<>'b' then 'no'
when activity = 'b' and isnull(prev_activity,'b')<>'a' then 'no'
else 'yes'
end)
from cte
group by id
rextester demo: http://rextester.com/NQQF78056
returns:
+----+-------------+
| id | right_order |
+----+-------------+
| 1 | yes |
| 2 | no |
+----+-------------+
Prior to SQL Server 2012 you can use outer apply() to get the previous activity instead of lag() like so:
select id
, right_order = min(case
when activity = 'a' and isnull(prev_activity,'b')<>'b' then 'no'
when activity = 'b' and isnull(prev_activity,'b')<>'a' then 'no'
else 'yes'
end)
from t
outer apply (
select top 1 prev_activity = i.activity
from t as i
where i.id = t.id
and i.date_of_activity < t.date_of_activity
order by i.date_of_activity desc
) x
group by id
EDITED - Allows for variable number of Patterns per ID
Perhaps another approach
Example
Declare #Pat varchar(max)='a b'
Declare #Cnt int = 2
Select ID
,RightOrder = case when rtrim(replicate(#Pat+' ',Hits/#Cnt)) = (Select Stuff((Select ' ' +activity From t Where id=A.id order by date_of_activity For XML Path ('')),1,1,'') ) then 'Yes' else 'No' end
From (Select ID,hits=count(*) from t group by id) A
Returns
ID RightOrder
1 Yes
2 No
select id,
case when sum(flag)=0 and cnt_per_id%2=0
and max(case when rnum=1 then activity end) = 'a'
and max(case when rnum=2 then activity end) = 'b'
and min_activity = 'a' and max_activity = 'b'
then 'yes' else 'no' end as RightOrder
from (select t.*
,row_number() over(partition by id order by activitydate) as rnum
,count(*) over(partition by id) as cnt_per_id
,min(activity) over(partition by id) as min_activity
,max(activity) over(partition by id) as max_activity
,case when lag(activity) over(partition by id order by activitydate)=activity then 1 else 0 end as flag
from tbl t
) t
group by id,cnt_per_id,max_activity,min_activity
Based on the explanation the following logic has to be implemented for rightorder.
Check if the number of rows per id are even (Remove this condition if there can be an odd number of rows like a,b,a or a,b,a,b,a and so on)
First row contains a and second b, min activity is a and max activity is b for an id.
Sum of flags (set using lag) should be 0

SQL group by and where on each group

I have a table with columns like sourceId (guid), state (1:Deactivated, 2:Activated, 3:Dead), modifiedDate.
I am writing a query to group by sourceId and see if ALL the records in a group have the state as 2 (activated) and also get the MAX of modifiedDate of the rows which have state as 2 (activated) in each group.
result table should be something like sourceId, IsAllActivated, MaxModifiedForActivatedRecords.
I tried a lot of options like Partition By, Cross over etc. which are giving me either one of the column and not both. Options which have self joins were costly, so looking for any other efficient way of forming the query.
Data :
SourceId | State | modifiedDate
s1 | 1 | 01/01
s1 | 2 | 01/02
s2 | 3 | 02/03
s2 | 3 | 03/03
s1 | 3 | 10/10
Ouput:
sourceId | IsAllActivated | MaxModifiedForActivatedRecords
s1 | 0 | 02/03
s2 | 1 | 03/03
What i had tried :
SELECT
[SourceID]
,CASE
WHEN COUNT(DISTINCT State) = 1 AND
SUM(DISTINCT State) = 3
THEN 1
ELSE 0
END AS IsAllActivated
FROM ThreadActivation
GROUP BY SourceID
SELECT
[SourceID]
,MAX(modifiedDate) AS MaxModifiedForActivatedRecords
FROM ThreadActivation
GROUP BY SourceID
HAVING State = 3
I am able to get them separately, but not together in a single query.
I tried ranking with row number :
WITH ThreadActivationTransaction AS (
select
*
,ROW_NUMBER() over(PARTITION BY SourceId order by modifiedDate desc) AS rk
from ThreadActivation)
select
[sourceID]
,CASE
WHEN COUNT(DISTINCT State) = 1 AND SUM(DISTINCT State) = 3
THEN 1
ELSE 0
END AS IsAllActivated
,[SourceId]
from ThreadActivation s
GROUP by SourceId --where s.rk =1
All these were not giving me a break through.
You can do this with aggregation and case:
select sourceId,
(case when max(state) = min(state) and max(state) = 2
then 1 else 0
end) as IsAllActivated,
max(case when state = 2 then modifiedDate end) as MaxModifiedForActivatedRecords
from t
group by sourceId;
This assumes that state is not NULL. The logic is only slightly more complicated if that is possible.

Rewriting SQL query to get record id and customer number

Sample data
+-------------------+-------------+-----------------+---------------------+
| RECORD_ID | CUST_NO | IsAccntClosed | Code |
+-------------------+-------------+-----------------+---------------------+
|159045 | 2439123 | N | 13 |
+-------------------+-------------+-----------------+---------------------+
|159048 | 6376150 | Y | 13 |
+-------------------+-------------+-----------------+---------------------+
|159048 | 9513035 | N | 13 |
+-------------------+-------------+-----------------+---------------------+
|159049 | 2398524 | N | 12 |
+-------------------+-------------+-----------------+---------------------+
|159049 | 6349269 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159049 | 6350690 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159049 | 6372163 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159049 | 6393810 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159049 | 6402062 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159050 | 2677512 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159050 | 6349382 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159050 | 6378137 | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
|159051 | 2336197 | N | 12 |
+-------------------+-------------+-----------------+---------------------+
|159051 | 6349293 | N | 12 |
+-------------------+-------------+-----------------+---------------------+
|159051 | 6350682 | N | 12 |
+-------------------+-------------+-----------------+---------------------+
|159051 | 6367895 | N | 12 |
+-------------------+-------------+-----------------+---------------------+
|159060 | yyyyyy | Y | 12 |
+-------------------+-------------+-----------------+---------------------+
IsAccntClosed column indicates if the account is Open (Y) or account is closed (Y).
I need to select Record_ID and cust_no for only those rows for which which Record_Id satisfies one of the below condition :
1. Only one cust account is open , there might be one or multiple closed customers
2. No open customer and only one closed customer
Expected output :
159045 2439123
159048 9513035
159049 2398524
159060 yyyyyy
A query like this would take each row as a single group and the count will come as 1
select RECORD_ID, CUST_NO, IsAccntClosed, count(IsAccntClosed), Code
from table1
group by RECORD_ID, CUST_NO, IsAccntClosed, Code
Any suggestions on how this query could be written to get the expected output?
Being IsAccntClosed a CHAR column, you should be able to get what you want with a query like this:
SELECT a.record_id,
(CASE
WHEN a.CountOpen=1 THEN a.CustNoOpen
ELSE a.CustNoClosed
END) AS cust_no
FROM (
SELECT b.record_id,
MAX(CASE WHEN b.IsAccntClosed='N' THEN b.cust_no ELSE NULL END) AS CustNoOpen ,
SUM(CASE WHEN b.IsAccntClosed='N' THEN 1 ELSE 0 END) AS CountOpen ,
MAX(CASE WHEN b.IsAccntClosed='Y' THEN b.cust_no ELSE NULL END) AS CustNoClosed,
SUM(CASE WHEN b.IsAccntClosed='Y' THEN 1 ELSE 0 END) AS CountClosed
FROM table1 b
GROUP BY b.record_id
) a
WHERE a.CountOpen=1 OR (a.CountOpen=0 AND a.CountClosed=1)
The inner query is grouping the table. It counts the open and closed accounts and takes one (random) cust_no of any of the closed accounts and one (random) of any of the open accounts, per group.
The outer query filters the data and cleans everything up, placing the open or closed cust_no in the output result column.
Notice that the WHERE condition of the outer query has the collateral effect that, since you are looking for records that have just a single open or a single closed account, those random cust_no which have been selected by the inner query, are now significant.
EDIT: I fixed the query and tested it on SQLFiddle.
In the below I added another record:
insert into tbl values (159060, 'zzzzzz', 'N', 12);
to illustrate what would happen if a record_id has just one open cust_no and just one closed cust_no. Note how in the result the cust_no returned is the zzzzz one, because that account is open, which you mentioned wanting to take precendence over closed, in the event of a tie 1:1 (zzzzzz should take over yyyyyy in this case, because yyyyyy is closed whereas zzzzzz is open)
Fiddle: http://sqlfiddle.com/#!4/5cb60/1/0
with one_open as
(select record_id
from tbl
where IsAccntClosed = 'N'
group by record_id
having count(distinct cust_no) = 1),
one_closed as
(select record_id
from tbl
where IsAccntClosed = 'Y'
group by record_id
having count(distinct cust_no) = 1),
bothy as
(select record_id from one_open intersect select record_id from one_closed)
select *
from tbl
where (record_id in (select record_id from one_open) and
IsAccntClosed = 'N')
or (record_id not in (select record_id from one_open) and
record_id in (select record_id from one_closed) and
IsAccntClosed = 'Y' and
record_id not in (select record_id from bothy))
select
record_id,
cust_no
from (
select
record_id,
cust_no,
count(case when isAccntClosed='Y' then 1 else null end)
over (partition by record_id) closed_accounts,
count(case when isAccntClosed='N' then 1 else null end)
over (partition by record_id) open_accounts
from
table1
)
where (open_accounts = 1)
or (open_accounts = 0 and closed_accounts = 1)
You can do this with conditional aggregation:
select RECORD_ID,
(case when sum(case when IsAccntClosed = 'N' then 1 else 0 end) = 1
then max(case when IsAccntClosed = 'N' then max(CUST_NO) end)
else max(CUST_NO)
end) as cust_no
from table1
group by RECORD_ID
having sum(case when IsAccntClosed = 'N' then 1 else 0 end) = 1 or
(sum(case when IsAccntClosed = 'N' then 1 else 0 end) = 0 and
sum(case when IsAccntClosed = 'Y' then 1 else 0 end) = 1
)
It is probably easier to understand the logic using subqueries:
select recordId,
(case when numOpen > 0 then OpenCustNo else closedCustNo end) as CustNo
from (select t1.RecordId, sum(case when IsAccntClosed = 'N' then 1 else 0 end) as numOpen,
sum(case when IsAccntOpen = 'N' then 1 else 0 end) as numClosed,
max(case when IsAccntClosed = 'N' then cust_no end) as OpenCustNo,
max(case when IsAccntClosed = 'Y' then cust_no end) as ClosedCustNo
from table1 t1
group by record_id
) r
where numOpen = 1 or numOpen = 0 and numClosed = 1;

Oracle 11g SQL: Pivot Table -- Variable number of columns

I am trying to take a table and pivot the values of two columns to their own columns. The twist is that there can be variable numbers of entries per anchor. Here is a toy table:
CREATE TABLE ATTRS
( WIDGET VARCHAR2(15),
A_NAME VARCHAR2(15),
A_VALUE VARCHAR2(15)
);
INSERT INTO ATTRS VALUES ('BOOK','PAGES','1000');
INSERT INTO ATTRS VALUES ('BOOK','COLOR','GREEN');
INSERT INTO ATTRS VALUES ('BOOK','LAST','TWAIN');
INSERT INTO ATTRS VALUES ('BOOK','FIRST','MARK');
INSERT INTO ATTRS VALUES ('CELLPHONE','BRAND','SAMSUNG');
INSERT INTO ATTRS VALUES ('LAPTOP','BRAND','LENOVO');
INSERT INTO ATTRS VALUES ('LAPTOP','COLOR','BLACK');
INSERT INTO ATTRS VALUES ('LAPTOP','BATTERY','STANDARD');
I will know the maximum number of unique A_NAME that can occur (we'll let it be 4 in this example) and want output like this:
WIDGET | A_NAME1 | A_VALUE1 | A_NAME2 | A_VALUE2 | A_NAME3 | A_VALUE3 | A_NAME4 | A_VALUE4
------------------------------------------------------------------------------------------
BOOK | PAGES | '1000' | COLOR | 'GREEN' | LAST | 'TWAIN' | FIRST | 'MARK'
CELLPHONE | BRAND | 'SAMSUNG'| (null) | (null) | (null) | (null) | (null) | (null)
LAPTOP | COLOR | 'BLACK' | BRAND | 'LENOVO' | BATTERY | 'STANDARD' | (null) | (null)
Note that order does not matter, i.e. if two A_NAME are the same, they need not be in the same column.
Thanks.
I'd have to think for a minute about how to accomplish this with the new (in 11g) pivot operator. Knowing that there are at most 4 rows per widget, though, you can do an old-school pivot along the lines of
SELECT widget,
max(CASE WHEN rn = 1 THEN a_name ELSE NULL END) a_name1,
max(CASE WHEN rn = 1 THEN a_value ELSE NULL END) a_value1,
max(CASE WHEN rn = 2 THEN a_name ELSE NULL END) a_name2,
max(CASE WHEN rn = 2 THEN a_value ELSE NULL END) a_value2,
max(CASE WHEN rn = 3 THEN a_name ELSE NULL END) a_name3,
max(CASE WHEN rn = 3 THEN a_value ELSE NULL END) a_value3,
max(CASE WHEN rn = 4 THEN a_name ELSE NULL END) a_name4,
max(CASE WHEN rn = 4 THEN a_value ELSE NULL END) a_value4
FROM( SELECT widget,
a_name,
a_value,
row_number() over (partition by widget
order by a_name, a_value) rn
FROM attrs )
GROUP BY widget