Finding rows in SQL where changes but only certain changes while keeping others

Finding rows in SQL where changes but only certain changes while keeping others - sql

I have this scenario where I want each occurrence of an active row to bring back that row in my result set and also inactive if there is only 1 inactive record for that IDENTIFIER and also if there are more than 1 active also show those. I've used Row_Number function and then in another query show where the row = '1' but if I do that row 1s only come back and then I lose some of my desired results. To restate my issue is I want all active records to come back and only inactive where IDENTIFIER is unique. The row that is bold should not be shown in the results.
1 has 1 active record in the DB.
2 has 2 active and 1 inactive records.
3 has no active records.
4 has only 2 active records, no inactive.

You can use a windowed conditional count, this has the benfit of only scanning the table once
SELECT
t.IDENTIFIER,
t.DB_ID,
t.Status
FROM (
SELECT *,
HasActive = COUNT(CASE WHEN t.Status = 'Active' THEN 1 END) OVER (PARTITION BY t.IDENTIFIER)
FROM YourTable t
) t
WHERE t.Status = 'Active' OR t.HasActive = 0;

One way to do this is with NOT EXISTS:
SELECT t1.*
FROM tablename t1
WHERE t1.Status = 'Active'
OR NOT EXISTS (
SELECT 1
FROM tablename t2
WHERE t2.identifier = t1.identifier AND t2.db_id <> t1.db_id
);
I assume that the column db_id is unique, at least for the same identifier.

If I understood you correctly, this is my variant.
select IDENTIFIER, [DB_ID], [Status]
from Tab
where [Status]='Active'
union
select IDENTIFIER, [DB_ID], [Status]
from Tab as t
where [Status]='Inactive' And 1=(select Count(*) from Tab where
IDENTIFIER=t.IDENTIFIER)
Order by IDENTIFIER, [DB_ID]

you can do it like this, because (rank=1 and Status=Inactive) only if there are no active rows for a particular Identifier
select * from (
select *,
DENSE_RANK() OVER (PARTITION BY identifier order by status) AS rank
from some_table
)
where rank=1 or status = 'Active'

Related

SQL to query historical table that the count of the number of times in the column is 1

I'm not even sure what to call this type of query and that's why the title might be misleading. Here's what I want to do. We have a history table that goes like this
id, mod_date, is_active
1, 2022-06-22:12:00:00, 1
1, 2022-06-22:13:00:00, 0
2, 2022-06-22:12:00:00, 0
3, 2022-07-07:00:00:00, 1
is_active means that the record was made active. For example, row 1 was made active at 2022-06-22:12:00:00 and then was made inactive at 13:00:00.
What I want is to get only the row that was made inactive on a specific day and not made active again on that day. I came up with this query
select distinct(id)
from history
where is_active = 0
and cast(ah.mod_date as date) = '2022-06-22'
It would return 1 and 2. But I only want 2 because 1 was toggled between states. So, I only want to find all of ids that was made inactive on a specific day and never made active again on that day or any of the toggling the same day.

You may phrase this using exists logic:
SELECT *
FROM history h1
WHERE is_active = 0 AND mod_date::date = '2022-06-22' AND
NOT EXISTS (SELECT 1
FROM history h2
WHERE h2.mod_date::date = '2022-06-22' AND
h2.id = h1.id AND h2.is_active = 1);

Count how many times an id has been activated and deactivated in a day. From the result select the ones that have been deactivated once and activated zero times.
with the_historical_table(id, mod_date, is_active) as
(
values
(1, '2022-06-22:12:00:00', 1),
(1, '2022-06-22:13:00:00', 0),
(2, '2022-06-22:12:00:00', 0),
(3, '2022-07-07:00:00:00', 1)
)
select id, mod_date from
(
select id, mod_date::date,
count(*) filter (where is_active = 1) activated,
count(*) filter (where is_active = 0) deactivated
from the_historical_table
group by id, mod_date::date
) t
where activated = 0 and deactivated = 1;
Result:
id
mod_date
2
2022-06-22

What I want is to get only the row that was made inactive on a
specific day and not made active again on that day
partition.: partition by id, mod_date::date order by id, mod_date
ordered set 1 0 1 row 0 the middle row, both lead and lag is 1. You don't want this situation in the partition.
Consider 3 case.
After partition only have one row, is_action = 0 that mean both lead and lag is NULL.
Partition have multi rows.
Partition have multi rows, ordered set multiple 1 followed by multiple 0
demo
The follow code is like compute base on these 3 logic and then union all.
WITH cte AS (
SELECT
*,
lag(is_active, 1) OVER w,
lead(is_active, 1) OVER w,
first_value(is_active) OVER (PARTITION BY id,
mod_date::date ORDER BY id,
mod_date DESC)
FROM test1
WINDOW w AS (PARTITION BY id,
mod_date::date ORDER BY id,
mod_date)) (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE (lead = 0
OR lead IS NULL)
AND (lag = 1)
AND is_active = 0
ORDER BY
id,
mod_date)
UNION ALL (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE
lead IS NULL
AND lag IS NULL
AND is_active = 0)
UNION ALL (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE
lead = 0
AND lag IS NULL
AND is_active = 0
AND first_value != 1)
ORDER BY
id,
mod_date;

SQL Select row depending on values in different columns

I've already found so many answers here but now I can't seem to find any to my specific problem.
I can't figure out how to select a value from a row depending on the value in different columns
with the below table, I want to achieve the following results.
in case the value in column stdvpuni = 1 then return values / contents from this row for the article (column art).
in case the value in column stdvpuni = 0 then return values / contents from the row where STDUNIABG = 1 for this article (column art).

You seem to want one row part art, based on the content of other rows. That suggests using row_number():
select t.*
from (select t.*,
row_number() over (partition by art order by stdvpuni desc, STDUNIABG desc) as seqnum
from t
) t
where seqnum = 1;
You don't specify what to do if neither column is 1. You might want a where clause (where 1 in (stdvpuni, STDUNIABG)) or another condition in the order by.

I do not know what values / contents is, but I suppose that's easy for you to figure out. So, I will focus on the way to select this:
SELECT
CASE
WHEN current.stdvpuni = 1 THEN 'values / contents of current row'
ELSE 'values / contents of other row'
END
FROM yourtable current
JOIN yourtable other
ON other.stdvpuni = 1;

Use your conditions with NOT EXISTS in the WHERE clause:
SELECT t1.*
FROM tablename t1
WHERE t1.STDVPUNI = 1
OR (
t1.STDVPUNI = 0 AND t1.STDUNIABG = 1
AND NOT EXISTS (SELECT 1 FROM tablename t2 WHERE t2.ART = t1.ART AND t2.STDVPUNI = 1)
);

Update one row based on value in another row

I have an table with following columns and values:
SubscriptionName, Status, Ignore
Project Plan 3 for faculty, Enabled, Null
Project Plan 3 for faculty, Suspended, Null
How can I update the Ignore column to True for the suspended record, if there are 2 entries with the same subscriptionName and the other record has the value Enabled in Status

In SQL Server, you can do this with window functions and an updatable CTE:
with cte (
select
t.*,
max(case when status = 'Enabled' then 1 end)
over(partition by SubscriptionName) has_enabled
from mytable t
)
update cte
set ignore = 'True'
where status = 'Suspended' and has_enabled = 1
The conditional window max() checks if another row exists with the same SubscriptionName and status 'Enabled'.
Or you can use exists:
update t
set ignore = 'True'
from mytable t
where
status = 'Suspended'
and exists (
select 1
from mytable t1
where t1.SubscriptionName = t.SubscriptionName and t1.status = 'Enabled'
)

Comparing two max dates with a condition in Oracle SQL

I have the data as below
ID date state
1 24-Aug-18 Not defined
1 23-Aug-18 Incorrect
1 22-Aug-18 Incorrect
1 21-Aug-18 Incorrect
1 1-Aug-18 Correct
1 23-Jul-17 Incorrect
1 22-Jul-17 Incorrect
1 21-Jul-17 Incorrect
1 10-Jul-17 Correct
The record 1 can stay at the incorrect state for 3 days post that it goes to 'not defined' (unless any update has not been made to the record. If done then it gets back to Correct). The not defined state has to be avoided. Now I need to define a query such that the query can identify the minimum latest record date at which the record went to the incorrect state i.e. in this case 21-Aug-2018. Also problem here is the table doesn't have unique keys.
I have been trying the below code but it is throwing me the error
'ORA-01427: single-row subquery returns more than one row'
select id, min(date) from table where state = 'Incorrect' group by id having
((Select trunc(MAX (date)) from table where state = 'Incorrect'
group by id) >= (select trunc(Max (date)) from table where state = 'Correct'
group by id))

Hmmm, I think this does what you want:
select id, min(date) as min_latest_incorrect_date
from (select t.*,
max(case when state = 'Correct' then date end) over (partition by id) as max_date_correct
from t
) t
where (date > max_date_correct or max_date_correct is null) and
state = 'Incorrect'
group by id

Per ID, you are looking for incorrect records that are not followed by any correct record. Of these take the first.
select id, min(date)
from mytable i
where state = 'Incorrect'
and not exists
(
select *
from mytable c
where c.id = i.id
and c.state = 'Correct'
and c.date > i.date
)
group by id
order by id;

Replace NULL with values

Here is my challenge:
I have a log table which every time a record is changed adds a new record but puts a NULL value for each non-changed value in each record. In other words only the changed value is set, the rest unchanged fields in each row simply has a NULL value.
Now I would like to replace each NULL value with the value above it that is NOT a NULL value like below:
Source table: Task_log
ID Owner Status Flag
1 Bob Registrar T
2 Sue NULL NULL
3 NULL NULL F
4 Frank Admission T
5 NULL NULL F
6 NULL NULL T
Desired output table: Task_log
ID Owner Status Flag
1 Bob Registrar T
2 Sue Registrar T
3 Sue Registrar F
4 Frank Admission T
5 Frank Admission F
6 Frank Admission T
How do I write a query which will generate the desired output table?

One the new windowed function of SQLServer 2012 is FIRST_VALUE, wich have quite a direct name, it can be partitioned through the OVER clause, before using it is necessary to divide every column in data block, a block for a column begin when a value is found.
With Block As (
Select ID
, Owner
, OBlockID = SUM(Case When Owner Is Null Then 0 Else 1 End)
OVER (ORDER BY ID)
, Status
, SBlockID = SUM(Case When Status Is Null Then 0 Else 1 End)
OVER (ORDER BY ID)
, Flag
, FBlockID = SUM(Case When Flag Is Null Then 0 Else 1 End)
OVER (ORDER BY ID)
From Task_log
)
Select ID
, Owner = FIRST_VALUE(Owner) OVER (PARTITION BY OBlockID ORDER BY ID)
, Status = FIRST_VALUE(Status) OVER (PARTITION BY SBlockID ORDER BY ID)
, Flag = FIRST_VALUE(Flag) OVER (PARTITION BY FBlockID ORDER BY ID)
FROM Block
SQLFiddle demo
The UPDATE query is easily derived

As I mentioned in my comment, I would try to fix the process that is creating the records rather than fixing the junk data. If that is not an option, the code below should get you pointed in the right direction.
UPDATE t1
set t1.owner = COALESCE(t1.owner, t2.owner),
t1.Status = COALESCE(t1.status, t2.status),
t1.Flag = COALESCE(t1.flag, t2.flag)
FROM Task_log as t1
INNER JOIN Task_log as t2
ON t1.id = (t1.id + 1)
where t1.owner is null
OR t1.status is null
OR t1.flag is null

I can think of several approaches.
You could use a combination of COALESCE with an array aggregate function. Unfortunately it doesn't look like SQL Server supports array_agg natively (although some nice people have developed some workarounds).
You could also use a subselect for each column.
SELECT id,
(SELECT TOP 1 FROM (SELECT owner FROM ... WHERE id = outer_id AND owner IS NOT NULL order by ID desc )) AS owner,
-- other columns
You could probably do something with window functions, too.

A vanilla solution would be:
select id
, owner
, coalesce(owner, ( select owner from t t2
where id = (select max(id) from t t3
where id < t1.id and owner is not null))
) as new_owner
, flag
, coalesce(flag, ( select flag from t t2
where id = (select max(id) from t t3
where id < t1.id and flag is not null))
) as new_flag
from t t1
Rather inefficient, but should work on most DBMS

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding rows in SQL where changes but only certain changes while keeping others - sql

One way to do this is with NOT EXISTS: SELECT t1.* FROM tablename t1 WHERE t1.Status = 'Active' OR NOT EXISTS ( SELECT 1 FROM tablename t2 WHERE t2.identifier = t1.identifier AND t2.db_id <> t1.db_id ); I assume that the column db_id is unique, at least for the same identifier.

If I understood you correctly, this is my variant. select IDENTIFIER, [DB_ID], [Status] from Tab where [Status]='Active' union select IDENTIFIER, [DB_ID], [Status] from Tab as t where [Status]='Inactive' And 1=(select Count(*) from Tab where IDENTIFIER=t.IDENTIFIER) Order by IDENTIFIER, [DB_ID]

you can do it like this, because (rank=1 and Status=Inactive) only if there are no active rows for a particular Identifier select * from ( select *, DENSE_RANK() OVER (PARTITION BY identifier order by status) AS rank from some_table ) where rank=1 or status = 'Active'

Related

SQL to query historical table that the count of the number of times in the column is 1

SQL Select row depending on values in different columns

Update one row based on value in another row

Comparing two max dates with a condition in Oracle SQL

Replace NULL with values

Categories

Resources