SQL to query historical table that the count of the number of times in the column is 1 - sql

I'm not even sure what to call this type of query and that's why the title might be misleading. Here's what I want to do. We have a history table that goes like this
id, mod_date, is_active
1, 2022-06-22:12:00:00, 1
1, 2022-06-22:13:00:00, 0
2, 2022-06-22:12:00:00, 0
3, 2022-07-07:00:00:00, 1
is_active means that the record was made active. For example, row 1 was made active at 2022-06-22:12:00:00 and then was made inactive at 13:00:00.
What I want is to get only the row that was made inactive on a specific day and not made active again on that day. I came up with this query
select distinct(id)
from history
where is_active = 0
and cast(ah.mod_date as date) = '2022-06-22'
It would return 1 and 2. But I only want 2 because 1 was toggled between states. So, I only want to find all of ids that was made inactive on a specific day and never made active again on that day or any of the toggling the same day.

You may phrase this using exists logic:
SELECT *
FROM history h1
WHERE is_active = 0 AND mod_date::date = '2022-06-22' AND
NOT EXISTS (SELECT 1
FROM history h2
WHERE h2.mod_date::date = '2022-06-22' AND
h2.id = h1.id AND h2.is_active = 1);

Count how many times an id has been activated and deactivated in a day. From the result select the ones that have been deactivated once and activated zero times.
with the_historical_table(id, mod_date, is_active) as
(
values
(1, '2022-06-22:12:00:00', 1),
(1, '2022-06-22:13:00:00', 0),
(2, '2022-06-22:12:00:00', 0),
(3, '2022-07-07:00:00:00', 1)
)
select id, mod_date from
(
select id, mod_date::date,
count(*) filter (where is_active = 1) activated,
count(*) filter (where is_active = 0) deactivated
from the_historical_table
group by id, mod_date::date
) t
where activated = 0 and deactivated = 1;
Result:
id
mod_date
2
2022-06-22

What I want is to get only the row that was made inactive on a
specific day and not made active again on that day
partition.: partition by id, mod_date::date order by id, mod_date
ordered set 1 0 1 row 0 the middle row, both lead and lag is 1. You don't want this situation in the partition.
Consider 3 case.
After partition only have one row, is_action = 0 that mean both lead and lag is NULL.
Partition have multi rows.
Partition have multi rows, ordered set multiple 1 followed by multiple 0
demo
The follow code is like compute base on these 3 logic and then union all.
WITH cte AS (
SELECT
*,
lag(is_active, 1) OVER w,
lead(is_active, 1) OVER w,
first_value(is_active) OVER (PARTITION BY id,
mod_date::date ORDER BY id,
mod_date DESC)
FROM test1
WINDOW w AS (PARTITION BY id,
mod_date::date ORDER BY id,
mod_date)) (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE (lead = 0
OR lead IS NULL)
AND (lag = 1)
AND is_active = 0
ORDER BY
id,
mod_date)
UNION ALL (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE
lead IS NULL
AND lag IS NULL
AND is_active = 0)
UNION ALL (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE
lead = 0
AND lag IS NULL
AND is_active = 0
AND first_value != 1)
ORDER BY
id,
mod_date;

Related

Finding rows in SQL where changes but only certain changes while keeping others

I have this scenario where I want each occurrence of an active row to bring back that row in my result set and also inactive if there is only 1 inactive record for that IDENTIFIER and also if there are more than 1 active also show those. I've used Row_Number function and then in another query show where the row = '1' but if I do that row 1s only come back and then I lose some of my desired results. To restate my issue is I want all active records to come back and only inactive where IDENTIFIER is unique. The row that is bold should not be shown in the results.
1 has 1 active record in the DB.
2 has 2 active and 1 inactive records.
3 has no active records.
4 has only 2 active records, no inactive.
You can use a windowed conditional count, this has the benfit of only scanning the table once
SELECT
t.IDENTIFIER,
t.DB_ID,
t.Status
FROM (
SELECT *,
HasActive = COUNT(CASE WHEN t.Status = 'Active' THEN 1 END) OVER (PARTITION BY t.IDENTIFIER)
FROM YourTable t
) t
WHERE t.Status = 'Active' OR t.HasActive = 0;
One way to do this is with NOT EXISTS:
SELECT t1.*
FROM tablename t1
WHERE t1.Status = 'Active'
OR NOT EXISTS (
SELECT 1
FROM tablename t2
WHERE t2.identifier = t1.identifier AND t2.db_id <> t1.db_id
);
I assume that the column db_id is unique, at least for the same identifier.
If I understood you correctly, this is my variant.
select IDENTIFIER, [DB_ID], [Status]
from Tab
where [Status]='Active'
union
select IDENTIFIER, [DB_ID], [Status]
from Tab as t
where [Status]='Inactive' And 1=(select Count(*) from Tab where
IDENTIFIER=t.IDENTIFIER)
Order by IDENTIFIER, [DB_ID]
you can do it like this, because (rank=1 and Status=Inactive) only if there are no active rows for a particular Identifier
select * from (
select *,
DENSE_RANK() OVER (PARTITION BY identifier order by status) AS rank
from some_table
)
where rank=1 or status = 'Active'

Check whether an employee is present on three consecutive days

I have a table called tbl_A with the following schema:
After insert, I have the following data in tbl_A:
Now the question is how to write a query for the following scenario:
Put (1) in front of any employee who was present three days consecutively
Put (0) in front of employee who was not present three days consecutively
The output screen shoot:
I think we should use case statement, but I am not able to check three consecutive days from date. I hope I am helped in this
Thank you
select name, case when max(cons_days) >= 3 then 1 else 0 end as presence
from (
select name, count(*) as cons_days
from tbl_A, (values (0),(1),(2)) as a(dd)
group by name, adate + dd
)x
group by name
With a self-join on name and available = 'Y', we create an inner table with different combinations of dates for a given name and take a count of those entries in which the dates of the two instances of the table are less than 2 units apart i.e. for each value of a date adate, it will check for entries with its own value adate as well as adate + 1 and adate + 2. If all 3 entries are present, the count will be 3 and you will have a flag with value 1 for such names(this is done in the outer query). Try the below query:
SELECT Z.NAME,
CASE WHEN Z.CONSEQ_AVAIL >= 3 THEN 1 ELSE 0 END AS YOUR_FLAG
FROM
(
SELECT A.NAME,
SUM(CASE WHEN B.ADATE >= A.ADATE AND B.ADATE <= A.ADATE + 2 THEN 1 ELSE 0 END) AS CONSEQ_AVAIL
FROM
TABL_A A INNER JOIN TABL_A B
ON A.NAME = B.NAME AND A.AVAILABLE = 'Y' AND B.AVAILABLE = 'Y'
GROUP BY A.NAME
) Z;
Due to the complexity of the problem, I have not been able to test it out. If something is really wrong, please let me know and I will be happy to take down my answer.
--Below is My Approch
select Name,
Case WHen Max_Count>=3 Then 1 else 0 end as Presence
from
(
Select Name,MAx(Coun) as Max_Count
from
(
select Name, (count(*) over (partition by Name,Ref_Date)) as Coun from
(
select Name,adate + row_number() over (partition by Name order by Adate desc) as Ref_Date
from temp
where available='Y'
)
) group by Name
);
select name as employee , case when sum(diff) > =3 then 1 else 0 end as presence
from
(select id, name, Available,Adate, lead(Adate,1) over(order by name) as lead,
case when datediff(day, Adate,lead(Adate,1) over(order by name)) = 1 then 1 else 0 end as diff
from table_A
where Available = 'Y') A
group by name;

How to get data based on Case condition and MAX Date

I have some data:
Declare #table table (RID VARCHAR(10),
CommType INT,
CommunicationType INT,
VALUE VARCHAR(20),
lastDate Datetime)
INSERT INTO #table (RID, CommType, CommunicationType, VALUE, lastDate)
VALUES
('00WAAS', 3, 0, 'mohan#gmail', '2012-06-15 15:23:49.653'),
('00WAAS', 3, 1, 'manasa#gmail', '2015-08-15 15:23:49.653'),
('00WAAS', 3, 2, 'mother#gmail', '2014-09-15 15:23:49.653'),
('00WAAS', 3, 2, 'father#gmail', '2016-01-15 15:23:49.653'),
('00WAAS', 3, 0, 'hello#gmail', '2013-01-15 15:23:49.653')
My query:
SELECT
TT.RID,
COALESCE(Homemail, BusinessMail, OtherMail) Mail
FROM
(SELECT
RID, MAX(Homemail) Homemail,
MAX(BusinessMail) BusinessMail,
MAX(OtherMail) OtherMail
FROM
(SELECT
RID,
CASE
WHEN CommType = 3 AND CommunicationType = 0 THEN VALUE
END AS Homemail,
CASE
WHEN CommType = 3 AND CommunicationType = 1 THEN VALUE
END AS BusinessMail,
CASE
WHEN CommType = 3 AND CommunicationType = 2 THEN VALUE
END AS OtherMail,
lastDate
FROM
#table) T
GROUP BY RID) TT
What I'm expecting
Here I need to get result if CommType = 3 and CommunicationType = 0 then related value based on latest date and if data is not available for
CommType = 3 and CommunicationType = 0
then I need to get data of CommunicationType = 1
related value based on latest date and if there is no data for
CommunicationType = 1
then CommunicationType = 2 based on latest date of that CommunicationTypes.
Here I have tried Case condition ,MAX and Coalesce
If combination data is present in CommunicationType = 0 is present get CommunicationType = 0 based on latest date
If combination data is not present in CommunicationType = 0 then get CommunicationType = 1 based on latest date
If combination data is not present in CommunicationType = 1 then get CommunicationType = 2 based on latest date
I'm not entirely sure I've understood the requirement. But I think you want:
One record returned for each RID.
The returned record should have a CommType of 3.
If there is more than one record with a CommType 3 you want the record with the lowest CommunicationType.
If there is still more than one record you want the one with the most recent lastDate.
This query uses the windowed function ROW_NUMBER to rank the available records, within a subquery. PARTITION BY ensures each RID is ranked sepearatly. The outer query returns all records with a rank of 1.
Query
SELECT
r.*
FROM
(
/* For each RID We want the lowest communication type with
* the most recent last date.
*/
SELECT
ROW_NUMBER() OVER (PARTITION BY RID ORDER BY CommunicationType, lastDate DESC) AS rn,
*
FROM
#table
WHERE
CommType = 3
) AS r
WHERE
r.rn = 1
;
Next Steps
This query is ok but could be better. For example what would happen if two records had a matching CommType, CommunicationType and lastDate? Reading up on the differences between ROW_NUMBER, RANK, DENSE_RANK and NTILE will help you figure out your options here.
If I understood you correctly, use ROW_NUMBER() :
SELECT tt.RID,COALESCE(tt.Homemail,tt.businessMail,tt.OtherMail)
FROM(
select s.RID,
MAX(CASE WHEN s.CommType = 3 AND s.CommunicationType = 0 THEN s.VALUE END) AS Homemail,
MAX(CASE WHEN s.CommType = 3 AND s.CommunicationType = 1 THEN s.VALUE END) AS BusinessMail,
MAX(CASE WHEN s.CommType = 3 AND s.CommunicationType = 2 THEN s.VALUE END) AS OtherMail
from (SELECT t.*,ROW_NUMBER() OVER(PARTITION BY t.rid,t.communicationType ORDER BY t.lastDate DESC)
FROM #table t
WHERE t.commType = 3) s
WHERE s.rnk = 1
GROUP BY s.rid) tt

Query to find ranges of consecutive rows

I have file that contains a dump of a SQL table with 2 columns: int ID (auto increment identity field) and bit Flag. The flag = 0 means a record is good and the flag = 1 means a record is bad (contains an error). The goal is to find all blocks of consecutive bad records (with flag value of 1) with 1,000 or more rows. The solution shouldn't use cursors or while loops and it should use the set-based queries only (selects, joins etc).
We would like to see the actual queries used and the results in the following format:
StartID – EndID NumberOfErrorsInTheBlock
StartID – EndID NumberOfErrorsInTheBlock
……………………….
StartID – EndID NumberOfErrorsInTheBlock
For example if our data were only 30 records and we were looking for blocks with 5 or more records then the results would look as follows (see the screenshot below, the errors blocks that met the criteria are highlighted) :
[ID Range].....[Number of errors in the block]
11-15..... 5
19-25..... 7
sql file containing sample rows, dropbox
T-SQL Solution for SQL Server 2012 and Above
IF OBJECT_ID('tempdb..#tbl_ranges') IS NOT NULL
DROP TABLE #tbl_ranges;
CREATE TABLE #tbl_ranges
(
row_num INT PRIMARY KEY,
ID INT,
Flag BIT,
Label TINYINT
);
WITH cte_yourTable
AS
(
SELECT Id,
Flag,
CASE
--label min
WHEN Flag != LAG(flag,1) OVER (ORDER BY ID) THEN 1
--inner
WHEN Flag = LAG(flag,1) OVER (ORDER BY ID) AND Flag = LEAD(flag,1) OVER (ORDER BY ID) THEN 2
--end
WHEN Flag = LAG(flag,1) OVER (ORDER BY ID) AND Flag != LEAD(flag,1) OVER (ORDER BY ID) THEN 3
END label
FROM yourTable
)
INSERT INTO #tbl_ranges
SELECT ROW_NUMBER() OVER (ORDER BY ID) row_num,
ID,
Flag,
label
FROM cte_yourTable
WHERE label != 2;
SELECT A.ID ID_start,
B.ID ID_end,
B.ID - A.ID range_cnt
FROM #tbl_ranges A
INNER JOIN #tbl_ranges B
ON A.row_num = B.row_num - 1
AND A.Flag = B.Flag;
IF OBJECT_ID('tempdb..#tbl_ranges') IS NOT NULL
DROP TABLE #tbl_ranges;
Abbreviated Results:
ID_start ID_end range_cnt
----------- ----------- -----------
2 3 1
5 8 3
9 10 1
11 35 24
36 356 320
357 358 1
359 406 47
...
With out using Temp Table, This is the best solution, Here is the Answer and It is perfect example for CTE with in CTE ( Nested CTE )
With Evaluation (ID,Flag,Evaluate)
as
(select ID,Flag,Evaluate = ID-row_number() over (order by Flag,ID)
from [dbo].[SqltestRecordsNew]
where Flag = 1
),
Evaluation_Final (StartingRecordID,EndRecordID,Flag,cnt)
as
(
select min(ID) as StartingRecordID,max(ID) as EndRecordID,
Flag, cnt = count(*)
from Evaluation
group by Evaluate, Flag
)
select Concat(StartingRecordID,' - ', EndRecordID) as 'StartingRecordID - EndRecordId',
cnt as GroupItemCnt from Evaluation_Final
where cnt > 999
order by Concat(StartingRecordID,' - ', EndRecordID)
-- Test results Case 1
Select ID,Flag,
Case when Flag=1 then 'Success'
else 'Defect Data'
End as TestResults
from SqltestRecordsNew
where ID between 1494363 and 1495559
-- Test results Case 2
Select ID,Flag,
Case when Flag=1 then 'Success'
else 'Defect Data'
End as TestResults from SqltestRecordsNew
where ID between 1498409 and 1503899
-- Test results Case 3
Select ID,Flag,
Case when Flag=1 then 'Success'
else 'Defect Data'
End as TestResults from SqltestRecordsNew
where ID between 1548257 and 1550489

Want a count but it repeats 1 with every record

I want a count but it repeats 1 with every record. Can you please suggest what to do?
SELECT Count(*),
innerTable.*
FROM (SELECT (SELECT NAME
FROM tours
WHERE tours.id = tourbooking.tourid) AS NAME,
(SELECT url
FROM tours
WHERE tours.id = tourbooking.tourid) AS Url,
(SELECT TOP 1 NAME
FROM tourimages
WHERE tourimages.tourid = tourbooking.tourid
ORDER BY id ASC) AS ImageName,
(SELECT duration + ' ' + CASE WHEN durationtype = 'd' THEN
'Day(s)' WHEN
durationtype =
'h' THEN 'Hour(s)' END
FROM tours
WHERE tours.id = tourbooking.tourid) AS Duration,
(SELECT Replace(Replace('<a> Adult(s) - <c> Children', '<a>', Sum
(CASE
WHEN [type] = 1 THEN 1
ELSE 0
END)),
'<c>',
Sum(CASE
WHEN [type] = 2 THEN 1
ELSE 0
END))
FROM tourperson
WHERE tourperson.bookingid = tourbooking.id) AS TotalPassengers
,
startdate,
createddate AS BookingDate,
id AS BookingID,
[status],
serviceprice
FROM tourbooking
WHERE memberid = 6)AS innerTable
GROUP BY innerTable.NAME,
innerTable.bookingdate,
innerTable.bookingid,
innerTable.duration,
innerTable.imagename,
innerTable.serviceprice,
innerTable.startdate,
innerTable.status,
innerTable.totalpassengers,
innerTable.url
You select records from tourbooking. One of the columns you select is id. This is probably the table's primary key and thus unique. (If not, you should hurry to change that name.)
You call this ID BookingID, and it is one of the columns you group by. So you get one result record per record in tourbooking. The number of records within such a "group" is of course 1; it is the one record you select and show.
If you built real groups, say a result record per day, then you'd get a real count, e.g. the number of bookings per day.