Counting users which doesnt make a certain event - sql

Hi from the following table
id event
1 unknown
1 unknown
1 unknown
2 unknown
2 X
2 Y
3 unknown
3 unknown
4 X
5 Y
i want count all the amount of users which in all of their rows has unknown values
In this case they should be 2 ids out of 5
My attempt was :
select
count(distinct case when event != 'unknown' then id else null end) as loggeds,
count(distinct case when event = 'unknown' then id else null end) as not_log_android,
count(distinct event) as session_long
from table
but is completly wrong

With NOT EXISTS:
select t.id
from tablename as t
where not exists (
select 1 from tablename where id = t.id and event <> 'unknown'
)
group by t.id
for the number of disinct ids:
select count(distinct t.id)
from tablename as t
where not exists (
select 1 from tablename where id = t.id and event <> 'unknown'
)
See the demo

You can check this question: How to check if value exists in each group (after group by)
SELECT COUNT(DISTINCT t1.id)
FROM theTable t1
WHERE NOT EXISTS (SELECT 1 from theTable t2 where t1.id = t2.id and t2.value != 'unknown')
OR
SELECT COUNT(t.id)
FROM theTable t
GROUP BY t.id
HAVING MAX(CASE value WHEN 'unknown' THEN 0 ELSE 1 END) = 0

SELECT id
FROM YourTable
GROUP BY id
HAVING COUNT(*) = COUNT ( CASE WHEN event = 'unknown' THEN 1 END )

I would do aggregation :
SELECT id
FROM table t
GROUP BY id
HAVING MIN(event) = MAX(event) AND MIN(event) = 'unknown';

Related

SQL Function for updating column with values

Those who have helped me before, i tend to use SAS9.4 a lot for my day to day work, however there are times when i need to use SQL Server
There is a output table i have with 2 variables (attached output.csv)
output table
ID, GROUP, DATE
The table has 830 rows:
330 have a "C" group
150 have a "A" group
50 have a "B" group
the remaining 300 have group as "TEMP"
within SQL i do not now how to programatically work out the total volume of A+B+C. The aim is to update "TEMP" column to ensure there is an Equal amount of "A" and "B" totalling 250 of each (the remainder of the total count)
so the table totals
330 have a "C" group
250 have a "A" group
250 have a "B" group
You want to proportion the "temp" to get equal amounts of "A" and "B".
So, the idea is to count up everything in A, B, and Temp and divide by 2. That is the final group size. Then you can use arithmetic to allocate the rows in Temp to the two groups:
select t.*,
(case when seqnum + a_cnt <= final_group_size then 'A' else 'B' end) as allocated_group
from (select t.*, row_number() over (order by newid()) as seqnum
from t
where group = 'Temp'
) t cross join
(select (cnt_a + cnt_b + cnt_temp) / 2 as final_group_size,
g.*
from (select sum(case when group = 'A' then 1 else 0 end) as cnt_a,
sum(case when group = 'B' then 1 else 0 end) as cnt_b,
sum(case when group = 'Temp' then 1 else 0 end) as cnt_temp
from t
) g
) g
SQL Server makes it easy to put this into an update:
with toupdate as (
select t.*,
(case when seqnum + a_cnt <= final_group_size then 'A' else 'B' end) as allocated_group
from (select t.*, row_number() over (order by newid()) as seqnum
from t
where group = 'Temp'
) t cross join
(select (cnt_a + cnt_b + cnt_temp) / 2 as final_group_size,
g.*
from (select sum(case when group = 'A' then 1 else 0 end) as cnt_a,
sum(case when group = 'B' then 1 else 0 end) as cnt_b,
sum(case when group = 'Temp' then 1 else 0 end) as cnt_temp
from t
) g
) g
)
update toupdate
set group = allocated_group;
I'd go with a top 250 update style approach
update top (250) [TableName] set Group = 'A' where exists (Select * from [TableName] t2 where t2.id = [TableName].id order by newid()) and Group = 'Temp'
update top (250) [TableName] set Group = 'B' where exists (Select * from [TableName] t2 where t2.id = [TableName].id order by newid()) and Group = 'Temp'

SQL Get rows based on conditions

I'm currently having trouble writing the business logic to get rows from a table with id's and a flag which I have appended to it.
For example,
id: id seq num: flag: Date:
A 1 N ..
A 2 N ..
A 3 N
A 4 Y
B 1 N
B 2 Y
B 3 N
C 1 N
C 2 N
The end result I'm trying to achieve is that:
For each unique ID I just want to retrieve one row with the condition for that row being that
If the flag was a "Y" then return that row.
Else return the last "N" row.
Another thing to note is that the 'Y' flag is not always necessarily the last
I've been trying to get a case condition using a partition like
OVER (PARTITION BY A."ID" ORDER BY A."Seq num") but so far no luck.
-- EDIT:
From the table, the sample result would be:
id: id seq num: flag: date:
A 4 Y ..
B 2 Y ..
C 2 N ..
Using a window clause is the right idea. You should partition the results by the ID (as you've done), and order them so the Y flag rows come first, then all the N flag rows in descending date order, and pick the first for each id:
SELECT id, id_seq_num, flag, date
FROM (SELECT id, id_seq_num, flag, date,
ROW_NUMBER() OVER (PARTITION BY id
ORDER BY CASE flag WHEN 'Y' THEN 0
ELSE 1
END ASC,
date ASC) AS rk
FROM mytable) t
WHERE rk = 1
My approach is to take a UNION of two queries. The first query simply selects all Yes records, assuming that Yes only appears once per ID group. The second query targets only those ID having no Yes anywhere. For those records, we use the row number to select the most recent No record.
WITH cte1 AS (
SELECT id
FROM yourTable
GROUP BY id
HAVING SUM(CASE WHEN flag = 'Y' THEN 1 ELSE 0 END) = 0
),
cte2 AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY t1.id ORDER BY t1."id seq" DESC) rn
FROM yourTable t1
INNER JOIN cte1 t2
ON t1.id = t2.id
)
SELECT *
FROM yourTable
WHERE flag = 'Y'
UNION ALL
SELECT *
FROM cte2 t2
WHERE t2.rn = 1
Here's one way (with quite generic SQL):
select t1.*
from Table1 as t1
where t1.id_seq_num = COALESCE(
(select max(id_seq_num) from Table1 as T2 where t1.id = t2.id and t2.flag = 'Y') ,
(select max(id_seq_num) from Table1 as T3 where t1.id = t3.id and t3.flag = 'N') )
Available in a fiddle here: http://sqlfiddle.com/#!9/5f7f9/6
SELECT DISTINCT id, flag
FROM yourTable

Count based on group with filter

I have a query that displays 2 columns: "Device_ID" and "Status". Device_ID is the name of all computers and status contains either "reboot" or "success" as values. I would like a third column that would count how many "success" there are for that specific Device_ID.
How could I go about doing this?
SELECT tgt.Device_ID, tgt.Status, src.cnt
FROM [TableName] tgt
INNER JOIN
(
Select Device_ID, count(CASE WHEN Status = 'SUCCESS' THEN 1 ELSE 0END) cnt
from [TableName]
GROUP BY Device_ID
) src
ON tgt.Device_ID= src.Device_ID;
SELECT A.Device_ID,A.Status,B.Count_of_Success_per_Device_ID
FROM Yourtable A
INNER JOIN
(
SELECT Device_ID,
SUM( CASE WHEN Status = 'Success' THEN 1 ELSE 0 END ) AS Count_of_Success_per_Device_ID
FROM Yourtable
GROUP BY Device_ID
) B
ON A.Device_ID = B.Device_ID ;

SQL Server duplicate row

I have a table with duplicate records. I want to mark whether the record is a duplicate or not in a another column, let's say a column name Flag. If the records is a duplicate mark it as 1 in Flag column else 0.
How to do this?
I can use a query to select duplicate records.
select
o.clientid, oc.dupeCount, o.pannodesc, o.CustNo
from
CustomerMaster1 o
inner join
(SELECT clientid, COUNT(*) AS dupeCount
FROM CustomerMaster1
WHERE ISNULL(PanNoDesc, '') <> ''
GROUP BY clientid
HAVING COUNT(*) > 1) oc ON o.clientid = oc.clientid
Simply saying, if there are two similar records, mark 1 against the second duplicated row, if three similar records mark 1 against two rows, leaving the original record as 0.
Just use count(*) as a window function to calculate the flag:
select o.clientid, oc.dupeCount, o.pannodesc, o.CustNo,
(case when count(*) over (partition by clientId) > 1
then 1 else 0
end) as IsDuplicate
from CustomerMaster1 o;
If you only case about certain records, then you can count them instead:
select o.clientid, oc.dupeCount, o.pannodesc, o.CustNo,
(case when sum(case when PanNoDesc <> '' or PanNoDesc is not null
then 1 else 0
end) over (partition by clientId) > 1
then 1 else 0
end) as IsDuplicate
from CustomerMaster1 o;
EDIT:
If you want to modify the data, assuming you have a flag, you can just use these statements as a CTE:
with toupdate as (
select o.clientid, oc.dupeCount, o.pannodesc, o.CustNo,
(case when sum(case when PanNoDesc <> '' or PanNoDesc is not null
then 1 else 0
end) over (partition by clientId) > 1
then 1 else 0
end) as NewIsDuplicate
from CustomerMaster1 o
)
update toupdate
set Flag = NewIsDuplicate;
You can write as
CREATE TABLE CustomerMaster1 (clientid INT,PanNoDesc VARCHAR(10),DupFlag bit)
INSERT INTO CustomerMaster1 VALUES(1,'A',NULL ),(1,'B',NULL )
SELECT clientid,PanNoDesc,DupFlag FROM CustomerMaster1
;WITH CTE AS(
SELECT clientid,
ROW_NUMBER()OVER (PARTITION BY clientid ORDER BY clientid ASC) AS rownum
FROM CustomerMaster1
WHERE ISNULL(PanNoDesc, '') <> ''
)
UPDATE T
SET T.DupFlag = (case WHEN rownum > 1 THEN 1 ELSE 0 END)
FROM CustomerMaster1 T
JOIN CTE ON CTE.clientid = T.clientid
SELECT clientid,PanNoDesc,DupFlag FROM CustomerMaster1
demo
Edit: Demo based on sample fields provided:
http://sqlfiddle.com/#!3/4592f/1

Joining two different queries under one answer

I have two different queries that have produced the correct result, but I would like to have them produce the answer out in one table. How do I do that?
Here is my code:
SELECT count(distinct ID) as NoOfEmployees
FROM Table_Name
WHERE date<= '2012-05-31';
select count(subA.ID) as EmployeesChanged from (
SELECT A.ID
FROM Table_Name A
WHERE A.date < '2012-06-01'
GROUP BY 1
HAVING COUNT(A.Service_type) > 1 ) subA
Currently I have the following output:
Number of Employees
x
Employees Changed
x
How do I make it
Number of Employees | Employees Changed | (Number of employees - number changed)
x | x | x
I don't know what database do you use. But for some databases you can try:
select q1.Value, q2.Value, q1.Value - q2.Value from
(SELECT count(distinct ID) as Value FROM Table_Name
WHERE date<= '2012-05-31') q1,
(select count(subA.ID) as Value from
( SELECT A.ID FROM Table_Name A
WHERE A.date < '2012-06-01' GROUP BY 1
HAVING COUNT(A.Service_type) > 1 ) subA) q2
If date<= '2012-05-31' is the same as A.date < '2012-06-01' ?
SELECT COUNT(1) AS NoOfEmployees,
SUM(CASE WHEN STCount > 0 then 1 else 0 end) as HasChange,
SUM(CASE WHEN STCount = 0 then 1 else 0 end) as NoChange
FROM
(SELECT ID,
COUNT(A.Service_type) STCount
FROM Table_Name
WHERE date<= '2012-05-31'
GROUP BY ID) AS Data
You can use CROSS JOIN:
SELECT a.*, b.*, a.NoOfEmployees - b.EmployeesChanged
FROM
(
SELECT count(distinct ID) as NoOfEmployees
FROM Table_Name
WHERE date<= '2012-05-31'
) a
CROSS JOIN
(
SELECT count(subA.ID) as EmployeesChanged
FROM
(
SELECT A.ID
FROM Table_Name A
WHERE A.date < '2012-06-01'
GROUP BY 1
HAVING COUNT(A.Service_type) > 1
) subA
) b
Edit:
You might be able to greatly optimize your query by using conditional aggregation instead of executing two separate queries:
SELECT a.NoOfEmployees, a.EmployeesChanged, a.NoOfEmployees - a.EmployeesChanged
FROM
(
SELECT
COUNT(DISTINCT CASE WHEN date <= '2012-05-31' THEN ID END) as NoOfEmployees,
COUNT(DISTINCT CASE WHEN date < '2012-06-01' AND COUNT(Service_type) > 1 THEN ID END) AS EmployeesChanged
FROM Table_Name
GROUP BY ID
) a