SQL Server : how can I get difference between counts of total rows and those with only data - sql

I have a table with data as shown below (the table is built every day with current date, but I left off that field for ease of reading).
This table keeps track of people and the doors they enter on a daily basis.
Table entrance_t:
id entrance entered
------------------------
1 a 0
1 b 0
1 c 0
1 d 0
2 a 1
2 b 0
2 c 0
2 d 0
3 a 0
3 b 1
3 c 1
3 d 1
My goal is to report on people and count entrances not used(grouping on people), but ONLY if they entered(entered=1).
So using the above table, I would like the results of query to be...
id count
----------
2 3
3 1
(id=2 did not use 3 of the entrances and id=3 did not use 1)
I tried queries(some with inner joins on two instances of same table) and I can get the entrances not used, but it's always for everybody. Like this...
id count
----------
1 4
2 3
3 1
How do I not display results id=1 since they did not enter at all?
Thank you,

You could use conditional aggregation:
SELECT id, count(CASE WHEN entered = 0 THEN 1 END) AS cnt
FROM entrance_t
GROUP BY id
HAVING count(CASE WHEN entered = 1 THEN 1 END) > 0;
DBFiddle Demo

Related

How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

SQL Query to get multiple resultant on single column

I have a table that looks something like this:
id name status
2 a 1
2 a 2
2 a 3
2 a 2
2 a 1
3 b 2
3 b 1
3 b 2
3 b 1
and the resultant i want is:
id name total count count(status3) count(status2) count(status1)
2 a 5 1 2 2
3 b 4 0 2 2
please help me get this result somehow, i can just get id, name or one of them at a time, don't know how to put a clause to get this table at once.
Here's a simple solution using group by and case when.
select id
,count(*) as 'total count'
,count(case status when 3 then 1 end) as 'count(status1)'
,count(case status when 2 then 1 end) as 'count(status3)'
,count(case status when 1 then 1 end) as 'count(status2)'
from t
group by id
id
total count
count(status3)
count(status2)
count(status1)
2
5
1
2
2
3
4
0
2
2
Fiddle
Here's a way to solve it using pivot.
select *
from (select status,id, count(*) over (partition by id) as "total count" from t) tmp
pivot (count(status) for status in ([1],[2],[3])) pvt
d
total count
1
2
3
3
4
2
2
0
2
5
2
2
1
Fiddle

Best way to by column and aggregation on another column

I want to create a rank column using existing rank and binary columns. Suppose for example a table with ID, RISK, CONTACT, DATE. The existing rank is RISK, say 1,2,3,NULL, with 3 being the highest. The binary-valued is CONTACT with 0,1 or FAILURE/SUCESS. I want to create a new RANK that will order by RISK once a certain number of successful contacts has been exceeded.
For example, suppose the constraint is a minimum of 2 successful contacts. Then the rank should be created as follows in the two instances below:
Instance 1. Three ID, all have a min of two successful contacts. In that case the rank mirrors the risk:
ID risk contact date rank
1 3 S 1 3
1 3 S 2 3
1 3 F 3 3
1 3 F 4 3
2 2 S 1 2
2 2 S 2 2
2 2 F 3 2
2 2 F 4 2
3 1 S 1 1
3 1 S 2 1
3 1 S 3 1
Instance 2. Suppose ID=1 has only one successful contact. In that case it is relegated to the lowest rank, rank=1, while ID=2 gets the highest value, rank=3, and ID=3 maps to rank=2 because it satisfies the constraint but has a lower risk value than ID=2:
ID risk contact date rank
1 3 S 1 1
1 3 F 2 1
1 3 F 3 1
1 3 F 4 1
2 2 S 1 3
2 2 S 2 3
2 2 F 3 3
2 2 F 4 3
3 1 S 1 2
3 1 S 2 2
3 1 S 3 2
This is SQL, specifically Hive. Thanks in advance.
Edit - I think Gordon Linoff's code does it correctly. In the end, I used three interim tables. The code looks like that:
First,
--numerize risk, contact
select A.* ,
case when A.risk = 'H' then 3
when A.risk = 'M' then 2
when A.risk = 'L' then 1
when A.risk is NULL then NULL
when A.risk = 'NULL' then NULL
else -999 end as RISK_RANK,
case when A.contact = 'Successful' then 1
else NULL end as success
Second,
-- sum_successes_by_risk
select A.* ,
B.sum_successes_by_risk
from T as A
inner join
(select A.person, A.program, A.risk, sum(a.success) as sum_successes_by_risk
from T as A
group by A.person, A.program, A.risk
) as B
on A.program = B.program
and A.person = B.person
and A.risk = B.risk
Third,
--Create table that contains only max risk category
select A.* ,
B.max_risk_rank
from T as A
inner join
(select A.person, max(A.risk_rank) as max_risk_rank
from T as A
group by A.person
) as B
on A.person = B.person
and A.risk_rank = B.max_risk_rank
This is hard to follow, but I think you just want window functions:
select t.*,
(case when sum(case when contact = 'S' then 1 else 0 end) over (partition by id) >= 2
then risk
else 1
end) as new_risk
from t;

Conditional Row Deleting in SQL

I have a table that contains 4 columns. I need to remove some of the rows based on the Code and ID columns. A code of 1 initiates the process I'm trying to track and a code of 2 terminates it. I would like to remove all rows for a specific ID when a code of 2 comes after a code of 1 and there is not an additional code 1. For example, my current data set looks like this:
Code Deposit Date ID
1 $100 3/2/2016 5
2 $0 3/1/2016 5
1 $120 2/8/2016 5
1 $120 3/22/2016 4
2 $70 2/8/2016 3
1 $120 1/3/2016 3
2 $0 6/15/2015 2
1 $120 3/22/2016 2
1 $50 8/15/2015 1
2 $200 8/1/2015 1
After I run my script I would like it to look like this:
Code Deposit Date ID
1 $100 3/2/2016 5
2 $0 3/1/2016 5
1 $120 2/8/2016 5
1 $120 3/22/2016 4
1 $50 8/15/2015 1
2 $200 8/1/2015 1
In all I have about 150,000 ID's in my actual table but this is the general idea.
You can get the ids using logic like this:
select t.id
from t
group by t.id
having max(case when code = 2 then date end) > min(case when code = 1 then date end) and -- code 2 after code 1
max(case when code = 2 then date end) > max(case when code = 1 then date end) -- no code 1 after code2
It is then easy enough to incorporate this into a query to get the rest of the details:
select t.*
from t
where t.id not in (select t.id
from t
group by t.id
having max(case when code = 2 then date end) > min(case when code = 1 then date end) and -- code 2 after code 1
max(case when code = 2 then date end) > max(case when code = 1 then date end)
);
The approach I took was to add up the Code per each ID. If it equals 3 exactly, it should be removed.
;WITH keepID as (
Select
ID
,SUM(code) as 'sumCode'
From #testInit
Group by ID
HAVING SUM(code) <> 3
)
Select *
From #testInit
Where ID IN (Select ID from keepID)
Your post showed keeping ID = 1 which does not seem to fit the criteria ? Are you sure you would be keeping ID = 1 ? It only as 2 records with a code of 1 and a code of 2 which adds up to 3 ... thus, remove it.
I just showed the approach in logic ... let me know if you need help with the delete code.
delete from table
where table.id in
(select id from B where A.id=B.id and B.date>A.date
from
(select code,id,max(date),id where code=1 group by id) as A,
(select code ,id,max(date),id where code=2 group by id) as B)
explanation: select code,id,max(date),id where code=1 as A
will fetch data with the highest date for a specific id of code 1
select code ,id,max(date),id where code=2 group by id) as B
will fetch data with the highest date for a specific id of code 2
select id from B where A.id=B.id and B.date>A.date wil select all the ids for which the code 2 date is higher than code 1 date.

Inserting a new indicator column to tell if a given row maximizes another column in SQL

I currently have a table in SQL that looks like this
PRODUCT_ID_1 PRODUCT_ID_2 SCORE
1 2 10
1 3 100
1 10 3000
2 10 10
3 35 100
3 2 1001
That is, PRODUCT_ID_1,PRODUCT_ID_2 is a primary key for this table.
What I would like to do is use this table to add in a row to tell whether or not the current row is the one that maximizes SCORE for a value of PRODUCT_ID_1.
In other words, what I would like to get is the following table:
PRODUCT_ID_1 PRODUCT_ID_2 SCORE IS_MAX_SCORE_FOR_ID_1
1 2 10 0
1 3 100 0
1 10 3000 1
2 10 10 1
3 35 100 0
3 2 1001 1
I am wondering how I can compute the IS_MAX_SCORE_FOR_ID_1 column and insert it into the table without having to create a new table.
You can try like this...
Select PRODUCT_ID_1, PRODUCT_ID_2 ,SCORE,
(Case when b.Score=
(Select Max(a.Score) from TableName a where a.PRODUCT_ID_1=b. PRODUCT_ID_1)
then 1 else 0 End) as IS_MAX_SCORE_FOR_ID_1
from TableName b
You can use a window function for this:
select product_id_1,
product_id_2,
score,
case
when score = max(score) over (partition by product_id_1) then 1
else 0
end as is_max_score_for_id_1
from the_table
order by product_id_1;
(The above is ANSI SQL and should run on any modern DBMS)