I have a table that looks like this:
|FileID| File Info |
| ---- | ------------ |
| 1 | X |
| 1 | Y |
| 2 | Y |
| 2 | Z |
| 2 | A |
I want to aggregate by FileID and split the File Info column into 2 separate count columns. I want 1 column to have the count of the Unique File Info and the other to be a count of non-Unique file info.
The result would ideally look like this:
|FileID| Count(Unique)| Count(Non-unique) |
| ---- | ------------ | ----------------- |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
where the non-unique count is the 'Y' and the unique count is from the 'X' and 'Z, A' for FileID 1 and 2 respectively.
I'm looking for ways to gauge uniqueness between files rather than within.
Use COUNT() window function in every row to check if FileInfo is unique and then use conditional aggregation to get the results that you want:
SELECT FileID,
COUNT(CASE WHEN counter = 1 THEN 1 END) count_unique,
COUNT(CASE WHEN counter > 1 THEN 1 END) count_non_unique
FROM (
SELECT t.*, COUNT(*) OVER (PARTITION BY t.FileInfo) counter
FROM tablename t
) t
GROUP BY FileID;
See the demo.
First you select the "Non Unique" rows from the table
SELECT FileInfo
FROM sometableyoudidnotname
GROUP BY FileInfo
HAVING COUNT(*) > 1
Now that you know which ones are unique and non unique you can left join to that table to get the "status" and count it up.
SELECT base.FileID,
SUM(CASE WHEN u.FileID is NOT NULL THEN 1 ELSE 0 END) as nonunique,
SUM(CASE WHEN u.FileID is NULL THEN 1 ELSE 0 END) as unique
FROM sometableyoudidnotname base
LEFT JOIN (
SELECT FileInfo
FROM sometableyoudidnotname
GROUP BY FileInfo
HAVING COUNT(*) > 1
) u ON base.FileInfo = u.FileInfo
GROUP BY base.FileID
Have a derived table that counts occurrences of each fileid. JOIN and GROUP BY:
select t1.FileID,
sum(case when t2.ficount = 1 then 1 else 0 end),
sum(case when t2.ficount > 1 then 1 else 0 end)
from tablename t1
join
(
select fileinfo, count(*) ficount
from tablename
group by fileinfo
) t2
on t1.fileinfo = t2.fileinfo
group by t1.FileID
Related
I have a table named Groups structured like so...
+--------------+
| Id GroupId |
+--------------+
| 1 3 |
| 2 3 |
| 3 2 |
| 1 2 |
| 2 2 |
| 3 2 |
+--------------+
I want to return the GroupId where Id = 1 and the other Id = 2, so the result should be 3. Here's what I've tried so far...
SELECT GroupId FROM Groups G1
WHERE G1.Id = 1 and exists ( select 1
FROM Groups G2
WHERE G2.Id = 2
and G1.GroupId = G2.GroupId)
This works fine until a group is added where both Ids exist in (group 2). Then, this fails as the subquery returned more than 1 value.
I've thought about using HAVING COUNT(*) == 2 to try and get the subquery to return the group with only 2 row counts but I'm not sure how to do that, any ideas?
Use group by and having:
select groupid
from groups
where id in (1, 2)
group by groupid
having count(*) = 2;
This assumes that the rows are unique. If you can have duplicates, use count(distinct id) = 2.
If you want 1 & 2 and no other ids, the logic is slightly more complicated:
select groupid
from groups
group by groupid
having sum(case when id = 1 then 1 else 0 end) > 0 and
sum(case when id = 2 then 1 else 0 end) > 0 and
count(*) = 2;
I have the following table:
Check | Email | Count
Y | a | 1
Y | a | 1
Y | b | 1
N | c | 1
N | d | 1
I want to group it by 'check' and number of counts under each email. So like this:
Check | Count # | Email Addresses
Y | 1 count | 1 (refers to email b)
Y | 2+ counts | 1 (refers to email a)
N | 1 count | 2 (refers to email c & d)
N | 2+ counts | 0 (no emails meet this condition)
Every 'check' value is specific to an email
This is most easily done by putting the values in columns not rows.
But it requires two levels of aggregation:
select check, sum(case when cnt = 1 then 1 else 0 end) as cnt_1,
sum(case when cnt >= 2 then 1 else 0 end) as cnt_2plus
from (select check, email, count(*) as cnt
from t
group by check, email
) ce
group by check;
This should work, but there might be a cleaner way to get there. I think you need an extra layer of aggregation to pick up the cases where no email meets the condition, assuming you have a record in the source table where the email is null. If there's no record of these cases in the source table, this won't work.
select check
,count_num
,case when email_addresses is null then 0 else email_addresses end as email_addresses
from (
select check,
case when count_sum = 1 then 1 when count_sum > 1 then 2+ else 0 end as count_num,
count(distinct(email)) as email_addresses
group by check, count_num
from (
select check, sum(count) as count_sum, email
from table
group by check, email
)
)
Apologies for my very ambiguous title, but i've been working on this for the better part of a day and can't get anywhere so i'm probably clouded.. Let me present sample data and explain what I'm trying to do:
+------+------+
| ID | UW |
+------+------+
| 1 | I |
| 1 | I |
| 3 | I |
| 3 | I |
| 3 | C |
| 3 | C |
| 4 | C |
| 4 | C |
I'm trying to find the count of IDs where there are both "I" and "C" in the UW column, so in the example above the count would be: 1 (for ID #3). Since ID 1 has only "I" and ID 4 has only "C" values in "UW" field. Thanks in advance for helping me with this, much appreciated.
Here is one way:
SELECT COUNT(DISTINCT A.ID) N
FROM dbo.YourTable A
WHERE EXISTS(SELECT 1 FROM dbo.YourTable
WHERE ID = A.ID
AND UW IN ('I','C'));
And another:
SELECT COUNT(*)
FROM ( SELECT ID
FROM dbo.YourTable
WHERE UW IN ('I','C')
GROUP BY ID
HAVING COUNT(DISTINCT UW) = 2) A;
You can use group by and having to get the ids that meet the conditions:
select id
from table t
group by id
having sum(case when uw = 'I' then 1 else 0 end) > 0 and
sum(case when uw = 'C' then 1 else 0 end) > 0;
You can then count these with a subquery:
select count(*)
from (select id
from table t
group by id
having sum(case when uw = 'I' then 1 else 0 end) > 0 and
sum(case when uw = 'C' then 1 else 0 end) > 0
) t
I like to formulate these problems this way, because the having clause is very general on the types of conditions that it can support.
I wish I could find a request allowing me to have on the same result line, 2 values obtained with a different clause:
For example, let's say that I have this table:
ID |VAL
----------
0 | 1
1 | 0
2 | 0
3 | 1
4 | 0
5 | 0
I wish I could, in the same request, select the number of lines having val = 1, the number of total lines, (and if possible the total percentage of one count on the other) which would give result set like this:
nb_lines | nb_val_1 | ratio
---------------------------
6 | 2 | 0.5
I tried something like:
select count(t1.ID), (select count t2.ID
from table t2 where t2.val = 1
)
FROM table t1
But obviously, this syntax doesn't exist (and it wouldn't give me the ratio). How could I perform this request ?
Try this query which uses CASE to count only those rows we need.
SELECT nb_lines,nb_val_1,nb_val_0, nb_val_1/nb_val_0 FROM
(SELECT COUNT (t1.ID) nb_lines,
COUNT (CASE
WHEN t1.val = 1
THEN 1
ELSE NULL
END) nb_val_1,
COUNT (CASE
WHEN t1.val = 0
THEN 1
ELSE NULL
END) nb_val_0
FROM tabless t1);
Is there any function to check if a column in a group contains a NULL, alternatively how would I solve this? Example below of data structure.
id | value
----------
1 | NULL
1 | 56
2 | 98
2 | 14
Result:
id | value
----------
1 | 1
2 | 0
try
select id,
count(*) - count(value) as null_value_count
from your_table
group by id
SQLFiddle demo
Another possibility which doesn't use the fact that count(value) ignores NULL values:
select id,
sum(case when value is null then 1 else 0 end) as null_count
from your_table
group by id;