SQL : Conditional query on several rows - sql

I have a table like
letter | number
a | 1
a | 1
b | 2
c | 1
c | 2
c | 2
and I would like to write a SQL query that only returns rows corresponding to letter values that are associated with both a number '1' and a number '2', i.e. I want to keep only
c | 1
c | 2
c | 2
from my example above.
Can anyone help? Many thanks!

You need to use Group By and Having clause
This will give you the letters that are associated to number 1 and 2
SELECT *
FROM yourtable
WHERE letter IN (SELECT letter
FROM yourtable
WHERE number IN ( 1, 2 )
GROUP BY letter
HAVING Count(DISTINCT number) = 2)
SQLFIDDLE DEMO
If you want to find the letters that are associated only to 1 and 2 then use this
SELECT letter
FROM test
GROUP BY letter
HAVING Count(DISTINCT CASE WHEN number = 1 THEN 1 END) = 1
AND Count(DISTINCT CASE WHEN number = 2 THEN 1 END) = 1
AND Count(DISTINCT number) = 2
SQLFIDDLE DEMO

Related

Improving query that counts distinct values that have particular values in another column

Say I have a table in the format:
| id | category|
|----|---------|
| 10 | A |
| 10 | B |
| 10 | C |
| 2 | C |
I want to count the number of distinct id's that have all three values A, B, and C in the category variable. In this case, the query would return 1 since only for id = 10 is this true.
My intuition is to write the following query to get this value:
SELECT
COUNT(DISTINCT id),
SUM(CASE WHEN category = 'A' THEN 1 else 0 END) AS A,
SUM(CASE WHEN category = 'B' THEN 1 else 0 END) AS B,
SUM(CASE WHEN category = 'C' THEN 1 else 0 END) AS C
FROM
table
GROUP BY
id
HAVING
A >= 1
AND
B >= 1
AND
C >= 1
This feels a bit overwrought though -- is there a simpler way to achieve the desired outcome?
You are close, but you need two levels of aggregation. Assuming no duplicate rows:
SELECT COUNT(*)
FROM (SELECT id
FROM t
WHERE Category IN ('A', 'B', 'C')
GROUP BY id
HAVING COUNT(*) = 3
) t;
I assume this is part of a larger table, your id and categories can appear multiple times and still be distinct due to other fields, and that you know how many categories you're looking for.
SELECT ID, COUNT(ID)
FROM(
SELECT DISTINCT ID, CATEGORY
FROM TABLE)
GROUP BY ID
HAVING COUNT(ID) = 3 --or however many categories you want
Your subquery here removes extraneous info and forces your id to show up once per category. You then count up the number of times it shows up and look up the ones that show up 3 or however many times you want.

How to return records from a subquery where the row count of the subquery is equal to X?

I have a table named Groups structured like so...
+--------------+
| Id GroupId |
+--------------+
| 1 3 |
| 2 3 |
| 3 2 |
| 1 2 |
| 2 2 |
| 3 2 |
+--------------+
I want to return the GroupId where Id = 1 and the other Id = 2, so the result should be 3. Here's what I've tried so far...
SELECT GroupId FROM Groups G1
WHERE G1.Id = 1 and exists ( select 1
FROM Groups G2
WHERE G2.Id = 2
and G1.GroupId = G2.GroupId)
This works fine until a group is added where both Ids exist in (group 2). Then, this fails as the subquery returned more than 1 value.
I've thought about using HAVING COUNT(*) == 2 to try and get the subquery to return the group with only 2 row counts but I'm not sure how to do that, any ideas?
Use group by and having:
select groupid
from groups
where id in (1, 2)
group by groupid
having count(*) = 2;
This assumes that the rows are unique. If you can have duplicates, use count(distinct id) = 2.
If you want 1 & 2 and no other ids, the logic is slightly more complicated:
select groupid
from groups
group by groupid
having sum(case when id = 1 then 1 else 0 end) > 0 and
sum(case when id = 2 then 1 else 0 end) > 0 and
count(*) = 2;

Unable to use lag function correctly in sql

I have created a table from multiple tables like this:
Week | Cid | CustId | L1
10 | 1 | 1 | 2
10 | 2 | 1 | 2
10 | 5 | 1 | 2
10 | 4 | 1 | 1
10 | 3 | 2 | 1
4 | 6 | 1 | 2
4 | 7 | 1 | 2
I want the output as:
Repeat
0
1
1
0
0
0
1
So, basically what I want is for each week, if a person (custid) comes in again with the same L1, then the value in the column Repeat should become 1, otherwise 0 ( so like, here, in row 2 & 3, custid 1, came with L1=2 again, so it will get 1 in column "Repeat", however in row 4, custid 1 came with L1=1, so it will get value as ).
By the way, the table isn't ordered (as I've shown).
I'm trying to do it as follows:
select t.*,
lag(0, 1, 0) over (partition by week, custid, L1 order by cid) as repeat
from
table;
But this is not giving the output and is giving empty result.
I think you need a case, but I would use row_number() for this:
select t.*,
(case when row_number() over (partition by week, custid, l1 order by cid) = 1
then 0 else 1
end) as repeat
from table;
This can also be computed without Window functions but by a self-join in the following way:
SELECT a.week, a.cid, a.custid, a.l1,
CASE WHEN b IS NULL THEN 1 ELSE 0 END AS repeat
FROM mytable a NATURAL LEFT JOIN
(SELECT week, min(cid) AS cid, custid, l1 FROM mytable
GROUP BY week,custid,l1) b
ORDER BY week DESC, custid, l1 DESC, cid;
It can be done simply by using an count(*) as analytic function. No case expression or self join needed. The query is even portable across databases that support analytic functions:
SELECT cust.*, least(count(*)
OVER (PARTITION BY Week, CustId, L1 ORDER BY Cid
ROWS UNBOUNDED PRECEDING) - 1, 1) repeat
FROM cust ORDER BY Week DESC, custId, L1 DESC;
Executing the query on your data results in the following output (last row is the repeat row):
Week | Cid | CustId | L1 | repeat
10 1 1 2 0
10 2 1 2 1
10 5 1 2 1
10 4 1 1 0
10 3 2 1 0
4 6 1 2 0
4 7 1 2 1
Tested on Oracle 11g and PostgreSQL 9.4. Note that the second ORDER BY is optional. See Oracle Language Reference, Analytic Functions for more details.

Simple group-by for SQL pull

I have the following table:
Check | Email | Count
Y | a | 1
Y | a | 1
Y | b | 1
N | c | 1
N | d | 1
I want to group it by 'check' and number of counts under each email. So like this:
Check | Count # | Email Addresses
Y | 1 count | 1 (refers to email b)
Y | 2+ counts | 1 (refers to email a)
N | 1 count | 2 (refers to email c & d)
N | 2+ counts | 0 (no emails meet this condition)
Every 'check' value is specific to an email
This is most easily done by putting the values in columns not rows.
But it requires two levels of aggregation:
select check, sum(case when cnt = 1 then 1 else 0 end) as cnt_1,
sum(case when cnt >= 2 then 1 else 0 end) as cnt_2plus
from (select check, email, count(*) as cnt
from t
group by check, email
) ce
group by check;
This should work, but there might be a cleaner way to get there. I think you need an extra layer of aggregation to pick up the cases where no email meets the condition, assuming you have a record in the source table where the email is null. If there's no record of these cases in the source table, this won't work.
select check
,count_num
,case when email_addresses is null then 0 else email_addresses end as email_addresses
from (
select check,
case when count_sum = 1 then 1 when count_sum > 1 then 2+ else 0 end as count_num,
count(distinct(email)) as email_addresses
group by check, count_num
from (
select check, sum(count) as count_sum, email
from table
group by check, email
)
)

SQL Server - Sum entire column AND Group By

Suppose I had the following table in SQL Server:
grp: val: criteria:
a 1 1
a 1 1
b 1 1
b 1 1
b 1 1
c 1 1
c 1 1
c 1 1
d 1 1
Now what I want is to get an output which would basically be:
Select grp, val / [sum(val) for all records] grouped by grp where criteria = 1
So, given the following is true:
Sum of all values = 9
Sum of values in grp(a) = 2
Sum of values in grp(b) = 3
Sum of values in grp(c) = 3
Sum of values in grp(d) = 1
The output would be as follows:
grp: calc:
a 2/9
b 3/9
c 3/9
d 1/9
What would my SQL have to look like??
Thanks!!
You should be able to use something like this which uses sum() over():
select distinct grp,
sum(val) over(partition by grp)
/ (sum(val) over(partition by criteria)*1.0) Total
from yourtable
where criteria = 1
See SQL Fiddle with Demo
The result is:
| GRP | TOTAL |
------------------------
| a | 0.222222222222 |
| b | 0.333333333333 |
| c | 0.333333333333 |
| d | 0.111111111111 |
I completely agree with #bluefeet's response -- this is just a little more of a database-independent approach (should work with most RDBMS):
select distinct
grp,
sum(val)/cast(total as decimal)
from yourtable
cross join
(
select SUM(val) as total
from yourtable
) sumtable
where criteria = 1
GROUP BY grp, total
And here is the SQL Fiddle.