Select groups given a condition in a variable sql - sql

I must do a query where I select those groups, given by de concatenation between sample and serial that could be defined as household, where at least one in the variable bplcountry = 1
sample serial bplcountry
1 1 2
1 1 1
1 3 2
2 1 2
2 2 2
2 3 2
3 1 2
3 3 2
3 3 1
I have made some research but I'm very amateur on SQL. I get some hint like this:
SELECT *
FROM latinCensus
GROUP BY sample AND serial
HAVING COUNT(bplcountry NOT IN ('1') OR NULL) = 0
Also I got some idea in this way
SELECT *
FROM latinCensus
GROUP BY CONCAT(sample,serial)
HAVING COUNT(bplcountry NOT IN ('1') OR NULL) = 0
I would expect something like this:
sample serial bplcountry
1 1 2
1 1 1
3 3 2
3 3 1
I will appreciate your help!

You want the pairs where bplcountry is 1. You can use window functions:
select lc.*
from (select lc.*,
sum(case when bplcountry = 1 then 1 else 0 end) over (partition by sample, serial) as cnt_1
from latincensus lc
) lc
where cnt_1 > 0;
Or use exists:
select lc.*
from latincensus lc
where exists (select 1
from latincensus lc2
where lc2.sample = lc.sample and lc2.serial = lc.serial and
lc2.bplcountry = 1
);

You haven't tagged your db, but something along these lines should work (can also be expressed using joins)
select sample, serial, bplcountry
from t
where (sample,serial) in (select sample,serial
from t
where bplcountry=1);

Related

SQL Query to get multiple resultant on single column

I have a table that looks something like this:
id name status
2 a 1
2 a 2
2 a 3
2 a 2
2 a 1
3 b 2
3 b 1
3 b 2
3 b 1
and the resultant i want is:
id name total count count(status3) count(status2) count(status1)
2 a 5 1 2 2
3 b 4 0 2 2
please help me get this result somehow, i can just get id, name or one of them at a time, don't know how to put a clause to get this table at once.
Here's a simple solution using group by and case when.
select id
,count(*) as 'total count'
,count(case status when 3 then 1 end) as 'count(status1)'
,count(case status when 2 then 1 end) as 'count(status3)'
,count(case status when 1 then 1 end) as 'count(status2)'
from t
group by id
id
total count
count(status3)
count(status2)
count(status1)
2
5
1
2
2
3
4
0
2
2
Fiddle
Here's a way to solve it using pivot.
select *
from (select status,id, count(*) over (partition by id) as "total count" from t) tmp
pivot (count(status) for status in ([1],[2],[3])) pvt
d
total count
1
2
3
3
4
2
2
0
2
5
2
2
1
Fiddle

Best way to by column and aggregation on another column

I want to create a rank column using existing rank and binary columns. Suppose for example a table with ID, RISK, CONTACT, DATE. The existing rank is RISK, say 1,2,3,NULL, with 3 being the highest. The binary-valued is CONTACT with 0,1 or FAILURE/SUCESS. I want to create a new RANK that will order by RISK once a certain number of successful contacts has been exceeded.
For example, suppose the constraint is a minimum of 2 successful contacts. Then the rank should be created as follows in the two instances below:
Instance 1. Three ID, all have a min of two successful contacts. In that case the rank mirrors the risk:
ID risk contact date rank
1 3 S 1 3
1 3 S 2 3
1 3 F 3 3
1 3 F 4 3
2 2 S 1 2
2 2 S 2 2
2 2 F 3 2
2 2 F 4 2
3 1 S 1 1
3 1 S 2 1
3 1 S 3 1
Instance 2. Suppose ID=1 has only one successful contact. In that case it is relegated to the lowest rank, rank=1, while ID=2 gets the highest value, rank=3, and ID=3 maps to rank=2 because it satisfies the constraint but has a lower risk value than ID=2:
ID risk contact date rank
1 3 S 1 1
1 3 F 2 1
1 3 F 3 1
1 3 F 4 1
2 2 S 1 3
2 2 S 2 3
2 2 F 3 3
2 2 F 4 3
3 1 S 1 2
3 1 S 2 2
3 1 S 3 2
This is SQL, specifically Hive. Thanks in advance.
Edit - I think Gordon Linoff's code does it correctly. In the end, I used three interim tables. The code looks like that:
First,
--numerize risk, contact
select A.* ,
case when A.risk = 'H' then 3
when A.risk = 'M' then 2
when A.risk = 'L' then 1
when A.risk is NULL then NULL
when A.risk = 'NULL' then NULL
else -999 end as RISK_RANK,
case when A.contact = 'Successful' then 1
else NULL end as success
Second,
-- sum_successes_by_risk
select A.* ,
B.sum_successes_by_risk
from T as A
inner join
(select A.person, A.program, A.risk, sum(a.success) as sum_successes_by_risk
from T as A
group by A.person, A.program, A.risk
) as B
on A.program = B.program
and A.person = B.person
and A.risk = B.risk
Third,
--Create table that contains only max risk category
select A.* ,
B.max_risk_rank
from T as A
inner join
(select A.person, max(A.risk_rank) as max_risk_rank
from T as A
group by A.person
) as B
on A.person = B.person
and A.risk_rank = B.max_risk_rank
This is hard to follow, but I think you just want window functions:
select t.*,
(case when sum(case when contact = 'S' then 1 else 0 end) over (partition by id) >= 2
then risk
else 1
end) as new_risk
from t;

Get a percentage of all in Access SQL

I have a table containing a list of features that will be implemented by a given team for a given release, with a flag to tell me if the feature is testable or not.
Sample data can be:
feature team rel testable
1 1 1 1
2 1 1 1
3 1 1 1
4 1 2 1
5 1 2 1
6 1 2 0
7 1 3 0
8 1 3 0
9 1 3 1
10 2 1 0
11 2 1 0
12 2 1 0
13 2 2 1
14 2 2 0
15 2 2 0
16 2 3 1
17 2 3 1
18 2 3 0
What I try to get is, for each team and each release, what is the percentage of testable feature (over the overall count of features for this team and release.
Ideally I would like to keep it as a single SQL query due to the way I designed the display of the result.
I went as far as this:
SELECT
MyTable.team AS team,
MyTable.rel AS rel,
(COUNT(*)*100 / (
SELECT COUNT(*)
FROM MyTable
WHERE
[MyTable].team = team
AND [MyTable].rel = rel
)
) AS result
FROM MyTable
WHERE
MyTable.team IN (1,2)
AND MyTable.rel IN (1,2,3)
AND MyTable.testable = 1
GROUP BY
MyTable.rel,
MyTable.team
ORDER BY
MyTable.team,
MyTable.rel
Here is the result I expect (I don't really care about the rounding)
team rel result
1 1 1 // all are testable for team 1 release 1
1 2 0.66 // 2 out of 3 are testable for team 1 release 2
1 3 0.33
2 1 0
2 2 0.33
2 3 0.66
My feeling is that I am not that far from the solution, but I am not able to fix it.
I would think a simple average function would work here; assuming all values in the testable field are 1 or 0 only.
oh and get rid of testable = 1 in where clause
I'm not sure if access will implicitly cast the Boolean... so this will enable the avg to work by converting the value to 1,0 explicitly.
SELECT
MyTable.team AS team,
MyTable.rel AS rel,
AVG(iif(Testable,1,0)) AS result
FROM MyTable
WHERE
MyTable.team IN (1,2)
AND MyTable.rel IN (1,2,3)
GROUP BY
MyTable.rel,
MyTable.team
ORDER BY
MyTable.team,
MyTable.rel
select y.team, y.rel, x.cnt/y.tot as res
from (
select t.team, t.rel, sum(x.cnt) as tot
from (
select team, rel, testable, count(*) as cnt
from table where team in (1,2) and rel in (1,2,3)
group by team, rel, testable) x
join table t on t.team = x.team and t.rel = x.rel
group by team, rel) y
You can try this.

SQL get all IDs where Sub-IDs are exactly specified without getting other IDs where some Sub-ID's are not present

Sorry for that title, I don't know how to describe my problem in one sentence.
I have Table like this:
event | thema
-------------
1 1
1 2
2 1
2 2
2 3
3 1
3 2
3 3
3 4
4 1
4 2
4 3
What I want are the event IDs where the thema is exaclty 1, 2 and 3, not the event ID where it is only 1 and 2 or 1,2,3 and 4.
SELECT event WHERE thema=1 OR thema=2 OR thema=3
returns them all
SELECT event WHERE thema=1 AND thema=2 AND thema=3
returns nothing.
I think this should be absolutely simple, but stack is overflown...
Thanks for some help!
Group by the event and take only those having at least one thema 1 and 2 and 3 and not any other
SELECT event
from your_table
group by event
having sum(case when thema = 1 then 1 else 0 end) > 0
and sum(case when thema = 2 then 1 else 0 end) > 0
and sum(case when thema = 3 then 1 else 0 end) > 0
and sum(case when thema not in (1,2,3) then 1 else 0 end) = 0
This type of query is a "set-within-sets" query (your are looking for sets of "thema" for each event). The most general approach is aggregation using a having clause. This might be the shortest way to write the query using standard SQL:
select event
from table t
group by event
having count(distinct (case when thema in (1, 2, 3) then thema end)) = 3;
or,
first create table #themas (depending on vendor, make this a temp table or a simple table-valued variable) that contains user-specified list of thema values, then
Select event from your_table y
Where not exists
(Select * From #Themas t
where Not Exists
(Select * From your_table
where event = y.event
and thema = t.thema))
and not exists (Select * From your_table
where event = t.event
and thema not in
(Select thema From #Themas ))

Group By Questions

Here is a table of profile answers:
profile_id | answer_id
----------------------
1 1
1 4
1 10
Here is a table which contains a list of responses by poll respondents:
user_id | answer_id
-------------------
1 1
1 9
2 1
2 4
2 10
3 14
3 29
I want to return a list of users whose answer was in (6,9) but also in(1,10), basically all of the answers that match profile 1.
How can I write this select query?
I tried the following, but apparently I don't quite understand how group by works:
SELECT DISTINCT [user_id]
FROM [user_question_answers] a
GROUP BY a.[user_id]
HAVING a.[answer_id] IN (6,9)
AND a.[answer_id] IN (1,10)
EDIT: Return user_id 1 only
Your query is close . . .
SELECT [user_id]
FROM [user_question_answers] a
GROUP BY a.[user_id]
HAVING max(case when a.[answer_id] IN (6,9) then 1 else 0 end) = 1
AND max(case when a.[answer_id] IN (1,10) then 1 else 0 end) = 1