How to return values from same table? - sql

I've two tables A and B. I want to return all records from A and only matching from B. I can use left join for this. But after joining, I want to return records based on a flag in the same table.
Table A:
| Col1 | Col2 |
|------|------|
| 123 | 12 |
| 456 | 34 |
| 789 | 56 |
Table B:
| Col1 | Col2 | Col3 | Col4 | Col5 |
|------|------|------|------|------|
| 123 | 12 | NULL | I | 1 |
| 456 | 34 | NULL | E | 1 |
| 111 | 98 | NULL | I | 1 |
| 222 | 99 | NULL | E | 1 |
| 123 | 12 | AB | NULL | 2 |
| 456 | 34 | CD | NULL | 2 |
| 123 | 12 | EF | NULL | 2 |
| 111 | 98 | GH | NULL | 2 |
| 222 | 99 | IJ | NULL | 2 |
After left joining A and B this how the result will look like:
| Col1 | Col2 | Col3 | Col4 | Col5 |
|------|------|------|------|------|
| 123 | 12 | NULL | I | 1 |
| 456 | 34 | NULL | E | 1 |
| 123 | 12 | AB | NULL | 2 |
| 456 | 34 | CD | NULL | 2 |
| 123 | 12 | EF | NULL | 2 |
| 789 | 56 | NULL | NULL | NULL |
1 and 2 values in Col5 tells if Col4 should be populated or Col3. 1 for Col4 and 2 for Col3.
I want to return all the records for 'I'(but excluding the record which has 'I') in Col4 which will look like this:
| Col1 | Col2 | Col3 | Col4 | Col5 |
|------|------|------|--------|------|
| 123 | 12 | AB | (null) | 2 |
| 123 | 12 | EF | (null) | 2 |
I also want to return records for 'E' (again excluding the record which has 'E') in col4 but for all the values other than one in Col3. In this case CD. Which would look like this:
| Col1 | Col2 | Col3 | Col4 | Col5 |
|------|------|------|--------|------|
| 456 | 34 | AB | (null) | 2 |
| 456 | 34 | EF | (null) | 2 |
| 456 | 34 | GH | (null) | 2 |
| 456 | 34 | IJ | (null) | 2 |
Can someone suggest how to handle this in SQL?

Ok I believe the following two queries achieve your desired results. You can see all the sample code via the following SQL Fiddle.
Existence Rule:
select A.*
, B.Col3
, B.Col4
, B.Col5
from TableA A
JOIN TableB B
on A.Col1 = B.Col1
and A.Col2 = B.Col2
and B.Col5 = 2
where exists (select 1 from TableB C
where C.col1 = B.col1 and C.col2 = B.col2
and c.col4 = 'I' AND C.col5 = 1)
Results:
| Col1 | Col2 | Col3 | Col4 | Col5 |
|------|------|------|--------|------|
| 123 | 12 | AB | (null) | 2 |
| 123 | 12 | EF | (null) | 2 |
Exclusion Rule:
select A.*
, B.Col3
, B.Col4
, B.Col5
from TableA A
CROSS JOIN TableB B
where b.col5 = 2
and exists (select 1 from TableB C
where C.col1 = a.col1 and C.col2 = a.col2
and c.col4 = 'E' AND C.col5 = 1)
and b.col3 not in (select col3 from TableB b
where b.col1 = a.col1 and b.col2 = a.col2 and b.col5 = 2)
Results:
| Col1 | Col2 | Col3 | Col4 | Col5 |
|------|------|------|--------|------|
| 456 | 34 | AB | (null) | 2 |
| 456 | 34 | EF | (null) | 2 |
| 456 | 34 | GH | (null) | 2 |
| 456 | 34 | IJ | (null) | 2 |

Result for I:-
;with cte1 As(select a.col1,a.col2 from A a left join B b on a.col1 =b.col2 and a.col2=b.col2 where b.col4 = 'I'),cte2 As(select b.col3,b.col4,b.col5 from from A a left join B b on a.col1 =b.col2 and a.col2=b.col2 where b.col4 <> 'I')
Result for E:-
select a.col1,a.col2,b.col3,b.col4,b.col5 from cte1 a cross join cte2 b
;with cte1 As(select a.col1,a.col2 from A a left join B b on a.col1 =b.col2 and a.col2=b.col2 where b.col4 = 'E'),cte2 As(select b.col3,b.col4,b.col5 from from A a left join B b on a.col1 =b.col2 and a.col2=b.col2 where b.col4 <> 'E')
select a.col1,a.col2,b.col3,b.col4,b.col5 from cte1 a cross join cte2 b

select c.col1, c.col2
from
(select a.col1, a.col2, b.col3 from a inner join table b on a.id = b.id
where "condition" ) c
where c.col1 = "condition"
This is the script. The explanation is:
Inside the () i wrote the first select. There, you will do the select with your joins and your conditions. At the end of the select i wrote "c" which is the name of the table generated from the sub-select.
Then, you'll select some values from the generated table and filter them with a where that will act on the results generated by the table created with the sub-select
EDIT: I used your question's names to make it easier

Related

Reduce left outer join in SQL query

Updated with sqlfilld SQLfiddle
I have an Oracle query where I need to reduce the number of left outer join to perform efficiently. The current query runs for more than 2 hours and I want to reduce its complexity by reducing the number of join operations.
Without the joins, the query runs in 15 minutes. Hence I want to rewrite the logic. Is there any efficient way to do that?
WITH myquery AS
(
SELECT *
FROM TEST_FILE1
)
SELECT
A.Col3, A.Col1, A.Col2, A.Col4, A.Col5
-- D.CB,
-- NVL(D.CD, 0), NVL(D.CE, 0), NVL(D.EF, 0),
,CASE WHEN V1.Col1 IS NULL THEN 0 ELSE 1 END AS QQ1
,CASE WHEN V2.Col3 IS NULL THEN 0 ELSE 1 END AS QQ2
,CASE WHEN V3.Col1 IS NULL THEN 0 ELSE 1 END AS QQ3
,CASE WHEN V4.Col3 IS NULL THEN 0 ELSE 1 END AS QQ4
, case when V5.Col1 is NULL then 0 else 1 end as QQ5
, case when V6.Col3 is NULL then 0 else 1 end as QQ6
, case when V7.Col1 is NULL then 0 else 1 end as QQ7
, case when V8.Col3 is NULL then 0 else 1 end as QQ8
FROM (
SELECT Col3, Col1, Col2, Col4, Col5
FROM (
SELECT distinct Col3
FROM myquery
) A1
CROSS JOIN (
SELECT distinct Col1
FROM myquery
) A2
CROSS JOIN (
SELECT distinct Col2
FROM myquery
) A3
CROSS JOIN (
SELECT distinct Col4
FROM myquery
) A4
CROSS JOIN (
SELECT distinct Col5
FROM myquery
) A5
WHERE Col3 = 42
) A
LEFT JOIN myquery D on NVL(D.Col3, '-') = NVL(A.Col3, '-') AND NVL(D.Col1, '-') = NVL(A.Col1, '-')
AND NVL(D.Col2, '-') = NVL(A.Col2, '-') AND NVL(D.Col4, '-') = NVL(A.Col4, '-') AND NVL(D.Col5,
'-') = NVL(A.Col5, '-')
LEFT JOIN (
SELECT distinct Col1, Col3, Col5
FROM myquery
) V1 on V1.Col1 = A.Col1 AND V1.Col3 = A.Col3 AND V1.Col5 = A.Col5
LEFT JOIN (
SELECT distinct Col3, Col5, Col2
FROM myquery
) V2 on V2.Col3 = A.Col3 AND V2.Col5 = A.Col5 AND V2.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col5, Col1, Col2
FROM myquery
) V3 on V3.Col3 = A.Col3 AND V3.Col5 = A.Col5 AND V3.Col1 = A.Col1 AND V3.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col5, Col2
FROM myquery
WHERE Col1 in ('Bert','Myra')
) V4 on V4.Col3 = A.Col3 AND V4.Col5 = A.Col5 AND V4.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col1, Col3
FROM myquery
) V5 on V5.Col1 = A.Col1 AND V5.Col3 = A.Col3
LEFT JOIN (
SELECT distinct Col3, Col2
FROM myquery
) V6 on V6.Col3 = A.Col3 AND V6.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col1, Col2
FROM myquery
) V7 on V7.Col3 = A.Col3 AND V7.Col1 = A.Col1 AND V7.Col2 = A.Col2
LEFT JOIN (
SELECT distinct Col3, Col2
FROM myquery
WHERE Col1 in ('Bert','Myra')
) V8 on V8.Col3 = A.Col3 AND V8.Col2 = A.Col2
So far I was thinking of using analytical window function but didn't get the desired output .Any leads will be highly appreciated.
Here is my input data of test_file table
+------+------+------+------+------+
| COL1 | COL2 | COL3 | COL4 | COL5 |
+------+------+------+------+------+
| Bert | "M" | 42 | 68 | 166 |
| Carl | "M" | 32 | 70 | 155 |
| Dave | "M" | 39 | 72 | 167 |
| Elly | "F" | 30 | 66 | 124 |
| Fran | "F" | 33 | 66 | 115 |
| Hank | "M" | 30 | 71 | 158 |
| Jake | "M" | 32 | 69 | 143 |
| Luke | "M" | 34 | 72 | 163 |
| Neil | "M" | 36 | 75 | 160 |
| Page | "F" | 31 | 67 | 135 |
| Alex | "M" | 41 | 74 | 170 |
| Gwen | "F" | 26 | 64 | 121 |
| Ivan | "M" | 53 | 72 | 175 |
| Kate | "F" | 47 | 69 | 139 |
| Myra | "F" | 23 | 62 | 98 |
| Omar | "M" | 38 | 70 | 145 |
| Quin | "M" | 29 | 71 | 176 |
| Ruth | "F" | 28 | 65 | 131 |
+------+------+------+------+------+
From this table I want to create every posssible combination by taking distinct values of each column by applying cross join . It will produce 7776 records with my filter on col1=42 . because I want all possible combination for this column only.
With this combination I want to check which all column combinatios are null using many combinations of left outer joins.
Output(partial):
+------+------+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+
| COL3 | COL1 | COL2 | COL4 | COL5 | QQ1 | QQ2 | QQ3 | QQ4 | QQ5 | QQ6 | QQ7 | QQ8 |
+------+------+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+
| 42 | Page | "F" | 68 | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 42 | Alex | "F" | 62 | 143 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 42 | Fran | "M" | 66 | 175 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 42 | Omar | "F" | 70 | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 42 | Elly | "M" | 72 | 124 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 42 | Quin | "M" | 64 | 160 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 42 | Omar | "M" | 64 | 158 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 42 | Kate | "F" | 62 | 176 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 42 | Neil | "F" | 69 | 145 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 42 | Dave | "F" | 62 | 163 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 42 | Ruth | "M" | 70 | 115 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 42 | Bert | "M" | 65 | 121 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| 42 | Bert | "M" | 72 | 145 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
| 42 | Omar | "M" | 62 | 158 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
| 42 | Ruth | "M" | 75 | 131 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
+------+------+------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+
When checking whether data exists in a table we use EXISTS or IN, not JOIN (SELECT DISTINCT ...). Hence this is the query I'd probably come up with:
WITH myquery AS
(
SELECT * FROM TEST_FILE1
)
, a as
(
select col1, col2, 42 as col3, col4, col5
from
(
(select distinct col1 from myquery)
cross join
(select distinct col2 from myquery)
cross join
(select distinct col4 from myquery)
cross join
(select distinct col5 from myquery)
)
)
select
a.col1, a.col2, a.col3, a.col4, a.col5,
case when (col1, col3, col5) in (select col1, col3, col5 from myquery ) then 1 else 0 end as v1,
case when (col2, col3, col5) in (select col2, col3, col5 from myquery ) then 1 else 0 end as v2,
case when (col1, col2, col3, col5) in (select col1, col2, col3, col5 from myquery ) then 1 else 0 end as v3,
case when (col2, col3, col5) in (select col2, col3, col5 from myquery where col1 in ('Bert', 'Myra')) then 1 else 0 end as v4,
case when (col1, col3) in (select col1, col3 from myquery ) then 1 else 0 end as v5,
case when (col2, col3) in (select col2, col3 from myquery ) then 1 else 0 end as v6,
case when (col1, col2, col3) in (select col1, col2, col3 from myquery ) then 1 else 0 end as v7,
case when (col2, col3) in (select col2, col3 from myquery where col1 in ('Bert', 'Myra')) then 1 else 0 end as v8
from a
order by a.col1, a.col2, a.col3, a.col4, a.col5;
If your real query here: WITH myquery AS (...) is more than just a mere SELECT * FROM TEST_FILE1, you may want to use the /*+MATERIALIZE*/ hint here in order to speed up access to it.

BigQuery full join two tables without a column to join on

I have two tables:
t1
col1 col2
a a
a c
b a
b d
c a
c d
t2
team game
mazs 1
mazs 2
doos 1
bahs 3
...
t2 is a very long table with many teams and games, whereas t1 is shown in its entirety - a table with 6 rows, featuring combinations of the letters a,b,c,d. Note that t1 is not a fully exhaustive list of a,b,c,d pairings, only the pairings in the 6 rows that appear.
I'd like to create a table that looks like this:
output
team game col1 col2
mazs 1 a a
mazs 1 a c
mazs 1 b a
mazs 1 b d
mazs 1 c a
mazs 1 c d
mazs 2 a a
mazs 2 a c
mazs 2 b a
mazs 2 b d
mazs 2 c a
mazs 2 c d
What's going on here is that for each row in t2, there are 6 rows in output, one row for each of the col1, col2 pairings from t1.
t1 and t2 are created on my end from the following queries:
SELECT col1, col2 FROM sometable GROUP BY col1, col2
SELECT DISTINCT team, game FROM anothertable
The first query creates t1 and the second query creates t2.
It is called CROSS JOIN, see below example:
With t1 as (
select 'a' col1, 'a' col2 union all
select 'a' col1, 'c' col2 union all
select 'b' col1, 'a' col2 union all
select 'b' col1, 'd' col2 union all
select 'c' col1, 'a' col2 union all
select 'c' col1, 'd' col2),
t2 as (
select 'mazs' team, 1 game union all
select 'mazs' team, 2 game union all
select 'doos' team, 1 game union all
select 'bahs' team, 3 game
)
SELECT * FROM t2 cross join t1;
Output:
+------+------+------+------+
| team | game | col1 | col2 |
+------+------+------+------+
| mazs | 1 | a | a |
| mazs | 1 | a | c |
| mazs | 1 | b | a |
| mazs | 1 | b | d |
| mazs | 1 | c | a |
| mazs | 1 | c | d |
| mazs | 2 | a | a |
| mazs | 2 | a | c |
| mazs | 2 | b | a |
| mazs | 2 | b | d |
| mazs | 2 | c | a |
| mazs | 2 | c | d |
| doos | 1 | a | a |
| doos | 1 | a | c |
| doos | 1 | b | a |
| doos | 1 | b | d |
| doos | 1 | c | a |
| doos | 1 | c | d |
| bahs | 3 | a | a |
| bahs | 3 | a | c |
| bahs | 3 | b | a |
| bahs | 3 | b | d |
| bahs | 3 | c | a |
| bahs | 3 | c | d |
+------+------+------+------+

Round down to nearest of Multiple of N

I have sql table as follows
+-----------------------------+
| |col1 | col2 | col3| col4| |
+-----------------------------+
| _______________________ |
| | a | 3 | d1 | 10 | |
| | a | 6 | d2 | 15 | |
| | b | 2 | d2 | 8 | |
| | b | 30 | d1 | 50 | |
+-----------------------------+
I would like transform the above table into below, where the transformation is
col4 = col4 - (col4 % min(col2) group by col1)
+------------------------------+
| |col1 | col2 | col3| col4| |
+------------------------------+
| ____________________________ |
| |a | 3 | d1 | 9 | |
| |a | 6 | d2 | 15 | |
| |b | 2 | d2 | 8 | |
| |b | 30 | d1 | 50 | |
| |
+------------------------------+
I could read the above table in application code to do transformation manually, was wondering if it was possible to offload the transformation to sql
Just run a simple select query for this:
select col1, col2, col3,
col4 - (col4 % min(col2) over (partition by col1))
from t;
There is no need to actually modify the table.
You can use a multi-table UPDATE to achieve your desired result, joining your table to a table of MIN(col2) values:
UPDATE table1
SET col4 = col4 - (col4 % t2.col2min)
FROM (SELECT col1, MIN(col2) AS col2min
FROM table1
GROUP BY col1) t2
WHERE table1.col1 = t2.col1
Output:
col1 col2 col3 col4
a 3 d1 9
a 6 d2 15
b 2 d2 8
b 30 d1 50
Demo on dbfiddle

Select from two slightly similar tables with complex conditions

I have 4 tables in an oracle database with a complex relationship and they do not have useful primary keys.
TableA
+------+------+------+------+------+-----------------+
| ColA | ColX | ColY | ColZ | ColZa| A |
+------+------+------+------+------+-----------------+
| k9 | a1 | c1 | g1 | z1 | 2018-02-19 |
| k9 | a1 | c1 | g3 | z2 | 2018-02-02 |
| k10 | a2 | f3 | g1 | z3 | 2018-02-09 |
| k10 | a | b | c | d | 2018-02-03 |
| k | a | b | c1 | z2 | 2018-02-01 |
| k9 | a1 | c1 | c9 | z5 | 2018-02-04 |
| k9 | a1 | c1 | c2 | z5 | 2018-02-03 |
| k9 | a1 | c1 | g2 | z5 | 2018-02-03 |
+------+------+------+------+------+-----------------+
TableB
+------+------+------+------+------+----------------+
| ColA | ColX | ColY | ColZ | ColZa| B |
+------+------+------+------+------+----------------+
| e | a3 | f | g1 | i | 2018-02-03 |
| e3 | a1 | f1 | g3 | d2 | 2018-02-04 |
| k9 | a1 | c1 | g2 | z5 | 2018-02-08 |
| e4 | a4 | f2 | g2 | i2 | 2018-02-07 |
| e5 | a1 | f1 | g1 | d2 | 2018-02-06 |
| k9 | a1 | c1 | g1 | d2 | 2018-02-22 |
+------+------+------+------+------+----------------+
TableC
+------+------+------+----------------+
| ColA | ColX | ColY | C |
+------+------+------+----------------+
| ab | c2 | c2 | cx |
| k9 | a1 | c1 | cy |
| cd | a2 | c3 | cy |
| ef | c2 | c4 | cz |
| ef | c2 | c2 | cz |
+------+------+------+----------------+
TableD
+------+------+------+----------------+
| ColA | ColX | ColY | D |
+------+------+------+----------------+
| e | a | f | dx |
| e1 | a | a | dy |
| e2 | a1 | a1 | dz |
+------+------+------+----------------+
Some business logic requires me to select and combine data from TableA and TableB
The Problem:
Fetch records ColA, ColX, ColY, ColZ, ColZa, A, B in TableA AND/OR TableB for cases where pseudo key ColA_ColX_ColY have value ColZ = 'g1', with merge on ColA | ColX | ColY | ColZ | ColZa.
I used the word 'pseudo' here because it is not really a key but it's just a means to identify the records of interest in TablesA and TablesB.
To construct a valid key, count(colY) must be 1 for value in colX in TableC and TableD (this is actually the case in all four tables but if you only consider distinct values but I am suppose to use only TableC and TableD since it is more explicit)
The process:
In the result table below, I should get row1 in table TableA because 'a1' has only one count(ColY)=1 in TableC but I ignored row1 in TableB and row3 in TableA because count(ColY) is not equal to 1 in either TableC or TableD
Now that I have a value 'a1' from TableC.ColX which matches my criteria, I select all records in TableA and TableB where ColX = 'a1' and ColY = 'c1' and ColA = 'k9'
My desired result
+------+------+------+------+------+-----------------+----------------+
| ColA | ColX | ColY | ColZ | ColZa| A | B |
+------+------+------+------+------+-----------------+----------------|
| k9 | a1 | c1 | g1 | z1 | 2018-02-19 | [null] |
| k9 | a1 | c1 | g3 | z2 | 2018-02-02 | [null] |
| k9 | a1 | c1 | c9 | z5 | 2018-02-04 | [null] |
| k9 | a1 | c1 | c2 | z5 | 2018-02-03 | [null] |
| k9 | a1 | c1 | g2 | z5 | 2018-02-03 | 2018-02-08 |
| k9 | a1 | c1 | g4 | d2 | [null] | 2018-02-22 |
+------+------+------+------+------+-----------------+----------------+
So, I wrote a query similar to
select a.ColX, a.ColY, a.ColZ, a.ColZa, a.A, b.B from TableA a FULL OUTER JOIN TableB b ON a.ColX=b.ColX AND a.ColY=b.ColY AND a.ColZ=b.ColZ
where (
a.ColX IN
(select ColX from TableA where
ColX IN
(select ColX from TableC group by ColX HAVING count(ColY)=1) and
ColX in
(select distinct ColX from TableB where ColZ = 'g1'and B > trunc(sysdate) - 365)
group by ColX having count(distinct ColY)=1)
OR
b.ColX IN
(select ColX from TableA where
ColX IN
(select ColX from TableC group by ColX HAVING count(ColY)=1) and
ColX in
(select distinct ColX from TableB where ColZ = 'g1' and B > trunc(sysdate) - 365)
group by ColX having count(distinct ColY)=1));
I have no control over the data model here. How do I make my query work?
The data in TableA and TableB are in 100,000 records and data in TableC and TableD are up to a million.
SQL is not my area of expertise and I really hope I am not going too off the mark here.
I didn't understand what your query is supposed to do, but as a pure refactoring exercise I get this:
with whatever as
( select colx
from tablea
where colx in
( select colx
from tablec
group by colx having count(colb) = 1
union all
select colx
from tableb
where colz = 'g1'
and b > trunc(sysdate) - 365 )
group by colx
having count(distinct colza) = 1 )
select a.colx, a.coly, a.colz, a.colza, a.a, b.b
from tablea a
full outer join tableb b
on a.colx = b.colx
and a.coly = b.coly
and a.colz = b.colz
join whatever w
on w.colx in (a.colx, b.colx);

Sum cases that meet multiple criteria (pandas)

I need to find the sum of cases in col2 where for each set in col1 (ABC), the col2 value has a Y in col3 100% of the time. In this case, B1 &
D1 meet this criteria, so N=2. Support in pandas or SQL are helpful (both are ideal).
| col1 | col2 | col3 | col4 | col5 |
|------|-------|-------|-------|-------|
| A | A1 | N | 1 | 256 |
| A | B1 | Y | 2 | 3 |
| A | C1 | N | 3 | 323 |
| B | F1 | N | 1 | 89 |
| B | B1 | Y | 2 | 256 |
| C | D1 | Y | 1 | 3 |
| D | A1 | N | 1 | 32 |
| D | C1 | Y | 2 | 893 |
Something like this in python pandas
df.groupby('col2').col3.apply(lambda x : sum(x=='Y')==x.count()).sum()
Out[568]: 2
More detail :
df.groupby('col2').col3.apply(lambda x : sum(x=='Y')==x.count())
Out[569]:
col2
A1 False
B1 True
C1 False
D1 True
F1 False
Name: col3, dtype: bool
I don't see what col1 has to do with this. You can do this with a SQL query:
select count(*)
from (select col2
from t
where min(col3) = max(col3) and min(col3) = 'Y'
) t;