sql excluding certain results - sql

lets say i have a data set of
A B
-- --
a 1
b 1
c 1
d 1
d 2
e 1
f 1
f 2
g 1
how would i exclude a result in column B of 1, if column B has values of both 1 and 2 for the same value in column A?
i want my results to look like this
A B
-- --
a 1
b 1
c 1
d 2
e 1
f 2
g 1

Checking explicitly here for the values 1 and 2 and using the fact that there are exactly two of them. You could potentially make this less cumbersome if it's safe to assume that you always want the highest value.
select
tbl.A,
tbl.B
from
Table1 tbl
left outer join (
select
A
from
Table1
where
B in (1,2)
group by
A
having
count(B) = 2
) mlt on tbl.A = mlt.A
where
(
mlt.A is not null
and tbl.B = 2
) or (
mlt.A is null
and tbl.B = 1
)
Figure out all the A values that have both 1 and 2.
Match those to the table on the A value.
If A is in the subquery, use the B = 2 record. If it isn't, use the B = 1 record.

select
* from tbl where a IN
(
select
a from tbl
group by a
having count(*)>1
)
and b!=1
UNION ALL
select
* from tbl where a IN
(
select
a from tbl
group by a
having count(*)=1
)

For the example data and desired result, the simplest query to achieve the result would be a GROUP BY operation and an aggregate function.
SELECT d.A
, MAX(d.B) AS B
FROM my_data_set d
GROUP BY d.A
ORDER BY d.A
If we are only interested in rows that have a 1 or 2 in column B, we can add a WHERE clause
SELECT d.A
, MAX(d.B) AS B
FROM my_data_set d
WHERE d.B IN (1,2)
GROUP BY d.A
ORDER BY d.A
With the example data, the output is the same.
Both of these statements achieve the specified result. (There is only a single row returned for each distinct value in A.)
Or, for the same the example data, we can return the same result set with a more literal implementation of the specification.
To exclude rows with 1when there is a row with 2 for the same value of A, we can use a NOT EXISTS predicate and a correlated subquery.
SELECT d.A
, d.B
FROM my_data_set d
WHERE ( d.B = 2 )
OR ( d.B = 1 AND
NOT EXISTS ( SELECT 1
FROM my_data_set e
WHERE e.A = d.A
AND e.B = 2
)
)
ORDER BY d.A, d.B

Related

How do I get the max date for records with 2 conditions?

I want to query for each ID the record with the maximum date of date_pcp that meets the following conditions:
draft_final = F and
dsrIntExt is not not null.
But my statement pulls the max date of the records FIRST then removes the nulls and drafts. This results in missing data because a client that had an assessment on 8/30/19 but drsIntExt is null and an assessment on 5/1/19 but data in the drsIntExt field (both in final draft) would not be included in my final data and I need that 5/1/19 record.
Data
ID draft_final drsIntExt date_pcp
A F 8/30/2019
A F E 5/1/2019
B F I 5/20/2019
C D E 8/31/2019
C F I 5/6/2019
C F E 12/2/2018
Expected Result
ID draft_final drsIntExt date_pcp
A F E 5/1/2019
B F I 5/20/2019
C F I 5/6/2019
Actual Result
ID draft_final drsIntExt date_pcp
B F I 5/20/2019
Current Code
SELECT
l.ID,
l.draft_final,
l.drsIntExt,
l.date_pcp,
l.dsrInternalPCP_Value,
l.dsrInternalSite_Value
FROM
CWS.bhmp_pcp_form l
INNER JOIN (
SELECT ID,
max(date_pcp) AS MaxDatePCP
FROM
CWS.bhmp_pcp_form
GROUP BY
ID) lb ON
l.ID = lb.ID
AND l.date_pcp = lb.MaxDatePCP
WHERE
l.drsIntExt IS NOT NULL
AND l.draft_final = 'F'
You can try:
select t1.ID, t1.draft_final, t1.dsrIntExt, t1.date_pcp
from table1 t1
where t1.date_pcp =
(
select max(t2.date_pcp)
from table1 t2
where t2.ID = t1.ID
and t2.draft_final = 'F'
and t2.dsrIntExt is not null
)
Using the OVER clause might give you what you want.
SELECT ID,
draft_final,
drsIntExt,
date_pcp
FROM
(SELECT
l.ID,
l.draft_final,
l.drsIntExt,
l.date_pcp,
row_number() over (partition by id
order by date_pcp DESC
) as seqnum
FROM
bhmp_pcp_form l
WHERE
l.drsIntExt IS NOT NULL
AND l.draft_final = 'F') t
WHERE seqnum = 1
Can try the below query
Select ID, draft_final, max( drsIntExt)
,max(date_pcp) from table where
drsIntExt IS NOT NULL and
draft_final='F'
group by
ID, draft_final
Having 1=max( case when
date_pcp=max(date_pcp) then 1
else 0)

Oracle SQL - Delete Entries Based Off Unique Rows

I am pulling a single column from a DB and it looks something like this:
Group
A
A
A
B
B
B
C
D
D
D
E
F
F
F
I need to delete unique entries, so entries A, B, D and F should stay and entries C and E should be deleted.
I am getting this row based of a query like this:
select Group from table where type = 'rec';
and basically each type should have more than one group and if it doesn't it needs to be removed.
NOTE: I need it to be automated and not just a "remove C" and "remove E" because there are thousands of rows and I'm not sure which I will need to delete unless I just find them. The number of rows that will need to be deleted will also be changing, hence why I need it to be automated based off of count.
One method is:
delete t
where "group" in (select "group" from t group by "group" having count(*) = 1);
Based on your sample code:
delete t
where type = 'rec' and
"group" in (select "group" from t where type = 'rec' group by "group" having count(*) = 1);
You could also do this as:
delete t
where type = 'rec' and
not exists (select 1
from t t2
where t2.group = t.group and t2.type = 'rec' and t2.rowid <> t.rowid
);
Judging by your comments all you need is running total. If entry occurred once then select/delete it. The analytic functions is the best and easiest way if you ask me:
SELECT * FROM
(
SELECT COUNT(grp) OVER (PARTITION BY grp ORDER BY grp) cnt -- number of occurances --
, grp
FROM
( -- convert to multi-row - REPLACE AAABBB with your actual column --
SELECT trim(regexp_substr('A A A B B B C D D D E F F F', '[^ ]+', 1, LEVEL)) grp
FROM dual -- from your table_name --
CONNECT BY LEVEL <= regexp_count('A A A B B B C D D D E F F F', '[^ ]+')
)
)
WHERE cnt = 1 -- Select/Delete only those that appeared once --
/
Output:
cnt|grp
--------
1 C
1 E
Full output, if you comment where:
cnt|grp
--------
3 A
3 A
3 A
3 B
3 B
3 B
1 C
3 D
3 D
3 D
1 E
3 F
3 F
3 F
Final edit based on your questions. This simulates your table:
WITH your_table AS
(
SELECT 'rec' grp_type FROM dual
UNION ALL
SELECT 'not_rec' grp_type FROM dual
)
SELECT grp_type FROM your_table WHERE grp_type = 'rec' -- apply all that above to this select --
/

SQL get the closest two rows within duplicate rows

I have following table
ID Name Stage
1 A 1
1 B 2
1 C 3
1 A 4
1 N 5
1 B 6
1 J 7
1 C 8
1 D 9
1 E 10
I need output as below with parameters A and N need to select closest rows where difference between stage is smallest
ID Name Stage
1 A 4
1 N 5
I need to select rows where difference between stage is smallest
This query can make use of an index on (name, stage) efficiently:
WITH cte AS (
SELECT TOP 1
a.id AS a_id, a.name AS a_name, a.stage AS a_stage
, n.id AS n_id, n.name AS n_name, n.stage AS n_stage
FROM tbl a
CROSS APPLY (
SELECT TOP 1 *, stage - a.stage AS diff
FROM tbl
WHERE name = 'N'
AND stage >= a.stage
ORDER BY stage
UNION ALL
SELECT TOP 1 *, a.stage - stage AS diff
FROM tbl
WHERE name = 'N'
AND stage < a.stage
ORDER BY stage DESC
) n
WHERE a.name = 'A'
ORDER BY diff
)
SELECT a_id AS id, a_name AS name, a_stage AS stage FROM cte
UNION ALL
SELECT n_id, n_name, n_stage FROM cte;
SQL Server uses CROSS APPLY in place of standard-SQL LATERAL.
In case of ties (equal difference) the winner is arbitrary, unless you add more ORDER BY expressions as tiebreaker.
dbfiddle here
This solution works, if u know the minimum difference is always 1
SELECT *
FROM myTable as a
CROSS JOIN myTable as b
where a.stage-b.stage=1;
a.ID a.Name a.Stage b.ID b.Name b.Stage
1 A 4 1 N 5
Or simpler if u don't know the minimum
SELECT *
FROM myTable as a
CROSS JOIN myTable as b
where a.stage-b.stage in (SELECT min (a.stage-b.stage)
FROM myTable as a
CROSS JOIN myTable as b)

How to find rows missing by every group

I have two tables:
Input:
A:
ID col
1 a
1 b
1 c
2 a
2 b
3 x
4 y
B
ID col
1 a
1 b
2 a
I want to for every ID in B, find rows in A but not in B by every ID.
Output:
ID Col
1 c
2 b
What I tried:
left/right join. I am trying something like select * from a left join b on a.id = b.id where b.id is null
except. select * from a except select * from b
but not sure how to modify it.
Assuming you want the values in A for which there are records in B with the same ID, but not the same col, you could do:
select
a.ID,
a.col
from A
left join B
on b.ID = a.ID and b.col = a.col
where A.ID in (select distinct ID from B) -- B contains this `ID` somewhere...
and B.ID is null -- ...but not with the same `col`
Test it here.
Using a combination of exists and not exists.
select *
from a
where exists (select 1 from b where a.id=b.id) --id check
and not exists (select 1 from b where a.id=b.id and a.col=b.col) -- col check

Postgresql query with left join and having

Postgresql 9.1: I have a query that must return the values of a second table only if the aggregate function SUM of two columns is greater than zero.
This is the data:
Table a
id
---
1
2
3
Table b
id fk(table a)
---------------
1 1
2 null
3 3
Table c
id fk(table b) amount price
-----------------------------------
1 1 1 10 --positive
2 1 1 -10 --negative
3 3 2 5
As you can see, table b has some ids from table a, and table c can have 1 or more references to table b, table c is candidate to be retrieved only if the sum(amount * price ) > 0.
I wrote this query:
SELECT
a.id, b.id, SUM(c.amount * c.price) amount
FROM
tablea a
LEFT JOIN
tableb b ON b.fk = a.id
LEFT JOIN
tablec c ON c.fk = b.id
GROUP BY
a.id, b.id
HAVING
SUM(c.amount * c.price) > 0
But this query is not retrieving all rows from table a just the row 1 and I need the two rows. I understand this is happening because of the HAVING clause but I don't know how to rewrite it.
Expected result
a b sum
------------------
1 null null -- the sum of 1 * 10 (rows 1 and two) = 0 so its not retrieved.
2 null null -- no foreign key in second table
3 3 10 -- the sum of 2 * 5 (row 3) > 0 so it's ok.
Try this:
SELECT A.ID, B.ID, C.ResultSum
FROM TableA A
LEFT JOIN TableB B ON (B.FK = A.ID)
LEFT JOIN (
SELECT FK, SUM(Amount * Price) AS ResultSum
FROM TableC
GROUP BY FK
) C ON (C.FK = B.ID) AND (ResultSum > 0)
See demo here.