Restricting results to only rows where one value appears only once - sql

I have a query that is more complex than the example here, but which needs to only return the rows where a certain field doesn't appear more than once in the data set.
ACTIVITY_SK STUDY_ACTIVITY_SK
100 200
101 201
102 200
100 203
In this example I don't want any records with an ACTIVITY_SK of 100 being returned because ACTIVITY_SK appears twice in the data set.
The data is a mapping table, and is used in many joins, but multiple records like this imply data quality issues and so I need to simply remove them from the results, rather than cause a bad join elsewhere.
SELECT
A.ACTIVITY_SK,
A.STATUS,
B.STUDY_ACTIVITY_SK,
B.NAME,
B.PROJECT
FROM
ACTIVITY A,
PROJECT B
WHERE
A.ACTIVITY_SK = B.STUDY_ACTIVITY_SK
I had tried something like this:
SELECT
A.ACTIVITY_SK,
A.STATUS,
B.STUDY_ACTIVITY_SK,
B.NAME,
B.PROJECT
FROM
ACTIVITY A,
PROJECT B
WHERE
A.ACTIVITY_SK = B.STUDY_ACTIVITY_SK
WHERE A.ACTIVITY_SK NOT IN
(
SELECT
A.ACTIVITY_SK,
COUNT(*)
FROM
ACTIVITY A,
PROJECT B
WHERE
A.ACTIVITY_SK = B.STUDY_ACTIVITY_SK
GROUP BY A.ACTIVITY_SK
HAVING COUNT(*) > 1
)
But there must be a less expensive way of doing this...

Something like this could be a bit "cheaper" to run:
SELECT
A.ACTIVITY_SK,
A.STATUS,
B.STUDY_ACTIVITY_SK,
B.NAME,
B.PROJECT
PROJECT B INNER JOIN
(SELECT
ACTIVITY_SK,
MIN(STATUS) STATUS,
FROM
ACTIVITY
GROUP BY ACTIVITY_SK
HAVING COUNT(ACTIVITY_SK) = 1 ) A
ON A.ACTIVITY_SK = B.STUDY_ACTIVITY_SK

Another alternative:
select * from (
SELECT
A.ACTIVITY_SK,
A.STATUS,
B.STUDY_ACTIVITY_SK,
B.NAME,
B.PROJECT,
count(distinct a.pk) over (partition by a.activity_sk) AS c
FROM
ACTIVITY A,
PROJECT B
WHERE
A.ACTIVITY_SK = B.STUDY_ACTIVITY_SK
) where c = 1;
(where a.pk refers to a unique identifier from the ACTIVITY table)

Related

SQL Finding duplicate values in two of the three columns of each row

Let's say we have three columns: A, B, and C.
I would like to filter the results as follows:
The values of A and B are the same (duplicated) for > 1 (more than 1) row, and the value of C is always different.
In the attached image, the values that appear selected would meet the conditions mentioned above.
What I've tried:
SELECT
a.notation as A, a.gene as B, b.id as C
FROM
`db-dummy`.sgdata c
join `db-dummy`.g_info a on a.rec_id = c.gen_id
join `db-dummy`.spec_data b on b.rec_id = c.spec_id GROUP BY A, B HAVING COUNT(*) > 1;
I thought that using GROUP BY and HAVING COUNT(*) > 1 I could get the desired result, but I get the following error:
SQL Error [1055] [42000]: (conn=1632) Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'db-dummy.b.spec_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
If you had a single table, I would suggest just using exists. But because you have a join, use window functions. If you are. looking for different values of id:
SELECT A, B, C
FROM (SELECT a.notation as A, a.gene as B, b.id as C,
MIN(b.id) OVER (PARTITION BY a.notation, a.gene) as min_id,
MAX(b.id) OVER (PARTITION BY a.notation, a.gene) as max_id
FROM `db-dummy`.sgdata c JOIN
`db-dummy`.g_info a
ON a.rec_id = c.gen_id JOIN
`db-dummy`.spec_data b
ON b.rec_id = c.spec_id
) x
WHERE min_id <> max_id;
If you are just looking for multiple rows for a given A and B, then you can use:
SELECT A, B, C
FROM (SELECT a.notation as A, a.gene as B, b.id as C,
COUNT(*) OVER (PARTITION BY a.noation, a.gene) as cnt
FROM `db-dummy`.sgdata c JOIN
`db-dummy`.g_info a
ON a.rec_id = c.gen_id JOIN
`db-dummy`.spec_data b
ON b.rec_id = c.spec_id
) x
WHERE cnt > 1;
SELECT * FROM `db-dummy`.sgdata a
LEFT JOIN
(SELECT COUNT(Id) as count, notation, gene
FROM `db-dummy`.sgdata
GROUP BY notation, gene
HAVING COUNT(id) > 1) b
on a.notation = b.notation AND a.gene = b.gene

Sql code for distinct fields

I was wondering if anyone can help me with this query.
I have two tables that I join together (DDS2ENVR.QBO AND KCA0001.ORTS)
THE QBO Table has a field labeled NIIN AND RIC. THE KCA0001.ORTS table has a field named SERVICE and OWN_RIC.
I Join the tables by QBO.RIC and ORTS.OWN_RIC. My dilemma is that under the NIIN field multiple rows can be identical but have different values for RIC.
Example:
NIIN RIC
123455 A
122222 B
123456 C
122222 A
I want to query a distinct count for NIINS that separates by the different service where it does not overlap. So example NIIN should only find distinct values only associated with A where the same NIIN is not found in B,C,D etc.
SELECT D.SERVICE, COUNT(C.NIIN)
FROM DDS2ENVR.QBO C
JOIN KCA0001.ORTS D ON D.OWN_RIC = C.RIC
WHERE C.SITE_ID = ('HEAA')
GROUP BY D.SERVICE
HAVING COUNT(DISTINCT C.NIIN) > 1
Please ask questions if this does not make any sense.
Using Not Exists
SELECT D.SERVICE, COUNT(C.NIIN)
FROM DDS2ENVR.QBO C
JOIN KCA0001.ORTS D ON D.OWN_RIC = C.RIC
WHERE C.SITE_ID = ('HEAA')
and NOT EXISTS (Select 1 from DDS2ENVR.QBO C1 where C1.NIIN = C.NIIN and C1.RIC <> C.RIC)
GROUP BY D.SERVICE
HAVING COUNT(DISTINCT C.NIIN) > 1
Also if the table DDS2ENVR.QBO doesn't contain duplicates and your dbms supports CTE
With cte as
(Select NIIN from DDS2ENVR.QBO group by NIIN having count(*) = 1)
SELECT D.SERVICE, COUNT(C.NIIN)
FROM DDS2ENVR.QBO C
JOIN KCA0001.ORTS D ON D.OWN_RIC = C.RIC
WHERE C.SITE_ID = ('HEAA')
and C.NIIN in (Select * from cte)
GROUP BY D.SERVICE
HAVING COUNT(DISTINCT C.NIIN) > 1

SQL queries: find common parts which price has gone up from one type to another

I have a table in the database (big orange one) including parts and prices for two different type. I am looking to find the little orange table as result in summary:
I am looking for common parts in both type R and O Where price has gone up from type O to type R.
This is the script I tried but it is disconnected:
SELECT *FROM Table WHERE type='R'as a
SELECT * FROM Table WHERE type='O'as b
SELECT * FROM a
INNER JOIN b ON a.part = b.part
WHERE a.price < b.price
Please try this and let me know,
SELECT * FROM Table a, Table b
WHERE a.type = 'R'
and b.type = 'O'
and a.part = b.part
and a.price < b.price;
I just found the answer myself:
SELECT * INTO h
FROM table AS t
WHERE t.type='R';
SELECT * INTO b
FROM table AS tab
WHERE tab.type ='O';
SELECT h.type, h.part, h.price, b.price, (b.price - h.price) AS Gap
FROM h
INNER JOIN b ON h.part = b.part;

Comparing Result Set Against Itself

I have a query that I would like to compare against itself - so far this query is working (if there is a neater/better way to write it I'd like to know!) but it's producing some "duplicate" values.
SELECT
a.GroupID,
a.MemberH,
a.ChartID as ChartIDA,
b.ChartID as ChartIDB
FROM
(Select DISTINCT
s.GroupID,
c.ChartID,
m.MemberH
From Charts c, ChartRetrieval cr, Sites s, Members m
Where
c.ChartID=cr.ChartID
and cr.ChartScanningStatusID <> 331
and s.SiteID=c.SiteID
and s.ProjectID not in (1,2,111)
and m.MemberID=c.MemberID) a
INNER JOIN
(Select DISTINCT
s.GroupID,
c.ChartID,
m.MemberH
From Charts c, ChartRetrieval cr, Sites s, Members m
Where
c.ChartID=cr.ChartID
and cr.ChartScanningStatusID <> 331
and s.SiteID=c.SiteID
and s.ProjectID not in (1,2,111)
and m.MemberID=c.MemberID) b ON a.GroupID=b.GroupID AND a.MemberHICN=b.MemberHICN
WHERE
a.GroupID=b.GroupID
and a.MemberH=b.MemberH
and a.ChartID <> b.ChartID
Order By a.GroupID
So far the results are correct but as I said it's giving me some dupes.
IE -
Group ID | MemberH | ChartIDA | ChartIDB
-----------------------------------------
471021 | 810392941 | 4810391 | 2193845
-----------------------------------------
471021 | 810392941 | 2193845 | 4810391
I know these rows are technically not duplicates but the info is the same just flipped (so for me they are dupes lol).
Is there a way I can fix this?
A simple trick :
and a.ChartID <> b.ChartID
Instead of
and a.ChartID < b.ChartID
This will show only one of the two rows.
This is half an answer as it does not cover the duplicate rows. You could self join a CTE in order to have a smaller query:
;WITH SomeQuery AS (
SELECT
a.GroupID,
a.MemberH,
a.ChartID as ChartIDA,
b.ChartID as ChartIDB
FROM
(Select DISTINCT
s.GroupID,
c.ChartID,
m.MemberH
From Charts c, ChartRetrieval cr, Sites s, Members m
Where
c.ChartID=cr.ChartID
and cr.ChartScanningStatusID <> 331
and s.SiteID=c.SiteID
and s.ProjectID not in (1,2,111)
and m.MemberID=c.MemberID)
)
SELECT A.*
FROM SomeQuery A
INNER JOIN SomeQuery B
ON A.GroupID = B.GroupID AND A.MemberHICN = B.MemberHICN
WHERE
A.GroupID = B.GroupID
and A.MemberH = B.MemberH
and A.ChartID <> B.ChartID
Order By A.GroupID
More about Common Table Expressions here.

Compare last to second last record for each contract

To keep it simple, my question is similar to THIS QUESTION, PART 2, only problem is, I am not running Oracle and thus can not use the rownumbers.
For those who need more information and examples:
I have a table
contractId date value
1 09/02/2011 A
1 13/02/2011 C
2 02/02/2011 D
2 08/02/2011 A
2 12/02/2011 C
3 22/01/2011 C
3 30/01/2011 B
3 12/02/2011 D
3 21/01/2011 A
EDIT: added another line for ContractID. Since I had some code myself, but that would display the following:
contractId date value value_old
1 09/02/2011 A
2 08/02/2011 A D
3 30/01/2011 B C
3 30/01/2011 B A
But that is not what I want ! The result should still be as below!
Now I want to select the last record before a given date and compare that with the previous value.
Suppose the 'given date' is 11/02/2011 in this example, the output should be like this:
contractId date value value_old
1 09/02/2011 A
2 08/02/2011 A D
3 30/01/2011 B C
I do have the query to select the last record before the given date. That is the easy part. But to select the last record before that, I am lost...
I really hope I can get some help here, have been breaking my head over this for days and looking for answers on the web and stackoverflow.
One possibility:
SELECT a.contractId, a.Date, a.Value, (SELECT Top 1 b.[Value]
FROM tbl b
WHERE b.[Date] < a.[Date] And b.ContractID=a.ContractID
ORDER BY b.[Date] Desc) AS Old_Value
FROM tbl AS a
WHERE a.Date IN
(SELECT TOP 1 b.Date
FROM tbl b
WHERE b.ContractID=a.ContractID
AND b.Date < #2011/02/11#
ORDER BY b.date DESC)
As promised, I would also post my answer. Although at this point, I still think Remou's answer is better, since the code is shorter and it seems more efficient (calls the same table fewer times). But here goes:
Query1:
SELECT c.contractID, c.firstofdates, a.value, d.value, d.date
FROM (table1 AS A RIGHT JOIN (SELECT b.cid,max(b.date) AS FirstOfdates
FROM table1 as B
where b.date < #02/11/2011#
GROUP BY b.contractID ) AS c ON (a.date = c.firstofdates) AND (a.contractID = c.contractID))
LEFT JOIN (select e.contractID, e.date, e.value
from table1 as e
) AS d ON (d.date < c.firstofdates) AND (d.contractID = c.contractID);
This query actually gives the result with the extra row for the 3rd contractID.
Query2:
SELECT b.contractID, max(a.date) AS olddate
FROM table1 AS a RIGHT JOIN (select contractID, firstofdates
from Query1) AS b ON (a.contractID= b.contractID) AND (a.date < b.firstofdates)
GROUP BY b.contractID;
And then to combine both:
Query3:
SELECT Query1.contractID, Query1.firstofdates AS [date], Query1.A.value AS [value], Query1.d.value AS [old value]
FROM Query1 RIGHT JOIN Query2
ON (Query1.date=Query2.olddate or Query2.olddate is null) AND (Query1.cid = Query2.cid);