I have a stored procedure in Bigquery and a resulting table where 2 rows are not exactly duplicates but I want to filter one of the rows based on a condition.
SQL query:
Results:
WITH DupCodes AS (
SELECT AccCode
FROM Table
GROUP BY AccCode
HAVING COUNT(*) > 1
)
SELECT *
FROM table
WHERE (AccCode IN (SELECT AccCode FROM DupCodes) AND AccountName IS NOT NULL)
OR (AccCode NOT IN (SELECT AccCode FROM DupCodes))
One method uses not exists logic:
select t.*
from t
where t.accountname is not null or
not exists (select 1
from t t2
where t2.accCode = t.accCode and t2.accountname is not null
);
That is, show all rows where accountname is not empty. Then show empty rows only when there is no non-empty accountname for the same accCode.
Related
My table has 2 columns containing code pairs (Parentcodes and Childcodes). They are unique parings but each code can and often are repeated in each column. I'm trying to pull a list of each instance of each code and all of the associated values from the other column.
So basically
Select ParentCode, Childcode
from TABLE
where count(ParentCode)>1
(and vice versa)
It seems like I have to include both columns in the group by if I want them both in the select. I've tried subqueries but with no luck. I know I can set up a script in VBA to loop through each code and return the results (running a basic select where count > 1), but that seems like the least efficient approach.
Sample data:
To get as parentcode or childcode also repeated more than 1 time you can use IN:
select Parentcode, Childcode
from Table
where Parentcode in (
select Parentcode
from Table
group by Parentcode
having count(Parentcode) > 1
)
or Childcode in (
select Childcode
from Table
group by Childcode
having count(Childcode) > 1
)
You should be just about there with that.
select Perentcode, count(ParentCode) count
from TABLE
group by ParentCode
having count(Parentcode)>1
You can use EXISTS:
select t.* from tablename t
where
exists (select 1 from tablename where parentcode <> t.parentcode and childcode = t.childcode)
or
exists (select 1 from tablename where parentcode = t.parentcode and childcode <> t.childcode)
Im new to DB2 , and tried based on some similar posts, I have a table where I need to find the count of IDs based on where status=P and
the count of(primary=1) more than once.
so my result should be 2 here - (9876,3456)
Tried:
SELECT id, COUNT(isprimary) Counts
FROM table
GROUP BY id
HAVING COUNT(isprimary)=1;
Try the query below:
select ID as IDs,Count(isPrimary) as isPrimary
From Table
where Status = 'p'
Group by ID
Having Count(isPrimary) >1
You are close, I think all you need to do is to add a where clause like:
SELECT id, COUNT(*) as Counted
FROM table
WHERE PrimaryFlag = 1
AND[status] = 'P'
GROUP BY id
EDIT: if you need to count only the distinct IDs, then try:
SELECT COUNT(t.ID) FROM
(
SELECT id, COUNT(*) as Counted
FROM table
WHERE PrimaryFlag = 1
AND[status] = 'P'
GROUP BY id
) as t
I'm trying to Select all the records in my database that don't exist in a subquery.
For some reason it returns nothing even though the sub query returns 2000 or so rows on it's own and the main query returns over 5000. I need all the records that aren't contained in the subquery
SELECT ID
FROM PART
WHERE NOT ID IN
(
SELECT DOCUMENT_ID AS ID
FROM USER_DEF_FIELDS
WHERE PROGRAM_ID = 'VMPRTMNT' AND ID = 'UDF-0000029'
)
This is better written as a correlated NOT EXISTS subquery.
SELECT ID
FROM PART
WHERE NOT EXISTS
(
SELECT 1
FROM USER_DEF_FIELDS
WHERE PROGRAM_ID = 'VMPRTMNT'
AND ID = 'UDF-0000029'
AND DOCUMENT_ID = PART.ID
)
Basically my select statement returns below:
ID Status
100 1
100 2
101 1
What i'm looking for is to return if a ID having status as 1 and if the same ID has another status ID as 2 then exclude both
In Short results as below:
ID Status
101 1
Thanks in advance !
The following query returns ID values that occur only once.
SELECT ID
FROM t
GROUP BY ID
HAVING COUNT(*) = 1
It should be sufficient for the sample data you provided. If there are other cases then let me know.
SQL Fiddle
You gonna need subquery and NOT IN here.
The following would work if you have column status as INT datatype
SELECT *
FROM table
WHERE status = 1
AND ID NOT IN (
SELECT ID
FROM table
WHERE status = 2
);
Making a generic query, which will remove all duplicated rows, not only for a particular ID :
select ID
from table where ID NOT IN
(select ID from table GROUP BY ID HAVING count(Status) > 1)
/* Subquery will fetch ID's having multiple entries*/
SQL Fiddle
The CTE 'IDs' retrieves all IDs which have single record in DB. This is then joined to original table to return the result as a pair (ID, Status)
;with IDs as
(
select ID
from yourtable
group by ID
having count(*) = 1
)
select i.ID, y.Status
from yourtable y
inner join IDs i on y.ID = i.ID
order by i.ID
My goal is to delete all records from my table that are NOT the MAX(recordDate) of a grouped CaseKey. So if I have 9 records with 3 sets of 3 casekeys, and each casekey has its 3 dates. I'd delete the 2 lower dates of each set and come up with 3 total records, only the MAX(recordDate) of each remaining.
I have the following SQL Query:
DELETE FROM table
WHERE tableID NOT IN (
SELECT tableID
FROM (
Select MAX(recordDate) As myDate, tableID From table
Group By CaseKey
) As foo
)
I receive the error:
Error on Line 3... Column 'table.tableID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Obviously I could add tableID to my Group By clause, but then the result of that statement is incorrect and returns all rows instead of just returning the MAX recordDate of the grouped CaseKeys.
Server is down right now, but the apparent answer is: (tiny tweak from WildPlasser's answer)
DELETE zt FROM ztable zt
WHERE EXISTS (
SELECT * FROM ztable ex
WHERE ex.CaseKey = zt.CaseKey
AND ex.recordDate > zt.recordDate
);
In other words, for each record in zt, run a query to see if the same record also has a record with a higher recordDate. If so, the WHERE EXISTS statement passes and the record is deleted, otherwise the WHERE statement fails and the record is its own MAX recordDate.
Thank you, WildPlasser, for that simplistic methodology that I was somehow blowing up.
There is one special property of MAX: there is no record with a higher value than max. So we can delete all the records for which a record with the same CaseKey, but with a higher recordDate exists:
DELETE FROM ztable zt
WHERE EXISTS (
SELECT *
FROM ztable ex
WHERE ex.CaseKey = zt.CaseKey
AND ex.recordDate > zt.recordDate
);
BTW: The above query (as well as the MAX() version) assumes that there is only one record with the maximum date. There could be ties.
In the case of ties, you'll need to add an extra field to the where clause; as a tie-breaker. Assuming that TableId can function as such, the query would become:
DELETE FROM ztable zt
WHERE EXISTS (
SELECT *
FROM ztable ex
WHERE ex.CaseKey = zt.CaseKey
AND ( ex.recordDate > zt.recordDate
OR (ex.recordDate = zt.recordDate AND ex.TableId > zt.TableId)
)
);
Just express
delete all records from my table that are NOT the MAX(recordDate) of a
grouped CaseKey
in sql as
DELETE FROM table t1
WHERE t1.recordDate <>
(SELECT MAX(recordDate)
FROM table t2
WHERE t2.CaseKey = t1.CaseKey)
You can rank all records with the same caseKey where the rank > 1 to only return the lower dates. That way you can use your tableID.
DELETE FROM [table]
WHERE [tableID] IN
(SELECT
[sub].[tableID]
FROM
(
SELECT
[tableID],
Rank() OVER (PARTITION BY [caseKey] ORDER BY [recordDate] DESC, [tableID] DESC) AS [rank]
FROM [table]
) AS [sub]
WHERE [sub].[rank] > 1)