Select 2 Rows from Table when COUNT of another table - sql

Here is the code that I currently have:
SELECT `A`.*
FROM `A`
LEFT JOIN `B` ON `A`.`A_id` = `B`.`value_1`
WHERE `B`.`value_2` IS NULL
AND `B`.`userid` IS NULL
ORDER BY RAND() LIMIT 2
What it currently is supposed to do is select 2 rows from A when the 2 rows A_id being selected are not in value_1 or value_2 in B. And the rows in B are specific to individual users with userid.
What I need to do is make it also so that also checks if there are already N rows in B matching a A_id (either in value_1, or value_2) and userid, and if there are more than N rows, it doesn't select the A row.

The following would handle your first request:
Select ...
From A
Left Join B
On ( B.value_1 = A.A_id Or B.value_2 = A.A_id )
And B.userid = #userid
Where B.<non-nullable column> Is Null
Part of the trick is moving your criteria into the ON clause of the Left Join. I'm not sure how the second part of your request fits with the first part. If there are no rows in B that match on value_1 or value_2 for the given user, then by definition that row count will be zero. Is it that you want it be the situation where there can only be a maximum number of rows in B matching on the given criteria? If so, then I'd write my query like so:
Select ...
From A
Where (
Select Count(*)
From B B2
Where ( B2.value_1 = A.A_id Or B2.value_2 = A.A_id )
And B2.userid = #userid
) <= #MaxItems

Related

How to find doubles in master-child table?

I need help with a query to find doubles. Let met explain the situation by example:
tableA (the master table) has a key field keyA with these values :
keyA
1
2
3
etc
tableB (the client table) has a foreign key field keyA and a value field, fieldB
keyA fieldB
1 a
1 b
2 a
2 b
3 a
3 c
4 a
4 b
4 c
etc
So, the values for fieldB in child table tableB are:
for tableA.keyA = 1 are: a and b
for tableA.keyA = 2 are: a and b
for tableA.keyA = 3 are: a and c
for tableA.keyA = 4 are: a, b and c
Now, given a value for keyA I need to find all records in tableA that have matching records in tableB for the field fieldB.
For example, if I search with keyA = 1 then
tableA.keyA = 2 is OK because both have same tableB.fieldB (a and b versus a and b)
tableA.keyA = 3 is not OK because both have not same tableB.fieldB (a and b versus a and c)
tableA.keyA = 4 is not OK because both have not same tableB.fieldB (a and b versus a, b and c)
I need a query that can give me this result. I hope someone can help me with this or can point me into the right direction.
Try this simple query , hope this will solve your problem
DECLARE #vkey int = 1
;WITH cte_test AS (
SELECT keyA,(SELECT ','+fieldb FROM tableB t1 WHERE t1.keyA = t.keyA FOR XML path('')) AS rslt
from tableB t
GROUP BY t.keyA)
SELECT t2.*
FROM cte_test t1
INNER JOIN cte_test t2 ON t1.[rslt] = t2.[rslt] AND t2.[keyA] <> t1.[keyA]
WHERE t1.[keyA] = #vkey
If there is no other item have the same combination , then there is no records in the result, otherwise it will return the matched items.
Assuming there are no duplicates, you can do this with a self-join and aggregation:
select c.keyA, c2.keyA
from (select c.*, count(*) over (partition by keyA) as numBs
from clientTable c
) c join
(select c.*, count(*) over (partition by keyA) as numBs
from clientTable c
) c2
on c2.fieldB = c.fieldB and
c2.keyA <> c.keyA and
c.keyA = 1 -- or whatever key you want to check
where c.numBs = c2.numBs
group by c.keyA, c2.keyA, c.numBs, c2.numBs
having count(*) = c.numBs;
The idea is to count the number of fieldB values for each keyA. These need to be equal (where c.numBs = c2.numBs) and to check that all match (having count(*) = c.numBs).

Value present in more than one table

I have 3 tables. All of them have a column - id. I want to find if there is any value that is common across the tables. Assuming that the tables are named a.b and c, if id value 3 is present is a and b, there is a problem. The query can/should exit at the first such occurrence. There is no need to probe further. What I have now is something like
( select id from a intersect select id from b )
union
( select id from b intersect select id from c )
union
( select id from a intersect select id from c )
Obviously, this is not very efficient. Database is PostgreSQL, version 9.0
id is not unique in the individual tables. It is OK to have duplicates in the same table. But if a value is present in just 2 of the 3 tables, that also needs to be flagged and there is no need to check for existence in he third table, or check if there are more such values. One value, present in more than one table, and I can stop.
Although id is not unique within any given table, it should be unique across the tables; a union of distinct id should be unique, so:
select id from (
select distinct id from a
union all
select distinct id from b
union all
select distinct id from c) x
group by id
having count(*) > 1
Note the use of union all, which preserves duplicates (plain union removes duplicates).
I would suggest a simple join:
select a.id
from a join
b
on a.id = b.id join
c
on a.id = c.id
limit 1;
If you have a query that uses union or group by (or order by, but that is not relevant here), then you need to process all the data before returning a single row. A join can start returning rows as soon as the first values are found.
An alternative, but similar method is:
select a.id
from a
where exists (select 1 from b where a.id = b.id) and
exists (select 1 from c where a.id = c.id);
If a is the smallest table and id is indexes in b and c, then this could be quite fast.
Try this
select id from
(
select distinct id, 1 as t from a
union all
select distinct id, 2 as t from b
union all
select distinct id, 3 as t from c
) as t
group by id having count(t)=3
It is OK to have duplicates in the same table.
The query can/should exit at the first such occurrence.
SELECT 'OMG!' AS danger_bill_robinson
WHERE EXISTS (SELECT 1
FROM a,b,c -- maybe there is a place for old-style joins ...
WHERE a.id = b.id
OR a.id = c.id
OR c.id = b.id
);
Update: it appears the optimiser does not like carthesian joins with 3 OR conditions. The below query is a bit faster:
SELECT 'WTF!' AS danger_bill_robinson
WHERE exists (select 1 from a JOIN b USING (id))
OR exists (select 1 from a JOIN c USING (id))
OR exists (select 1 from c JOIN b USING (id))
;

How to retain a row which is foreign key in another table and remove other duplicate rows?

I have two table:
A:
id code
1 A1
2 A1
3 B1
4 B1
5 C1
6 C1
=====================
B:
id Aid
1 1
2 4
(B doesn't contain the Aid which link to code C1)
Let me explain the overall flow:
I want to make each row in table A have different code(by delete duplicate),and I want to retain the Aid which I can find in table B.If Aid which not be saved in table B,I retain the id bigger one.
so I can not just do something as below:
DELETE FROM A
WHERE id NOT IN (SELECT MAX(id)
FROM A
GROUP BY code,
)
I can get each duplicate_code_groups by below sql statement:
SELECT code
FROM A
GROUP BY code
HAVING COUNT(*) > 1
Is there some code in sql like
for (var ids in duplicate_code_groups){
for (var id in ids) {
if (id in B){
return id
}
}
return max(ids)
}
and put the return id into a idtable?? I just don't know how to write such code in sql.
then I can do
DELETE FROM A
WHERE id NOT IN idtable
Using ROW_NUMBER() inside CTE (or sub-query) you can assign numbers for each Code based on your ordering and then just join the result-set with your table A to make a delete.
WITH CTE AS
(
SELECT A.*, ROW_NUMBER() OVER (PARTITION BY A.Code ORDER BY COALESCE(B.ID,0) DESC, A.ID desc) RN
FROM A
LEFT JOIN B ON A.ID = B.Aid
)
DELETE A FROM A
INNER JOIN CTE C ON A.ID = C.ID
WHERE RN > 1;
SELECT * FROM A;
SQLFiddle DEMO
The first select gives you all A.id that are in B - you don't want to delete them. The second select takes A, selects all codes without an id that appears in B, and from this subset takes the maximum id. These two sets of ids are the ones you want to keep, so the delete deletes the ones not in the sets.
DELETE from A where A.id not in
(
select aid from B
union
select MAX(A.id) from A left outer join B on B.Aid=A.id group by code having COUNT(B.id)=0
)
Actual Execution Plan on MS SQL Server 2008 R2 reveals that this solution performs quite well, it's 5-6 times faster than Nenad's solution :).
Try this Solution
DELETE FROM A
WHERE NOT id IN
(
SELECT MAX(B.AId)
FROM A INNER JOIN B ON A.id = B.aId
)

Hive SQL - Refining JOIN query to ignore Null values

I'm a little new with SQL so bear with me.
I have two tables, each with an ID column. Table A has a column titled role, Table B has a column titled outcome. I want to query these tables to find which rows based on the ID have role = 'PS' and outcome = 'DE'. Here is my code:
SELECT count(*)
FROM A JOIN B
ON (A.id = B.id
AND A.role = 'PS'
AND B.outcome = 'DE')
I've been searching the internet for a way to do this so that it doesn't include rows that have null values for either A.role or B.outcome.
The above code returns lets say 40,100, even though the total number of entries in B where B.outcome = 'DE' is only 40,000. So it is obviously including entries that do not fit my conditions. Is there a way to better refine my query?
Your query already excludes rows with a null value in A.role. After all, null = 'PS' is not true, and you're using an inner join.
There's an easy explanation of how you can retrieve more rows from the join than there are in B. Say you have these rows for A:
A.id A.role
1 'A'
1 'A'
And these rows for B:
B.id B.outcome
1 'A'
1 'A'
Then this query:
select *
from A
join B
on A.id = B.id and A.role = 'A' and B.role = 'A'
will return 4 rows. That's more than there are in table A or B!
So I'd investigate whether id is unique:
select count(*) from A group by id having count(*) > 1
select count(*) from B group by id having count(*) > 1
If these queries return a count greater than zero, id is not unique. Since a join repeats rows for each match, that would explain a large increase in the amount of returned records.

Is possible have different conditions for each row in a query?

How I can select a set of rows where each row match a different condition?
Example:
Supposing I have a table with a column called name, I want the result ONLY IF the first row name matches 'A', the second row name matches 'B' and the third row name matches 'C'.
Edit:
I want to do this to work without a fixed size, but in a way I can define the sequence like R,X,V,P,T and it matches the sequence, each one in a row, but in the order.
you can, but probably not in a way you would want:
if your table has a numeric id field, that is incremented with each row, you can self join that table 3 times (lets say as "a", "b" and "c") and use the join condition a.id + 1 = b.id and b.id + 1 = c.id and put you filter in a where clause like: a.name = 'A' AND b.name = 'B' AND c.name = 'C'
but don't expect performance ...
Assuming that You know how to provide a row number to your rows (ROW_NUMBER() in SQL Server, for instance), You can create a lookup (match) table and join on it. See below for explanation:
LookupTable:
RowNum Value
1 A
2 B
3 C
Your SourceTable source table (assuming You already added RowNum to it-in case You didn't, just introduce subquery for it (or CTE for SQL Server 2005 or newer):
RowNum Name
-----------
1 A
2 B
3 C
4 D
Now You need to inner join LookupTable with your SourceTable on LookupTable.RowNum = SourceTable.RowNum AND LookupTable.Name = SourceTable.Name. Then do a left join of this result with LookupTable on RowNum only. If there is LookupTable.RowNum IS NULL in final result then You know that there is no complete match on at least one row.
Here is code for joins:
SELECT T.*, LT2.RowNum AS Matched
FROM LookupTable LT2
LEFT JOIN
(
SELECT ST.*
FROM SourceTable ST
INNER JOIN LookupTable LT ON LT.RowNum = ST.RowNum AND LT.Name = ST.Name
) T
ON LT2.RowNum = T.RowNum
Result set of above query will contain rows with Matched IS NULL if row is not matching condition from LookupTable table.
I suppose you could do a sub query for each row, but it wouldn't perform well or scale well at all and would be hard to maintain.
This may be close to what your after... but I need to know where you're getting your values for A, B, C etc...
Select [insert your fields here]
FROM
(Select T1.Name, T1.Age, RowNum as t1RowNum from T T1 order by name) T1O
Full Outer JOIN
(Select T2.Name, T2.Age, RowNum as T2rowNum From T T2 order By name) T2O
ON T1O.T1RowNum+1 = T2O.T2RowNum