Query for unique values - sql

I have the following database table in Access:
Field1 | Field2
A | 1
B | 1
C | 2
D | 2
B | 3
O | 3
L | 3
I want to develop a query in Access (preferably without using SQL) to select all values in Field2 corresponding to an occurence of the value "B" in field 1. This query should yield
Field1|Field2
A | 1
B | 1
B | 3
O | 3
L | 3

Use a subquery:
select t.*
from t
where t.field2 in (select t2.field2 from t as t2 where t2.field1 = 'B');

Related

Bigquery: Joining 2 tables one having repeated records and one with count ()

I want to join tables after unnest arrays in Table:1 but the records duplicated after the join because of the unnest.
Table:1
| a | d.b | d.c |
-----------------
| 1 | 5 | 2 |
- -------------
| | 3 | 1 |
-----------------
| 2 | 2 | 1 |
Table:2
| a | c | f |
-----------------
| 1 | 12 | 13 |
-----------------
| 2 | 14 | 15 |
I want to join table 1 and 2 on a but I need also to have the output of:
| a | d.b | d.c | f | h | Sum(count(a))
---------------------------------------------
| 1 | 5 | 2 | 13 | 12 |
- ------------- - - 1
| | 3 | 1 | | |
---------------------------------------------
| 2 | 2 | 1 | 15 | 14 | 1
a can be repeated in table 2 for that I need to count(a) then select the sum after join.
My problem is when I'm joining I need the nested and repeated record to be the same as in the first table but when use aggregation to get the sum I can't group by struct or arrays so I UNNEST the records first then use ARRAY_AGG function but also there was an issue in the sum.
SELECT
t1.a,
t2.f,
t2.h,
ARRAY_AGG(DISTINCT(t1.db)) as db,
ARRAY_AGG(DISTINCT(t1.dc)) as dc,
SUM(t2.total) AS total
FROM (
SELECT
a,
d.b as db,
d.c as dc
FROM
`table1`,
UNNEST(d) AS d,
) AS t1
LEFT JOIN (
SELECT
a,
f,
h,
COUNT(*) AS total,
FROM
`table2`
GROUP BY
a,f,h) AS t2
ON
t1.a = t2.a
GROUP BY
1,
2,
3
Note: the error is in the total number after the sum it is much higher than expected all other data are correct.
I guess your table 2 contains is not unique for column a.
Lets assume that the table 2 looks like this:
a
c
f
1
12
13
2
14
15
1
100
101
There are two rows where a is 1. Since b and f are different, the grouping does not solve this ( GROUP BY a,f,h) AS t2) and counts(*) as total is one for each row.
a
c
f
total
1
12
13
1
2
14
15
1
1
100
101
1
In the next step you join this table to your table 1. The rows of table1 with value 1 in column a are duplicated, because table2 has two entries. This lead to the fact that the sum is too high.
Instead of unnesting the tables, I recommend following approach:
-- Creating of sample data as given:
with tbl_A as (select 1 a, [struct(5 as b,2 as c),struct(3,1)] d union all select 2,[struct(2,1)] union all select null,[struct(50,51)]),
tbl_B as (select 1 as a,12 b, 13 f union all select 2,14,15 union all select 1,100,101 union all select null,500,501)
-- Query:
select *
from tbl_A A
left join
(Select a,array_agg(struct(b,f)) as B, count(1) as counts from tbl_B group by 1) B
on ifnull(A.a,-9)=ifnull(B.a,-9)

Get Records with both matching values in SQL

Table A
id1 | id2 |
---------------
2 | 3 |
4 | 5 |
Table B
groupid | parentid | uid
----------------
4 | 2 (id1) | 1
4 | 3 (id2) | 2
6 | 2 | 3
7 | 4 (Id1) | 4
8 | 4 (Id1) | 5
8 | 5 (Id2) | 6
8 | 6 | 7
I want to fetch records where groupid should have both id1 & id2.
So in this case uid 1,2 & 5,6 should be retrieved because groupid 4 & 8 have both of them.
How to achieve this in SQL? By SQL, let's say SQL Server
This should do the job for you:
select distinct B1.uid from TableB B1
join TableB B2 on B1.groupid = B2.groupid and B1.parentid ! = B2.parentid
and (
(B1.parentid in (select id1 from TableA) AND B2.parentid in (select id2 from TableA))
OR
(B2.parentid in (select id1 from TableA) AND B1.parentid in (select id2 from TableA))
)
Test it here:
http://rextester.com/KZY45975
Try this query
select b.uid from TableB b inner join TableA a on (a.id1 = b.parentid) inner join TableB c on (c.parentid = a.id2 and c.uid = b.uid)
might returns what you want

Join table 1 to either column 1 or 2 from table 2 without duplicates

[MS SQL 2008]
I have tables (all columns are string names):
A: two columns relating some datafield to an owning entity
B: three columns defining a hierarchy of entities
I need to create a singe table of the whole hierarchy (including all rows not existing in both tables), but the key column in table A (shown as Acol2) can be in either column 1 or 2 of table B...
A: B:
Acol1 | Acol2 Bcol1 | Bcol2 | Bcol3
-------+------ --------+-------+------
A | B B | X | Y
C | D Q | X | Y
E | F H | D | Z
G | H W | V | U
The output should be
Hierarchy:
Acol1 | Bcol1 | Bcol2 | Bcol3
-------+-------+-------+------
A | B | X | Y
Null | Q | X | Y
C | Null | D | Z
G | H | D | Z
E | Null | Null | Null
Null | W | V | U
Logic (also added to original):
If A has no record in B, show A with all Null
If A has record in Bcol1, show A with full row B
If A has record in Bcol2, show A with Null, Bcol2, Bcol3
If B has no record in A, show B with Null for Acol1
I have tried all sorts of UNIONs of two separate JOINs, but can't seem to get rid of extraneous rows...
B LEFT JOIN A ON Acol2=Bcol1 UNION B LEFT JOIN A ON Acol2=Bcol2;
gives duplicate rows, as the second part of the union has to set Bcol1 to NULL
(perhaps one solution is a way to remove this duplicate NULL row?)
B INNER JOIN A ON Acol2=Bcol1 UNION B INNER JOIN A ON Acol2=Bcol2;
Obviously removes all the rows from A and B that have no shared keys
(solution as to easy way to regain just those rows?)
Any idea appreciated!
To play:
[SQL removed - see fiddle in reply comments]
SELECT
Table1.ACol1,
CASE WHEN Table1.ACol1 = Table2.BCol1 THEN Table2.BCol1 ELSE NULL END AS BCol1
Table2.BCol2,
Table2.BCol3
FROM
Table1
FULL OUTER JOIN
Table2
ON Table1.ACol2 IN (Table2.BCol1, Table2.BCol2)
When you say no duplicates, this is only possible if ACol2 only ever appears in one field of one row in Table2. If it appears in multiple places, you'll get duplication.
- If that's possible, how would you want to chose which record from Table2?
Also, in general, however, this is a SQL-Anti-Pattern.
This is because the join would prefer an index on Table2. But, since you never know which field you're joining on, no single index will ever satsify the join condition.
EDIT:
What would make this significantly faster is to create a normalised TableB...
B_ID | B_Col | B_Val
------+-------+-------
1 | 1 | B
1 | 2 | X
1 | 3 | Y
2 | 1 | Q
2 | 2 | X
2 | 3 | Y
3 | 1 | H
3 | 2 | D
3 | 3 | Z
4 | 1 | W
4 | 2 | V
4 | 3 | U
Then index that table with (B_ID) and on (B_Val)...
Then include the B_ID field in the non_normalised table...
ID | Bcol1 | Bcol2 | Bcol3
------+-------+-------+-------
1 | B | X | Y
2 | Q | X | Y
3 | H | D | Z
4 | W | V | U
Then use the following query...
SELECT
Table1.ACol1,
CASE WHEN Table1.ACol1 = Table2.BCol1 THEN Table2.BCol1 ELSE NULL END AS BCol1
Table2.BCol2,
Table2.BCol3
FROM
(
Table1
LEFT JOIN
Table2Normalised
ON Table2Normalised.B_Val = Table1.ACol2
AND Table2Normalised.B_Col IN (1,2)
)
FULL OUTER JOIN
Table2
ON Table2Normalised.B_ID = Table2.ID
EDIT:
Without changing the schema, and instead having one index on BCol1 and a second index on Bcol2...
SELECT ACol1, BCol1, BCol2, BCol3 FROM Table1 a INNER JOIN Table2 b ON a.ACol2 = b.BCol1
UNION ALL
SELECT ACol1, NULL, BCol2, BCol3 FROM Table1 a INNER JOIN Table2 b ON a.ACol2 = b.BCol2
UNION ALL
SELECT ACol1, NULL, NULL, NULL FROM Table1 a WHERE NOT EXISTS (SELECT * FROM Table2 WHERE BCol1 = a.ACol2)
AND NOT EXISTS (SELECT * FROM Table2 WHERE BCol2 = a.ACol2)
UNION ALL
SELECT NULL, BCol1, BCol2, BCol3 FROM Table2 b WHERE NOT EXISTS (SELECT * FROM Table1 WHERE ACol2 = b.BCol1)
AND NOT EXISTS (SELECT * FROM Table1 WHERE ACol2 = b.BCol2)
But that's pretty messy...

When joining a table to itself to identfy duplicate date in a column, how do you keep it from returning the inverse in the results?

I am trying to write a sql statement to return me the list of duplicate items I find in a table. For the sake of simplicity imagine a table named TEST with a rowid column and a text column called column 1 with the following date:
rowid | column1
---------------
1 | A
2 | B
3 | C
4 | A
5 | B
6 | C
7 | D
The query I currently have is:
select t1.rowid, t1.column1, t2.rowid, t2.column1
from test t1
inner join test t2 on t1.column1 = t2.column1 and t1.rowid <> t2.rowid
It gives me the following results, as I would expect it to do:
rowid | column1 | rowid | column1
---------------------------------
1 | A | 4 | A
2 | B | 5 | B
3 | C | 6 | C
4 | A | 1 | A
5 | B | 2 | B
6 | C | 3 | C
What I really want is just:
rowid | column1 | rowid | column1
---------------------------------
1 | A | 4 | A
2 | B | 5 | B
3 | C | 6 | C
What black sql magic to I need to call upon in order to get my desired result?
select t1.rowid, t1.column1, t2.rowid, t2.column1
from test t1
inner join test t2 on t1.column1 = t2.column1 and t1.rowid < t2.rowid
Another approach to produce results in the same form as the original table:
SELECT t.rowid, t.column1
FROM (SELECT column1
FROM test
GROUP BY column1
HAVING COUNT(*) > 1) q
INNER JOIN test t
ON q.column1 = t.column1
ORDER BY t.column1, t.rowid
Have you tried this?
select min(rowid), column1, max(rowid), column1
from test
group by column1
having count(*)>1
Saves doing self-joins or subqueries, gotta be faster.

sql query distinct on multiple columns

i have this data and i am trying to find cases where there are different ids but duplicate data in Field 1,2,3,4
id field1 field2 field3 field4
==== ====== ====== ===== =======
1 A B C D
2 A B C D
3 A A C B
4 A A C B
so, in whatever way possible, in this case i want it to somehow show me:
1 & 2 are duplicates
3 & 4 are duplicates
Instead of SELECT DISTINCT, select the fields and a count of rows. Use HAVING to filter out items with more than one row, e.g:
select field1
,field2
,field3
,field4
,count (*)
from foo
group by field1
,field2
,field3
,field4
having count (*) > 1
You can then join your original table back against the results of the query.
One way to do this is to use having and group by
esben=# select * from test;
id | a | b | c | d
----+---+---+---+---
1 | 1 | 2 | 3 | 4
2 | 1 | 2 | 3 | 4
3 | 1 | 1 | 3 | 2
4 | 1 | 1 | 3 | 2
(4 rows)
esben=# select count(id),a,b,c,d from test group by a,b,c,d having count(id) >1;
count | a | b | c | d
-------+---+---+---+---
2 | 1 | 2 | 3 | 4
2 | 1 | 1 | 3 | 2
(2 rows)
This doesn't list the actual id's though, but without the actual output you want it is hard to tell you how to get about that.
SELECT *
FROM [TableName]
WHERE ID IN(SELECT MIN(ID)
FROM [TableName]
GROUP BY CONCAT(field1, field2, field3, field4))
This will return the full row for id's 1 & 3