get name of tuples that match an entire column in another table

get name of tuples that match an entire column in another table - sql

I have to select every name from table1 where there's tuples that match every type from table2 without grouping or aggregate functions.
table1 table2
name|type type|info
a | 1 1 | .
a | 2 2 | ..
a | 3 3 | ...
b | 1
b | 2
b | 3
c | 2
From here, it should output
name|
a |
b |
edit:
ended up doing something like
SELECT distinct outside.name
FROM table1 outside
WHERE '' NOT IN
[ (SELECT *
FROM table1 t
WHERE t.name=outside.name)
RIGHT OUTER JOIN
table2 ]
Second select makes a table with empty values for names that don't have a type in table2. So if '' isn't in the second select that means it has a tuple for every type in table2. I think

Here is one method:
select t1.name
from table1 t1
where exists (select 1 from table2 t2 where t2.type = t1.type)
group by t1.name
having count(distinct t1.type) = (select count(distinct t2.type) from table2);
This filters t1 down to the matches in t2. It then counts the number that match.
This uses count(distinct), which allows duplicates in the respective tables. If there are no duplicates, then just use count().

Related

Compare columns from 2 different tables with only last inserted values in table_2 in SQL Server

If I have two different tables in a SQL Server 2019 database as follows:
Table1
|id | name |
+-----+--------+
| 1 | rose |
| 2 | peter |
| 3 | ann |
| 4 | rose |
| 5 | ann |
Table2
| name2 |
+--------+
|rose |
|ann |
I would like to retrieve only the last tow ids from table1 (which in this case 4 and 5) that match name2 in table2. In other words, match happens only once on the last added names in table1, furthermore, the ids (4, 5) to be inserted in table2.
How to do that using SQL?
Thank you

You can use row_number()
select name,id from
(
select *, row_number() over(partition by t.name order by id desc) as rn
from table1 t join table2 t1 on t.name=t1.name2
)A where rn=1

Your question is vague, so there could be many answers here. My first thought is that you simply want an inner join. This will fetch ONLY the data that both tables share.
SELECT Table1.*
FROM Table1
INNER JOIN Table2 on Table1.name = Table2.name2

You seem to be describing:
select . . . -- whatever columns you want
from (select top (2) t1.*
from table1 t1
order by t1.id desc
) t1 join
table2 t2
on t2.name2 = t1.name;
This doesn't seem particularly useful for the data you have provided, but it does what you describe.
EDIT:
If you want only the most recent rows that match, use row_number():
select . . . -- whatever columns you want
from (select t1.*,
row_number() over (partition by name order by id desc) as seqnum
from table1 t1
) t1 join
table2 t2
on t2.name2 = t1.name and t1.seqnum = 1;

sql synthesis query

good morning.
Can anyone help me with the query, summarizing the total code in each table (Table 1 + table 2) as in table 3, thanks a lot.
Sorry, my english is not good

SELECT A.code, count(A.Date) as bang_1 , B.bang_2
FROM table_1 A
LEFT JOIN (SELECT code,count(*) as bang_2
FROM table_2
GROUP BY code) B ON A.code = B.code
GROUP BY A.code, B.bang_2

I think I understand what you are trying to do:
select codes.codename, t1.c as Table1, t2.c as Table2
from (
select codename from table1
union select codename from table2
) codes
left join (select codename, count(*) c from table1 group by codename) t1 on codes.codename = t1.codename
left join (select codename, count(*) c from table2 group by codename) t2 on codes.codename = t2.codename
group by codes.codename
order by codes.codename;
Example: https://dbfiddle.uk/?rdbms=mysql_5.5&fiddle=74d3bf30af406a2cdf65185a5fdcc564
Explanation
There are 2 subqueries t1 and t2. Both of them take a count of each code from each of the tables. The codes subquery combines codenames from both tables in case one table had more codes than the other.
Then, using the combined codenames connect with t1 and t2 and collect its respective counts.
Result
codename | Table1 | Table2
:------- | -----: | -----:
code1 | 2 | 1
code2 | 2 | 3
code3 | 1 | 1

SQL Join On Columns of Different Length

I'm trying to join two tables together in SQL where the columns contain a different number of unique entries.
When I use a full join the additional entries in the column joined on are missing.
The code I'm using is (in a SAS proc SQL):
proc sql;
create table table3 as
select table1.*, table2.*
from table1 full join table2
on table1.id = table2.id;
quit;
Visual example of problem (can't show actual tables as contain sensitive data)
Table 1
id | count1
1 | 2
2 | 3
3 | 2
Table 2
id | count2
1 | 4
2 | 5
3 | 6
4 | 2
Table 3
id | counta | countb
1 | 2 | 4
2 | 3 | 5
3 | 2 | 6
- | - | 2 <----- I want don't want the id column to be blank in this row
I hope I've explained my problem clearly enough, thanks in advance for your help.

The id from table 1 is blank because the row from table2 has no match in table 1. Try looking at the output from this query:
select coalesce(table1.id, table2.id) as id, table1.count1, table2.count2
from table1 full join table2
on table1.id = table2.id;
Coalesce works from left to right returning the first non null value (it can take more than 2 arguments). If the id in table 1 is null it uses the id from table 2 instead
I recommend also to alias all tables in queries, so I’d have written this:
SELECT
COALESCE(t1.id, t2.id) as id,
t1.count1,
t2.count2
FROM
table1 t1
FULL OUTER JOIN
table2 t2
ON
t1.id = t2.id;

Simply select coalesce(t1.id, t2.id), will return the first non-null id value.

Using the same table alias twice in a query

My coworker, who is new to ANSI join syntax, recently wrote a query like this:
SELECT count(*)
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c);
Note that table3 is joined to both table1 and table2 on different columns, but the two JOIN clauses use the same table alias for table3.
The query runs, but I'm unsure of it's validity. Is this a valid way of writing this query?
I thought the join should be like this:
SELECT count(*)
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c);
Are the two versions functionally identical? I don't really have enough data in our database yet to be sure.
Thanks.

The first query is a join of 4 tables, the second one is a join of 3 tables. So I don't expect that both queries return the same numbers of rows.
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c);
The alias t3 is only used in the ON clause. The alias t3 refers to the table before the ON keyword. I found this out by experimenting. So the pervious query is equvivalent to
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t4 ON
(t4.col_c = t1.col_c);
and this can be transfotmed in a traditional join
SELECT *
FROM table1 t1,
table2 t2,
table3 t3,
table3 t4
where (t1.col_a = t2.col_a)
and (t2.col_b = t3.col_b)
and (t4.col_c = t1.col_c);
The second query is
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c);
This can also transformed in a traditional join
SELECT *
FROM table1 t1,
table2 t2,
table3 t3
where (t1.col_a = t2.col_a)
and (t2.col_b = t3.col_b)
AND (t3.col_c = t1.col_c);
These queries seem to be different. To proof their difference we use the following example:
create table table1(
col_a number,
col_c number
);
create table table2(
col_a number,
col_b number
);
create table table3(
col_b number,
col_c number
);
insert into table1(col_a, col_c) values(1,3);
insert into table1(col_a, col_c) values(4,3);
insert into table2(col_a, col_b) values(1,2);
insert into table2(col_a, col_b) values(4,2);
insert into table3(col_b, col_c) values(2,3);
insert into table3(col_b, col_c) values(2,5);
insert into table3(col_b, col_c) values(7,9);
commit;
We get the following output
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b)
JOIN table3 t3 ON
(t3.col_c = t1.col_c)
| COL_A | COL_C | COL_A | COL_B | COL_B | COL_C | COL_B | COL_C |
|-------|-------|-------|-------|-------|-------|-------|-------|
| 1 | 3 | 1 | 2 | 2 | 3 | 2 | 3 |
| 4 | 3 | 4 | 2 | 2 | 3 | 2 | 3 |
| 1 | 3 | 1 | 2 | 2 | 5 | 2 | 3 |
| 4 | 3 | 4 | 2 | 2 | 5 | 2 | 3 |
SELECT *
FROM table1 t1
JOIN table2 t2 ON
(t1.col_a = t2.col_a)
JOIN table3 t3 ON
(t2.col_b = t3.col_b AND
t3.col_c = t1.col_c)
| COL_A | COL_C | COL_A | COL_B | COL_B | COL_C |
|-------|-------|-------|-------|-------|-------|
| 4 | 3 | 4 | 2 | 2 | 3 |
| 1 | 3 | 1 | 2 | 2 | 3 |
The number of rows retrieved is different and so count(*) is different.
The usage of the aliases was surprising. at least for me.
The following query works because t1 in the where_clause references table2.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_b<0;
The following query works because t1 in the where_clause references table1.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_c<0;
The following query raises an error because both table1 and table2 contain a column col_a.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_a<0;
The error thrown is
ORA-00918: column ambiguously defined
The following query works, the alias t1 refers to two different tables in the same where_clause.
select *
from table1 t1 join table2 t1 on(1=1)
where t1.col_b<0 and t1.col_c<0;
These and more examples can be found here: http://sqlfiddle.com/#!4/84feb/12
The smallest counter example
The smallest counter example is
table1
col_a col_c
1 2
table2
col_a col_b
1 3
table3
col_b col_c
3 5
6 2
Here the second query has an empty result set and the first query returns one row. It can be shown that the count(*) of the second query never exeeds the count(*)of the first query.
A more detailed explanation
This behaviour will became more clear if we analyze the following statement in detail.
SELECT t.col_b, t.col_c
FROM table1 t
JOIN table2 t ON
(t.col_b = t.col_c) ;
Here is the reduced syntax for this query in Backus–Naur form derived from the syntax descriptions in the SQL Language Reference of Oracle 12.2. Note that under each syntax diagram there is a link to the Backus–Naur form of this diagram, e.g Description of the illustration select.eps. "reduced" means that I left out all the possibilities that where not used, e,g. the select is defined as
select::=subquery [ for_update_clause ] ;
Our query does not use the optional for_update_clause, so I reduced the rule to
select::=subquery
The only exemption is the optional where-clause. I didn't remove it so that this reduced rules can be used to analyze the above query even if we add a where_clause.
These reduced rule will define only a subset of all possible select statements.
select::=subquery
subquery::=query_block
query_block::=SELECT select_list FROM join_clause [ where_clause ]
join_clause::=table_reference inner_cross_join_clause ...
table_reference::=query_table_expression t_alias query_table_expression::=table
inner_cross_join_clause::=JOIN table_reference ON condition
So our select statement is a query_block and the join_clause is of type
table_reference inner_cross_join_clause
where table_reference is table1 t and inner_cross_join_clause is JOIN table2 t ON (t.col_b = t.col_c). The ellipsis ... means that there could be additional inner_cross_join_clauses, but we do not need this here.
in the inner_cross_join_clause the alias t refers to table2. Only if these references cannot be satisfied the aliasmust be searched in an outer scope. So all the following expressions in the ONcondition are valid:
t.col_b = t.col_c
Here t.col_b is table2.col_b because t refers to the alias of its inner_cross_join_clause, t.col_c is table1.col_c. t of the inner_cross_join_clause (refering to table2) has no column col_c so the outer scope will be searched and an appropriate alias will be found.
If we have the clause
t.col_a = t.col_a
the alias can be found as alias defined in the inner_cross_join_clause to which this ON-condition belongs so t will be resolved to table2.
if the select list consists of
t.col_c, t.col_b, t.col_a
instead of * then the join_clause will be searched for an alias and t.col_c will be resolved to table1.col_c (table2 does not contain a column col_c), t.col_b will be resolved to table2.col_b (table1 does not contain a col_b) but t.col_a will raise the error
ORA-00918: column ambiguously defined
because for the select_list none of the aias definition has a precedenve over the other. If our query also has a where_clause then the aliases are resolved in the same way as if they are used in the select_list.

With more data, it will produce different results.
Your colleagues query is same as this.
select * from table3 where t3.col_b = 'XX'
union
select * from table3 where t3.col_c = 'YY'
or
select * from table3 where t3.col_b = 'XX' or t3.col_c = 'YY'
while your query is like this.
select * from table3 where t3.col_b ='XX' and t3.col_c='YY'
First one is like data where (xx or yy) while second one is data where ( xx and yy)

PostgreSQL LEFT OUTER JOIN query syntax

Lets say I have a table1:
id name
-------------
1 "one"
2 "two"
3 "three"
And a table2 with a foreign key to the first:
id tbl1_fk option value
-------------------------------
1 1 1 1
2 2 1 1
3 1 2 1
4 3 2 1
Now I want to have as a query result:
table1.id | table1.name | option | value
-------------------------------------
1 "one" 1 1
2 "two" 1 1
3 "three"
1 "one" 2 1
2 "two"
3 "three" 2 1
How do I achieve that?
I already tried:
SELECT
table1.id,
table1.name,
table2.option,
table2.value
FROM table1 AS table1
LEFT outer JOIN table2 AS table2 ON table1.id = table2.tbl1fk
but the result seems to omit the null vales:
1 "one" 1 1
2 "two" 1 1
1 "one" 2 1
3 "three" 2 1
SOLVED: thanks to Mahmoud Gamal: (plus the GROUP BY)
Solved with this query
SELECT
t1.id,
t1.name,
t2.option,
t2.value
FROM
(
SELECT t1.id, t1.name, t2.option
FROM table1 AS t1
CROSS JOIN table2 AS t2
) AS t1
LEFT JOIN table2 AS t2 ON t1.id = t2.tbl1fk
AND t1.option = t2.option
group by t1.id, t1.name, t2.option, t2.value
ORDER BY t1.id, t1.name

You have to use CROSS JOIN to get every possible combination of name from the first table with the option from the second table. Then LEFT JOIN these combination with the second table. Something like:
SELECT
t1.id,
t1.name,
t2.option,
t2.value
FROM
(
SELECT t1.id, t1.name, t2.option
FROM table1 AS t1
CROSS JOIN table2 AS t2
) AS t1
LEFT JOIN table2 AS t2 ON t1.id = t2.tbl1_fk
AND t1.option = t2.option
SQL Fiddle Demo

Simple version: option = group
It's not specified in the Q, but it seems like option is supposed to define a group somehow. In this case, the query can simply be:
SELECT t1.id, t1.name, t2.option, t2.value
FROM (SELECT generate_series(1, max(option)) AS option FROM table2) o
CROSS JOIN table1 t1
LEFT JOIN table2 t2 ON t2.option = o.option AND t2.tbl1_fk = t1.id
ORDER BY o.option, t1.id;
Or, if options are not numbered in sequence, starting with 1:
...
FROM (SELECT DISTINCT option FROM table2) o
...
Returns:
id | name | option | value
----+-------+--------+-------
1 | one | 1 | 1
2 | two | 1 | 1
3 | three | |
1 | one | 2 | 1
2 | two | |
3 | three | 2 | 1
Faster and cleaner, avoiding the big CROSS JOIN and the big GROUP BY.
You get distinct rows with a group number (grp) per set.
Requires Postgres 8.4+.
More complex: group indicated by sequence of rows
WITH t2 AS (
SELECT *, count(step OR NULL) OVER (ORDER BY id) AS grp
FROM (
SELECT *, lag(tbl1_fk, 1, 2147483647) OVER (ORDER BY id) >= tbl1_fk AS step
FROM table2
) x
)
SELECT g.grp, t1.id, t1.name, t2.option, t2.value
FROM (SELECT generate_series(1, max(grp)) AS grp FROM t2) g
CROSS JOIN table1 t1
LEFT JOIN t2 ON t2.grp = g.grp AND t2.tbl1_fk = t1.id
ORDER BY g.grp, t1.id;
Result:
grp | id | name | option | value
-----+----+-------+--------+-------
1 | 1 | one | 1 | 1
1 | 2 | two | 1 | 1
1 | 3 | three | |
2 | 1 | one | 2 | 1
2 | 2 | two | |
2 | 3 | three | 2 | 1
-> SQLfiddle for both.
How?
Explaining the complex version ...
Every set is started with a tbl1_fk <= the last one. I check for this with the window function lag(). To cover the corner case of the first row (no preceding row) I provide the biggest possible integer 2147483647 the default for lag().
With count() as aggregate window function I add the running count to each row, effectively forming the group number grp.
I could get a single instance for every group with:
(SELECT DISTINCT grp FROM t2) g
But it's faster to just get the maximum and employ the nifty generate_series() for the reduced CROSS JOIN.
This CROSS JOIN produces exactly the rows we need without any surplus. Avoids the need for a later GROUP BY.
LEFT JOIN t2 to that, using grp in addition to tbl1_fk to make it distinct.
Sort any way you like - which is possible now with a group number.

try this
SELECT
table1.id, table1.name, table2.option, table2.value FROM table1 AS table11
JOIN table2 AS table2 ON table1.id = table2.tbl1_fk

This is enough:
select * from table1 left join table2 on table1.id=table2.tbl1_fk ;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

get name of tuples that match an entire column in another table - sql

Related

Compare columns from 2 different tables with only last inserted values in table_2 in SQL Server

sql synthesis query

SQL Join On Columns of Different Length

Using the same table alias twice in a query

PostgreSQL LEFT OUTER JOIN query syntax

Categories

Resources