Why can't we use/refer to derived table in subquery - sql

I wonder why can't a subquery use a derived table as its derived table (Query2), even though it can access the attribute of the derived table (Query1)?
Query1
select B.col1, col2 from dummy B
where B.col1 = (select col1 from dummy A where B.col1 = 'aa' and A.col1 = B.col1 limit 1);
Here we can use B.col1.
Query2
select B.col1, col2 from dummy B
where B.col1 = (select col1 from B where B.col1 = 'aa' limit 1);
Here we can't use B. The error says B doesn't exist.
You can find the sqlfiddle here.

Because the first reference is a correlated reference. A table alias is a reference to a particular "instance" of a table in the query, not another "table" itself.
If you want that ability, use a CTE:
with b as (
select d.*
from dummy d
)
Then you can use b multiple times in the ensuing query.

The problem is that in your second query you are trying to select from the table that does not exist and in your first query you are referring to the column that does.
In both query's the B and the A letters are only aliases not tables.
Having that in mind, lets remove the aliases in the second query:
select col1, col2 from dummy
where col1 = (select col1 from where col1 = 'aa' limit 1);
Do you see something missing now in the query structure ?
To conclude, from your question it seems you do not understand which B is the problem. This one: FROM B and not this one where B.col1.
Hope this helps...

Related

filtering for column pairs in SQL

consider two tables that have the following columns
A
1,X
2,Y
3,Z
B
1,X
1,Y
1,Z
2,X
2,Y
2,Z
3,X
3,Y
3,Z
is it possible select rows in B that have column pairs as in A without joining or a third column?
something like
select * from B where distinct columns in (select distinct columns from A)
You could use exists logic:
SELECT col1, col2
FROM TableB b
WHERE EXISTS (SELECT 1 FROM TableA a WHERE a.col1 = b.col1 AND a.col2 = b.col2);

Not able to join two tables with limit in Postgres

I have table A with col1,col2,col3 and Table B col1.
I want to join both tables using the limit
I want some thing like
select a.col1,a.col2,a.col3,b.col1
from tableA a, tableB b limit 5 and a.col1 between 1 AND 10;
So I have 10 records in table b and 10 in table a. I should get total of 50 records by limiting only 5 records from table b
Your description translates to a CROSS JOIN:
SELECT a.col1, a.col2, a.col3, b.b_col1 -- unique column names
FROM tablea a
CROSS JOIN ( SELECT col1 AS b_col1 FROM tableb LIMIT 5 ) b;
-- WHERE a.col1 BETWEEN 1 AND 10; -- see below
... and LIMIT for tableb like a_horse already demonstrated. LIMIT without ORDER BY returns arbitrary rows. The result can change from one execution to the next.
To select random rows from tableb:
...
CROSS JOIN ( SELECT col1 AS b_col1 FROM tableb ORDER BY random() LIMIT 5) b;
If your table is big consider:
Best way to select random rows PostgreSQL
While you ...
have 10 records in ... table a
... the added WHERE condition is either redundant or wrong to get 50 rows.
And while SQL allows it, it rarely makes sense to have multiple result columns of the same name. Some clients throw an error right away. Use a column alias to make names unique.
You need a derived table (aka "sub-query") for that. In the derived table, you can limit the number of rows.
select a.col1, a.col2, b.col3, b.col1
from tablea a
join (
select b.col3, b.col1
from tableb
limit 5 -- makes no sense without an ORDER BY
) b on b.some_column = a.some_column --<< you need a join condition
where a.col1 between 1 and 10;
Note that using LIMIT without an ORDER BY usually makes no sense.

Is it necessary to reduce update times even use group by statement?

There are two tables
Table A col1,col2,col3
100,200,aaa;
101,200,bbb;
102,200,ccc;
Table B col1,col2,col3
aaa,1,ok;
aaa,2,ok;
aaa,3,ok;
bbb,1,fine;
bbb,3,fine;
Assume table A is a very large table and table B is a small table. In table B, col1 only have one col3 value, e.g, if col1 is 'aaa', col3 must be 'ok'
case 1:
update a set a.col2 = b.col3
from A a, B b
where a.col3 = b.col1
case 2:
update a set a.col2 = b.col3
from A a, (select col1, col3 from B group by col1,col3) b
where a.col3 = b.col1
The result of case 1 and case 2 are the same, but I just want to ask which statement is better? Whether case 1 will update table A for 5 times? Will the group by statement in case 2 consume more calcuation?
You should run EXPLAIN on both these queries to see how your database is actually handling things. That being said, one thing does stand out in terms of performance. In your first query:
update a set a.col2 = b.col3
from A a, B b
where a.col3 = b.col1
you are joining table A with B via the col3 and col1 columns. If there were an index on B.col1 then the join could proceed much faster than if the database were forced to do a full table scan of B. But an index on B.col1 probably would not help in your second query:
update a set a.col2 = b.col3
from A a, (select col1, col3 from B group by col1,col3) b
where a.col3 = b.col1
Here you are joining A to a table derived from B and as such no index is likely available. So I would opt for your first query.
By the way, you are using the old pre ANSI-92 syntax for joining in your first query and you might want to update it.
Since these 2 statements are logically equal (result wise) they might have the same execution plan and therefore have the same performance.
Different execution plans might give an advantage to each of the statements.
I would like to emphasize one thing -
Nested-loops is not the only option to implement JOIN and in databases that support HASH JOIN they are rarely used for equality JOIN therefore the all way you are thinking about what is going here needs to be revised.
Thank you guys, according to sql execution plan, it will dedup data going to update at background so no need to distinct manually, see below screenshot.
sql server automatically sort/distinct

Is there a way to do a multi table query and get result just from specific tables?

I am trying to do a multi query but I don't want to use sub queries i.e:
SELECT column1
FROM table1
WHERE
EXISTS (SELECT column1 FROM table2 WHERE table1.column1 = table2.column1);)
I thought of using a JOIN but so far my best result was this:
SELECT *
FROM table1
JOIN table2 ON table1.t1id = table2.t2id
WHERE table1.id = 5;
This would be good except of the fact that I get a duplicate column (the id in table 1 and 2 are foreign keys).
How do I remove the duplicate column if possible?
UPDATE:
Table1:
tableA_ID, TABLEB_ID
1, 1
1, 4
3, 2
4, 3
TableA: ID, COL1, COL2
1, A, B
2, A, B
3, A, B
4, A, B
TableB: ID, Col3, COL4
1, C, D
2, C, D
3, C, D
4, C, D
I want to get all or some of the columns from TableA according to a condition
Sample: Lets say the condition is that tableA_ID = 1 which will result in the 2 first rows in the table then I want to get all or some of the columns in TableA that respond to the ID that I got from Table1.
Sample: The result from before was [{1,1}{1,4}] which means I want from TableA the results:
TableA.ID, TableA.COL1, TableA.COL2
1,A,B
4,A,B
The actual results I get is:
Table1.tableA_ID, Table1.TABLEB_ID, TableA.ID, TableA.COL1, TableA.COL2
1,1,1,A,B
1,4,4,A,B
Is this what you're looking for?
select a.id, a.column1, b.column2
from table1 a
left join table2 b on a.id = b.otherid;
You can't change the column list of a query based on the values it returns. It just isn't the way that SQL is designed to operate. At best, you can return all of the columns from the second table and ignore the ones that aren't relevant based on other values in that row.
I'm not even sure how a variable column list would work. In your scenario, you're looking for two discrete values separately. But that's not the only scenario: what if the condition is tableA_ID in (1,2). Would you want different numbers of columns in different rows as part of a single result set?
Getting just the columns you want (just from specific tables, as you say) is the easy part (btw -- don't use '*' if you can help it -- topic for another discussion):
SELECT
A.ID,
A.COL1,
A.COL2
FROM
TABLE1 Bridge
LEFT JOIN TABLEA A
ON Bridge.TABLEA_ID = A.ID
LEFT JOIN TABLEB B
ON Bridge.TABLEB_ID = B.ID
Getting the rows you want will be the harder part (influenced by your choice of joins, among several other things).
I think you'll need to select only the fields of table A and use a distinct clause. Rest of your query will remain as it is. i.e.
SELECT distinct table1.*
FROM table1
JOIN table2 ON table1.t1id = table2.t2id
WHERE table1.id = 5;

SQL query two tables with relation one-to-many

I have two tables A and Band the relation between A to B is A--->one-to-Many--->B
Normally i have one record of B for every record of A.
I am trying to write a query which would give me list of ONLY records of A which have more than ONE(MULTIPLE) record(s) in B.
I am very confused as I have only done basic sql queries and this one seems complex to me.
Could some one please guide me to correct answer or give me the solution.
edited:
ok i tried something like below and it gave me an error
SELECT SOME_COLUMN_NAME FROM A a, B b WHERE a.ID=b.ID and count(b.SOME_OTHER_COLUMN_NAME)>1;
ORA-00934: group function is not allowed here
I tried to search on the internet ad i am not allowed to use grouping in where clause and should go by using having. I am stuck here now.
You haven't specified which database system you are using (sql-server/mysql/sqlite/oracle etc) so this is a generic answer.
In this form, list out all the columns of A explicitly in the SELECT and GROUP BY clauses. It normally generates a good straightforward plan in most DBMS. But it can also fail miserably if the type is not GROUP-able, such as TEXT columns in SQL Server.
SELECT A.Col1, A.Col2, A.Col3
FROM A
JOIN B ON A.LinkID = B.LinkID
GROUP BY A.Col1, A.Col2, A.Col3
HAVING COUNT(*) > 1
This other form using a subquery works for any column types in A and normally produces exactly the same plan.
SELECT A.Col1, A.Col2, A.Col3
FROM A
WHERE 1 < (
SELECT COUNT(*)
FROM B
WHERE A.LinkID = B.LinkID)
You could do it with a sub-query:
select *
from A
where ( select count(*) from B where B.id = A.id ) > 1
select *
from tableA
where Id in (select IdA from tableb group by idA having COUNT(*)>1)
assuming tableB has a field called idA that links it to tableA