what does it mean to have an SQL FROM clause with no comma? - sql

I noticed today that this query
select * from table1 table2 where column_from_table1 = ?;
works. It works the same as (same columns return)
select * from table1 where column_from_table1 = ?;
Shouldn't the former be a syntax error? What is it interpreting table2 as?

Appears it's interpreting it as renaming the table, even though table2 exists it happily allows the rename, this also works:
select * from table1 asdf where asdf.column_from_table1 = ?;

select * from table1 table2 where column_from_table1 = ?;
table2 is working as a table alias for table1. It's not being used as the name of an object in the database at all. The fact that a table named table2 exists is wholly irrelevant to this query. Usually you'd see something like this:
select t.id, t.name from table1 t where t.column_from_table1 = ?;
Some RDBMSs require the as keyword, so you'll also see this:
SELECT t.id, t.name FROM table1 AS t WHERE t.column_from_table1 = ?;
Table aliases are useful for making queries with multiple tables easier to write, especially if they have shared column names which need to be qualified. They're also essential for self-joins where a table is joined to itself.
Example of a join using aliases:
SELECT t1.Id,
t1.Name as t1_Name
t2.Name as t2_Name
FROM table1 t1
JOIN table2 t2
ON t1.id = t2.id
WHERE t1.column_from_table1 = ?;
Or, for a self-join to look for duplicate Name values, for example:
SELECT t1.Name,
t1.Id
t2.Id as Dupe_Id
FROM table1 t1
JOIN table1 t2
ON t1.Name = t2.Name
WHERE t1.Id < t2.Id;
Notice that this query is referring to table1 twice and uses the aliases of t1 and t2 to differentiate which it's referring to.
Note that a comma join, such as FROM table1, table2 WHERE table1.id = table2.id is very old syntax that should be explicitly avoided when writing queries. The older syntax is difficult to read and maintain and doesn't support outer joins except by vender-specific extensions. The newer syntax with the JOIN keyword was introduced in standard SQL in 1992. There's no reason to still be using comma joins.

Related

SQL beginner question: unexpected behavior with where exists select 1

I started using SQL a week ago. I am sorry but I have a "why my code does not work" question.
Please look at the following three queries on table1 and table2.
A. Inner join (returned 2 row results)
select t1.*, t2.* from table1 t1, table2 t2
where t1.item = t2.item
and t1.something = t2.something
B. Subquery (returned 2 row results)
select t1.* from table1 t1
where exists (select 1 from table2 t2
where t1.item = t2.item
and t1.something = t2.something)
C. My code (Expected the same results as in A. "Inner join" but takes forever to return results)
select t1.*, t2.* from table1 t1, table2 t2
where exists (select 1 from table2 t2
where t1.item = t2.item
and t1.something = t2.something)
For your reference, # of rows for each table is the following.
select count(*) from table1 -- (100K)
select count(*) from table2 -- (10K)
Would somebody kindly educate me know why my code (C) does not work?
Thank you for your help in advance.
The problem with your (C) query is that the outer reference to table2 is completed unconstrained1. This means that you're effectively writing query B again but also cross joining that result to table2, meaning that you'll get not 2 results but 20000.
You should be using explicit join syntax. One of the advantages of this is that it forces you to think about the join conditions at the point of joining rather than having to remember to include them in the general where clause.
select t1.*, t2.*
from table1 t1
inner join table2 t2
on t1.item = t2.item
and t1.something = t2.something
It's an error to omit the on clause. It's never an error to forget to constrain a column in the where clause2.
1Just because you refer to table2 again inside your exists subquery, and even though you assign it the same t2 alias, that doesn't mean that they are the same reference. The two references to table2 are unrelated in any way.
2Of course, it's often a logical error to do this, but what I mean in this paragraph is specifically about error messages that the system will raise.

How to select data from the table not exist in another table sql

How to select data from the table not exist in another table sql. I've tried NOT IN and NOT EXIST methods. But it causes performance issues for large amount of data. Can anyone suggest a solution for this.?
Thanks in advance.
I’ve tried the following.
SELECT name
FROM table1
WHERE NOT EXISTS
(SELECT *
FROM table2
WHERE table1.name = table2.name)
And NOT IN Cases.
But performance issues while a for large number of data.
I think your table table1 and table2 have index on their name column, so you can try this:
SELECT name
FROM table1 t1 LEFT JOIN table2 t2 ON t1.name = t2.name
WHERE t2.id IS NULL
May be id column existed, if not, use t2.name as a replacement for t2.id
For this query:
SELECT name
FROM table1
WHERE NOT EXISTS
(SELECT *
FROM table2
WHERE table1.name = table2.name)
You want an index on table2(name).

SQL select rows based on 2 col criteria in a separate table

I wish to select some rows from a table based on values from another table:
Table1 (wish to select from here)
Columns Date, Name, Pay
Table2 (contains a 'list' that determines what is selected from Table1)
Columns Date, Name
The query I wish to write is to:
Select Date,Name,Pay from Table1 where Date,Name is present in Table2
I got as far as being able to do it on one value
SELECT Date,Name,Pay FROM Table1 WHERE Table1.Name IN (Select Table2.name from Table2)
but Im stuck with how to add the date qualifier. The names in either table are not unique, what makes them unique is the date and name combination.
If I understood your question clearly, you want to apply join
select t1.Date,t1.Name,t1.Pay FROM Table1 t1 inner join Table2 t2
ON t1.Name = t2.Name and t1.Date = t2.Date
The generic SQL solution uses exists:
Select Date, Name, Pay
from Table1 t1
where exists (select 1 from table2 t2 where t2.date = t1.date and t2.name = t1.name);
This will not match values in table 2 if they are NULL. For that, you would need a NULL-safe comparison operation. The ANSI standard is is not distinct from.
Some databases support in with tuples. In those databases, you can write:
Select Date, Name, Pay
from Table1 t1
where (t1.date, t1.name) in (select t2.date, t2.name from table2 t2);
Once again, this might have an issue with NULL values, depending on how you want to treat them.
Interestingly, you could extend your logic by using a correlated subquery:
SELECT Date, Name, Pay
FROM Table1 t1
WHERE t1.Name IN (Select t2.name from Table2 t2 where t2.date = t1.date);
Although this does what you want, I think the previous two approaches are clearer in their intent.
I should note that you could use a join for this. However, that would return duplicate values if you had duplicates in table2. For that reason, I prefer the exists or in methods, because these have no risk of duplicating values.
You can use alias (and instead of subquery a join ) for a more easy vision of your related table
SELECT a.Date, a.Name, a.Pay
FROM Table1 a
inner join Table2 b on a.name = b.name
in this case date is obtain from table1, changing the alias or addingi both column if you need more

Inner join with where conditions, which will excute first? Join or where conditions?

For example1:
select T1.*, T2.*
from TABLE1 T1, TABLE2 T2
where T1.id = T2.id
and T1.name = 'foo'
and T2.name = 'bar';
That will first join T1 and T2 together by id, then select the records that satisfy the name conditions?
Or select the records that satisfy the name condition in T1 or T2, then join those together?
And, Is there a difference in performance between example1 and example2(DB2)?
example2:
select *
from
(
select * from TABLE1 T1 where T1.name = 'foo'
) A,
(
select * from TABLE2 T2 where T2.name = 'bar'
) B
where A.id = B.id;
How the query will be executed depends on what the query planner does with it. Depending on the available indexes and how much data is in the tables the query plan may look different. The planner tries to do the work in the order that it thinks is most efficient.
If the planner does a good job, the plan for both queries should be the same, otherwise the first query is likely to be faster because the second would create two intermediate results that doesn't have any indexes.
Exemple 1 is more efficient because it has no embedded queries. About how the result set is build, I have no idea - I don't know DB2.

Rewrite SQL code SELECT block to simplify logic

I am trying to rewrite this block with simpler logic if this can be done. I am using it within a larger SELECT statement and I think IF I can simplify this block, I might be able to improve performance of my query.
proj_catg_type_id, proj_catg_id and proj_id are all PKs in their tables.
select t1.proj_catg_name
from table1 t1, table2 t2, table3 t3
where t2.proj_catg_type_id = t1.proj_catg_type_id
and t2.proj_catg_type_id = 213
and t3.proj_id = t2.proj_id
Without knowing the referential integrety rules and the logic behind the tables it is difficult to give a 100% correct answer. But just by looking to this statement the most simplified logic would be
select t1.proj_catg_name
from table1 t1
where t1.proj_catg_type_id = 213;
select t1.proj_catg_name
from table1 t1 inner join table2 t2
on t2.proj_catg_type_id=t1.proj_catg_type_id
where t2.proj_catg_type_id=213
and t3.proj_id=t2.proj_i
maybe? is t3 used outside this subselect?
If t3 is a table outside the selct you showed, then this is a correlated subquery which you should not be using at all, ever! That turns your query into a row-by agonizing row cursor.
Use derived tables or joins to get the results.
You don't give me enough code to write a specific solution for your problem, but let me give you an example:
SELECT
field1
, field2
, (SELECT t3.field3
FROM table2 t2
JOIN table3 t3 ON t2.id = t3.id
WHERE t4.somefield = t2.somefield)
FROM table1 t1
JOIn table4 t4 ON t1.id = t4.id
SELECT
field1
, field2
, t3.field3
FROM table1 t1
JOIn table4 t4
ON t1.id = t4.id
join (SELECT field3
FROM table2 t2
JOIN table3 t3 ON t2.id = t3.id) a
ON t4.somefield = t2.somefield
The first query runs one row at a time which is extremely slow. The second should give the same results but runs in a set-based fashion which is much faster. It is important to make sure the derived table has an a alias. You could also use a CTE.