'Implicit' JOIN based on schema's foreign keys? - sql

Hello all :) I'm wondering if there is way to tell the database to look at the schema and infer the JOIN predicate:
+--------------+ +---------------+
| prices | | products |
+--------------+ +---------------+
| price_id (PK)| |-1| product_id(PK)|
| prod_id |*-| | weight |
| shop | +---------------+
| unit_price |
| qty |
+--------------+
Is there a way (preferably in Oracle 10g) to go from:
SELECT * FROM prices JOIN product ON prices.prod_id = products.product_id
to:
SELECT * FROM pricesIMPLICIT JOINproduct

The closest you can get to not writing the actual join condition is a natural join.
select * from t1 natural join t2
Oracle will look for columns with identical names and join by them (this is not true in your case). See the documentation on the SELECT statement:
A natural join is based on all columns in the two tables that have the same name. It selects rows from the two tables that have equal values in the relevant columns. If two columns with the same name do not have compatible data types, then an error is raised
This is very poor practice and I strongly recommend not using it on any environment

You shouldnt do that. Some db systems allow you to but what if you modify the fk's (i.e. add foreign keys)? You should always state what to join on to avoid problems. Most db systems won't even allow you to do an implicit join though (good!).

Related

In Oracle SQL is there a way to join on a value twice?

Lets say I have two tables, with the following columns:
cars
motorcycle_id | fuel_id | secondary_fuel_id ...
fuel_types
fuel_id | fuel_label | ...
In this case fuel_id and secondary fuel_id both refer to the fuel_types table.
Is it possible to include both labels in an inner join? I want to join on the fuel_id but I want to be able to have the fuel label twice as a new column. So the join would be something like:
motorcycle_id | fuel_id | fuel_label | secondary_fuel_id | secondary_fuel_label | ...
In this case I would have created the secondary_fuel_label column.
Is this possible to do in SQL with joins? Is there another way to accomplish this?
You would just join twice:
select c.*, f1.fuel_label, f2.fuel_label as secondary_fuel_label
from cars c left join
fuel_types f1
on c.fuel_id = f1.fuel_id left join
fuel_types f2
on c.fuel_id = f1.secondary_fuel_id ;
The key point here is to use table aliases, so you can distinguish between the two table references to fuel_types.
Note that this uses left join to be sure that rows are returned even if one of the ids are missing.

Trying to avoid duplicated records from a query

I have 2 tables with the following structure:
------------------------------------
| dbo.Katigories | dbo.Products |
|-----------------|------------------|
| product_id | product_id |
| Cat_Main_ID | other data..... |
| Cat_Sub_ID | other data..... |
| Cat_Sub_Sub_ID | other data..... |
| other data..... | other data..... |
I want to retrieve all the products from the dbo.Products table, having the same Cat_Main_ID and the same Cat_Sub_ID. To do that, I have the following SELECT statement:
SELECT * FROM dbo.katigories, dbo.Products
WHERE
dbo.katigories.Cat_Main_ID = (the Cat_Main_ID – exists_in-my url - query string)
AND
dbo.katigories.Cat_Sub_ID = (the Cat_Sub_ID – exists_in-my url - query string)
AND
dbo.katigories.product_id = dbo.Products.product_id
Unfortunately, this SELECT statement giving me duplicated records of products.
I know why this is happening: The reason is that some of the products belong simultaneously to many categories or subcategories. What I do not know is the way I can manage to get only unique records from the Products table. Only the unique product_id without duplicated.
Can someone please help with the correct syntax of my query?
In SQL Server, you can use this trick:
SELECT TOP (1) WITH TIES *
FROM dbo.katigories k JOIN
dbo.Products p
ON k.product_id = p.product_id
WHERE k.Cat_Main_ID = (the Cat_Main_ID – exists_in-my url - query string) AND
k.Cat_Sub_ID = (the Cat_Sub_ID – exists_in-my url - query string)
ORDER BY ROW_NUMBER() OVER (PARTITION BY p.product_id ORDER BY NEWID());
In other databases, you would do the some thing very similar with ROW_NUMBER() in a subquery or CTE.
Notes:
SELECT * is dangerous, because you have columns with the same names.
Always use correct, proper, standard, explicit JOIN syntax. Never use commas in the FROM clause.
Table aliases make a query easier to write and to read.
I think you can add the directive 'DISTINCT' after the directive 'SELECT'

update a single column with join lookups

I have a table adjustments with columns adjustable_id | adjustable_type | order_id
order_id is the target column to fill with values, this value should come from another table line_items which has a order_id column.
adjustable_id (int) and _type (varchar) references that table.
table: adjustments
id | adjustable_id | adjustable_type | order_id
------------------------------------------------
100 | 1 | line_item | NULL
101 | 2 | line_item | NULL
table: line_items
id | order_id | other | columns
--------------------------------
1 | 10 | bla | bla
2 | 20 | bla | bla
In the case above I guess I need a join query to update adjustments.order_id first row with value 10, second row with 20 and so on for the other rows using Postgres 9.3+.
In case the lookup fails, I need to delete invalid adjustments rows, for which they have no corresponding line_items.
There are two ways to do this. The first one using a co-related sub-query:
update adjustments a
set order_id = (select lorder_id
from line_items l
where l.id = a.adjustable_id)
where a.adjustable_type = 'line_item';
this is standard ANSI SQL as standard SQL does not define a join condition for the UPDATE statement.
The second way is using a join, which is a Postgres extension to the SQL standard (other DBMS also support that but with different semantics and syntax).
update adjustments a
set order_id = l.order_id
from line_items l
where l.id = a.adjustable_id
and a.adjustable_type = 'line_item';
The join is probably the faster one. Note that both versions (especially the first one) will only work if the join between line_items and adjustments will always return exactly one row from the line_items table. If that is not the case they will fail.
The reason why Arockia's query was "eating your RAM" is that his/her query creates a cross-join between table1 and table1 which is then joined against table2.
The Postgres manual contains a warning about that:
Note that the target table must not appear in the from_list, unless you intend a self-join
update a set A.name=B.name from table1 A join table2 B on
A.id=B.id

With SQL, how can I find non-matches in a single table many-to-many relation?

I have a database that, among other things, records the results of reactions between ingredients. It's currently structured with the following three tables:
| Material |
|----------------|
| id : Integer |
| name : Varchar |
| Reaction |
|-----------------|
| id : Integer |
| <other details> |
| Ingredient |
|-----------------------|
| material_id : Integer |
| reaction_id : Integer |
| quantity : Real |
This maps the many-to-many relationship between materials and reactions.
I would like to run a query that returns every pair of materials that do not form a reaction. (i.e. every pair (x, y) such that there is no reaction that uses exactly x and y and no other materials.) In other circumstances, I would do this with a LEFT JOIN onto the intermediate table and then look for NULL reaction_ids. In this case, I'm getting the pairs by doing a CROSS JOIN on the materials table and itself, but I'm not sure how (or whether) doing two LEFT JOINs onto the two materials aliases can work.
How can this be done?
I'm most interested in a generic SQL approach, but I'm currently using SQLite3 with SQLAlchemy. I have the option of moving the database to PostgreSQL, but SQLite is strongly preferred.
Use a cross join to generate the list and then remove the pairs that are in the same reaction.
select m.id, m2.id as id2
from materials m cross join
materials m2
where not exists (select 1
from ingredient i join
ingredient i2
on i.reaction_id = i2.reaction_id and
i.material_id = m.id and
i2.material_id = m2.id
);
Although this query looks complicated, it is essentially a direct translation of your question. The where clause is saying that there are not two ingredients for the same reaction that have each of the materials.
For performance, you want an index on ingredient(reaction_id, material_id).
EDIT:
If you like, you can do this without an exists, using a left join and where:
select m.id, m2.id
from materials m cross join
materials m2 left join
ingredients i
on i.material_id = m.id left join
ingredients i2
on i2.material_id = m2.id and
i2.reaction_id = m2.reaction_id
where i2.reaction_id is null;

how to optimize a left join query?

I have two tables, jos_eimcart_customers_addresses and jos_eimcart_customers. I want to pull all records from the customers table, and include address information where available from the addresses table. The query does work, but on my localhost machine it took over a minute to run. On localhost, the tables are about 8000 rows each, but in production the tables could have upwards of 25,000 rows each. Is there any way to optimize this so it doesn't take as long? Both tables have an index on the id field, which is primary key. Is there some other index I need to create that would help this run faster? Should the addresses table have an index on the customer_id field, since it's a foreign key? I have other database queries that are similar and run on much larger tables, more quickly.
(EDITED TO ADD: There can be more than one address record per customer, so customer_id is not a unique value in the addresses table.)
select
c.firstname,
c.lastname,
c.email as customer_email,
a.email as address_email,
c.phone as customer_phone,
a.phone as address_phone,
a.company,
a.address1,
a.address2,
a.city,
a.state,a.zip,
c.last_signin
from jos_eimcart_customers c
left join jos_eimcart_customers_addresses a
on c.id = a.customer_id
order by c.last_signin desc
EDITED TO ADD: Explain results
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
==========================================================================================
1 | SIMPLE | c | ALL | NULL | NULL| NULL |NULL |6175 |Using temporary; Using filesort
---------------------------------------------------------------------------------------
1 | SIMPLE | a | ALL | NULL | NULL| NULL |NULL |8111 |
You should create an index on a.customer_id. It doesn't need to be a unique index, but it should definitely be indexed.
Try creating an index and see if it is faster. For further optimisation, you can use SQL's EXPLAIN to see if your query is using indexes where it should be.
Try http://www.dbtuna.com/article.asp?id=14 and http://www.devshed.com/c/a/MySQL/MySQL-Optimization-part-1/2/ for a bit of info on EXPLAIN.
Short answer: Yes, customer_id should have index.
Better answer: It would be best to find a query analyzer for MySql and use it to determine what the actual cause of the slow down is.
For example you could put EXPLAIN before your select and see what the results is.
Optimizing MySQL: Queries and Indexes