SQL statement from two tables - sql

I would like to know if it is possible, to select certain columns from one table, and another column from a second table, which would relate to a non imported column in the first table. I have to obtain this data from access, and do not know if this is possible with Access, or SQL in general.

Assuming the following table structure:
CREATE TABLE tbl_1 (
pk_1 int,
field_1 varchar(25),
field_2 varchar(25)
);
CREATE TABLE tbl_2 (
pk_2 int,
fk_1 int,
field_3 varchar(25),
field_4 varchar(25)
);
You could use the following:
SELECT t1.field_1, t2.field_3
FROM tbl_1 t1
INNER JOIN tbl_2 t2 ON t1.pk_1 = t2.fk_1
WHERE t2.field_3 = "Some String"
In regard to Bill's post, there are two ways to create JOIN's within SQL queries:
Implicit - The join is created using
the WHERE clause of the query with multiple tables being specified in the FROM clause
Explicit - The join is created using
the appropriate type of JOIN clause
(INNER, LEFT, RIGHT, FULL)
It is always recommended that you use the explicit JOIN syntax as implicit joins can present problems once the query becomes more complex.
For example, if you later add an explicit join to a query that already uses an implicit join with multiple tables referenced in the FROM clause, the first table referenced in the FROM clause will not be visible to the explicitly joined table.

What you are looking for are JOINs:
http://en.wikipedia.org/wiki/Join_(SQL)
You need primary keys for the referenced data sets and foreign keys in the first table.

I'm not 100% sure I understand your question.
Is the following true:
Your first table is imported from somewhere else.
You are only importing some columns.
You want to build a query which references a column which you haven't imported.
If this is true, it's just not possible. As far as the Access query engine in concerned the non-imported columns don't exist.
Why not just import them as well?

But primary keys make the query more efficient

Related

Is natural join the only elision of foreign key name?

Suppose you have a department table with DepartmentID as primary key, and an employee table with DepartmentID as a foreign key. You can then use the fact that these columns have the same name, to perform a natural join that allows you to omit the column name from the query. (I'm not commenting on whether you should or not - that's a matter of opinion - just noting the fact that this shorthand is part of SQL syntax.)
There are various other cases in SQL syntax where you might refer to the column names with expressions like employee.DepartmentID = department.DepartmentID. Are there any other cases where some kind of shorthand allows you to use the fact that the columns have the same name, to omit the column name?
SQL does not know directly about foreign keys; it just has foreign key constraints, which prevent you from creating invalid data. When you have a foreign key, you would want both a constraint and to do joins on it, but the database does not automatically derive one from the other.
Anyway, when you are using a join on two columns with the same names:
SELECT ...
FROM employee
JOIN department ON employee.DepartmentID = department.DepartmentID
then you can replace the ON clause with the USING clause:
SELECT ...
FROM employee
JOIN department USING (DepartmentID)
If there is a USING clause then each of the column names specified must exist in the datasets to both the left and right of the join-operator. For each pair of named columns, the expression "lhs.X = rhs.X" is evaluated for each row of the cartesian product as a boolean expression. Only rows for which all such expressions evaluates to true are included from the result set.
[…]
For each pair of columns identified by a USING clause, the column from the right-hand dataset is omitted from the joined dataset. This is the only difference between a USING clause and its equivalent ON constraint.
(Omitting the duplicate column matters only when you are using SELECT *. (I'm not commenting on whether you should or not – that's a matter of opinion – just noting the fact that this shorthand is part of SQL syntax.))

Automatic Normalization of VARCHAR columns?

Imagine you have many tables which have a VARCHAR(50) column called Country, there is a small number of different country names which repeat millions of times, the wise thing to do here is to create a table called dbo.Country(CountryID, CountryName) and have all the tables hold CountryID with a foreign key reference.
Problem is we have to JOIN all our queries with dbo.Country every time we want do something with that column.
But all the joins seem to follow the same pattern, so my question is, can SQL Server do it automatically? For example I would specify a column called CountryName in some table which looks like a VARCHAR but is actually stored as a CountryID with foreign key, and SQL Server could implicitly add the JOIN whenever necessary.
Is there such a feature in SQL Server or any other SQL database?
You can't do this "automatically". However, you do have a couple of options.
One is to create a view on top of the table that automatically does the join:
create view v_table as
select t.*, c.CountryName
from table t join
country c
on t.countryId = c.countryId;
Alternatively, you could make Country an enumerated type. That would allow it to be accessed as a string but stored as an integer.

How to use an expression in a join between two tables?

I have two tables. In the 1st table (transaction) there are 2 columns called supplier_code and local_commodity_code. In the 2nd table (local_feed_commodity_map) there are two columns called local_commodity_code and local_commodity_desc. In 1st table, the local_commodity_code field is made by concatenating the supplier_code from 1st table and local_commodity_code from the 2nd table.
I split the concatenated column by using the following code:
SELECT
SUBSTR(T.LOCAL_COMMODITY_CODE, 1, INSTR(T.LOCAL_COMMODITY_CODE,'~')-1) LOCAL_COM_CODE
FROM OYSTER_WEB3.TRANSACTION T
So, I have the column named local_com_code after splitting.
Now I want to join these two tables using the newly generated column (local_com_code) and the local_commodity_code column from the 2nd table. How can I do this only using SELECT statement because I don't have permission for create, insert or update table.
SELECT L.*, T.*
FROM (SELECT Supplier_Code,
Local_Commodity_Code,
SUBSTR(LOCAL_COMMODITY_CODE, 1, INSTR(LOCAL_COMMODITY_CODE,'~')-1)
LOCAL_COM_CODE
FROM OYSTER_WEB3.TRANSACTION
) T
JOIN Local_Feed_Commodity_Map L
ON L.Local_Commodity_Code = T.Local_Com_Code
Oracle has an aversion to the SQL standard 'AS' keyword in some locations, so I've not used it anywhere to maximize the chances of the code working.
However, as I noted in a comment to the question, this is an appalling piece of schema design and should be fixed. It is ludicrous to pessimize all queries that have to work between these two tables by requiring the use of SUBSTR and INSTR like that. The Local_Commodity_Code in the Transaction table should be identical to the Local_Commodity_Code in the Local_Feed_Commodity_Map table so that both the primary key and the foreign key columns can be properly indexed (and referential integrity enforced).

Compare rows in different table having same columns

I have 2 tables tbl_A and tbl_A_temp. Both the table have the same schema. their primary key differ since they are identity columns. Is there a way i can compare the two rows in these two table and get to know if they differ.I will be inserting data from tbl_A_temp to tbl_A, i need this compare just to make sure that I am not inserting any duplicate data in the main tables.
Regards,
Amit
I think this should work for you. Basically, since you don't have a primary key to join on, you'll need to perform a LEFT JOIN on all your other fields. If any are different, then the NULL check will be true:
SELECT t.*
FROM tbl_A_temp t
LEFT JOIN tbl_A a ON
t.field1=a.field1 AND t.field2=a.field2 AND ...
WHERE a.field1 IS NULL
I've also seen others use CHECKSUM, but have run into issues myself with it returning false positives.

When is it required to give a table name an alias in SQL?

I noticed when doing a query with multiple JOINs that my query didn't work unless I gave one of the table names an alias.
Here's a simple example to explain the point:
This doesn't work:
SELECT subject
from items
join purchases on items.folder_id=purchases.item_id
join purchases on items.date=purchases.purchase_date
group by folder_id
This does:
SELECT subject
from items
join purchases on items.folder_id=purchases.item_id
join purchases as p on items.date=p.purchase_date
group by folder_id
Can someone explain this?
You are using the same table Purchases twice in the query. You need to differentiate them by giving a different name.
You need to give an alias:
When the same table name is referenced multiple times
Imagine two people having the exact same John Doe. If you call John, both will respond to your call. You can't give the same name to two people and assume that they will know who you are calling. Similarly, when you give the same resultset named exactly the same, SQL cannot identify which one to take values from. You need to give different names to distinguish the result sets so SQL engine doesn't get confused.
Script 1: t1 and t2 are the alias names here
SELECT t1.col2
FROM table1 t1
INNER JOIN table1 t2
ON t1.col1 = t2.col1
When there is a derived table/sub query output
If a person doesn't have a name, you call them and since you can't call that person, they won't respond to you. Similarly, when you generate a derived table output or sub query output, it is something unknown to the SQL engine and it won't what to call. So, you need to give a name to the derived output so that SQL engine can appropriately deal with that derived output.
Script 2: t1 is the alias name here.
SELECT col1
FROM
(
SELECT col1
FROM table1
) t1
The only time it is REQUIRED to provide an alias is when you reference the table multiple times and when you have derived outputs (sub-queries acting as tables) (thanks for catching that out Siva). This is so that you can get rid of ambiguities between which table reference to use in the rest of your query.
To elaborate further, in your example:
SELECT subject
from items
join purchases on items.folder_id=purchases.item_id
join purchases on items.date=purchases.purchase_date
group by folder_id
My assumption is that you feel that each join and its corresponding on will use the correlating table, however you can use whichever table reference you want. So, what happens is that when you say on items.date=purchases.purchase_date, the SQL engine gets confused as to whether you mean the first purchases table, or the second one.
By adding the alias, you now get rid of the ambiguities by being more explicit. The SQL engine can now say with 100% certainty which version of purchases that you want to use. If it has to guess between two equal choices, then it will always throw an error asking for you to be more explicit.
It is required to give them a name when the same table is used twice in a query. In your case, the query wouldn't know what table to choose purchases.purchase_date from.
In this case it's simply that you've specified purchases twice and the SQL engine needs to be able to refer to each dataset in the join in a unique way, hence the alias is needed.
As a side point, do you really need to join into purchases twice? Would this not work:
SELECT
subject
from
items
join purchases
on items.folder_id=purchases.item_id
and items.date=purchases.purchase_date
group by folder_id
The alias are necessary to disambiguate the table from which to get a column.
So, if the column's name is unique in the list of all possible columns available in the tables in the from list, then you can use the coulmn name directly.
If the column's name is repeated in several of the tables available in the from list, then the DB server has no way to guess which is the right table to get the column.
In your sample query all the columns names are duplicated because you're getting "two instances" of the same table (purchases), so the server needs to know from which of the instance to take the column. SO you must specify it.
In fact, I'd recommend you to always use an alias, unless there's a single table. This way you'll avoid lots of problems, and make the query much more clear to understand.
You can't use the same table name in the same query UNLESS it is aliased as something else to prevent an ambiguous join condition. That's why its not allowed. I should note, it's also better to use always qualify table.field or alias.field so other developers behind you don't have to guess which columns are coming from which tables.
When writing a query, YOU know what you are working with, but how about the person behind you in development. If someone is not used to what columns come from what table, it can be ambiguous to follow, especially out here at S/O. By always qualifying by using the table reference and field, or alias reference and field, its much easier to follow.
select
SomeField,
AnotherField
from
OneOfMyTables
Join SecondTable
on SomeID = SecondID
compare that to
select
T1.SomeField,
T2.AnotherField
from
OneOfMyTables T1
JOIN SecondTable T2
on T1.SomeID = T2.SecondID
In these two scenarios, which would you prefer reading... Notice, I've simplified the query using shorter aliases "T1" and "T2", but they could be anything, even an acronym or abbreviated alias of the table names... "oomt" (one of my tables) and "st" (second table). Or, as something super long as has been in other posts...
Select * from ContractPurchaseOffice_AgencyLookupTable
vs
Select * from ContractPurchaseOffice_AgencyLookupTable AgencyLkup
If you had to keep qualifying joins, or field columns, which would you prefer looking at.
Hope this clarifies your question.