SQL: Combining Join and where - sql

I just started today with SQL and have following (probably quite newbie) question:
Given two Data Bases Data1 & Data2 having the same number of rows and an identical first column. I want to get all the columns from Data1, but only the rows that meet a condition involving columns of Data2.
I tried something like
SELECT
column2
column3
...
FROM
Data1
INNER JOIN Data2 ON Data1.column1 = Data2.column1
WHERE 'condition1 involving columns in Data2',
'condition2 involving columns in Data2',
...
;
This does not give me the column1, though. If I include it in the select statement above it throws an error 'Column reference column1 is ambiguous'.
Any ideas what is going on?

SELECT Data1.column1
FROM Data1 INNER JOIN Data2
WHERE Data1.column1 = Data2.column2
AND 'SOME CONDITION IS MET'
The key here is that your select needs to define which database column 1 is being selected from.
WHERE allows you to pair the two databases on their primary key (I assume)
AND allows you to add multiple conditions to your select.

The problem is that you are joining two tables with a column with the same name. So in those cases you must prefix the column name with the table name or alias (and it is a good idea to avoid future errors to prefix column names even if there are no repeated names). Data1.column1 in your case.

Basically, you should always precede column names with table names they belong to, or - a better option - use a table alias. For example:
select a.column1,
b.column2,
b.column3
from table1 a join table2 b on a.id = b.id
where b.some_column = 20
Doing so, there won't be any ambiguity there.

Related

How to JOIN ON something that isn't an 'EQUAL' value?

I was wondering on how to JOIN on something that isn't an equal sign. For example, I have a few tables, all with IDs, and I can easily do the following (for equals):
LEFT JOIN ON ID1 = ID2
The above example works perfect when columns have an exact match.
But some columns, instead of having a single ID, have multiple IDs, and weird separator, for example:
Table A
ID
ID7523
ID8891
ID7463
ID5234
ID7562
As you can see, Table A has individual IDs only - works great for exact join matches (=). There are no "splits" in table A, all exact matches.
TableB
ID
ID5234 -- ID7562
ID7523
ID8891
ID7463
ID5234 -- ID7562
ID7562 -- ID5234
There's a space and two dashes and another space between some of these IDs, called 'splits', and to make matters worse, sometimes they list one ID first, sometimes they list it last (not sure if that matters yet).
I do not have the ability to edit any of the tables.
Is there any way to join the ones with the dashes also?
Thanks!
LEFT JOIN ID1 -- ID2
Received error: An expression of non-boolean type specified in a context where a condition is expected
At this point, I'm not worried about all of the logic, but just connecting the tables together.
Fuzzymatch thinking...
Is it possible to think outside the box and use something like a JOIN where the ON has a CONTAINS or LIKE statement?
UPDATE... it is possible.
Ref: Using JOIN Statement with CONTAINS function
That is 100% possible, although, inefficient from a querying perspective.
Firstly, a JOIN implies equality so using something like a JOIN ON ColumnA LIKE ColumnB is not going to be permissible (at least not with ANSI SQL - there may be some proprietary commands I'm not aware of). What you can do however create a brand new set for Table2 including a user-defined Column B within memory and use this new altered foreign key to JOIN your tables.
So for instance instead of:
SELECT TABLE1.*, TABLE2.*
FROM TABLE1
JOIN TABLE2 ON ID1 = ID2
Do something like:
SELECT TABLE1.*, TABLE2_MODIFIED.*
FROM TABLE1
JOIN (SELECT TABLE2.*, LEFT(ID2, 6) AS new_id FROM TABLE2) TABLE2_MODIFIED ON ID1 = new_id
So what this does is create a temporary in-memory subset of TABLE2 (called a derived table) with a user-defined field that trims everything to the right of the first 6 characters of the ID2 field. At that point you have two keys that are ready for a typical JOIN.
If the RDBMS type you are using doesn't have a LEFT function, see if SUBSTRING, TRIM or even a CASE function will work for you. But, ultimately, if you need to join two sets and your foreign keys aren't equal, you want to redefine one of your sets to make them equal as needed.

How to drop one join key when joining two tables

I have two tables. Both have lot of columns. Now I have a common column called ID on which I would join.
Now since this variable ID is present in both the tables if I do simply this
select a.*,b.*
from table_a as a
left join table_b as b on a.id=b.id
This will give an error as id is duplicate (present in both the tables and getting included for both).
I don't want to write down separately each column of b in the select statement. I have lots of columns and that is a pain. Can I rename the ID column of b in the join statement itself similar to SAS data merge statements?
I am using Postgres.
Postgres would not give you an error for duplicate output column names, but some clients do. (Duplicate names are also not very useful.)
Either way, use the USING clause as join condition to fold the two join columns into one:
SELECT *
FROM tbl_a a
LEFT JOIN tbl_b b USING (id);
While you join the same table (self-join) there will be more duplicate column names. The query would make hardly any sense to begin with. This starts to make sense for different tables. Like you stated in your question to begin with: I have two tables ...
To avoid all duplicate column names, you have to list them in the SELECT clause explicitly - possibly dealing out column aliases to get both instances with different names.
Or you can use a NATURAL join - if that fits your unexplained use case:
SELECT *
FROM tbl_a a
NATURAL LEFT JOIN tbl_b b;
This joins on all columns that share the same name and folds those automatically - exactly the same as listing all common column names in a USING clause. You need to be aware of rules for possible NULL values ...
Details in the manual.

SQL Join for cell content, not column name

I read up on SQL Join but as far as I understand it, you can only join tables which have a column name in common.
I have information in two different tables, but the column name is different in each. I need to pull information on something which is only in one of the tables, but also need information from the other. So was looking to join/merge them.
Here is what I mean..
TABLE1:
http://postimg.org/image/hnd63c2f5/
The cell content 18599 in column from_pin_id also pertains to content in another table:
TABLE2:
http://postimg.org/image/apmu26l5z/
My question is how do I merge the two table details so that it recognizes 18599 is referring to the same thing, so that I can pull content on it from other columns in TABLE2?
I've looked through the codes on W3 but cannot find anything to what I need, as mentioned above, it seems to be just for joining tables with a common column:
SELECT column_name(s)
FROM table1
JOIN table2
ON table1.column_name=table2.column_name;
You can write as :
select * from table1
where from_pin_id in
(
select from_pin_id
from table1
intersect
select id
from table2
)
Intersect operator selects all elements that belong to both of the sets.
Change the table names and the columns that you select as needed.
SELECT table1.id, table1.owner_user_id, table1.from_pin_id, table2.board_id
FROM table1
JOIN table2 ON table1.from_pin_id = table2.id
GROUP BY id, owner_user_id, from_pin_id, board_id

Natural join on in SQL

TableA TableB
Column1 Column2 Column3 Column4
1 2 1 3
I have two table TableA(Column1,Column2) and TableB(Column3,Column4).I want to join two table using column1 ,column4(LIKE NATURAL JOIN). Is in SQL any things to join two table and return a new table with deleting repeated columns?
I want select this:
column1 column2 column4
1 2 3
DBMSes that support NATURAL JOIN require the column names of the join keys to match, and if you do SELECT * you will get only the unique column names. It doesn't make sense to try to specify column names, because the whole thing works by the names already being the same.
You MUST have same-named columns between the two tables, as it will use every same-named column between them to perform the join. Your tables TableA and TableB are unsuitable for a natural join as they don't share any column names.
So you are relegated to doing a regular join:
SELECT
A.*, -- you can at least get all the columns from one table
B.Column4 -- but you have to specify the rest one at a time
FROM
TableA A
INNER JOIN TableB B
ON A.Column1 = B.Column3
;
You just have to bite the bullet and write the query. You may want to not have to write the column names, but that's just not possible.
Some notes: When you say "return a new table", I think I know what you mean, but technically it is a rowset since to be a table it would have to be stored in the database with a name.
It may be possible to alias the column in a view or inline derived table, but you haven't told us what specific DBMS you're using so we can answer for its exact capabilities. It might look something like this:
SELECT
*
FROM
TableA A
NATURAL JOIN (
SELECT Column1 = Column3, Column4
FROM TableB B
) B
;
But notice that you still have to list all the other columns in TableB in order to do this. And I'm not even sure it works.
Joining two tables and querying on some or all columns doesn't return you a new table but record set. To get what you wanted try this. Below query adheres to SQL standard and thus should work on all SQL compliant databases.
SELECT ta.column1, ta.column2, tb.column4 from TableA ta INNER JOIN TableB tb ON (ta.column1 = tb.column4)
If you want to use Natural Join, you need to have same columns.
'Distinct' statement prevents repeating the similar rows too
SELECT Distinct
TableA.Column1,
TableA.Column2,
TableB.Column4
FROM
TableA INNER JOIN TableB ON TableA.Column1 = TableB.Column3

Update a column of a table with a column of another table in PostgreSQL

I want to copy all the values from one column val1 of a table table1 to one column val2 of another table table2. I tried this command in PostgreSQL:
update table2
set val2 = (select val1 from table1)
But I got this error:
ERROR: more than one row returned by a subquery used as an expression
Is there an alternative to do that?
Your UPDATE query should look like this:
UPDATE table2 t2
SET val2 = t1.val1
FROM table1 t1
WHERE t2.table2_id = t1.table2_id
AND t2.val2 IS DISTINCT FROM t1.val1; -- optional, see below
The way you had it, there was no link between individual rows of the two tables. Every row would be fetched from table1 for every row in table2. This made no sense (in an expensive way) and also triggered the syntax error, because a subquery expression in this place is only allowed to return a single value.
I fixed this by joining the two tables on table2_id. Replace that with your actual join condition.
I rewrote the UPDATE to join in table1 (with the FROM clause) instead of running correlated subqueries, because that is typically faster.
It also prevents that table2.val2 is nullified where no matching row is found in table1. Instead, nothing happens to such rows with this form of the query.
You can add table expressions to the FROM list like you would in a plain SELECT (tables, subqueries, set-returning functions, ...). The manual:
from_item
A table expression allowing columns from other tables to appear in the WHERE condition and update expressions. This uses the same
syntax as the FROM clause of a SELECT statement; for example,
an alias for the table name can be specified. Do not repeat the target
table as a from_item unless you intend a self-join (in which
case it must appear with an alias in the from_item).
The final WHERE clause prevents updates that wouldn't change anything - at almost full cost but no gain (exotic exceptions apply). If both old and new value are guaranteed to be NOT NULL, simplify to:
AND t2.val2 <> t1.val1
See:
How do I (or can I) SELECT DISTINCT on multiple columns?
update table1 set table1_column= table2.column from table2 table2 where table1_id= table2.id
do not use alias name for table1.
tables are table1, table2