join tables in sql where data does not exactly match up - sql

I am using SQL Server 2008 and I have two tables that I want to join. I have provided something below that shows how my data looks. I want to join the two tables on the given columns, but how can I do this with the "ID" in front of the number in table B? I was thinking of a trim on the join, but I don't know how to do that.
Something like...
Select *
From TableA AS A
Left Join TableB AS B
On A.ColumnA = B.ColumnB
But this won't work because the numbers don't completely match up.
TableA ColumnA
123
456
789
TableB ColumnB
ID123
ID456
ID789
I hope I made this clear enough. Any suggestions?

SQL Fiddle Demo
select *
from tableA a
join tableB b
on 'ID' + cast(columnA as varchar(5)) = b.columnB

Related

duplicate query result when join table

I face issue about duplicate data when join table, here my sample data table I have
-- Table A
I want to join with
-- Table B
this my query notation for join both table,
select a.trans_id, name
from tableA a
inner join tableB b
on a.ID_Trans = b.trans_id
and this the result, why I get the duplicating data which should show only two lines of data, please help me to solve this case.
Firstly, as you have been told multiple times in the comments, this is working exactly as you have written, and (more importantly) as intended. You have 2 rows in tableA and those 2 rows match 2 rows in your table tableB according to the ON clause. This means that each join operation, for the each of the rows in tableA, results in 2 rows as well; thus 4 rows (2 * 2 = 4).
Considering that your table, TableA only has one column then it seems that you should be cleaning up that data and deleting the duplicates. There are plenty of examples on how to do that already (example).
Perhaps the column you show us in TableA is one many, and thus instead you have a denormalisation issue, and instead there should be another table with the details of Id_trans and a PRIMARY KEY or UNIQUE CONSTRAINT/INDEX on it. Then you would join fron that table to TableB.
Finally, what you might be after is an EXISTS, which would look like this:
SELECT B.trans_id, B.[name]
FROM dbo.TableB B
WHERE EXISTS(SELECT 1
FROM dbo.TableA A
WHERE A.ID_Trans = B.trans_id); --Odd that it's called ID_Trans in one table, and Trans_ID in another
As the comments mentioned your query does exactly what you asked it to do but I think you wanted something like:
select a.trans_id, a.name, b.name
from tableA a
inner join tableB b on a.trans_id = b.trans_id
group by a.trans_id, a.name, b.name
Since there are two rows in both table with same ID join will make them four. You can use distinct to remove duplicates:
select distinct a.trans_id, name
from tableA a
inner join tableB b
on a.id_trans = b.trans_id
But I would suggest to use exists:
select trans_id, name
from tableB b
exists (select 1 from tableA a where a.trans_id=b.trans_id)

INTERSECT two table of size 500ml rows in vertica

I am very new to vertica db and hence looking for different efficient ways for comparing two tables of average size 500ml-800ml rows in vertica. I have a process that gets the data from vertica view and dump in to SQL server for later merge to final table in sql server. for few large tables combine it is dumping about 3bl rows daily. Instead of dumping all data I want to take daily snapshot, and compare it with previous days snapshot on vertica side only and then push changed rows only in to SQL SEREVER.
lets say previous snapshot is stored in tableA, today's snapshot stored in tableB. PK on both table is column named OrderId.
Simplest way I can think of is
Select * from tableB
Where OrderId NOT IN (
SELECT * from tableA
INTERSECT
SELECT * from tbleB
)
So my questions are:
Is there any other/better option in vertica to get only changed rows between two tables? Or should I
even consider doing this compare on vertica side?
How much doing such comparison should take?
What should I consider to improve the performance of such query?
If your columns have no NULL values, then a massive LEFT JOIN would seem to do what you want:
select b.*
from tableB b left join
tableA a
on b.OrderId = a.OrderId and
b.col1 = a.col1 and
. . . -- for all the columns you care about
However, I think you want except:
select b.*
from tableB b
except
select a.*
from tableA a;
I imagine this would have reasonable performance.
Do you have a primary key in the two tables?
Then my technique, for a complete Change Data Capture, is:
SELECT
'I' AS to_do
, newrows.*
FROM tb_today newrows
LEFT
JOIN tb_yesterday oldrows USING(id)
WHERE oldrows.id IS NULL
UNION ALL
SELECT
'U' AS to_do
, newrows.*
FROM tb_today newrows
JOIN tb_yesterday oldrows
WHERE oldrows.fname <> newrows.fname
OR oldrows.lnamd <> newrows.lname
OR oldrows.bdate <> newrwos.bdate
OR oldrows.sal <> newrows.sal
[...]
OR oldrows.lastcol <> newrows.lastcol
UNION ALL
SELECT
'D' AS to_do
, oldrows.*
FROM tb_yesterday oldrows
LEFT
JOIN tb_today oldrows USING(id)
WHERE newrows.id IS NULL
;
Just leave out the last leg of the UNION SELECT if you don't want to cater for DELETEs ('D')
Good luck
you also do it nicely using joins:
SELECT b.*
FROM tableB AS b
LEFT JOIN tableA AS a ON a.id = b.id
WHERE a.id IS NULL
so above query return only diff from TableB to TableA i.e. data which is present in both table will be skipped...

SQL select from table where column contains specific strings

I have a table which contains strings which is needed to be searched [like below image]
TableA
After that, I have a table which stores data [like below image]
TableB
And what the search function does is to select from TableB only if the record contains the string in TableA [i.e. The expected result should be like below image(TableC)]
TableC
I've tried using the SQL below, but the SQL had some error while trying to run [Incorrect syntax near 'select'] , also, the SQL is a bit complicated, is there any way to make the SQL simplier?
select * from TableB a
where exists
(select colA from TableB b
where b.colA = a.ColA and Contains (b.ColA, select searchCol from TableA))
Please try:
select
a.*
From tblA a inner join tblB b
on b.Col like '%'+a.Col+'%'
SQL Fiddle Demo
One of these should work according to me
SELECT b.colA
FROM TableB b join TableA a
WHERE colA like '%' + a.ColA + '%';
SELECT b.colA
FROM TableB b join TableA a
WHERE INSTR(b.colA, '{' + a.colA + '}') > 0;
SELECT b.* FROM tblA a,tblB b
WHERE PATINDEX('%'+a.Col+'%',b.Col)>0
SQL FIDDLE DEMO
I have searched for performance difference between PATINDEX and LIKE but not got a conclusive answer , some say PATINDEX is faster when indexing can't be used which is when pattern to be searched is enclosed within wildcard character eg '%'+a.Col+'%' while other post mention they can similar performance .
Perhaps some SO SQL Wizard can look in crystal ball and show us the light

Join SQL query to get data from two tables

I'm a newbie, just learning SQL and have this question: I have two tables with the same columns. Some registers are in the two tables but others only are in one of the tables. To illustrate, suppose table A = (1,2,3,4), table B=(3,4,5,6), numbers are registers. I need to select all registers in table B if they are not in table A, that is result=(5,6). What query should I use? Maybe a join. Thanks.
You can either use a NOT IN query like this:
SELECT col from A where col not in (select col from B)
or use an outer join:
select A.col
from A LEFT OUTER JOIN B on A.col=B.col
where B.col is NULL
The first is easier to understand, but the second is easier to use with more tables in the query.
Select register from TABLE_B b
Where not exists (Select register from TABLE_A a where a.register = b.register)
I assumed you have a column named register in TABLE_A and TABLE_B

Sql Join Query using MS Access

Hello I Have a problem in getting rows from one table after comparing both. Detail of Both Table are as follows:-
I am using Ms Access database.
TableA is having a data of numeric type (Field Name is A it is primary key)
----------
Field A
==========
1
2
3
4
5
Table B is having data of numeric type ( Field Name is A it is foreign key)
--------
Field A
========
2
4
Now I am using below query which is this
select a.a
from a a
, b b
where a.a <> b.b
I want to show all the data from Table A which is not equal to Table B. But the above query is not working as I described.
Can you help me in this regard.
Regards,
Fawad Munir
In an attempt at clarity, I've used upper case for tables and lower case for fields:
Select A.a
FROM A LEFT OUTER JOIN B ON A.a=B.b
WHERE B.b is null
This will show all the records in A that are not in B (I assume that's what you want).
Read up on Access outer joins. In the query designer you double click the join and select something like "all records from table a and only the matching records in table b".
In your question you said that the name of the field in table B is 'A'. Given that, I'd say that your query should be something like
select a.a
from a, b
where a.a <> b.a
But I'm not sure this will do what you want. I think you're trying to find rows in table A which do not have a matching row in table B, in which case you might try
SELECT A.A
FROM A
LEFT OUTER JOIN B
ON (B.A = A.A)
WHERE B.A IS NULL
Try that and see if it does what you want.
Share and enjoy.
I don't know exactly if Access would accept the syntax, but here how I would do in SQL Server.
select a.a
from TableA a
where a.a NOT IN (
select b.a
from TableB b
)
or even as above-mentioned:
select a.a
from TableA a
left outer join TableB b on b.a = a.a
where b.a IS NULL
Its not entirely clear what you are trying to achieve, but its sounds like you are attempting to solve the common problem of finding rows in Table A missing associated data in Table B. If this is the case, it appears you misunderstand the semantics of the join you tried. In which case, you have 2 problems, because the understanding the the JOIN operation is critical to working with relational databases.
In relation to the first problem, please research how to express a subquery using the IN operator. Something like
... WHERE a NOT IN (SELECT a from b)
In relation to the second problem, try your query without the WHERE restriction, and see what is returned. Once you understand what the join is doing, you will see why applying a WHERE restriction to it will not solve your problem.
If I understand you correctly, you want to see every row in A for which column a contains a value that cannot be found in any column b value of B. You can get this data in several ways.
I think using NOT IN is the clearest, personally:
SELECT * FROM tableA WHERE columnA NOT IN
(SELECT columnB FROM tableB WHERE columnB IS NOT NULL)
Many people prefer a filtered JOIN:
SELECT tableA.* FROM tableA LEFT OUTER JOIN tableB
ON tableA.columnA = tableB.columnB WHERE tableB.columnB IS NULL
There is a NOT EXISTS variant as well:
SELECT * FROM tableA WHERE columnA NOT EXISTS
(SELECT * FROM tableB WHERE columnB = tableA.columnA)