Joining two tables in SQL in which one column has to be "cleaned" - sql

I need to join two tables in SQL, which has two related columns (column ID1 in Table 1 and column ID in Table 2). ID1 in table 1 consists of 6 digits, whereas ID2 in table 2 consists of 6 digitis but an additional quotation marks (") in the beginning and end of the string. I need to remove these quotation marks and join the two tables to verify if there is any values reocurring in both columns.
I know how to remove first and last character of the string in table 2:
SELECT SUBSTRING ([ID2],2,Len([ID2])-2) FROM [dbo].[table2]
I need to join this new "trimmed" column with the other column from table 1.
Any suggestions?

Assuming you are using ms sql server db, and need everything from table1 and matched from table2 then:
sample:
table1 | table2
[ID] | [ID]
547832 | "547832"
-----------------------------
select table1.* , table2.*
from
db.tb1 table1
left join
db.tb2 table2
on
table1.[ID] = SUBSTRING([ID2],2,Len([ID2])-2) ;

First extract your trimmed column with different name by using 'AS' and then you can join the tables.
Try like the below
syntax: SELECT Substring( columnname , positon, length) AS Newcolumnname FROM Tablename;
EX: SELECT Substring(customerName,1,5) AS Newstr from Customer
Joins Table2 ON customer.Newstr = Table2.name;

I am using MS SQL, yes.
Thanks for the reply. However, why is it a left join and not an inner join here? Just curious.
So, essentially what I need to do is:
In the first table, I have around 10 columns, in the second table I have 5 columns. They all have different names, ID was just used as an example. Two of the columns from table 2 appears to have similar values as two of the columns from table 1 (one is an ID of 6 digits, the other is names). I want to remove the first and last character of the 6 digits in the ID column in table 2 and join that and the names column with ID and names from table 1. Hope it makes sense

Related

SQL Merge tables with two different columns

I have two tables that look like this:
table1:
table2:
and I'd like to merge them, to create a table with a header like this:
variablenname | mean | stddev | ms_form | dependent | fvalue | probf
The first nine rows should be the first table with empty values in the last 3 columns followed by the three rows from the second table with empty values in the first 4 columns. I'm sorry I can't post pictures or a third link, but my reputation is too low. I hope it's understandable.
I tried to create a new table and select everything from both with union but that didn't work. Can anybody help me?
Thanks!
You can use ALTER to add the last 3 columns to table 1. From there, INSERT table 2 onto table 1. So something like this.
ALTER table1 ADD dependent varchar(16)
ALTER table1 ADD favalue varchar(16)
ALTER table1 ADD probf varchar(16)
INSERT INTO table1(dependent, fvalue, probf)
SELECT * FROM table2
You can use a combination of joins and unions
LEFT JOIN the two tables in first query on fields that won't have a match
then UNION the second query with a RIGHT JOIN of the same tables.
The LEFT JOIN will only show results from the first table, then the RIGHT JOIN will only show results from the second
Something Like:
SELECT * FROM table1 as t1
LEFT JOIN table1 as t2 on t1.variablenname = t2.fValue
UNION
SELECT * FROM table1 as t1
RIGHT JOIN table1 as t2 on t1.variablenname = t2.fValue;
This is how I would like the final table to look like:

SQL JOIN and COUNT COMMANDS

I have two different tables, we will call table 1 and table 2. Within table 1, I have a column known as featureid which is a numerical value that corresponds to a numerical value in table 2 known as the termid. Also, within this table 2, each of the different termid corresponds to a plain text description of that termid.
What I am attempting to do is join the featureid in table 1 to the termid in table 2, but have the output be a two column display of the plain text description and occurrence of each within table 1.
I know I need to use the JOIN and COUNT syntax within SQL, but not sure how to correctly write the command.
Thanks!
You actually only need one column in the GROUP BY. It's also good practice to specify which values are coming from which table, like so:
SELECT
table2.textDesc,
COUNT(table1.featureid) AS OccurrenceCount
FROM
table1
INNER JOIN
table2 ON
table1.featureid = table2.termid
GROUP BY table2.textDesc
I think this is what you are looking for:
SELECT textDesc, COUNT(featureid)
FROM table1, table2
WHERE featureid=termid
GROUP BY featureid, textDesc
Alternatively, you can use a different syntax (with the same end result) like so:
SELECT textDesc, COUNT(featureid)
FROM table1 INNER JOIN table2
ON featureid=termid
GROUP BY featureid, textDesc

Oracle Compare data between two different table

I have two table one is having all field VARCHAR2 but other having different type for different data.
For Example :
Table One
==========================
Col 1 VARCHAR2 UNIQUE KEY
Col 2 VARCHAR2
Col 3 VARCHAR2
===========================
Table Two
==========================
Col One VARCHAR2 UNIQUE KEY
Col Two TIMESTAMP
Col Three NUMBER
==========================
we are having one mapping table. it denotes which column of Table One has to compare with which column of Table Two.
For Example
Mapping Table
==============================
Table One Table Two
==============================
Col 1 Col One
Col 2 Col Three
Col 3 Col Two
==============================
Now with the help of UNIQUE KEY of TABLE ONE we have to find same row in TABLE TWO and compare rows column by column and get changes in data.
Currently we are using java program for comparing data row by row and column by column and getting changes between data in rows with same UNIQUE KEY. it is working fine but taking too much time as we are having 100000 records in DB.
Now my question is : is there any way i can compare data at SQL level and get changes in data?
You can do it 'manually' with a query like this: It's a lot of work, but there are only three different types of checks you need to do, so it's not very complex:
select
*
from
Table1 t1
full outer join Table2 t2 on t2.ID = t1.ID
where
-- Check ID, either record does not exist in either table.
t1.ID is null or
t2.ID = null or
-- Not nullable field can be easily compared.
t1.NotNullableField1 <> t2.NotNUllableField1 or
-- Nullable field is slightly more work.
t1.NullableField1 <> t2.NullableField1 or
(t1.NullableField1 is null and t2.NullableField1 is not null) or
(t1.NullableField1 is not null and t2.NullableField1 is null)
Another solution is to use MINUS, which is a bit like UNION, only it returns a dataset minus the records in a second dataset:
select * from Table1 t1
MINUS
select * from Table2 t2
This works only one way (which might be fine for your purpose), but you can also combine it with UNION to make it bidirectional.
select
*
from
( select * from Table1
MINUS
select * from Table2)
UNION ALL
( select * from Table2
MINUS
select * from Table1)
The output of both solutions is a bit different.
In the FULL OUTER JOIN query, the IDs will be joined and the values of the matching rows will be displayed next to each other as a single row.
In the MINUS query, the result will be presented as a single dataset. If a record does not exist in either one table, it will be displayed. If a record (ID) exists in both tables, but other fields are different, you will get both rows. So it's a bit harder to compare them.
See: http://www.techonthenet.com/oracle/minus.php

Matching records with wild cards from two different tables

I have two tables with the following data (amongst other data).
Table 1
Value 1
'003232339639
'00264644106272
0026461226291#
I need to match the second column in the table below using column 1 as an identifier
Table 2
Value 1 Value 2
00264 1
0026485 2
0026481 3
00322889 4
00323283 5
00323288 6
So the results I need will be as follows:
Result
Table 1, Value 1 Table 2, Value 2
'003232339639......4
'00264644106272....1
0026461226291#.....1
Any help will be appreciated - very stuck here and doing it manually at the moment in excel.
I hope this format makes sense - first time I am using this forum.
Melany, the question is kind of confusing (not written correctly) perhaps that's why no one is responding. I'll make an attempt to explain how similar selects is done
SELECTING DATA FROM TABLE1 WHERE A MATCHING COLUMN (COL1) EXISTS IN BOTH TABLE
SELECT * FROM TABLE1
INNER JOIN TABLE2
ON TABLE1.COL1 = TABLE2.COL1
AND TABLE1.COL1 = 'XYZ'
USING A SUBSELECT FOR THE SAME
SELECT * FROM TABLE1
WHERE COL1 IN(SELECT COL1 FROM TABLE2
WHERE COL1 = 'XYZ')
In SQL, the wildcard for one or more characters is %, and is to be used with the keyword LIKE.
So I suggest the following (if your purpose is really to match rows in Table1 for which Value1 begins like a value in Table2.Value1):
SELECT Table1.Value1, Table2.Value2 WHERE Table1.Value1 LIKE CONCAT(Table2.Value1, '%');
Edit: replace CONCAT(x, y) with x || y for some DBMSs (SQLite for instance).

How to compare string data to table data in SQL Server - I need to know if a value in a string doesn't exist in a column

I have two tables, one an import table, the other a FK constraint on the table the import table will eventually be put into. In the import table a user can provide a list of semicolon separated values that correspond to values in the 2nd table.
So we're looking at something like this:
TABLE 1
ID | Column1
1 | A; B; C; D
TABLE 2
ID | Column2
1 | A
2 | B
3 | D
4 | E
The requirement is:
Rows in TABLE 1 with a value not in TABLE 2 (C in our example) should be marked as invalid for manual cleanup by the user. Rows where all values are valid are handled by another script that already works.
In production we'll be dealing with 6 columns that need to be checked and imports of AT LEAST 100k rows at a time. As a result I'd like to do all the work in the DB, not in another app.
BTW, it's SQL2008.
I'm stuck, anyone have any ideas. Thanks!
Seems to me you could pass ID & Column1 values from Table1 to a Table-Valued function (or a temp table in-line) which would parse the ;-delimited list, returning individual values per record.
Here are a couple options:
T-SQL: Parse a delimited string
Quick T-Sql to parse a delimited string
The result (ID, value) from the function could be used to compare (unmatched query) against values in Table 2.
SELECT tmp.ID
FROM tmp
LEFT JOIN Table2 ON Table2.id = tmp.ID
WHERE Table2.id is null
The ID results of the comparison would then be used to flag records in Table 1.
Perhaps inserting those composite values into 'TABLE 1' may have seemed like the most convenient solution at one time. However, unless your users are using SQL Server Management Studio or something similar to enter the values directly into the table then I assume there must be a software layer between the UI and the database. If so, you're going to save yourself a lot headaches both now and in the long run by investing a little time in altering your code to split the semi-colon delimited inputs into discrete values before inserting them into the database. This will result in 'TABLE 1' looking something like this
TABLE 1
ID | Column1
1 | A
1 | B
1 | C
1 | D
It's then trivial to write the SQL to find those IDs which are invalid.
If it is possible, try putting the values in separate rows when importing (instead of storing it as ; separated).
This might help.
Here is an easy and straightforward solution for the IDs of the invalid rows, despite its lack of performance because of string manipulations.
select T1.ID
from [TABLE 1] T1
left join [TABLE 2] T2
on ('; ' + T1.COLUMN1 + '; ') like ('%; ' + T2.COLUMN2 + '; %')
where T1.COLUMN1 is not null
group by T1.ID
having count(*) < len(T1.COLUMN1) - len(replace(T1.COLUMN1, ';', '')) + 1
There are two assumptions:
The semicolon-separated list does not contain duplicates
TABLE 2 does not contain duplicates in COLUMN2.
The second assumption can easily be fixed by using (select distinct COLUMN2 from [TABLE 2]) rather than [TABLE 2].