merge two temp tables and add common columns as new row and add unmatch column using sql - sql

I am using ms sql server, i have two tables below( table 1 and table 2):
table 1 table 2 result
name value ++ name data == name value data
test 10 test1 20 test 10 null
test1 null 20
I want to merge table 1 and table 2 and my expected result would be as result table , can anybody help me here ?

You can combine these using a full join:
select coalesce(t1.name, t2.name) as name, t1.value, t2.data
from t1 full join
t2
on t1.name = t2.name;
If you want to use * to select all columns in the tables, SQL Server does not offer simple way to choose unique columns (without listing all columns). SQL Server doesn't support USING.

Related

How do I compare whether same/intersection data is there in one and another row in same table when I have huge data in the table in sql server?

I have a table with the sample data below. Now, I just want to compare one record with all other records in the same table and we have to give ID if that record colloids with any other records in the remaining records. And column is with comma separated data, So if we have 'A,C' as Name in one record and 'A' in another record(Check the input from text) then it colloid each other because 'A' is common in both.
In the same way one of the record is not having anything in the Name it is NULL. When it is Null it should colloid with remaining other records. Like this Name column I have around 10 columns to verify data.
Input
ID
Name
1
A,C
2
B
3
A
4
NULL
OUTPUT
ID
ColloidID
1
3
1
4
2
4
3
1
3
4
4
1
4
2
4
3
Problem : I have implemented solution like below, and it working fine as expected. But the thing here is it is fine when less data in the table(<100k) but it's taking more time and space when dealing with millions of data(Ex : >20M Data)
SELECT DISTINCT A.ID,B.ID AS ColloidID
FROM #Temp1 A
CROSS APPLY #Temp1 B
WHERE A.ID<>B.ID
AND master.dbo.fIntersection(COALESCE(A.Name,B.Name,''),COALESCE(B.Name,A.Name,'')) = 1
Ideally you should not store multiple pieces of info in a single column.
Be that as it may, you can use a nested EXISTS with STRING_SPLIT to compare the two columns.
SELECT t1.ID, t2.ID
FROM #Temp1 t1
JOIN #Temp1 t2 ON t2.ID <> t1.ID
AND (t1.Name IS NULL OR t2.Name IS NULL
OR EXISTS (SELECT 1
FROM STRING_SPLIT(t1.Name, ',') s1
JOIN STRING_SPLIT(t2.Name, ',') s2 ON s2.value = s1.value
)
)
ORDER BY
t1.ID,
t2.ID;
db<>fiddle
20M isn't a lot of data, provided a good database design is used, with proper indexes. This is definitely not a good design. It violates the most basic design rule - one value per field. As a result, it's impossible to index Name, forcing 4*10^14 comparisons.
The only way to get acceptable performance is to fix the design. To do that Name has to be split into separate rows. The data needs to be stored in a table whose Name column is covered by an index or primary key:
create table #Id_Names (
ID bigint not null,
Name varchar(30) null,
INDEX IX_Id_Names (Name,ID)
);
GO
INSERT INTO #Id_Names (Id,Name)
select ID,value
from #Temp1 t
CROSS APPLY STRING_SPLIT(Name,',');
After that, the query is simplified to :
SELECT
t1.ID,t2.ID as ColloidID
FROM #Id_Names t1
INNER JOIN #Id_Names t2
ON t1.ID<>t2.ID
AND (t1.Name=t2.Name
OR t1.Name IS NULL
OR t2.Name IS NULL)
This can run a lot faster. The only real problem is the logic of treating NULL as a wildcard. This will return the entire table. And since the table joins itself, each null will result in (20M-1)^2 extra rows. The same relations will be repeated twice, eg (1,4) and (4,1)
If #Temp1 was a proper table, an alternative would be to create an indexed view. Creating an index over a VIEW essentially generates, stores and updates its results automatically.
Another option is to create a Clustered Columnstore index. This provides both compression and acceleration. The data is stored per column in buckets of roughly 1M rows. In each bucket, each column value is only stored once.
create table #Id_Names (
ID bigint not null,
Name varchar(30) null,
INDEX CCI_Id_Names CLUSTERED COLUMNSTORE
);

Joining two tables in SQL in which one column has to be "cleaned"

I need to join two tables in SQL, which has two related columns (column ID1 in Table 1 and column ID in Table 2). ID1 in table 1 consists of 6 digits, whereas ID2 in table 2 consists of 6 digitis but an additional quotation marks (") in the beginning and end of the string. I need to remove these quotation marks and join the two tables to verify if there is any values reocurring in both columns.
I know how to remove first and last character of the string in table 2:
SELECT SUBSTRING ([ID2],2,Len([ID2])-2) FROM [dbo].[table2]
I need to join this new "trimmed" column with the other column from table 1.
Any suggestions?
Assuming you are using ms sql server db, and need everything from table1 and matched from table2 then:
sample:
table1 | table2
[ID] | [ID]
547832 | "547832"
-----------------------------
select table1.* , table2.*
from
db.tb1 table1
left join
db.tb2 table2
on
table1.[ID] = SUBSTRING([ID2],2,Len([ID2])-2) ;
First extract your trimmed column with different name by using 'AS' and then you can join the tables.
Try like the below
syntax: SELECT Substring( columnname , positon, length) AS Newcolumnname FROM Tablename;
EX: SELECT Substring(customerName,1,5) AS Newstr from Customer
Joins Table2 ON customer.Newstr = Table2.name;
I am using MS SQL, yes.
Thanks for the reply. However, why is it a left join and not an inner join here? Just curious.
So, essentially what I need to do is:
In the first table, I have around 10 columns, in the second table I have 5 columns. They all have different names, ID was just used as an example. Two of the columns from table 2 appears to have similar values as two of the columns from table 1 (one is an ID of 6 digits, the other is names). I want to remove the first and last character of the 6 digits in the ID column in table 2 and join that and the names column with ID and names from table 1. Hope it makes sense

SQL Combining two different tables

(P.S. I am still learning SQL and you can consider me a newbie)
I have 2 sample tables as follows:
Table 1
|Profile_ID| |Img_Path|
Table 2
|Profile_ID| |UName| |Default_Title|
My scenario is, from the 2nd table, i need to fetch all the records that contain a certain word, for which i have the following query :
Select Profile_Id,UName from
Table2 Where
Contains(Default_Title, 'Test')
ORDER BY Profile_Id
OFFSET 5 ROWS
FETCH NEXT 20 ROWS ONLY
(Note that i am setting the OFFSET due to requirements.)
Now, the scenario is, as soon as i retrieve 1 record from the 2nd table, i need to fetch the record from the 1st table based on the Profile_Id.
So, i need to return the following 2 results in one single statement :
|Profile_Id| |Img_Path|
|Profile_Id| |UName|
And i need to return the results in side-by-side columns, like :
|Profile_Id| |Img_Path| |UName|
(Note i had to merge 2 Profile_Id columns into one as they both contain same data)
I am still learning SQL and i am learning about Union, Join etc. but i am a bit confused as to which way to go.
You can use join:
select t1.*, t2.UName
from table1 t1 join
(select Profile_Id, UName
from Table2
where Contains(Default_Title, 'Test')
order by Profile_Id
offset 5 rows fetch next 20 rows only
) t2
on t2.profile_id = t1.profile_id
SELECT a.Profile_Id, a.Img_Path, b.UName
FROM table1 a INNER JOIN table2 b ON a.Profile_Id=b.Profile_Id
WHERE b.Default_Title = 'Test'

Oracle Compare data between two different table

I have two table one is having all field VARCHAR2 but other having different type for different data.
For Example :
Table One
==========================
Col 1 VARCHAR2 UNIQUE KEY
Col 2 VARCHAR2
Col 3 VARCHAR2
===========================
Table Two
==========================
Col One VARCHAR2 UNIQUE KEY
Col Two TIMESTAMP
Col Three NUMBER
==========================
we are having one mapping table. it denotes which column of Table One has to compare with which column of Table Two.
For Example
Mapping Table
==============================
Table One Table Two
==============================
Col 1 Col One
Col 2 Col Three
Col 3 Col Two
==============================
Now with the help of UNIQUE KEY of TABLE ONE we have to find same row in TABLE TWO and compare rows column by column and get changes in data.
Currently we are using java program for comparing data row by row and column by column and getting changes between data in rows with same UNIQUE KEY. it is working fine but taking too much time as we are having 100000 records in DB.
Now my question is : is there any way i can compare data at SQL level and get changes in data?
You can do it 'manually' with a query like this: It's a lot of work, but there are only three different types of checks you need to do, so it's not very complex:
select
*
from
Table1 t1
full outer join Table2 t2 on t2.ID = t1.ID
where
-- Check ID, either record does not exist in either table.
t1.ID is null or
t2.ID = null or
-- Not nullable field can be easily compared.
t1.NotNullableField1 <> t2.NotNUllableField1 or
-- Nullable field is slightly more work.
t1.NullableField1 <> t2.NullableField1 or
(t1.NullableField1 is null and t2.NullableField1 is not null) or
(t1.NullableField1 is not null and t2.NullableField1 is null)
Another solution is to use MINUS, which is a bit like UNION, only it returns a dataset minus the records in a second dataset:
select * from Table1 t1
MINUS
select * from Table2 t2
This works only one way (which might be fine for your purpose), but you can also combine it with UNION to make it bidirectional.
select
*
from
( select * from Table1
MINUS
select * from Table2)
UNION ALL
( select * from Table2
MINUS
select * from Table1)
The output of both solutions is a bit different.
In the FULL OUTER JOIN query, the IDs will be joined and the values of the matching rows will be displayed next to each other as a single row.
In the MINUS query, the result will be presented as a single dataset. If a record does not exist in either one table, it will be displayed. If a record (ID) exists in both tables, but other fields are different, you will get both rows. So it's a bit harder to compare them.
See: http://www.techonthenet.com/oracle/minus.php

MS SQL - Joining on two tables with a substringed key in one column

I have a 2 tables I need to join, however on one of the tables I need to extract a key from a varchar field in each row.
Table 1 Description (numeric 18,varchar 4000)
descriptionid description
1 Blah Blah: Queue 1Blah Blah
2 foobar:Queue 2
3 rem:Queue 2 -This is a note
4 Anotherrow: Queue 3
5 Something else
Table 2 Queue - (numeric 18, varchar 100)
queueid queue
123 Queue 1
124 Queue 2
127 Queue 3
129 Queue 4
So I need to produce the output like so
View 3 Queue-Description (numeric 18, numeric 18)
descriptionid queueid
1 123
2 124
3 124
4 127
5 null
So in table 1 row 1, I need to strip out the value Queue1 from the description, verify it is in the queue table, and lookup the queueid.
I am unable to change the structure of tables 1 and 2.
What ways can this be achieved in MSSQL?
What is the most efficient way to do this in SQL - using MSSQL 2005 here.
most efficient way
Well... don't know about that but it is a way.
select T1.descriptionid,
T2.queueid
from Table1 as T1
left outer join Table2 as T2
on T1.description like '%'+T2.queue+'%'
Another way
select T1.descriptionid,
T2.queueid
from Table1 as T1
left outer join Table2 as T2
on charindex(T2.queue, T1.description, 1) > 0
If there are more than one match (see comment by Ed Harper) you can use this to pick the one with the longest match.
select T1.descriptionid,
T2.queueid
from Table1 as T1
outer apply (
select top 1 T3.queueid
from Table2 as T3
where charindex(T3.queue, T1.description, 1) > 0
order by len(T3.queue) desc
) as T2(queueid)
The most efficient way to do this is to add an extra column to your table and insert the extracted the ID from the string. You can do this when rows are added and you can process the existing ones fairly easily. But trying to left join like this will be very slow.
In Sql Server 2005 you can extract your queue string using regex. The Data Extraction section on this page contains an example.
In a stored procedure you can then build an indexed temp table that contains a new column - this allows you to do this without changing the table metadata).
If you can change the table metadata you can:
Trigger the content into another column (on insert).
Or if the information is not needed immediately a daily sql job could extract the information.