Comparing data in the same table for non existent rows - sql

I have a table that has both Company name and Contacts for their respective companies. Type column has a 0 or 1 indicating whether it is a company or person. Each row has a column with a unique contact no. The 'person' row has a column called "Company no." that links the person to the company. I'm trying to return rows that show a company without any contacts in the same table. Not sure how to even start writing this query.

Try it like this:
DECLARE #tbl TABLE(ContactNo INT, Name VARCHAR(100), [Type] INT,CompanyNo INT);
INSERT INTO #tbl VALUES
(100,'ACME, Inc.',0,100)
,(200,'Bob Smith',1,100)
,(300,'John Doe',1,100)
,(400,'Widget World',0,400)
,(500,'Fishing, Inc.',0,500)
,(600,'Jane Doe',1,500);
WITH TheCompanies AS
(
SELECT *
FROM #tbl AS tbl
WHERE tbl.[Type]=0
)
SELECT *
FROM TheCompanies
WHERE NOT EXISTS(SELECT 1 FROM #tbl WHERE [Type]=1 AND CompanyNo=TheCompanies.CompanyNo);

Related

How to join on columns that contain strings that aren't exact matches in SQL Server?

I am trying to create a simple table join on columns from two tables that are equivalent but not exact matches. For example, the row value in table A might be "Georgia Production" and the corresponding row value in table B might be "Georgia Independent Production Co".
I first tried a wild card in the join like this:
select BOLFlatFile.*, customers.City, customers.FEIN_Registration_No, customers.ST
from BOLFlatFile
Left Join Customers on (customers.Name Like '%'+BOLFlatFile.Customer+'%');
and this works great for 90% of the data. However, If the string in table A does not exactly appear in Table B, it returns null.
So back to the above example, if the value for table A were "Georgia Independent", it would work, but if it were "Georgia Production, it would not.
This might be a complicated way of still being wrong, but this works with the sample I've mocked up.
The assumption is that because you are "wildcard searching" a string from one table to another, I am assuming that all of the words in the first table column appear in the second table column, which means by default that the second table column will always have a longer string in it than the first table column.
the second assumption is that there is a unique id on the first table, if there is not then you can create one by using the row_number function and ordering on your string column.
The approach below firstly creates some sample data (I've used tablea and tableb to represent your tables).
Then a dummy table is created to store the uniqueid for your first table and the string column.
Next a loop is invoked to iterate across the string in the dummy table and insert the unique id and the first section of the string followed by a space into the handler table which is what you will use to join the 2 target tables together.
The next section joins the first table to the handler table using the unique id and then joins the second table to the handler table on the key words longer than 3 letters (avoiding "the" "and" etc) joining back to the first table using the assumption that the string in table b is longer than table a (because you are looking for instances of each word in table a column in the corresponding column of table b hence the assumption).
declare #tablea table (
id int identity(1,1),
helptext nvarchar(50)
);
declare #tableb table (
id int identity(1,1),
helptext nvarchar(50)
);
insert #tablea (helptext)
values
('Text to find'),
('Georgia Production'),
('More to find');
insert #tableb (helptext)
values
('Georgia Independent Production'),
('More Text to Find'),
('something Completely different'),
('Text to find');
declare #stringtable table (
id int,
string nvarchar(50)
);
declare #stringmatch table (
id int,
stringmatch nvarchar(20)
);
insert #stringtable (id, string)
select id, helptext from #tablea;
update #stringtable set string = string + ' ';
while exists (select 1 from #stringtable)
begin
insert #stringmatch (id, stringmatch)
select id, substring(string,1,charindex(' ',string)) from #stringtable;
update #stringmatch set stringmatch = ltrim(rtrim(stringmatch));
update #stringtable set string=replace(string, stringmatch, '') from #stringtable tb inner join #stringmatch ma
on tb.id=ma.id and charindex(ma.stringmatch,tb.string)>0;
update #stringtable set string=LTRIM(string);
delete from #stringtable where string='' or string is null;
end
select a.*, b.* from #tablea a inner join #stringmatch m on a.id=m.id
inner join #tableb b on CHARINDEX(m.stringmatch,b.helptext)>0 and len(b.helptext)>len(a.helptext);
It all depends how complex you want to make this matching. There is various ways of matching these strings and some may work better than others. Below is an example of how you can split the names in your BOLFlatFile and Customers tables into separate words by using string_split.
The example below will match anything where all the words in the BOLFlatFile customer field are contained within the customers name field (note: it won't take into account ordering of the strings).
The code below will match the first two strings as expected, but not the last two sample strings.
CREATE TABLE BOLFlatFile
(
[customer] NVARCHAR(500)
)
CREATE TABLE Customers
(
[name] NVARCHAR(500)
)
INSERT INTO Customers VALUES ('Georgia Independent Production Co')
INSERT INTO BOLFlatFile VALUES ('Georgia Production')
INSERT INTO Customers VALUES ('Test String 1')
INSERT INTO BOLFlatFile VALUES ('Test 1')
INSERT INTO Customers VALUES ('Test String 2')
INSERT INTO BOLFlatFile VALUES ('Test 3')
;with BOLFlatFileSplit
as
(
SELECT *,
COUNT(*) OVER(PARTITION BY [customer]) as [WordsInName]
FROM
BOLFlatFile
CROSS APPLY
STRING_SPLIT([customer], ' ')
),
CustomerSplit as
(
SELECT *
FROM
Customers
CROSS APPLY
STRING_SPLIT([name], ' ')
)
SELECT
a.Customer,
b.name
FROM
CustomerSplit b
INNER JOIN
BOLFlatFileSplit a
ON
a.value = b.value
GROUP BY
a.Customer, b.name
HAVING
COUNT(*) = MAX([WordsInName])

How to split data in SQL Server table row

I have table of transaction which contains a column transactionId that has values like |H000021|B1|.
I need to make a join with table Category which has a column CategoryID with values like H000021.
I cannot apply join unless data is same.
So I want to split or remove the unnecessary data contained in TransctionId so that I can join both tables.
Kindly help me with the solutions.
Create a computed column with the code only.
Initial scenario:
create table Transactions
(
transactionId varchar(12) primary key,
whatever varchar(100)
)
create table Category
(
transactionId varchar(7) primary key,
name varchar(100)
)
insert into Transactions
select'|H000021|B1|', 'Anything'
insert into Category
select 'H000021', 'A category'
Add computed column:
alter table Transactions add transactionId_code as substring(transactionid, 2, 7) persisted
Join using the new computed column:
select *
from Transactions t
inner join Category c on t.transactionId_code = c.transactionId
Get a straighforward query plan:
You should fix your data so the columns are the same. But sometimes we are stuck with other people's bad design decisions. In particular, the transaction data should contain a column for the category -- even if the category is part of the id.
In any case:
select . . .
from transaction t join
category c
on transactionid like '|' + categoryid + |%';
Or if the category id is always 7 characters:
select . . .
from transaction t join
category c
on categoryid = substring(transactionid, 2, 7)
You can do this using query :
CREATE TABLE #MyTable
(PrimaryKey int PRIMARY KEY,
KeyTransacFull varchar(50)
);
GO
CREATE TABLE #MyTransaction
(PrimaryKey int PRIMARY KEY,
KeyTransac varchar(50)
);
GO
INSERT INTO #MyTable
SELECT 1, '|H000021|B1|'
INSERT INTO #MyTable
SELECT 2, '|H000021|B1|'
INSERT INTO #MyTransaction
SELECT 1, 'H000021'
SELECT * FROM #MyTable
SELECT * FROM #MyTransaction
SELECT *
FROM #MyTable
JOIN #MyTransaction ON KeyTransacFull LIKE '|'+KeyTransac+'|%'
DROP TABLE #MyTable
DROP TABLE #MyTransaction

Multi column exist statement

I am trying to insert data to a table from another where data is not already exists
The table I am inserting the data into
CREATE TABLE #T(Name VARCHAR(10),Unit INT, Id INT)
INSERT INTO #T
VALUES('AAA',10,100),('AAB',11,102),('AAC',12,130)
The table I am selecting the data from
CREATE TABLE #T1(Name VARCHAR(10),TypeId INT,Unit INT, Id INT)
INSERT INTO #T1
VALUES('AAA',3,10,100),('AAA',3,10,106)
In this case I want to select ('AAA',3,10,106) from #T1 because AAA,106 combination not exists in #T
Basically what I want is to populate unique Name and Id combination
I tried below which doesn't seems to work
SELECT *
FROM #T1
WHERE NOT EXISTS(SELECT * FROM #T)
You have to somehow correlate the two tables:
SELECT *
FROM #T1
WHERE NOT EXISTS(SELECT *
FROM #T
WHERE #T1.Name = #T.Name AND #T1.ID = #T.ID)
The above query essentially says: get me those records of table #T1 which do not have a related record in #T having the same Name and ID values.
Your best bet is probably to use a insert statement with a subquery. Something like this:
SQL Insert Into w/Subquery - Checking If Not Exists
Edit: If you're still stuck, try this--
INSERT INTO #T (Name, Unit, Id)
SELECT Name, Unit, Id
FROM #T1
WHERE
NOT EXISTS (SELECT Name, Unit, Id FROM #T
WHERE #T.Name = #T1.Name AND #T.Unit = #T1.Unit AND #T.Id = #T1.Id)

How to insert Auto-Increment using SELECT INTO Statement? SQL SERVER

This my table1:
Name Description
john student
dom teacher
I need to use SELECT * INTO to transfer it to another table (table2) but I want it with a new column named Auto which is auto-incremented.
Which will look like this:
Name Description Auto
John Student 1
Dom Teacher 2
Current Code: SELECT * INTO table2 FROM table1
Use ROW_NUMBER to add sequential number starting from 1.
SELECT *,
Auto = ROW_NUMBER() OVER(ORDER BY(SELECT NULL))
INTO table2
FROM table1
The accepted answer has additional convenience when breaking one table into several smaller ones with the exact number of rows. If necessary, it is possible to remove the column used for autoincrement.
SELECT *,
ID = ROW_NUMBER() OVER(ORDER BY ( SELECT NULL ))
INTO #Table2
FROM Table1
DECLARE #start INT, #end INT;
SET #start = 1;
SET #end = 5000000;
SELECT *
INTO Table3
FROM #Table2
WHERE ID BETWEEN #start AND #end;
ALTER TABLE Table3 DROP COLUMN ID;
You can use an identity field for this, that's what they're for. The logic of the identity(1,1) means that it will start at the number 1 and increment by 1 each time.
Sample data;
CREATE TABLE #OriginalData (Name varchar(4), Description varchar(7))
INSERT INTO #OriginalData (Name, Description)
VALUES
('John','student')
,('Dom','teacher')
Make a new table and insert the data into it;
CREATE TABLE #NewTable (Name varchar(4), Description varchar(7), Auto int identity(1,1))
INSERT INTO #NewTable (Name, Description)
SELECT
Name
,Description
FROM #OriginalData
Gives the results as;
Name Description Auto
John student 1
Dom teacher 2
If you ran the insert a couple more times your results would look like this;
Name Description Auto
John student 1
Dom teacher 2
John student 3
Dom teacher 4
John student 5
Dom teacher 6

SQL if #name exists select ID else insert #name into table

Table looks like below:
CREATE TABLE names
(ID int,
name varchar(10) unique)
I need to achieve the following result:
if #name not exists in names then insert into names (name) values (#name)
select id from names where name=#name
It would be best to achieve it with user defined function.
You basically have the answer written in your question already:
IF (NOT EXISTS (SELECT * FROM names WHERE name = #name))
INSERT INTO names (name) values (#name);
SELECT id FROM names WHERE name = #name;
The only problem is that you haven't set up your table names to use an IDENTITY column. This means you need to assign values for id as well.