Filtering a group of records - sql

Please see the SQL structure below:
CREATE table TestTable (id int not null identity, [type] char(1), groupid int)
INSERT INTO TestTable ([type]) values ('a',1)
INSERT INTO TestTable ([type]) values ('a',1)
INSERT INTO TestTable ([type]) values ('b',1)
INSERT INTO TestTable ([type]) values ('b',1)
INSERT INTO TestTable ([type]) values ('a',2)
INSERT INTO TestTable ([type]) values ('a',2)
The first four records are part of group 1 and the fifth and sixth records are part of group 2.
If there is at least one b in the group then I want the query to only return b's for that group. If there are no b's then the query should return all records for that group.

Here you go
SELECT *
FROM testtable
LEFT JOIN (SELECT distinct groupid FROM TestTable WHERE type = 'b'
) blist ON blist.groupid = testtable.groupid
WHERE (blist.groupid = testtable.groupid and type = 'b') OR
(blist.groupid is null)
How it works
join to a list of items that contain b.
Then in where statement... if we exist in that list just take type b. Otherwise take everything.
As an after-note you could be cute with the where clause like this
WHERE ISNULL(blist.groupid,testtable.groupid) = testtable.groupid
I think this is less clear -- but is often how advanced users will do it.

Related

Insert into table from select only when select returns valid rows

I want to insert into table from select statement but it is required that insert only happens when select returns valid rows. If no rows return from select, then no insertion happens.
insert into products (name, type) select 'product_name', type from prototype where id = 1
However, the above sql does insertion even when select returns no rows.
It tries to insert NULL values.
I know the following sql can check if row exists
select exists (select true from prototype where id = 1)
How to write a single SQL to add the above condition to insert to exclude the case ?
You are inserting the wrong way. See the example below, that doesn't insert any row since none matches id = 1:
create table products (
name varchar(10),
type varchar(10)
);
create table prototype (
id int,
name varchar(10),
type varchar(10)
);
insert into prototype (id, name, type) values (5, 'product5', 'type5');
insert into prototype (id, name, type) values (7, 'product7', 'type7');
insert into products (name, type) select name, type from prototype where id = 1
-- no rows were inserted.

How to join on columns that contain strings that aren't exact matches in SQL Server?

I am trying to create a simple table join on columns from two tables that are equivalent but not exact matches. For example, the row value in table A might be "Georgia Production" and the corresponding row value in table B might be "Georgia Independent Production Co".
I first tried a wild card in the join like this:
select BOLFlatFile.*, customers.City, customers.FEIN_Registration_No, customers.ST
from BOLFlatFile
Left Join Customers on (customers.Name Like '%'+BOLFlatFile.Customer+'%');
and this works great for 90% of the data. However, If the string in table A does not exactly appear in Table B, it returns null.
So back to the above example, if the value for table A were "Georgia Independent", it would work, but if it were "Georgia Production, it would not.
This might be a complicated way of still being wrong, but this works with the sample I've mocked up.
The assumption is that because you are "wildcard searching" a string from one table to another, I am assuming that all of the words in the first table column appear in the second table column, which means by default that the second table column will always have a longer string in it than the first table column.
the second assumption is that there is a unique id on the first table, if there is not then you can create one by using the row_number function and ordering on your string column.
The approach below firstly creates some sample data (I've used tablea and tableb to represent your tables).
Then a dummy table is created to store the uniqueid for your first table and the string column.
Next a loop is invoked to iterate across the string in the dummy table and insert the unique id and the first section of the string followed by a space into the handler table which is what you will use to join the 2 target tables together.
The next section joins the first table to the handler table using the unique id and then joins the second table to the handler table on the key words longer than 3 letters (avoiding "the" "and" etc) joining back to the first table using the assumption that the string in table b is longer than table a (because you are looking for instances of each word in table a column in the corresponding column of table b hence the assumption).
declare #tablea table (
id int identity(1,1),
helptext nvarchar(50)
);
declare #tableb table (
id int identity(1,1),
helptext nvarchar(50)
);
insert #tablea (helptext)
values
('Text to find'),
('Georgia Production'),
('More to find');
insert #tableb (helptext)
values
('Georgia Independent Production'),
('More Text to Find'),
('something Completely different'),
('Text to find');
declare #stringtable table (
id int,
string nvarchar(50)
);
declare #stringmatch table (
id int,
stringmatch nvarchar(20)
);
insert #stringtable (id, string)
select id, helptext from #tablea;
update #stringtable set string = string + ' ';
while exists (select 1 from #stringtable)
begin
insert #stringmatch (id, stringmatch)
select id, substring(string,1,charindex(' ',string)) from #stringtable;
update #stringmatch set stringmatch = ltrim(rtrim(stringmatch));
update #stringtable set string=replace(string, stringmatch, '') from #stringtable tb inner join #stringmatch ma
on tb.id=ma.id and charindex(ma.stringmatch,tb.string)>0;
update #stringtable set string=LTRIM(string);
delete from #stringtable where string='' or string is null;
end
select a.*, b.* from #tablea a inner join #stringmatch m on a.id=m.id
inner join #tableb b on CHARINDEX(m.stringmatch,b.helptext)>0 and len(b.helptext)>len(a.helptext);
It all depends how complex you want to make this matching. There is various ways of matching these strings and some may work better than others. Below is an example of how you can split the names in your BOLFlatFile and Customers tables into separate words by using string_split.
The example below will match anything where all the words in the BOLFlatFile customer field are contained within the customers name field (note: it won't take into account ordering of the strings).
The code below will match the first two strings as expected, but not the last two sample strings.
CREATE TABLE BOLFlatFile
(
[customer] NVARCHAR(500)
)
CREATE TABLE Customers
(
[name] NVARCHAR(500)
)
INSERT INTO Customers VALUES ('Georgia Independent Production Co')
INSERT INTO BOLFlatFile VALUES ('Georgia Production')
INSERT INTO Customers VALUES ('Test String 1')
INSERT INTO BOLFlatFile VALUES ('Test 1')
INSERT INTO Customers VALUES ('Test String 2')
INSERT INTO BOLFlatFile VALUES ('Test 3')
;with BOLFlatFileSplit
as
(
SELECT *,
COUNT(*) OVER(PARTITION BY [customer]) as [WordsInName]
FROM
BOLFlatFile
CROSS APPLY
STRING_SPLIT([customer], ' ')
),
CustomerSplit as
(
SELECT *
FROM
Customers
CROSS APPLY
STRING_SPLIT([name], ' ')
)
SELECT
a.Customer,
b.name
FROM
CustomerSplit b
INNER JOIN
BOLFlatFileSplit a
ON
a.value = b.value
GROUP BY
a.Customer, b.name
HAVING
COUNT(*) = MAX([WordsInName])

Multi column exist statement

I am trying to insert data to a table from another where data is not already exists
The table I am inserting the data into
CREATE TABLE #T(Name VARCHAR(10),Unit INT, Id INT)
INSERT INTO #T
VALUES('AAA',10,100),('AAB',11,102),('AAC',12,130)
The table I am selecting the data from
CREATE TABLE #T1(Name VARCHAR(10),TypeId INT,Unit INT, Id INT)
INSERT INTO #T1
VALUES('AAA',3,10,100),('AAA',3,10,106)
In this case I want to select ('AAA',3,10,106) from #T1 because AAA,106 combination not exists in #T
Basically what I want is to populate unique Name and Id combination
I tried below which doesn't seems to work
SELECT *
FROM #T1
WHERE NOT EXISTS(SELECT * FROM #T)
You have to somehow correlate the two tables:
SELECT *
FROM #T1
WHERE NOT EXISTS(SELECT *
FROM #T
WHERE #T1.Name = #T.Name AND #T1.ID = #T.ID)
The above query essentially says: get me those records of table #T1 which do not have a related record in #T having the same Name and ID values.
Your best bet is probably to use a insert statement with a subquery. Something like this:
SQL Insert Into w/Subquery - Checking If Not Exists
Edit: If you're still stuck, try this--
INSERT INTO #T (Name, Unit, Id)
SELECT Name, Unit, Id
FROM #T1
WHERE
NOT EXISTS (SELECT Name, Unit, Id FROM #T
WHERE #T.Name = #T1.Name AND #T.Unit = #T1.Unit AND #T.Id = #T1.Id)

Compare two tables' data

drop table #temp1
drop table #temp2
create table #temp1 (id int identity, A int, B int)
create table #temp2 (id int identity, A int, B int)
insert into #temp1 values (20, 1001)
insert into #temp1 values (20, 1001)
insert into #temp1 values (30, 1001)
insert into #temp2 values (20, 1001)
With the help of SQL, I need to find out that the 2nd and 3rd row in the #temp1 is not present in #temp2.
How to find it out?
You can use the EXCEPT operator:
SELECT id, a, b
FROM #temp1
EXCEPT
SELECT id, a, b
FROM #temp2 ;
You have to consider though something very serious. The id column gets its numbers automatically so they have no value for you. In a real example, you would probably had a UNIQUE constraint in the (a, b) column or some other combination defined as PRIMARY KEY. So, the example above wouldn't be valid. There would be only two rows inserted at table1, the 1st and the 3rd. The 2nd would be rejected as identical to the 1st one.

Select records with order of IN clause

I have
SELECT * FROM Table1 WHERE Col1 IN(4,2,6)
I want to select and return the records with the specified order which i indicate in the IN clause
(first display record with Col1=4, Col1=2, ...)
I can use
SELECT * FROM Table1 WHERE Col1 = 4
UNION ALL
SELECT * FROM Table1 WHERE Col1 = 6 , .....
but I don't want to use that, cause I want to use it as a stored procedure and not auto generated.
I know it's a bit late but the best way would be
SELECT *
FROM Table1
WHERE Col1 IN( 4, 2, 6 )
ORDER BY CHARINDEX(CAST(Col1 AS VARCHAR), '4,2,67')
Or
SELECT CHARINDEX(CAST(Col1 AS VARCHAR), '4,2,67')s_order,
*
FROM Table1
WHERE Col1 IN( 4, 2, 6 )
ORDER BY s_order
You have a couple of options. Simplest may be to put the IN parameters (they are parameters, right) in a separate table in the order you receive them, and ORDER BY that table.
The solution is along this line:
SELECT * FROM Table1
WHERE Col1 IN(4,2,6)
ORDER BY
CASE Col1
WHEN 4 THEN 1
WHEN 2 THEN 2
WHEN 6 THEN 3
END
select top 0 0 'in', 0 'order' into #i
insert into #i values(4,1)
insert into #i values(2,2)
insert into #i values(6,3)
select t.* from Table1 t inner join #i i on t.[in]=t.[col1] order by i.[order]
Replace the IN values with a table, including a column for sort order to used in the query (and be sure to expose the sort order to the calling application):
WITH OtherTable (Col1, sort_seq)
AS
(
SELECT Col1, sort_seq
FROM (
VALUES (4, 1),
(2, 2),
(6, 3)
) AS OtherTable (Col1, sort_seq)
)
SELECT T1.Col1, O1.sort_seq
FROM Table1 AS T1
INNER JOIN OtherTable AS O1
ON T1.Col1 = O1.Col1
ORDER
BY sort_seq;
In your stored proc, rather than a CTE, split the values into table (a scratch base table, temp table, function that returns a table, etc) with the sort column populated as appropriate.
I have found another solution. It's similar to the answer from onedaywhen, but it's a little shorter.
SELECT sort.n, Table1.Col1
FROM (VALUES (4), (2), (6)) AS sort(n)
JOIN Table1
ON Table1.Col1 = sort.n
I am thinking about this problem two different ways because I can't decide if this is a programming problem or a data architecture problem. Check out the code below incorporating "famous" TV animals. Let's say that we are tracking dolphins, horses, bears, dogs and orangutans. We want to return only the horses, bears, and dogs in our query and we want bears to sort ahead of horses to sort ahead of dogs. I have a personal preference to look at this as an architecture problem, but can wrap my head around looking at it as a programming problem. Let me know if you have questions.
CREATE TABLE #AnimalType (
AnimalTypeId INT NOT NULL PRIMARY KEY
, AnimalType VARCHAR(50) NOT NULL
, SortOrder INT NOT NULL)
INSERT INTO #AnimalType VALUES (1,'Dolphin',5)
INSERT INTO #AnimalType VALUES (2,'Horse',2)
INSERT INTO #AnimalType VALUES (3,'Bear',1)
INSERT INTO #AnimalType VALUES (4,'Dog',4)
INSERT INTO #AnimalType VALUES (5,'Orangutan',3)
CREATE TABLE #Actor (
ActorId INT NOT NULL PRIMARY KEY
, ActorName VARCHAR(50) NOT NULL
, AnimalTypeId INT NOT NULL)
INSERT INTO #Actor VALUES (1,'Benji',4)
INSERT INTO #Actor VALUES (2,'Lassie',4)
INSERT INTO #Actor VALUES (3,'Rin Tin Tin',4)
INSERT INTO #Actor VALUES (4,'Gentle Ben',3)
INSERT INTO #Actor VALUES (5,'Trigger',2)
INSERT INTO #Actor VALUES (6,'Flipper',1)
INSERT INTO #Actor VALUES (7,'CJ',5)
INSERT INTO #Actor VALUES (8,'Mr. Ed',2)
INSERT INTO #Actor VALUES (9,'Tiger',4)
/* If you believe this is a programming problem then this code works */
SELECT *
FROM #Actor a
WHERE a.AnimalTypeId IN (2,3,4)
ORDER BY case when a.AnimalTypeId = 3 then 1
when a.AnimalTypeId = 2 then 2
when a.AnimalTypeId = 4 then 3 end
/* If you believe that this is a data architecture problem then this code works */
SELECT *
FROM #Actor a
JOIN #AnimalType at ON a.AnimalTypeId = at.AnimalTypeId
WHERE a.AnimalTypeId IN (2,3,4)
ORDER BY at.SortOrder
DROP TABLE #Actor
DROP TABLE #AnimalType
ORDER BY CHARINDEX(','+convert(varchar,status)+',' ,
',rejected,active,submitted,approved,')
Just put a comma before and after a string in which you are finding the substring index or you can say that second parameter.
And first parameter of CHARINDEX is also surrounded by , (comma).