Optimised way of replacing strings with each other in SQL table - sql

Problem Statement:
Looking for the optimised way of replacing strings with each other in SQL table with huge data and multiple cases.
Consider, I have a table City
I need to replace Bangalore with Delhi and Delhi with Bangalore and similarly there might be 'n' number of other cases.
I know that we can use casing in update to replace the data in the table. Is there a better way of doing it using Replace() function or anything else in the single update..?

An alternative would be to create a temporal table, something like
create table #tmpReplace(OriginalValue varchar(200), NewValue varchar(200))
and make an inner join with your table. The advantage is a much simpler update. The drawback, you still have to populate this table.
For example:
update yt
set yt.Name = tmp.NewValue
from dbo.YourTable yt inner join #tmpReplace tmp on tmp.Name = OriginalValue

Honestly, the easiest method would seem to be a CASE expression:
SELECT ID,
CASE Name WHEN 'Bangalore' THEN 'Delhi'
WHEN 'Delhi' THEN 'Bangalore'
...
ELSE Name
END AS Name
FROM dbo.YourTable;

I would suggest creating a derived table in the query with the "replacement" values:
update t
set t.name = v.newvalue
from t join
(values ('Delhi', 'Bangalore'),
('Bangalore', 'Delhi'),
. . .
) v(oldvalue, newvalue)
on v.oldvalue = t.name;

Related

SQL Server connecting several Selects

I have the following statement:
Select No, Region = 'Ohio'
FROM table
where PostCode >='0001'
AND PostCode <= '4999'
which updates me the table with the correct state in the field Region. How can I expand that statement with several other WHERE conditions in the same statement?
e.g.
Region = 'NewYork'
Where PostCode >='5000'
AND PostCode <= '7999'
My solution would be to build several Statements, for each Region, but there must be a better way having them all in one.
Two common ways to select/set different values based on multiple criteria in a single query are case statements and doing a join on another table with those values. I should also point out that you can take advantage of the between operator in SQL server for much of this.
CASE statements in a single query
A case statement might be useful if you have a small set of criteria, or if you just need to throw together an adhoc query. Here is an example of using a case statement:
select
No,
Region = case
when (PostCode >= '0001' and PostCode <= '4999')
'Ohio'
when (PostCode between '5000' and '7999')
'NewYork'
else
'Unknown'
end
from [...]
JOIN a table with the values and criteria
This is definitely the better method for something like evaluating 50 states - especially since this data is likely static. The idea is that you will want to have a table that contains the criteria and the value, and then join it to the table.
Here is an example using a temp table - you would likely want to use a real table for something as common as states.
-- Setup a #states table
create table #states (state varchar(20), PostCodeMin char(4), PostCodeMax char(4))
insert into #states values ('Ohio', '0001', '4999')
insert into #states values ('NewYork', '5000', '7999')
-- Now query it
select
t.No,
State = isnull(s.state, 'Unknown')
from
my_table t
left outer join #states s
on (t.PostCode between s.PostCodeMin and s.PostCodeMax)
Note that in the above query, I do a left outer join to #states, in case the state isn't setup. I also select the State using isnull, in case the outer join doesn't return anything for that particular row in my_table.
You can create a calculated field using a case statement on region. If there is going to be many "Unknown" records returned then you may want to tweak the WHERE clause to filter out nonessential records for better performance.
SELECT
*
FROM
(
Select
No,
Region =
CASE
WHEN PostCode >'0001' AND PostCode <='4999' THEN 'Ohio'
WHEN PostCode >'5000' AND PostCode <='7999' THEN 'New York'
ELSE
'Unknown'
END
FROM table
where PostCode >='0001' AND PostCode <= '7999'
)AS X
ORDER BY
Region

'In' clause in SQL server with multiple columns

I have a component that retrieves data from database based on the keys provided.
However I want my java application to get all the data for all keys in a single database hit to fasten up things.
I can use 'in' clause when I have only one key.
While working on more than one key I can use below query in oracle
SELECT * FROM <table_name>
where (value_type,CODE1) IN (('I','COMM'),('I','CORE'));
which is similar to writing
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'COMM'
and
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'CORE'
together
However, this concept of using 'in' clause as above is giving below error in 'SQL server'
ERROR:An expression of non-boolean type specified in a context where a condition is expected, near ','.
Please let know if their is any way to achieve the same in SQL server.
This syntax doesn't exist in SQL Server. Use a combination of And and Or.
SELECT *
FROM <table_name>
WHERE
(value_type = 1 and CODE1 = 'COMM')
OR (value_type = 1 and CODE1 = 'CORE')
(In this case, you could make it shorter, because value_type is compared to the same value in both combinations. I just wanted to show the pattern that works like IN in oracle with multiple fields.)
When using IN with a subquery, you need to rephrase it like this:
Oracle:
SELECT *
FROM foo
WHERE
(value_type, CODE1) IN (
SELECT type, code
FROM bar
WHERE <some conditions>)
SQL Server:
SELECT *
FROM foo
WHERE
EXISTS (
SELECT *
FROM bar
WHERE <some conditions>
AND foo.type_code = bar.type
AND foo.CODE1 = bar.code)
There are other ways to do it, depending on the case, like inner joins and the like.
If you have under 1000 tuples you want to check against and you're using SQL Server 2008+, you can use a table values constructor, and perform a join against it. You can only specify up to 1000 rows in a table values constructor, hence the 1000 tuple limitation. Here's how it would look in your situation:
SELECT <table_name>.* FROM <table_name>
JOIN ( VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b) ON a = value_type AND b = CODE1;
This is only a good idea if your list of values is going to be unique, otherwise you'll get duplicate values. I'm not sure how the performance of this compares to using many ANDs and ORs, but the SQL query is at least much cleaner to look at, in my opinion.
You can also write this to use EXIST instead of JOIN. That may have different performance characteristics and it will avoid the problem of producing duplicate results if your values aren't unique. It may be worth trying both EXIST and JOIN on your use case to see what's a better fit. Here's how EXIST would look,
SELECT * FROM <table_name>
WHERE EXISTS (
SELECT 1
FROM (
VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b)
WHERE a = value_type AND b = CODE1
);
In conclusion, I think the best choice is to create a temporary table and query against that. But sometimes that's not possible, e.g. your user lacks the permission to create temporary tables, and then using a table values constructor may be your best choice. Use EXIST or JOIN, depending on which gives you better performance on your database.
Normally you can not do it, but can use the following technique.
SELECT * FROM <table_name>
where (value_type+'/'+CODE1) IN (('I'+'/'+'COMM'),('I'+'/'+'CORE'));
A better solution is to avoid hardcoding your values and put then in a temporary or persistent table:
CREATE TABLE #t (ValueType VARCHAR(16), Code VARCHAR(16))
INSERT INTO #t VALUES ('I','COMM'),('I','CORE')
SELECT DT. *
FROM <table_name> DT
JOIN #t T ON T.ValueType = DT.ValueType AND T.Code = DT.Code
Thus, you avoid storing data in your code (persistent table version) and allow to easily modify the filters (without changing the code).
I think you can try this, combine and and or at the same time.
SELECT
*
FROM
<table_name>
WHERE
value_type = 1
AND (CODE1 = 'COMM' OR CODE1 = 'CORE')
What you can do is 'join' the columns as a string, and pass your values also combined as strings.
where (cast(column1 as text) ||','|| cast(column2 as text)) in (?1)
The other way is to do multiple ands and ors.
I had a similar problem in MS SQL, but a little different. Maybe it will help somebody in futere, in my case i found this solution (not full code, just example):
SELECT Table1.Campaign
,Table1.Coupon
FROM [CRM].[dbo].[Coupons] AS Table1
INNER JOIN [CRM].[dbo].[Coupons] AS Table2 ON Table1.Campaign = Table2.Campaign AND Table1.Coupon = Table2.Coupon
WHERE Table1.Coupon IN ('0000000001', '0000000002') AND Table2.Campaign IN ('XXX000000001', 'XYX000000001')
Of cource on Coupon and Campaign in table i have index for fast search.
Compute it in MS Sql
SELECT * FROM <table_name>
where value_type + '|' + CODE1 IN ('I|COMM', 'I|CORE');

SQL - String Logic

I don't know if what I am wanting to achieve is possible so here is my conundrum.
Within a SQL table there are a number of fields that contain yes/no flags in a string so for example. On the field may be be called 'Stock' and within this one field there is a string of flags which e.g. 'YNYYY' lets say for example that the flags stand for.
Coke
Fanta
Pepsi
Lilt
Dr Pepper
in this instance I would want in my return of data to return Coke,Pepsi,Lilt,Dr Pepper ommiting the Fanta.
Now this would be possible using the CASE Statement and this may be the answer that I have to use, however ideally so I don't have to write hundreds of different variables anyone know of a way this could be achieved?
Your help as always appreciated, I've done the normal googling and maybe I simply don't know what to search for as its giving me blanks.
Please point me in the right direction.
Regards
R
Why dont you use SELECT with WHERE?
Something like this.
SELECT GROUP_CONCAT(`Stock`)
FROM table_name
WHERE `flag` = 'Y'
Hope this helps.
One way you can achieve your goal is by writing a table valued function that turns your Y/N string into a table of (Id INT, value BIT), you could then join to a look up table based on convention. Here's what something like this would look like:
CREATE FUNCTION udf_StringToBool(#intput varchar(100))
RETURNS #table TABLE (
Id INT IDENTITY(1,1),
Value BIT
)
AS
begin
declare #temp_input varchar(100) = #intput
while len(#temp_input) > 0
begin
insert into #table (value)
SELECT CASE LEFT(#temp_input, 1) WHEN 'Y' THEN 1 ELSE 0 END
set #temp_input = right(#temp_input, len(#temp_input)-1)
END
RETURN
end
You would then join your stock (and a lookup to product) table with this function, then remove any that are not in stock in the WHERE clause:
SELECT s.*, v.Value, pl.Name
FROM
stock s
cross apply
(
select b.* from udf_StringToBool(s.Flags) b
) v
join product_lookup pl on pl.Id = v.Id
WHERE v.Value = 1
Here's how you would define the lookup table:
create table product_lookup
(
Id INT IDENTITY(1,1),
Name Varchar(50)
)
insert into product_lookup (Name) values
('Coke'),
('Fanta'),
('Pepsi'),
('Lilt'),
('Dr Pepper')
Then you could use PIVOT to generate the columns with booleans.
In the end I chose to use 'substring' and the 'case' statement so that each item appeared in it's own field this way I mitigated the need to write every variable.
SELECT CASE WHEN SUBSTRING(STOCK,1,1) = 'y' THEN 'IN STOCK' ELSE 'OUTOFSTOCK' END AS COKE
I don't know why it didnt occur to me to begin with and without your prompting and guidance I would have probably done this the long way round as ever thanks to all!

Pattern matching on strings from a table in SQL

Just wanted to know if it was possible to do a pattern matching on a set of data from a table.
Like:
select * from Table where Column like any(select Pattern from PatternTable)
Note that the Pattern is always a substring of Column. Hence the use of like. Is it even possible to do this at a database level without the use of stored procedures?
If it helps, my RDBMS is MS SQL-Server
Edit:
Alright, I have a table containing a set of data like
PatternTable
____________
test1
test2
test3
test4
Now, a table Table has the following data:
Table
______
SomeDatatest4SomeData
SomeDataSomeData
Now, can I use a query as mentioned above to find a match: For the above query, this should return SomeDatatest4SomeData
You can do this using exists:
select *
from Table t
where exists (select 1
from PatternTable pt
where t.Column like pt.Pattern
);
SELECT t.*
FROM [Table] t
INNER JOIN PatternTable p ON t.[Column] LIKE '%' + p.Pattern + '%'

Writing a single UPDATE statement that prevents duplicates

I've been trying for a few hours (probably more than I needed to) to figure out the best way to write an update sql query that will dissallow duplicates on the column I am updating.
Meaning, if TableA.ColA already has a name 'TEST1', then when I'm changing another record, then I simply can't pick a value for ColA to be 'TEST1'.
It's pretty easy to simply just separate the query into a select, and use a server layer code that would allow conditional logic:
SELECT ID, NAME FROM TABLEA WHERE NAME = 'TEST1'
IF TableA.recordcount > 0 then
UPDATE SET NAME = 'TEST1' WHERE ID = 1234
END IF
But I'm more interested to see if these two queries can be combined into a single query.
I am using Oracle to figure things out, but I'd love to see a SQL Server query as well. I figured a MERGE statement can work, but for obvious reasons you can't have the clause:
..etc.. WHEN NOT MATCHED UPDATE SET ..etc.. WHERE ID = 1234
AND you can't update a column if it's mentioned in the join (oracle limitation but not limited to SQL Server)
ALSO, I know you can put a constraint on a column that prevents duplicate values, but I'd be interested to see if there is such a query that can do this without using constraint.
Here is an example start-up attempt on my end just to see what I can come up with (explanations on it failed is not necessary):
ERROR: ORA-01732: data manipulation operation not legal on this view
UPDATE (
SELECT d.NAME, ch.NAME FROM (
SELECT 'test1' AS NAME, '2722' AS ID
FROM DUAL
) d
LEFT JOIN TABLEA a
ON UPPER(a.name) = UPPER(d.name)
)
SET a.name = 'test2'
WHERE a.name is null and a.id = d.id
I have tried merge, but just gave up thinking it's not possible. I've also considered not exists (but I'd have to be careful since I might accidentally update every other record that doesn't match a criteria)
It should be straightforward:
update personnel
set personnel_number = 'xyz'
where person_id = 1001
and not exists (select * from personnel where personnel_number = 'xyz');
If I understand correctly, you want to conditionally update a field, assuming the value is not found. The following query does this. It should work in both SQL Server and Oracle:
update table1
set name = 'Test1'
where (select count(*) from table1 where name = 'Test1') > 0 and
id = 1234