Multiple replacements on a column replacement from a replacement table - sql

I have a requirement where I have a table REPLACE_Table. This table would have 2 columns: one would be Original_string and the other would be Replacement_String.
I have a cursor running on Item_master table. For each record, in the Item_description column, I need to scan for the Replace_Table/Original_string and replace it with Replace_Table/replacement_string.
For Example, if my Replace_Table has these 2 rows:
Original_string Replacement_String
--------------------------------------
LO ##
WO ()
If my first Item_Description is 'HELLO WORLD', then I should get the result as 'HEL## ()RLD'.
I cannot use recursive Replace function in SQL because I do not know the number of records in my REPLACE_Table. I cannot use XLATE because it is not character to character replacement.
Only solution I have in mind is to read the REPLACE_Table in a loop and keep replacing Item_Description column value using the REPLACE in SQL.
Is there any other good solution?

Ok, so you're dealing with outputting XML and you're concerned about special characters...
Personally, I'd look at using the CDATA section for any data which might contain special characters...
<name><![CDATA[Mike & Son's Auto]]></name>
Is handled by an XML parser just like
<name>Mike & Son&apos;s Auto</name>
would be.
Also consider looking at whatever tools you might be using for web services. Scott Klement's excellent open source HTTP API includes an http_EscapeXml() procedure already.
Failing that, consider using the XMLTEXT() function built into Db2 for i
myText = 'Mike & Son''s Auto';
exec SQL
values (XMLSERIALIZE(XMLTEXT(:myText)
as varchar(50)
excluding XMLDECLARATION
)) into :myXmlText;
Although XMLTEXT() only converts & and < from what I can see...

Related

SQL Query to Split a string, format each result, then build back into a single row

I've been working on a rather convoluted process to format some data for a work project. We received a data extract and need to export it for import during a migration, but some of the data won't import due to case sensitivity (user logons with sentence case for example).
In an ideal world, I could demand the data be sanitised and formatted before it's provided for me to build the import, but I can't, so I'm stuck where I have to format it myself.
Plan:
Take string result
Split string result by pipe delimitation
Format each split results ( ) into lower case (where applicable)
Put all split results back into one string using FOR XML PATH
Example of problem:
Field 'Assigned To' can contain a pipe delimitted string of users and/or user groups, e.g.
John Smith (jsmith)|College Group of Arts|Bob Jones (BJones)
Now as you can see above, John Smith (jsmith) looks fine, as does College Group of Arts, however Bob Jones has had his logon sentence cased, so I need to use a LOWER command, chained with SUBSTRING and CHARINDEX to convert the logon to lower. Standalone, this approach works fine, but the problem I'm having is where I'm using a function found here on Stack Overflow (slightly manipulated to account for pipe delimitation) T-SQL split string.
When I retrieve the table results of the split string, I can't apply CHARINDEX against any characters in the result string, and I can't work out why.
Scenario:
The raw data extract, untouched, returns the below when queried;
|College of Science Administrators|Bob Jones (BJones)|
I then apply the below query, which calls the function queried above;
declare #assignedto nvarchar(max) = (select assigned_to from project where project_id = 1234)
SELECT SUBSTRING(Name,CHARINDEX(Name,'('),255)
FROM dbo.splitstring(#assignedto)
I then get the below results;
College of Science Administrators
Bob Jones (BJones)
What I'd expect to see is;
College of Science Administrators
(BJones)
I could then apply my LOWER logic to change it to lower case.
If that worked, then thought process was then to take those results and pass them back into a single string using a FOR XML PATH.
So I guess technically, there are 2 questions here;
Why won't my function let me manipulate the results with CHARINDEX?
And is this the best way to do what I'm trying to achieve overall?
I would strongly suggest you take that splitstring function you found and throw it away. It is horribly inefficient and doesn't even take the delimiter as a parameter. There are so many better splitter options available. One such example is the DelimitedSplit8K_LEAD which can be found here.
I noticed you also have your delimiters at the beginning and the end so you have to eliminate those but not a big deal. Here is how I would go about parsing this string. I am using a variable for your string here with the value you said is in your table.
declare #Something varchar(100) = '|College of Science Administrators|Bob Jones (BJones)|'
select MyOutput = case when charindex('(', x.Item) > 1 then substring(x.Item, charindex('(', x.Item), len(x.Item)) else Item end
from dbo.DelimitedSplit8K_LEAD(#Something, '|') x
where x.Item > ''
For question #1 you must simply invert parameters in CharIndex :
CHARINDEX('(', Name))

SELECT middle part of a String if it exists. Postgresql

i've got a problem with transferring "real-World" data into my schema.
It's actually a "project" for my Database course and they gave us ab table with dog race results. This Table has a column which contains the name of the Dog (which itself consists of the actuall name and the name of the breeder) and informations about the Birthcountry, actual living Country and the birth year.
Example filed are "Lillycette [AU 2012]" or "Black Bear Lee [AU/AU 2013]" or "Lemon Ralph [IE/UK 1998]".
I've managed it to get out the first word and save it in the right column with split_part like this:
INSERT INTO tblHund (rufname)
SELECT
split_part(name, ' ', 1) AS rufname,
FROM tblimport;
tblimport is a table where I dumped the data from the csv file.
That works just as it should.
Accessing the second part of the Name with this fails because sometimes there isn't a second part and sometimes times there second part consists of two words.
And this is the where iam stuck right now.
I tried it with substring and regular expressions:
INSERT INTO tblZwinger (Name)
SELECT
substring(vatertier from E'[^ ]*\\ ( +)$')AS Name
FROM tblimport
WHERE substring(vatertier from E'[^ ]*\\ ( +)$') != '';
The above code is executed without errors but actually does nothing because the SELECT statement just give empty strings back.
It took me more then 3h to understand a bit of this regular Expressions but I still feel pretty stupid when I look at them.
Is there any other way of doing this. If so just give me a hint.
If not what is wrong with my expression above?
Thanks for your help.
You need to use atom ., which matches any single character inside capturing group:
E'[^ ]*\\ (.+)$'
SELECT
tblimport.*,
ti.parts[1] as f1,
ti.parts[2] as f2, -- It should be the "middle part"
ti.parts[3] as f3
FROM
tblimport,
regexp_matches(tblimport.vatertier, '([^\s]+)\s*(.*)\s+\[(.*)\]') as ti(parts)
WHERE
nullif(ti.parts[2], '') is not null
Something like above.

SQL -- SELECT statement -- concatenate strings to

I have an SQL question. Everything works fine in the below SELECT statement except the portion I have highlighted in bold. What I'm trying to do is allow the user to search for a specific Rule within the database. Unfortunately, I do not actually have a Rule column, and so I need to concatenate certain field values to create a string with which to compare to the user's searchtext.
Any idea why the part in bold does not work? In theory, I would like this statement to check for whether the string "Rule " + part_num (where part_num is the value contained in the part_num field) equals the value of searchtext (the value of searchtext is obtained from my PHP script).
I did some research on concatenating strings for SQL purposes, but none seem to fit the bill. Does someone out there have any suggestions?
SELECT id,
part_num,
part_title,
rule_num,
rule_title,
sub_heading_num,
sub_heading,
contents
FROM rules
WHERE part_title LIKE "%'.$searchtext.'%"
OR rule_title LIKE "%'.$searchtext.'%"
OR sub_heading LIKE "%'.$searchtext.'%"
OR contents LIKE "%'.$searchtext.'%"
OR "rule" + part_num LIKE "%'.$searchtext.'%" --RULE PLUS PART_NUM DOESN'T WORK
ORDER BY id;
Since you didn't specify which DB your using, I'm going to assume SQL Sever.
Strings are specified in SQL Server with single quotes 'I'm a string', not double quotes.
See + (String Concatenation) on MSDN for examples.
Another possibility is that part_num is a numeric. If so, cast the number to a string (varchar) before concatenating.

How to extract numerical data from SQL result

Suppose there is a table "A" with 2 columns - ID (INT), DATA (VARCHAR(100)).
Executing "SELECT DATA FROM A" results in a table looks like:
DATA
---------------------
Nowshak 7,485 m
Maja e Korabit (Golem Korab) 2,764 m
Tahat 3,003 m
Morro de Moco 2,620 m
Cerro Aconcagua 6,960 m (located in the northwestern corner of the province of Mendoza)
Mount Kosciuszko 2,229 m
Grossglockner 3,798 m
// the DATA continues...
---------------------
How can I extract only the numerical data using some kind of string processing function in the SELECT SQL query so that the result from a modified SELECT would look like this:
DATA (in INTEGER - not varchar)
---------------------
7485
2764
3003
2620
6960
2229
3798
// the DATA in INTEGER continues...
---------------------
By the way, it would be best if this could be done in a single SQL statement. (I am using IBM DB2 version 9.5)
Thanks :)
I know this thread is old, and the OP doesn't need the answer, but I had to figure this out with a few hints from this and other threads. They all seem to be missing the exact answer.
The easy way to do this is to TRANSLATE all unneeded characters to a single character, then REPLACE that single character with an empty string.
DATA = 'Nowshak 7,485 m'
# removes all characters, leaving only numbers
REPLACE(TRANSLATE(TRIM(DATA), '_____________________________________________________________________________________________', ' abcdefghijklmnopqrstuvwzyaABCDEFGHIJKLMNOPQRSTUVWXYZ`~!##$%^&*()-_=+\|[]{};:",.<>/?'), '_', '')
=> '7485'
To break down the TRANSLATE command:
TRANSLATE( FIELD or String, <to characters>, <from characters> )
e.g.
DATA = 'Sample by John'
TRANSLATE(DATA, 'XYZ', 'abc')
=> a becomes X, b becomes Y, c becomes Z
=> 'SXmple Yy John'
** Note: I can't speak to performance or version compatibility. I'm on a 9.x version of DB2, and new to the technology. Hope this helps someone.
In Oracle:
SELECT TO_NUMBER(REGEXP_REPLACE(data, '[^0-9]', ''))
FROM a
In PostgreSQL:
SELECT CAST(REGEXP_REPLACE(data, '[^0-9]', '', 'g') AS INTEGER)
FROM a
In MS SQL Server and DB2, you'll need to create UDF's for regular expressions and query like this.
See links for more details.
Doing a quick search on line for DB2 the best inbuilt function I can find is Translate It lets you specify a list of characters you want to change to other characters. It's not ideal, but you can specify every character that you want to strip out, that is, every non numeric character available...
(Yes, that's a long list, a very long list, which is why I say it's not ideal)
TRANSLATE('data', 'abc...XYZ,./\<>?|[and so on]', ' ')
Alternatively you need to create a user defined function to search for the number. There are a few alternatives for that.
Check each character one by one and keep it only if it's a numeric.
If you know what precedes the number and what follows the number, you can search for those and keep what is in between...
To elaborate on Dems's suggeston, the approach I've used is a scalar user-defined function (UDF) that accepts an alphanumeric string and recursively iterates through the string (one byte per iteration) and suppresses the non-numeric characters from the output. The recursive expression will generate a row per iteration, but only the final row is kept and returned to the calling application.

Is it possible to get the matching string from an SQL query?

If I have a query to return all matching entries in a DB that have "news" in the searchable column (i.e. SELECT * FROM table WHERE column LIKE %news%), and one particular row has an entry starting with "In recent World news, Somalia was invaded by ...", can I return a specific "chunk" of an SQL entry? Kind of like a teaser, if you will.
select substring(column,
CHARINDEX ('news',lower(column))-10,
20)
FROM table
WHERE column LIKE %news%
basically substring the column starting 10 characters before where the word 'news' is and continuing for 20.
Edit: You'll need to make sure that 'news' isn't in the first 10 characters and adjust the start position accordingly.
You can use substring function in a SELECT part. Something like:
SELECT SUBSTRING(column, 1,20) FROM table WHERE column LIKE %news%
This will return the first 20 characters from column column
I had the same problem, I ended up loading the whole field into C#, then re-searched the text for the search string, then selected x characters either side.
This will work fine for LIKE, but not full text queries which use FORMS OF INFLECTION because that may match "women" when you search for "woman".
If you are using MSSQL you can perform all kinds VB-like of substring functions as part of your query.