SQL: LIKE query on column with chr(10) or chr(13) - sql

I am doing a
select field1
from tablename
where field1 like '%xyz zrt bla bla trew%'
field1 is a clob column and between the 'xyz zrt bla bla trew' there might be new line characters like chr(10) or chr(13). So it might be 'xyz\r\n\rzrt bla bla\n\n trew' etc
These are being converted to one (or more spaces). So any spaces between words can be true spaces or one or more of those new line characters.
How do I take this into account in my LIKE?
I am using Oracle but if possible I would like to use something that works for SQL Server, etc.

The lazy solution (relying on regular expressions, which may or may not kill performance - which may or may not matter) would be something like this:
select field1
from tablename
where regexp_like(field1, 'xyz\s+zrt\s+bla\s+bla\s+trew')

For Oracle you can use INSTR(field1, chr(10)) > 0
For SQL Server you can use field1 LIKE '%' + CHAR(10) + '%'
or CHARINDEX(CHAR(10), field1) > 0

You can replace the \n and \r with spaces first then trim the spaces and then compare. That would always work and perhaps fast enough.
However for proper speeding up the search process you can store the trimming results in a designated column "field1_trim".
Depends on your needs - storing the trimming results into a temporary table might be enough and more balanced between space/speed solution.
For example: save the result of the following query into a temporary table and then run your query on it
select
regexp_replace('[[:space:]]+', chr(32))
field1_trim,
<some unique row id to map to original table>
from table1;

Related

Identify and remove records surrounded in quotes

I have imported a table containing roughly 100,000 records, some of which need to be removed. I'd like to identify and remove any and all records where a particular field (called MyQuery) contains words surrounded in quotation marks, but if there are only TWO quotation marks in the field.
For example I would like to remove
"This is a test"
--but not--
"This is "a" test"
Thank you kindly for any assistance
My apologies to all of you for the terrible structure and clarity of my question. I think I was able to get to the answer based on your suggestions though!
select * from <table> where LEN(myquery) - LEN(REPLACE(myquery,'"','')) = 2 and myquery like '"%"'
Thank you all so much - you're the best!
You can do it like this:
DELETE
-- SELECT * -- To test first!
FROM YourTable
WHERE LEN(MyQuery) - LEN(REPLACE(MyQuery, '"', '')) = 2;
AND MyQuery LIKE '"%"'
You are basically comparing the length if the field after removing all quotes. This is an easy way to determine the number of occurrences of a specific character.
After your comment, I added another the condition so the quotes are always surrounding the field.
You can use the following query to delete all records that contain '"', excluding those that start and end with '"':
DELETE From <TABLENAME> Where COL like '%"%' and (COL not like '"%' Or COL not like '%"')
Or, you can use: Here we delete all records having more than 2 occurrences of "
DELETE FROM <TABLENAME> WHERE (LEN(COL)-len(replace(COL,'"',''))) > 2

SQL - SELECT all the records limiited to certain characters

I am pretty new to SQL so I desperately need support with the following matter:
one of the columns of the table I am querying contains very long and various text values: so, I would like to limit the output to some kinds only.
I would like to get as output of the query the same content, limited to [a-z][A-Z][0-9], so for example:
Original table: Hello /% World, 2019
Result query: Hello World 2019
Does anybody have any idea?
Thank you very much
Assuming regexp_replace is supported,
regexp_replace(column, '[^a-zA-Z0-9]', '')
You didn't mention which SQL engine you are using, but most engines support the string function REPLACE which can use to substitute any character you want with an empty string to remove it. You will have to nest several of these to get rid of the different characters. Check the documentation of your database engine for the syntax.
For example:
SELECT REPLACE(REPLACE('Hello /% World, 2019, '/', ''), '%','')
In general, I would argue that this is not the job of the database engine, and you can do it in your presentation layer.
HTH
$string = 'Hello /% World, 2019';
Strip all characters but letters and numbers from a PHP string:
$res = preg_replace("/[^a-zA-Z0-9]/", "", $string);
Strip all characters but letters, numbers, and whitespace:
$res = preg_replace("/[^a-zA-Z0-9\s]/", "", $string);
BEST OPTION: Remove all special characters leaving no spaces
SELECT REGEXP_REPLACE(your_column, '[^0-9A-Za-z]', '') AS newfield
FROM tablename;
Remove many characters, leaving spaces for the same number of characters removed
SELECT TRANSLATE(columnname,'!##$%^&*()',
' ' ) AS newfield
FROM tablename;
If you just want to replace one character
SELECT REPLACE('columnname', '%', '') AS newfield
From tablename;
If you are looking for something in particular (You want to find the words "Hello World 2019")
Select ‘Hello World 2019' AS newfield
From tablename
WHERE columnname like ‘Hello%’ AND columnname like ‘%World%’ AND columnname like ‘%2019’;
Assuming Hello is always the beginning and 2019 is always at the end and you want all the fields. If you want other fields, you might want those as well; like,
Select TableID, columnname, ‘Hello World 2019' AS newfield
From tablename
WHERE columnname like ‘Hello%’ AND columnname like ‘%World%’ AND columnname like ‘%2019’;

How to match text ending with a text on DB2?

I've executed the query on DB2:
SELECT * FROM MYTABLE WHERE MYFIELD LIKE '%B'
Although I know there are records that end with 'B' in the database, the query returned no results. After some research, it seems that DB2 is not recognizing LIKE expressions that don't end with '%'. So the following query would work:
SELECT * FROM MYTABLE WHERE MYFIELD LIKE '%B%'
but naturally not as expected, because it will return only the rows, where MYFIELD contains 'B', but doesn't end with it.
How to go around that strage, hmmm, feature? How to match the text on the end of the word in LIKE-like expressions?
DB2 can match patterns at the end of the string. The problem is probably that there are other characters.
You can try:
WHERE rtrim(MYFIELD) LIKE '%B'
You can also look at the lengths of the field and delimit the string value to see if there are other characters:
select length(MyField), '|' || MyField || '|'
from mytable
where MyField like '%B%';
If 'like' operator does not work for you, you can try regular expressions (with xQuery)

How can I output a record as a string with the fields separated by comma's in SQL?

I'm using SQL Server 2000 and I want to return a string like the following from this table:
Field1 | Field2 | Field3
ABC | 123 | abc
DEF | 456 | def
Output Desired:
"ABC,123,abc"
"DEF,456,def"
I believe I can do this by simple concatenation, but it feels messy. What is the best way to do this?
NOTE: This is a simple example, the actual use has 9+ fields and may have nulls.
try:
select
'"'
+ISNULL(CONVERT(varchar(1000),Field1),'null')
+ ',' + ISNULL(CONVERT(varchar(1000),Field2),'null')
+ ',' + ISNULL(CONVERT(varchar(1000),Field3),'null')
+'"' AS ResultColumn
from MyTable
just adjust the 1000 in each CONVERT(varchar(1000), as necessary
You can use coalesce or simple concatenation as you mentioned.
Select Field1 + ','+ Field2 + ',' + Field3
Beware of null values through, if Field1,2, or 3 is null, you will get a Null result, this behavior can be changed but is not recommended.
Hmm, why do you search for a more complex answer than concatenation? If you want something "cleaner", then do the concatenation fro your code calling the DB server, that way you do not put the CPU burden on the server to concatenate.
One thing no one has mentioned that always comes back to bite people in the butt is that if your delimiter (comma) is contained in a data field, then you will have extra commas and your data will be ambiguous. (ie, does "a,b,c,d" contain three fields, or four? If the second field actually contains "b,c" then your export is malformed).
The CSV format has been applied slightly different over time, so there is no standard way to handle this situation. The way I personally prefer to handle it is by enquoting the field values, and then escaping the quotes by repeating them. so for example, you have a record with the following data:
Field1: abc
Field2: def"1
Field3: 12,""3,b
The CSV record will be written as:
"abc","def""1","12,""""3,b"
That format avoids all possible ambiguity in field data.
Parsing it back is simple and deterministic as well. The other option is to use a field delimiter that you think can never happen in actual field data. This is a quick way to handle it, but then leaves a time bomb in your program for another programmer to fix later :-)
EDIT: If you are not going to be reading back the data, then, of course, you should see how the program reading the data handles importing data that contains delimiter characters as part of the field data that should not be interpreted as field delimiters.
The simplest is concatenation. Remember, keep it simple for the sake of understanding, readability and maintainability. No need for clever code.
select (Field1 + ',' + Field2 + ',' + Field3) as result
from MyTable
Here's a link: http://msdn.microsoft.com/en-us/library/aa276862(SQL.80).aspx
Remark
When you concatenate null values, either the concat null yields null setting of sp_dboption or SET CONCAT_NULL_YIELDS_NULL determines the behavior when one expression is NULL. With either concat null yields null or SET CONCAT_NULL_YIELDS_NULL enabled ON, 'string' + NULL returns NULL. If either concat null yields null or SET CONCAT_NULL_YIELDS_NULL is disabled, the result is 'string'.
SELECT COALESCE(Field1,'''''') +','+ COALESCE(Field2,'''''') +',' +COALESCE(Field3,'''''') FROM Table1

SQL (MySQL): Match first letter of any word in a string?

(Note: This is for MySQL's SQL, not SQL Server.)
I have a database column with values like "abc def GHI JKL". I want to write a WHERE clause that includes a case-insensitive test for any word that begins with a specific letter. For example, that example would test true for the letters a,c,g,j because there's a 'word' beginning with each of those letters. The application is for a search that offers to find records that have only words beginning with the specified letter. Also note that there is not a fulltext index for this table.
You can use a LIKE operation. If your words are space-separated, append a space to the start of the string to give the first word a chance to match:
SELECT
StringCol
FROM
MyTable
WHERE
' ' + StringCol LIKE '% ' + MyLetterParam + '%'
Where MyLetterParam could be something like this:
'[acgj]'
To look for more than a space as a word separator, you can expand that technique. The following would treat TAB, CR, LF, space and NBSP as word separators.
WHERE
' ' + StringCol LIKE '%['+' '+CHAR(9)+CHAR(10)+CHAR(13)+CHAR(160)+'][acgj]%'
This approach has the nice touch of being standard SQL. It would work unchanged across the major SQL dialects.
Using REGEXP opearator:
SELECT * FROM `articles` WHERE `body` REGEXP '[[:<:]][acgj]'
It returns records where column body contains words starting with a,c,g or i (case insensitive)
Be aware though: this is not a very good idea if you expect any heavy load (not using index - it scans every row!)
Check the Pattern Matching and Regular Expressions sections of the MySQL Reference Manual.