SQL Text Function for special characters - sql

I have a field with text reviews in it and I want to spot where people have used special characters to get offensive words etc past the filters, so instead of typing badword they type b.a.d.w.o.r.d or b*a*d*w*o*r*d,
Is there a way to look for say 3 or more special characters in word in a text review, maybe some sort of count function for special characters?

If you have a table with a field containing words you dont want to allow you could add it in your WHERE clause like so using REGEX_REPLACE.
SELECT yourfield
FROM yourtable
WHERE REGEXP_REPLACE(yourfield,'[^a-zA-Z'']','') NOT IN (SELECT badwords
FROM badwordstable)

Related

How can I return Special Characters in SQL?

I've been searching and I couldnt found exactly what I need.
I have and SQL Server 2008
I have some strings in a table with special characters such as "!,;:-()" and I'm trying to make a script that could return this characters, BUT ONLY THOSE CHARACTERS (do you get it?)
For example, I have "Borges, Ricardo" and I need to return only the ","
Another example, "Calle 13 Nº 34, Mercedes Bs As ( 6600)", and I only need the "º,()"
I dont want to get the letters or numbers.
I wrote this simple script:
SELECT name FROM table WHERE name LIKE '%[^A-Za-z0-9 ]%'
SO I could get all the rows where there is a Special Character like a (,;.:-()&%$... but I need to RETURN ONLY the special Characters. Get it? No letters, no numbers, only the special characters in the row.
THank you very much for your help!

Getting the Column containing the non-english language in ORACLE

I have above entries in my database, my requirement is to extract the fields containing the non-english language characters ( including if the data containing the combination of english and non-english characters like HotelName field for the ID 45).
I tried by regexp_like function by looking for the alphanumeric and non-alphanumeric, but i have some data with combination of both the condition fails there.
Thanks in Advance
Raghavan
Does this do what you want?
where regexp_like(hotelname, '[^a-zA-Z0-9 ]')
That is, where the hotel name contains any character that is not a "letter" or digit. You may need to take additional characters into account as well, such as commas, periods, and hyphens.

Remove unnecessary Characters by using SQL query

Do you know how to remove below kind of Characters at once on a query ?
Note : .I'm retrieving this data from the Access app and put only the valid data into the SQL.
select DISTINCT ltrim(rtrim(a.Company)) from [Legacy].[dbo].[Attorney] as a
This column is company name column.I need to keep string characters only.But I need to remove numbers only rows,numbers and characters rows,NULL,Empty and all other +,-.
Based on your extremely vague "rules" I am going to make a guess.
Maybe something like this will be somewhere close.
select DISTINCT ltrim(rtrim(a.Company))
from [Legacy].[dbo].[Attorney] as a
where LEN(ltrim(rtrim(a.Company))) > 1
and IsNumeric(a.Company) = 0
This will exclude entries that are not at least 2 characters and can't be converted to a number.
This should select the rows you want to delete:
where company not like '%[a-zA-Z]%' and -- has at least one vowel
company like '%[^ a-zA-Z0-9.&]%' -- has a not-allowed character
The list of allowed characters in the second expression may not be complete.
If this works, then you can easily adapt it for a delete statement.

SQL: insert space before numbers in string

I have a nvarchar field in my table, which contains all sorts of strings.
In case there are strings which contain a number following a non-number sign, I want to insert a space before that number.
That is - if a certain entry in that field is abc123, it should be turned into abc 123, or ab12.34 should become ab 12. 34.I want this to be done throughout the entire table.
What's the best way to achieve it?
You can try something like that:
select left(col,PATINDEX('%[0-9]%',col)-1 )+space(1)+
case
when PATINDEX('%[.]%',col)<>0
then substring(col,PATINDEX('%[0-9]%',col),len(col)+1-PATINDEX('%[.]%',col))
+space(1)+
substring(col,PATINDEX('%[.]%',col)+1,len(col)+1-PATINDEX('%[.]%',col))
else substring(col,PATINDEX('%[0-9]%',col),len(col)+1-PATINDEX('%[0-9]%',col))
end
from tab
It's not simply, but I hope it will help you.
SQL Fiddle
I used functions (link to MSDN):
LEFT, PATINDEX, SPACE, SUBSTRING, LEN
and regular expression.

SQL String contains ONLY

I have a table with a field that denotes whether the data in that row is valid or not. This field contains a string of undetermined length. I need a query that will only pull out rows where all the characters in this field are N. Some possible examples of this field.
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNN
NNNNNEEEENNNNNNNNNNNN
NNNNNOOOOOEEEENNNNNNNNNNNN
Any suggestions on a postcard please.
Many thanks
This should do the trick:
SELECT Field
FROM YourTable
WHERE Field NOT LIKE '%[^N]%' AND Field <> ''
What it's doing is a wildcard search, broken down:
The LIKE will find records where the field contains characters other than N in the field. So, we apply a NOT to that as we're only interested in records that do not contain characters other than N. Plus a condition to filter out blank values.
SELECT *
FROM mytable
WHERE field NOT LIKE '%[^N]%'
I don't know which SQL dialect you are using. For example Oracle has several functions you may use. With oracle you could use condition like :
WHERE LTRIM(field, 'N') = ''
The idea is to trim out all N's and see if the result is empty string. If you don't have LTRIM, check if you have some kind of TRANSLATE or REPLACE function to do the same thing.
Another way to do it could be to pick length of your field and then construct comparator value by padding empty string with N. Perhaps something like:
WHERE field = RPAD('', field, 'N)
Oracle pads that empty string with N's and picks number of pad characters from length of the second argument. Perhaps this works too:
WHERE field = RPAD('', LENGTH(field), 'N)
I haven't tested those, but hopefully that give you some ideas how to solve your problem. I guess that many of these solutions have bad performance if you have lot of rows and you don't have other WHERE conditions to select proper index.