Identify and remove records surrounded in quotes - sql

I have imported a table containing roughly 100,000 records, some of which need to be removed. I'd like to identify and remove any and all records where a particular field (called MyQuery) contains words surrounded in quotation marks, but if there are only TWO quotation marks in the field.
For example I would like to remove
"This is a test"
--but not--
"This is "a" test"
Thank you kindly for any assistance

My apologies to all of you for the terrible structure and clarity of my question. I think I was able to get to the answer based on your suggestions though!
select * from <table> where LEN(myquery) - LEN(REPLACE(myquery,'"','')) = 2 and myquery like '"%"'
Thank you all so much - you're the best!

You can do it like this:
DELETE
-- SELECT * -- To test first!
FROM YourTable
WHERE LEN(MyQuery) - LEN(REPLACE(MyQuery, '"', '')) = 2;
AND MyQuery LIKE '"%"'
You are basically comparing the length if the field after removing all quotes. This is an easy way to determine the number of occurrences of a specific character.
After your comment, I added another the condition so the quotes are always surrounding the field.

You can use the following query to delete all records that contain '"', excluding those that start and end with '"':
DELETE From <TABLENAME> Where COL like '%"%' and (COL not like '"%' Or COL not like '%"')
Or, you can use: Here we delete all records having more than 2 occurrences of "
DELETE FROM <TABLENAME> WHERE (LEN(COL)-len(replace(COL,'"',''))) > 2

Related

Remove unnecessary Characters by using SQL query

Do you know how to remove below kind of Characters at once on a query ?
Note : .I'm retrieving this data from the Access app and put only the valid data into the SQL.
select DISTINCT ltrim(rtrim(a.Company)) from [Legacy].[dbo].[Attorney] as a
This column is company name column.I need to keep string characters only.But I need to remove numbers only rows,numbers and characters rows,NULL,Empty and all other +,-.
Based on your extremely vague "rules" I am going to make a guess.
Maybe something like this will be somewhere close.
select DISTINCT ltrim(rtrim(a.Company))
from [Legacy].[dbo].[Attorney] as a
where LEN(ltrim(rtrim(a.Company))) > 1
and IsNumeric(a.Company) = 0
This will exclude entries that are not at least 2 characters and can't be converted to a number.
This should select the rows you want to delete:
where company not like '%[a-zA-Z]%' and -- has at least one vowel
company like '%[^ a-zA-Z0-9.&]%' -- has a not-allowed character
The list of allowed characters in the second expression may not be complete.
If this works, then you can easily adapt it for a delete statement.

Select rows that has mixed charcters in a single value e.g. 'Joh?n' in name column

In an oracle table:
1- a value in a VARCHAR column contains characters that are not letters.
Consider a scenarion where a name in 'last_name' column contains regular characters (A - Z, a - z) as well as characters that are not english letters (e.g. '.', '-', ' ','_', '>' or similar).
The challenge is to select the rows that has names in 'last_name' as '.John' or 'John.' or '-John' or 'Joh-n'
2- Is it possible to have non-date values in a Date defined column? If yes, how can such records be selected in an oracle query?
Thanks!
I believe this will do the trick:
SELECT * FROM mytable WHERE REGEXP_LIKE(last_name, '[^A-Za-z]');
As for your 2nd question, I am unsure. I would be glad if someone else could add on to what I have to answer your 2nd question. I have found this website thought that might be of help. http://infolab.stanford.edu/~ullman/fcdb/oracle/or-time.html
It explains the DATE format.
If I properly understand your goal, you need to select rows with last_name column containing the name 'John', but it may also have additional characters before, after, or even inside the name. In that case, this should be helpful:
select * from tab where regexp_replace(last_name, '[^A-Za-z]+', '') = 'John'

Matching exactly 2 characters in string - SQL

How can i query a column with Names of people to get only the names those contain exactly 2 “a” ?
I am familiar with % symbol that's used with LIKE but that finds all names even with 1 a , when i write %a , but i need to find only those have exactly 2 characters.
Please explain - Thanks in advance
Table Name: "People"
Column Names: "Names, Age, Gender"
Assuming you're asking for two a characters search for a string with two a's but not with three.
select *
from people
where names like '%a%a%'
and name not like '%a%a%a%'
Use '_a'. '_' is a single character wildcard where '%' matches 0 or more characters.
If you need more advanced matches, use regular expressions, using REGEXP_LIKE. See Using Regular Expressions With Oracle Database.
And of course you can use other tricks as well. For instance, you can compare the length of the string with the length of the same string but with 'a's removed from it. If the difference is 2 then the string contained two 'a's. But as you can see things get ugly real soon, since length returns 'null' when a string is empty, so you have to make an exception for that, if you want to check for names that are exactly 'aa'.
select * from People
where
length(Names) - 2 = nvl(length(replace(Names, 'a', '')), 0)
Another solution is to replace everything that is not an a with nothing and check if the resulting String is exactly two characters long:
select names
from people
where length(regexp_replace(names, '[^a]', '')) = 2;
This can also be extended to deal with uppercase As:
select names
from people
where length(regexp_replace(names, '[^aA]', '')) = 2;
SQLFiddle example: http://sqlfiddle.com/#!4/09bc6
select * from People where names like '__'; also ll work

Get rows that contain only certain characters

I want to get only those rows that contain ONLY certain characters in a column.
Let's say the column name is DATA.
I want to get all rows where in DATA are ONLY (must have all three conditions!):
Numeric characters (1 2 3 4 5 6 7 8 9 0)
Dash (-)
Comma (,)
For instance:
Value "10,20,20-30,30" IS OK
Value "10,20A,20-30,30Z" IS NOT OK
Value "30" IS NOT OK
Value "AAAA" IS NOT OK
Value "30-" IS NOT OK
Value "30," IS NOT OK
Value "-," IS NOT OK
Try patindex:
select * from(
select '10,20,20-30,30' txt union
select '10,20,20-30,40' txt union
select '10,20A,20-30,30Z' txt
)x
where patindex('%[^0-9,-]%', txt)=0
For you table, try like:
select
DATA
from
YourTable
where
patindex('%[^0-9,-]%', DATA)=0
As per your new edited question, the query should be like:
select
DATA
from
YourTable
where
PATINDEX('%[^0-9,-]%', DATA)=0 and
PATINDEX('%[0-9]%', LEFT(DATA, 1))=1 and
PATINDEX('%[0-9]%', RIGHT(DATA, 1))=1 and
PATINDEX('%[,-][-,]%', DATA)=0
Edit: Your question was edited, so this answer is no longer correct. I won't bother updating it since someone else already has updated theirs. This answer does not fulfil the condition that all three character types must be found.
You can use a LIKE expression for this, although it's slightly convoluted:
where data not like '%[^0123456789,!-]%' escape '!'
Explanation:
[^...] matches any character that is not in the ... part. % matches any number (including zero) of any character. So [^0123456789-,] is the set of characters that you want to disallow.
However: - is a special character inside of [], so we must escape it, which we do by using an escape character, and I've chosen !.
So, you match rows that do not contain (not like) any character that is not in your disallowed set.
Use option with PATINDEX and LIKE logic operator
SELECT *
FROM dbo.test70
WHERE PATINDEX('%[A-Z]%', DATA) = 0
AND PATINDEX('%[0-9]%', DATA) > 0
AND DATA LIKE '%-%'
AND DATA LIKE '%,%'
Demo on SQLFiddle
As already mentioned u can use a LIKE expression but it will only work with some minor modifications, otherwise too many rows will be filtered out.
SELECT * FROM X WHERE T NOT LIKE '%[^0-9!-,]%' ESCAPE '!'
see working example here:
http://sqlfiddle.com/#!3/474f5/6
edit:
to meet all 3 conditions:
SELECT *
FROM X
WHERE T LIKE '%[0-9]%'
AND T LIKE '%-%'
AND T LIKE '%,%'
see: http://sqlfiddle.com/#!3/86328/1
Maybe not the most beautiful but a working solution.

Replacing an Unknown Value with SQL Replace

I'm probably missing something really obvious here, but this has been a bear to search for on Google (Maybe I don't have the right terminology).
I want to replace an unknown value with another value from a temp table. I know the length of the value so my thought was to use underscores as you would in a LIKE statement. The following DOES NOT work however:
UPDATE MyTable
SET Name =
Replace(Name, '__SomeString', TempTable.value + ' SomeString')
FROM MyTable INNER JOIN TempTable
ON Name LIKE TempTable.Name
This is MS SQL 2000 FWIW.
EDIT: To try and clarify it looks like the underscore '_' wildcard that is used in a LIKE statement is taken literally inside of the replace function. Is there another way?
Any thoughts?
UPDATE MyTable
SET Name =
CASE WHEN (Name like '_SomeString')
THEN TempTable.value + SUBSTRING(Name,2,LEN(Name)-1)
ELSE Name END
FROM MyTable INNER JOIN TempTable
ON MyTable.Name = TempTable.Name
WHERE MyTable.Name = 'TheNameToReplace' -- I don't know if it will be for a specific name hence the where...
This will then replace 'SomeString' in the Name field, with the value from TempTable.value
Is this what you were looking for or something else?
Perhaps you can use stuff instead of replace. You need to know the start position in the string where you want to replace the characters and you need to know the length of the expression that is to be replaced. If you don't know that perhaps you can use charindex or patindex to figure that out.
select stuff('A123', 1, 1, 'B ')
Result:
(No column name)
B 123
Would somethi8ng like this work?
UPDATE mytable
SET field1 = 'A' + SUBSTRING(field1,2,LEN(field1))
WHERE LEFT(field1) IN (0,1,2,3,4,5,6,7,8,9)
Apparently it is not possible to use wild cards in the REPLACE function. This is the closest match on SO that I could find: MySQL Search & Replace with WILDCARDS - Query While the link is for MySQL I believe it is true for MS SQL as well.
The other answers here are all creative solutions to the problem, but I ended up going the brute force route.