String Search in ANSI SQL - sql

I am using Netezza.
One of my columns (COMMENTS) in my table (TABLE1) has text.
COMMENTS column is a free text column and can have any text in it.
Here are a few samples of how the text looks:
OLD STUDENT:N(Y/N/U=UNKNOWN) Some other text could be here
Some other text could be here OLD STUDENT:Y(Y/N/U=UNKNOWN)
OLD STUDENT:Y(Y/N/U=UNKNOWN) Some other text could be here
Note: Some records have "" between OLD STUDENT: and Y
I want to filter records based on text search.
The idea is to isolate records having the text "STUDENT:Y"
Here is what I have so far but is not working:
SELECT COMMENTS
FROM
TABLE1
WHERE
LOWER(COMMENTS) LIKE '%student%'
AND LOWER(COMMENTS) NOT LIKE '%student:n%'
How do I handle this?
Thank you for your time.

Even though there are valid records, WHERE LOWER(COMMENTS) LIKE '%student:y% where not returning any results. Some of the text fields have weird characters between the student: and y
Adding another single wildcard character "_" solved the issue.
SELECT COMMENTS
FROM
TABLE1
WHERE
LOWER(COMMENTS) LIKE '%student%'
AND LOWER(COMMENTS) NOT LIKE '%student:_n%'

Related

Remove unnecessary Characters by using SQL query

Do you know how to remove below kind of Characters at once on a query ?
Note : .I'm retrieving this data from the Access app and put only the valid data into the SQL.
select DISTINCT ltrim(rtrim(a.Company)) from [Legacy].[dbo].[Attorney] as a
This column is company name column.I need to keep string characters only.But I need to remove numbers only rows,numbers and characters rows,NULL,Empty and all other +,-.
Based on your extremely vague "rules" I am going to make a guess.
Maybe something like this will be somewhere close.
select DISTINCT ltrim(rtrim(a.Company))
from [Legacy].[dbo].[Attorney] as a
where LEN(ltrim(rtrim(a.Company))) > 1
and IsNumeric(a.Company) = 0
This will exclude entries that are not at least 2 characters and can't be converted to a number.
This should select the rows you want to delete:
where company not like '%[a-zA-Z]%' and -- has at least one vowel
company like '%[^ a-zA-Z0-9.&]%' -- has a not-allowed character
The list of allowed characters in the second expression may not be complete.
If this works, then you can easily adapt it for a delete statement.

Find exactly text

Text column is NVARCHAR(MAX) type.
ID Text
001 have odds and modds
002 odds>=12
003 modds
004 odds < 1
How can I search in Text column contains odds and not contain modds
I try:
Select * from MyTable
Where text LIKE '%odds%' AND text NOT LIKE '%modds%'
But result not correct return all. I want return
ID Text
001 have odds and modds
002 odds>=12
004 odds < 1
Any ideas? Thanks!
WHERE (text LIKE '%odds%' AND text NOT LIKE '%modds%')
OR (text LIKE '%odds%odds%')
Some questions regarding how this works. First off, SQL works with "sets" of data so we need a selector (WHERE clause) to create our "set" (or it is the entire table "set" if none is included)
SO here we created two portions of the set.
First we select all the rows that include the value "odds" in them somewhere but do NOT include "modds" in them. This excludes rows that ONLY include "modds" in them.
Second, we include rows where they have BOTH/two values of "odds" in them - the "%" is a wildcard so to break it down starting at the beginning.
"'%" anything at the start
"'%odds" anything at the start followed by "odds"
"'%odds%" anything at the start with anything following that
"'%odds%odds" anything at the start with anything following that but has "odds" after that
"'%odds%odds%'" anything at the start % with "odds" with anything in between % with "odds" following that with anything at the end %
This works for THIS SPECIFIC case because both the words contain "odds" so the order is NOT specific here. IF we wanted to do that with different words for example "cats", "cats" and "dogs" but JUST "dogs: we would have:
WHERE (mycolumn LIKE '%cats%' AND mycolumn NOT LIKE '%dogs%')
OR ((mycolumn LIKE '%cats%dogs%') OR (mycolumn LIKE '%dogs%cats%'))
This could also be written like: (has BOTH with the AND)
WHERE (mycolumn LIKE '%cats%' AND mycolumn NOT LIKE '%dogs%')
OR (mycolumn LIKE '%cats%' AND mycolumn LIKE '%dogs%')
This would catch the values without regard to the order of the "cats" and "dogs" values in the column.
Note the groupings with the parenthesis is not optional for these last two solution examples.
Select * from MyTable
Where text LIKE 'odds%'
Select * from MyTable Where text LIKE '% odds%' or text LIKE 'odds%'
The most flexible and efficient way is to use full-text search. This would create an index for each word in the specified text columns.
This feature is included with (at least some versions of) Microsoft SQL Server.

Oracle sql queries

I am working on a small relational database for school and having trouble with a simple query. I am supposed to find all records in a table that have a field with the word 'OMG' somewhere in the text.
I have tried a few other and I can't figure it out. I am getting an invalid relational operator error.
my attempts:
select * from comments where contains('omg');
select * from comments where text contains('omg');
select * from comments where about('omg');
and several more variations of the above. As you can see I am brand spanking new to SQL.
text is the name of the field in the comments table.
Thanks for any help.
Assuming that the column name is text:
select * from comments where text like '%omg%';
The percentage % is a wild-card which means that any text can come before/after omg
Pay attention, if you don't want the results to contain words in which omg` is a substring - you might want to consider adding whitespaces:
select * from comments where text like '% omg %';
Of course that it doesn't cover cases like:
omg is at the end of the sentence (followed by a ".")
omg has "!" right afterwards
etc
You may want to use the LIKE operator with wildcards (%):
SELECT * FROM comments WHERE text LIKE '%omg%';

SQL String contains ONLY

I have a table with a field that denotes whether the data in that row is valid or not. This field contains a string of undetermined length. I need a query that will only pull out rows where all the characters in this field are N. Some possible examples of this field.
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNN
NNNNNEEEENNNNNNNNNNNN
NNNNNOOOOOEEEENNNNNNNNNNNN
Any suggestions on a postcard please.
Many thanks
This should do the trick:
SELECT Field
FROM YourTable
WHERE Field NOT LIKE '%[^N]%' AND Field <> ''
What it's doing is a wildcard search, broken down:
The LIKE will find records where the field contains characters other than N in the field. So, we apply a NOT to that as we're only interested in records that do not contain characters other than N. Plus a condition to filter out blank values.
SELECT *
FROM mytable
WHERE field NOT LIKE '%[^N]%'
I don't know which SQL dialect you are using. For example Oracle has several functions you may use. With oracle you could use condition like :
WHERE LTRIM(field, 'N') = ''
The idea is to trim out all N's and see if the result is empty string. If you don't have LTRIM, check if you have some kind of TRANSLATE or REPLACE function to do the same thing.
Another way to do it could be to pick length of your field and then construct comparator value by padding empty string with N. Perhaps something like:
WHERE field = RPAD('', field, 'N)
Oracle pads that empty string with N's and picks number of pad characters from length of the second argument. Perhaps this works too:
WHERE field = RPAD('', LENGTH(field), 'N)
I haven't tested those, but hopefully that give you some ideas how to solve your problem. I guess that many of these solutions have bad performance if you have lot of rows and you don't have other WHERE conditions to select proper index.

Is it possible to get the matching string from an SQL query?

If I have a query to return all matching entries in a DB that have "news" in the searchable column (i.e. SELECT * FROM table WHERE column LIKE %news%), and one particular row has an entry starting with "In recent World news, Somalia was invaded by ...", can I return a specific "chunk" of an SQL entry? Kind of like a teaser, if you will.
select substring(column,
CHARINDEX ('news',lower(column))-10,
20)
FROM table
WHERE column LIKE %news%
basically substring the column starting 10 characters before where the word 'news' is and continuing for 20.
Edit: You'll need to make sure that 'news' isn't in the first 10 characters and adjust the start position accordingly.
You can use substring function in a SELECT part. Something like:
SELECT SUBSTRING(column, 1,20) FROM table WHERE column LIKE %news%
This will return the first 20 characters from column column
I had the same problem, I ended up loading the whole field into C#, then re-searched the text for the search string, then selected x characters either side.
This will work fine for LIKE, but not full text queries which use FORMS OF INFLECTION because that may match "women" when you search for "woman".
If you are using MSSQL you can perform all kinds VB-like of substring functions as part of your query.