Searching for a specific text value in a column in SQLite3 - sql

Suppose I have a table named 'Customer' with many columns and I want to display all customers who's name ends with 'Thomas'(Lastname = 'Thomas'). The following query shows an empty result(no rows). Also it didn't show any error.
SELECT * FROM Customer WHERE Lastname = 'Thomas';
While executing the following query give me correct result.
SELECT * FROM Customer WHERE Lastname LIKE '%Thomas%';
I would like to know what is the problem with my first query. I am using sqlite3 with Npm. Below is the result of '.show' command(Just in case of the problem is with config).
sqlite> .show
echo: off
explain: off
headers: on
mode: column
nullvalue: ""
output: stdout
separator: "|"
stats: off
width:

Use Like instead of =
Trim to ensure that there arent spaces messing around
so the query will be
SELECT * FROM Customer WHERE trim(Lastname) LIKE 'Thomas';
depending on your types, probably you dont need point 2, since as can be read in mysql manual
All MySQL collations are of type PADSPACE. This means that all CHAR
and VARCHAR values in MySQL are compared without regard to any
trailing spaces
But the point 1 could be the solution. Actually if you want to avoid problems, you should compare strings with LIKE, instead of =.
If You still have problems, probably you will have to use collates.
SELECT *
FROM t1
WHERE k LIKE _latin1 'Müller' COLLATE latin1_german2_ci; #using your real table collation
more information here But specifically with 'Thomas' you shouldn't need it, since it hasn't got any special characters.

Related

Query to ignore rows which have non hex values within field

Initial situation
I have a relatively large table (ca. 0.7 Mio records) where an nvarchar field "MediaID" contains largely media IDs in proper hexadecimal notation (as they should).
Within my "sequential" query (each query depends on the output of the query before, this is all in pure T-SQL) I have to convert these hexadecimal values into decimal bigint values in order to do further calculations and filtering on these calculated values for the subsequent queries.
--> So far, no problem. The "sequential" query works fine.
Problem
Unfortunately, some of these Media IDs do contain non-hex characters - most probably because there was some typing errors by the people which have added them or through import errors from the previous business system.
Because of these non-hex chars, the whole query fails (of course) because the conversion hits an error.
For my current purpose, such rows must be skipped/ignored as they are clearly wrong and cannot be used (there are no medias / data carriers in use with the current business system which can have non-hex character IDs).
Manual editing of the data is not an option as there are too many errors and it is not clear with what the data must be replaced.
Challenge
To create a query which only returns records which have valid hex values within the media ID field.
(Unfortunately, my SQL skills are not enough to create the above query. Your help is highly appreciated.)
The relevant section of the larger query looks like this (xxxx is where your help comes in :-))
select
pureMediaID
, mediaID
, CUSTOMERID
,CONTRACT_CUSTOMERID
from
(
select concat('0x', Replace(Ltrim(Replace(mediaID, '0', ' ')), ' ', '0')) AS pureMediaID
--, CUSTOMERID
, *
from M_T_CONTRACT_CUSTOMERS
where mediaID is not null
and mediaID like '0%'
and xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
) as inner1
EDIT: As per request I have added here some good and some bad data:
Good:
4335463357
4335459809
1426427996
4335463509
4335515039
4335465134
4427370396
4335415661
4427369036
4335419089
004BB03433
004e7cf9c6
00BD23133
00EE13D8C1
00CCB5522C
00C46522C
00dbbe3433
Bad:
4564589+
AB6B8BFC.8
7B498DFCnm
DB218DFChb
d<tgfh8CFC
CB9E8AFCzj
B458DFCjhl
rytzju8DFC
BFCtdsjshj
DB9888FCgf
9BC08CFCyx
EB198DFCzj
4B628CFChj
7B2B8DFCgg
After I did upgrade the compatibility level of the SQL instance to SQL2016 (it was below 2012 before) I could use try_convert with same syntax as the original convert function as donPablo has pointed out. With that the query could run fully through and every MediaID which is not a correct hex value gets nicely converted into a null value - really, really nice.
Exactly what I needed.
Unfortunately, the solution of ALICE... didn't work out for me as this was also (strangely) returning records which had the "+" character within them.
Edit: The added comment of Alice... where you create a calculated field like this:
CASE WHEN "KEY" LIKE '%[^0-9A-F]%' THEN 0 ELSE 1 end as xyz
and then filter in the next query like this:
where xyz = 1
works also with SQL Instances with compatibility level < SQL 2012.
Great addition for people which still have to work with older SQL instances.
An option (although not ideal in terms of performance) is to check the characters in the MediaID through a case statement and regular expression
Hexadecimals cannot contain characters other than A-F and numbers between 0 and 9
CASE WHEN MediaID LIKE '%[0-9A-F]%' THEN 1 ELSE 0 END
I would recommend writing a function that can be used to evaluate MediaID first and checks if it is hexadecimal and then running the query for conversion

SQL full text search behavior on numeric values

I have a table with about 200 million records. One of the columns is defined as varchar(100) and it's included in a full text index. Most of the values are numeric. Only few are not numeric.
The problem is that it's not working well. For example if a row contains the value '123456789' and i look for '567', it's not returning this row. It will only return rows where the value is exactly '567'.
What am I doing wrong?
sql server 2012.
Thanks.
Full text search doesn't support leading wildcards
In my setup, these return the same
SELECT *
FROM [dbo].[somelogtable]
where CONTAINS (logmessage, N'28400')
SELECT *
FROM [dbo].[somelogtable]
where CONTAINS (logmessage, N'"2840*"')
This gives zero rows
SELECT *
FROM [dbo].[somelogtable]
where CONTAINS (logmessage, N'"*840*"')
You'll have to use LIKE or some fancy trigram approach
The problem is probably that you are using a wrong tool since Full-text queries perform linguistic searches and it seems like you want to use simple "like" condition.
If you want to get a solution to your needs then you can post DDL+DML+'desired result'
You can do this:
....your_query.... LIKE '567%' ;
This will return all the rows that have a number 567 in the beginning, end or in between somewhere.
99% You're missing % after and before the string you search in the LIKE clause.
es:
SELECT * FROM t WHERE att LIKE '66'
is the same as as using WHERE att = '66'
if you write:
SELECT * FROM t WHERE att LIKE '%66%'
will return you all the lines containing 2 'sixes' one after other

sqlite text data type comparison not working

I have a table in SQLite database with data type text, but when I do comparison its not working if I do it this way :
select * from scanned_dbs where db = 'cdd_db';
But if i change the query to:
select * from scanned_dbs where db like 'cdd_db';
It works. But as valex pointed out it will also match cddAdb, cddBdb, ...and so on, so this is not the right way.
One more method I found which is working is this:
select * from scanned_dbs where cast(db as varchar) = 'cdd_db';
So can any one tell me why this is working and not the first one which is direct comparison...
Because an underscore ("_") in the LIKE pattern matches any single character in the string.
So when you use db = 'cdd_db' the only db value that matches is exact 'cdd_db'. But when you use the LIKE operator db like 'cdd_db' then "_" symbol is a pattern so db values that match: cddAdb,cddBdb,cddcdb,cddddb,cdd1db, ....

Access And/Or exclusions

I have some sample data like:
|H.RISK|NOTE|NORMAL|
The | are actually in the data, it's a String.
I am using Access, and am trying to exclude records that contain RISK in the String.
Sample query:
SELECT * FROM someTable WHERE (UCase(someTable.Field) NOT LIKE '*RISK*') AND (UCase(someTable.Field) NOT LIKE '*Blah*') AND someTable.SomeOtherField <> 4;
The problem is, the query is returning the above sample record, even though it does contain the string RISK.
I've tried the same query but switched to OR instead of AND but get the same results.
How can I properly structure this query to exclude records which contain certain strings?
Seeing that it appears you are running using the SQL syntax, try with the correct wild card.
SELECT * FROM someTable WHERE (someTable.Field NOT LIKE '%RISK%') AND (someTable.Field NOT LIKE '%Blah%') AND someTable.SomeOtherField <> 4;

Return rows where first character is non-alpha

I'm trying to retrieve all columns that start with any non alpha characters in SQlite but can't seem to get it working. I've currently got this code, but it returns every row:
SELECT * FROM TestTable WHERE TestNames NOT LIKE '[A-z]%'
Is there a way to retrieve all rows where the first character of TestNames are not part of the alphabet?
Are you going first character only?
select * from TestTable WHERE substr(TestNames,1) NOT LIKE '%[^a-zA-Z]%'
The substr function (can also be called as left() in some SQL languages) will help isolate the first char in the string for you.
edit:
Maybe substr(TestNames,1,1) in sqllite, I don't have a ready instance to test the syntax there on.
Added:
select * from TestTable WHERE Upper(substr(TestNames,1,1)) NOT in ('A','B','C','D','E',....)
Doesn't seem optimal, but functionally will work. Unsure what char commands there are to do a range of letters in SQLlite.
I used 'upper' to make it so you don't need to do lower case letters in the not in statement...kinda hope SQLlite knows what that is.
try
SELECT * FROM TestTable WHERE TestNames NOT LIKE '[^a-zA-Z]%'
SELECT * FROM NC_CRIT_ATTACH WHERE substring(FILENAME,1,1) NOT LIKE '[A-z]%';
SHOULD be a little faster as it is
A) First getting all of the data from the first column only, then scanning it.
B) Still a full-table scan unless you index this column.