Returning postcodes (varchars) with only one numeric character in them - sql

I've been asked to run a query to return a list of UK post codes from a table full of filters for email reports which only have 1 number at the end. The problem is that UK post codes are of variable length; some are structured 'AA#' or 'AA##' and some are structured 'A#' or 'A##'. I only want those that are either 'AA#' or 'A#'.
I tried running the below SQL, using length and (attempting to) use regex to filter out all results which didn't match what I wanted, but I'm very new to using ranges and it hasn't worked.
SELECT PostCode
FROM ReportFilterTable RFT
WHERE RFT.FilterType = 'Postcode'
AND LEN(RFT.Postcode) < 4
AND RFT.PostCode LIKE '%[0-9]'
I think the way I'm approaching this is flawed, but I'm clueless as to a better way. Could anyone help me out?
Thanks!
EDIT:
Since I helpfully didn't include any example data originally, I've now done so below.
This is a sample of the kind of values in the column I'm returning, with examples of what I need to return and what I don't.
B1 -- Should be returned
B10 -- Should not be returned
B2 -- Should be returned
B20 -- Should not be returned
B3 -- Should be returned
B30 -- Should not be returned
SE1 -- Should be returned
SE10 -- Should not be returned

You could filter for one or two letters (and omit the length check, since it's implicit in the LIKE):
WHERE RFT.FilterType = 'Postcode' AND
(RFT.PostCode LIKE '[A-Z][0-9]' OR RFT.PostCode LIKE '[A-Z][A-Z][0-9]')

If the issue is that you are getting values with multiple digits and you are using SQL Server (as suggested by the syntax), then you can do:
WHERE RFT.FilterType = 'Postcode' AND
LEN(RFT.Postcode) < 4 AND
(RFT.PostCode LIKE '%[0-9]' AND RFT.PostCode NOT LIKE '%[0-9][0-9]')
Or, if you know there are at least two characters, you could use:
WHERE RFT.FilterType = 'Postcode' AND
LEN(RFT.Postcode) < 4 AND
RFT.PostCode LIKE '%[^0-9][0-9]'

Non-digit followed by 1 digit ... LIKE '%[^0-9][0-9]'

Related

Comparing two fields with leading zeros

I have tables A and B that share several fields and have the same datatype/length and I'm trying to get additional information to B and for that I need to do a match on case_number.
The problem is case_number in table A has a length of 10 and anything less than 10 is preceded with zeros (i.e 84534 --> 0000084534) table B does not (84534 = 84534) So when I attempt to match on case_number I get no results. Both fields are varchar2 and this is Oracle and I'm unable to modify table A.
I tried to use LPAD and that does not seem to help. I need a function to work in select statement.
The simplest solution seems to be to left-pad the string from the second table with zeros:
...
where a.case_number = lpad(b.case_number, 10, '0')
...
Alternatively, you could leave b.case_number unchanged and left-trim '0' from a.case_number, but this will only work if you can guarantee that b.case_number never has leading zeros (and, in particular, that b.case_number can't be zero).
...
where ltrim(a.case_number, '0') = b.case_number
...
One method is to convert to a number:
to_number(x) = to_number(y)

Query to ignore rows which have non hex values within field

Initial situation
I have a relatively large table (ca. 0.7 Mio records) where an nvarchar field "MediaID" contains largely media IDs in proper hexadecimal notation (as they should).
Within my "sequential" query (each query depends on the output of the query before, this is all in pure T-SQL) I have to convert these hexadecimal values into decimal bigint values in order to do further calculations and filtering on these calculated values for the subsequent queries.
--> So far, no problem. The "sequential" query works fine.
Problem
Unfortunately, some of these Media IDs do contain non-hex characters - most probably because there was some typing errors by the people which have added them or through import errors from the previous business system.
Because of these non-hex chars, the whole query fails (of course) because the conversion hits an error.
For my current purpose, such rows must be skipped/ignored as they are clearly wrong and cannot be used (there are no medias / data carriers in use with the current business system which can have non-hex character IDs).
Manual editing of the data is not an option as there are too many errors and it is not clear with what the data must be replaced.
Challenge
To create a query which only returns records which have valid hex values within the media ID field.
(Unfortunately, my SQL skills are not enough to create the above query. Your help is highly appreciated.)
The relevant section of the larger query looks like this (xxxx is where your help comes in :-))
select
pureMediaID
, mediaID
, CUSTOMERID
,CONTRACT_CUSTOMERID
from
(
select concat('0x', Replace(Ltrim(Replace(mediaID, '0', ' ')), ' ', '0')) AS pureMediaID
--, CUSTOMERID
, *
from M_T_CONTRACT_CUSTOMERS
where mediaID is not null
and mediaID like '0%'
and xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
) as inner1
EDIT: As per request I have added here some good and some bad data:
Good:
4335463357
4335459809
1426427996
4335463509
4335515039
4335465134
4427370396
4335415661
4427369036
4335419089
004BB03433
004e7cf9c6
00BD23133
00EE13D8C1
00CCB5522C
00C46522C
00dbbe3433
Bad:
4564589+
AB6B8BFC.8
7B498DFCnm
DB218DFChb
d<tgfh8CFC
CB9E8AFCzj
B458DFCjhl
rytzju8DFC
BFCtdsjshj
DB9888FCgf
9BC08CFCyx
EB198DFCzj
4B628CFChj
7B2B8DFCgg
After I did upgrade the compatibility level of the SQL instance to SQL2016 (it was below 2012 before) I could use try_convert with same syntax as the original convert function as donPablo has pointed out. With that the query could run fully through and every MediaID which is not a correct hex value gets nicely converted into a null value - really, really nice.
Exactly what I needed.
Unfortunately, the solution of ALICE... didn't work out for me as this was also (strangely) returning records which had the "+" character within them.
Edit: The added comment of Alice... where you create a calculated field like this:
CASE WHEN "KEY" LIKE '%[^0-9A-F]%' THEN 0 ELSE 1 end as xyz
and then filter in the next query like this:
where xyz = 1
works also with SQL Instances with compatibility level < SQL 2012.
Great addition for people which still have to work with older SQL instances.
An option (although not ideal in terms of performance) is to check the characters in the MediaID through a case statement and regular expression
Hexadecimals cannot contain characters other than A-F and numbers between 0 and 9
CASE WHEN MediaID LIKE '%[0-9A-F]%' THEN 1 ELSE 0 END
I would recommend writing a function that can be used to evaluate MediaID first and checks if it is hexadecimal and then running the query for conversion

Exclude rows when column contains a 1 in position 2 without using function

I have a column that will always be 5 digits long, and each digit will always be a 1 or a 0. I need to put in my where clause to exclude when the second position is equal to 1. For example 01000 is to be excluded but 10010 is to be kept. I currently have:
WHERE (SUBSTRING(field, 2, 1) <> '1') or field IS NULL
How do do this without using the Substring function?
Edit:Also, the column is a varchar(10) in the database. Does this matter?
You could use the like operator to check that character directly:
WHERE field LIKE '_1%' OR field IS NULL
Use LEFT and RIGHT and then check that is 1 or not as below-
WHERE RIGHT(LEFT(field,2),1) <> '1' OR field IS NULL
No.
If 'field' is of a string type, you need to use string functions to manipulate it. SUBSTRING or some other flavor of it.
You can also convert it to binary and use bitwise AND operator but that won't solve the root issue here.
You are facing the consequences of someone ignoring 1NF.
There is a reason why Codd insisted that every "cell" must be atomic. Your's is not.
Can you separate this bitmap into atomic attribute columns?

select using wildcard to find ending in two character then numeric

I am querying to find things ending in "ST" followed by a number 1 - 999.
SELECT NUMBER WHERE NUMBER LIKE '%ST -- works correctly to return everything ending in "ST"
SELECT NUMBER WHERE NUMBER LIKE '%[1-999] -- works correctly to return everything ending in 1 - 999
SELECT NUMBER WHERE NUMBER LIKE '%ST[1-999] -- doesn't work - returns nothing
Also tried:
SELECT NUMBER WHERE NUMBER LIKE '%ST%[1-999] -- works, but also returns things like "GRASTNT3" that have extra things between the "ST" and the number
Can anyone help this struggling beginner?
Thanks!
The problem is that [1-999] doesn't mean what you think it does.
SQL Server interprets that as a set of values (1-9, 9, 9) which basically means that if there's more than 1 digit after the ST, the entry won't be returned.
So far as I can tell, your best bet is:
SELECT NUMBER WHERE
NUMBER LIKE '%ST[1-9][0-9][0-9]' OR
NUMBER LIKE '%ST[1-9][0-9]' OR
NUMBER LIKE '%ST[1-9]'
(assuming that your numbers don't have leading zeros - if they do, replace the ones with more zeros)
You need to do
SELECT NUMBER WHERE
NUMBER LIKE '%ST[1-9][0-9][0-9]'
OR NUMBER LIKE '%ST[1-9][0-9]'
OR NUMBER LIKE '%ST[1-9]';
The group in the the [] is a Char/NChar not an Int.
Better still normalise and type your data, so you have an ST bit and an int column for the number.
If you find you need to define different filters on variable string data, consider Full Text Searching or another Lucene related technology depending on your RDBMS.

Problem with MySQL Select query with "IN" condition

I found a weird problem with MySQL select statement having "IN" in where clause:
I am trying this query:
SELECT ads.*
FROM advertisement_urls ads
WHERE ad_pool_id = 5
AND status = 1
AND ads.id = 23
AND 3 NOT IN (hide_from_publishers)
ORDER BY rank desc
In above SQL hide_from_publishers is a column of advertisement_urls table, with values as comma separated integers, e.g. 4,2 or 2,7,3 etc.
As a result, if hide_from_publishers contains same above two values, it should return only record for "4,2" but it returns both records
Now, if I change the value of hide_for_columns for second set to 3,2,7 and run the query again, it will return single record which is correct output.
Instead of hide_from_publishers if I use direct values there, i.e. (2,7,3) it does recognize and returns single record.
Any thoughts about this strange problem or am I doing something wrong?
There is a difference between the tuple (1, 2, 3) and the string "1, 2, 3". The former is three values, the latter is a single string value that just happens to look like three values to human eyes. As far as the DBMS is concerned, it's still a single value.
If you want more than one value associated with a record, you shouldn't be storing it as a comma-separated value within a single field, you should store it in another table and join it. That way the data remains structured and you can use it as part of a query.
You need to treat the comma-delimited hide_from_publishers column as a string. You can use the LOCATE function to determine if your value exists in the string.
Note that I've added leading and trailing commas to both strings so that a search for "3" doesn't accidentally match "13".
select ads.*
from advertisement_urls ads
where ad_pool_id = 5
and status = 1
and ads.id = 23
and locate(',3,', ','+hide_from_publishers+',') = 0
order by rank desc
You need to split the string of values into separate values. See this SO question...
Can Mysql Split a column?
As well as the supplied example...
http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/
Here is another SO question:
MySQL query finding values in a comma separated string
And the suggested solution:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_find-in-set