Determine if zip code contains numbers only - sql

I have a field called zip, type char(5), which contains zip codes like
12345
54321
ABCDE
I'd like to check with an sql statement if a zip code contains numbers only.
The following isn't working
SELECT * FROM S1234.PERSON
WHERE ZIP NOT LIKE '%'
It can't work because even '12345' is an "array" of characters (it is '%', right?
I found out that the following is working:
SELECT * FROM S1234.PERSON
WHERE ZIP NOT LIKE ' %'
It has a space before %. Why is this working?

If you use SQL Server 2012 or up the following script should work.
DECLARE #t TABLE (Zip VARCHAR(10))
INSERT INTO #t VALUES ('12345')
INSERT INTO #t VALUES ('54321')
INSERT INTO #t VALUES ('ABCDE')
SELECT *
FROM #t AS t
WHERE TRY_CAST(Zip AS NUMERIC) IS NOT NULL

Using answer from here to check if all are digit
SELECT col1,col2
FROM
(
SELECT col1,col2,
CASE
WHEN LENGTH(RTRIM(TRANSLATE(ZIP , '*', ' 0123456789'))) = 0
THEN 0 ELSE 1
END as IsAllDigit
FROM S1234.PERSON
) AS Z
WHERE IsAllDigit=0
DB2 doesnot have regular expression facility like MySQL REGEXP

USE ISNUMERIC function;
ISUMERIC returns 1 if the parameter contains only numbers and zero if it not
EXAMPLE:
SELECT * FROM S1234.PERSON
WHERE ISNUMERIC(ZIP) = 1
Your statement doesn't validate against numbers but it says get everything that doesn't start with a space.

Let's suppose you ZIP code is a USA zip code, composed by 5 numbers.
db2 "with val as (
select *
from S1234.PERSON t
where xmlcast(xmlquery('fn:matches(\$ZIP,''^\d{5}$'')') as integer) = 1
)
select * from val"
For more information about xQuery:fn:matches: http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrfnmat.html

mySql does not have a native isNumberic() function. This would be pretty straight-forward in Excel with the ISNUMBER() function, or in T-SQL with ISNUMERIC(), but neither work in MySQL so after a little searching around I came across this solution...
SELECT * FROM S1234.PERSON
WHERE ZIP REGEXP ('[0-9]')
Effectively we're processing a regular expression on the contents of the 'ZIP' field, it may seem like using a sledgehammer to crack a nut and I've no idea how performance would differ from a more simple approach but it worked and I guess that's the point.

I have made more error-prone version based on the solution https://stackoverflow.com/a/36211270/565525, added intermedia result, some examples:
select
test_str
, TRIM(TRANSLATE(replace(trim(test_str), ' ', 'x'), 'yyyyyyyyyyy', '0123456789'))
, case when length(TRIM(TRANSLATE(replace(trim(test_str), ' ', 'x'), 'yyyyyyyyyyy', '0123456789')))=5 then '5-digit-zip' else 'not 5d-zip' end is_zip
from (VALUES
(' 123 ' )
,(' abc ' )
,(' a12 ' )
,(' 12 3 ')
,(' 99435 ')
,('99323' )
) AS X(test_str)
;
The result for this example set is:
TEST_STR 2 IS_ZIP
-------- -------- -----------
123 yyy not 5d-zip
abc abc not 5d-zip
a12 ayy not 5d-zip
12 3 yyxy not 5d-zip
99435 yyyyy 5-digit-zip
99323 yyyyy 5-digit-zip

Try checking if there's a difference between lower case and upper case. Numerics and special chars will look the same:
SELECT *
FROM S1234.PERSON
WHERE UPPER(ZIP COLLATE Latin1_General_CS_AI ) = LOWER(ZIP COLLATE Latin1_General_CS_AI)

Here's a working example for the case where you'd want to check zip codes in a range. You could use this code for inspiration to make a simple single post code check, if you want:
if local_test_environment?
# SQLite supports GLOB which is similar to LIKE (which it only has limited support for), for matching in strings.
where("(zip_code NOT GLOB '*[^0-9]*' AND zip_code <> '') AND (CAST(zip_code AS int) >= :range_start AND CAST(zip_code AS int) <= :range_finish)", range_start: range_start, range_finish: range_finish)
else
# SQLServer supports LIKE with more advanced matching in strings than what SQLite supports.
# SQLServer supports TRY_PARSE which is non-standard SQL, but fixes the error SQLServer gives with CAST, namely: Conversion failed when converting the nvarchar value 'US-19803' to data type int.
where("(zip_code NOT LIKE '%[^0-9]%' AND zip_code <> '') AND (TRY_PARSE(zip_code AS int) >= :range_start AND TRY_PARSE(zip_code AS int) <= :range_finish)", range_start: range_start, range_finish: range_finish)
end

Use regex.
SELECT * FROM S1234.PERSON
WHERE ZIP REGEXP '\d+'

Related

How to fetch only a part of string

I have a column which has inconsistent data. The column named ID and it can have values such as
0897546321
ABC,0876455321
ABC,XYZ,0873647773
ABC,
99756
test only
The SQL query should fetch only Ids which are of 10 digit in length, should begin with a 08 , should be not null and should not contain all characters. And for those values, which have both digits and characters such as ABC,XYZ,0873647773, it should only fetch the 0873647773 . In these kind of values, nothing is fixed, in place of ABC, XYZ , it can be anything and can be of any length.
The column Id is of varchar type.
My try: I tried the following query
select id
from table
where id is not null
and id not like '%[^0-9]%'
and id like '[08]%[0-9]'
and len(id)=10
I am still not sure how should I deal with values like ABC,XYZ,0873647773
P.S - I have no control over the database. I can't change its values.
SQL Server generally has poor support regular expressions, but in this case a judicious use of PATINDEX is viable:
SELECT SUBSTRING(id, PATINDEX('%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%', ',' + id + ','), 10) AS number
FROM yourTable
WHERE ',' + id + ',' LIKE '%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%';
Demo
If you normalise your data, and split the delimited data into parts, you can achieve this some what more easily:
SELECT SS.value
FROM dbo.YourTable YT
CROSS APPLY STRING_SPLIT(YT.YourColumn,',') SS
WHERE LEN(SS.value) = 10
AND SS.value NOT LIKE '%[^0-9]%';
If you're on an older version of SQL Server, you'll have to use an alternative String Splitter method (such as a XML splitter or user defined inline table-value function); there are plenty of examples on these already on Stack Overflow.
db<>fiddle

SQL query: convert

I'm trying to read a column from a database using a SQL query. The column consists of empty string or numbers as strings, such as
"7500" "4460" "" "2900" "2640" "1850" "" "2570" "9050" "8000" "9600"
I'm trying to find the right sql query to extract all the numbers (as integers) and removing the empty ones, but I'm stuck. So far I've got
SELECT *
FROM base
WHERE CONVERT(INT, code) IS NOT NULL
Done in program R (package sqldf)
If all columns are valid integers, you could use:
select * , cast(code as int) IntCode
from base
where code <> ''
To prevent cases when field code is not a valid number, use:
select *, cast(codeN as int) IntCode
from base
cross apply (select case when code <> '' and not code like '%[^0-9]%' then code else NULL end) N(codeN)
where codeN is not null
SQL Fiddle
UPDATE
To find rows where code is not a valid number, use
select * from base where code like '%[^0-9]%'
select *
from base
where col like '[1-9]%'
Example: http://sqlfiddle.com/#!6/f7626/2/0
If you don't need to test for the number being valid, ie. a string such as '909XY2' then this may run marginally faster, more or less depending on the size of the table
Is this what you want?
SELECT (case when code not like '%[^0-9]%' then cast(code as int) end)
FROM base
WHERE code <> '' and code not like '%[^0-9]%';
The conditions are repeated in the where and case on purpose. SQL Server does not guarantee that where filters are applied before logic in the select, so you can get an error with conversions. More recent versions of SQL Server have try_convert() to fix this problem.
Using sqldf with the default sqlite database and this test data:
DF <- data.frame(a = c("7500", "4460", "", "2900", "2640", "1850", "", "2570",
"9050", "8000", "9600"), stringsAsFactors = FALSE)
try this:
library(sqldf)
sqldf("select cast(a as aint) as aint from DF where length(a) > 0")
giving:
aint
1 7500
2 4460
3 2900
4 2640
5 1850
6 2570
7 9050
8 8000
9 9600
Note In plain R one could write:
transform(subset(DF, nchar(a) > 0), a = as.integer(a))

How to find repeating numbers in a column in SQL server . Eg 11111, 33333333, 5555555555,7777777 etc

I need to identify repeated numbers( Eg: 1111, 33333333, 5555555555,777777777 etc.) in a column.
How can I do this in sql server without having to hard code every scenario. The max length is 10 of the column. Any help is appreciated.
This will check if the column has all the same value in it.
SELECT *
FROM tablename
WHERE columnname = REPLICATE(LEFT(columnname,1),LEN(columnname))
As Nicholas Cary notes, if the column is numbers you'd need to cast as varchar first:
SELECT *
FROM tablename
WHERE CAST(columnname AS VARCHAR(10)) = REPLICATE(LEFT(CAST(columnname AS VARCHAR(10)),1),LEN(CAST(columnname AS VARCHAR(10))))
Riffing on #Dave.Gugg's excellent answer, here's another way, using patindex() to look for a character different than the first.
select *
from some_table t
where 0 = patindex( '[^' + left(t.some_column,1) + ']' , t.some_column )
Again, this only works for string types (char,varchar, etc.). Numeric types such as int will need to be converted first.

SQL Server - Select column that contains query string and split values into anothers 'columns'

I need to do a select in a column that contains a query string like:
user_id=300&company_id=201503&status=WAITING OPERATION&count=1
I want to perform a select and break each value in a new column, something like:
user_id | company_id | status | count
300 | 201503 | WAITING OPERATION | 1
How can i do it in SQL Server without use procs?
I've tried a function:
CREATE FUNCTION [xpto].[SplitGriswold]
(
#List NVARCHAR(MAX),
#Delim1 NCHAR(1),
#Delim2 NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
SELECT
Val1 = PARSENAME(Value,2),
Val2 = PARSENAME(Value,1)
FROM
(
SELECT REPLACE(Value, #Delim2, '&') FROM
(
SELECT LTRIM(RTRIM(SUBSTRING(#List, [Number],
CHARINDEX(#Delim1, #List + #Delim1, [Number]) - [Number])))
FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
FROM sys.all_objects) AS x
WHERE Number <= LEN(#List)
AND SUBSTRING(#Delim1 + #List, [Number], LEN(#Delim1)) = #Delim1
) AS y(Value)
) AS z(Value)
);
GO
Execution:
select QueryString
from User.Log
CROSS APPLY notifier.SplitGriswold(REPLACE(QueryString, ' ', N'ŏ'), N'ŏ', '&') AS t;
But it returns me only one column with all inside:
QueryString
user_id=300&company_id=201503&status=WAITING OPERATION&count=1
Thanks in advance.
I've had to do this many times before, and you're in luck! Since you only have 3 delimiters per string, and that number is fixed, you can use SQL Server's PARSENAME function to do it. That's far less ugly than the best alternative (using the XML parsing stuff). Try this (untested) query (replace TABLE_NAME and COLUMN_NAME with the appropriate names):
SELECT
PARSENAME(REPLACE(COLUMN_NAME,'&','.'),1) AS 'User',
PARSENAME(REPLACE(COLUMN_NAME,'&','.'),2) AS 'Company_ID',
PARSENAME(REPLACE(COLUMN_NAME,'&','.'),3) AS 'Status',
PARSENAME(REPLACE(COLUMN_NAME,'&','.'),4) AS 'Count',
FROM TABLE_NAME
That'll get you the results in the form "user_id=300", which is far and away the hard part of what you want. I'll leave it to you to do the easy part (drop the stuff before the "=" sign).
NOTE: I can't remember if PARSENAME will freak out over the illegal name character (the "=" sign). If it does, simply nest another REPLACE in there to turn it into something else, like an underscore.
You need to use SQL SUBSTRING as part of your select statement. You would first need to build the first row, then use a UNION to return the second row.

How to check if a string is a uniqueidentifier?

Is there an equivalent to IsDate or IsNumeric for uniqueidentifier (SQL Server)?
Or is there anything equivalent to (C#) TryParse?
Otherwise I'll have to write my own function, but I want to make sure I'm not reinventing the wheel.
The scenario I'm trying to cover is the following:
SELECT something FROM table WHERE IsUniqueidentifier(column) = 1
SQL Server 2012 makes this all much easier with TRY_CONVERT(UNIQUEIDENTIFIER, expression)
SELECT something
FROM your_table
WHERE TRY_CONVERT(UNIQUEIDENTIFIER, your_column) IS NOT NULL;
For prior versions of SQL Server, the existing answers miss a few points that mean they may either not match strings that SQL Server will in fact cast to UNIQUEIDENTIFIER without complaint or may still end up causing invalid cast errors.
SQL Server accepts GUIDs either wrapped in {} or without this.
Additionally it ignores extraneous characters at the end of the string. Both SELECT CAST('{5D944516-98E6-44C5-849F-9C277833C01B}ssssssssss' as uniqueidentifier) and SELECT CAST('5D944516-98E6-44C5-849F-9C277833C01BXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' as uniqueidentifier) succeed for instance.
Under most default collations the LIKE '[a-zA-Z0-9]' will end up matching characters such as À or Ë
Finally if casting rows in a result to uniqueidentifier it is important to put the cast attempt in a case expression as the cast may occur before the rows are filtered by the WHERE.
So (borrowing #r0d30b0y's idea) a slightly more robust version might be
;WITH T(C)
AS (SELECT '5D944516-98E6-44C5-849F-9C277833C01B'
UNION ALL
SELECT '{5D944516-98E6-44C5-849F-9C277833C01B}'
UNION ALL
SELECT '5D944516-98E6-44C5-849F-9C277833C01BXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
UNION ALL
SELECT '{5D944516-98E6-44C5-849F-9C277833C01B}ssssssssss'
UNION ALL
SELECT 'ÀD944516-98E6-44C5-849F-9C277833C01B'
UNION ALL
SELECT 'fish')
SELECT CASE
WHEN C LIKE expression + '%'
OR C LIKE '{' + expression + '}%' THEN CAST(C AS UNIQUEIDENTIFIER)
END
FROM T
CROSS APPLY (SELECT REPLACE('00000000-0000-0000-0000-000000000000', '0', '[0-9a-fA-F]') COLLATE Latin1_General_BIN) C2(expression)
WHERE C LIKE expression + '%'
OR C LIKE '{' + expression + '}%'
Not mine, found this online... thought i'd share.
SELECT 1 WHERE #StringToCompare LIKE
REPLACE('00000000-0000-0000-0000-000000000000', '0', '[0-9a-fA-F]');
SELECT something
FROM table1
WHERE column1 LIKE '[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]';
UPDATE:
...but I much prefer the approach in the answer by #r0d30b0y:
SELECT something
FROM table1
WHERE column1 LIKE REPLACE('00000000-0000-0000-0000-000000000000', '0', '[0-9a-fA-F]');
I am not aware of anything that you could use "out of the box" - you'll have to write this on your own, I'm afraid.
If you can: try to write this inside a C# library and deploy it into SQL Server as a SQL-CLR assembly - then you could use things like Guid.TryParse() which is certainly much easier to use than anything in T-SQL....
A variant of r0d30b0y answer is to use PATINDEX to find within a string...
PATINDEX('%'+REPLACE('00000000-0000-0000-0000-000000000000', '0', '[0-9a-fA-F]')+'%',#StringToCompare) > 0
Had to use to find Guids within a URL string..
HTH
Dave
Like to keep it simple. A GUID has four - in it even, if is just a string
WHERE column like '%-%-%-%-%'
Though an older post, just a thought for a quick test ...
SELECT [A].[INPUT],
CAST([A].[INPUT] AS [UNIQUEIDENTIFIER])
FROM (
SELECT '5D944516-98E6-44C5-849F-9C277833C01B' Collate Latin1_General_100_BIN AS [INPUT]
UNION ALL
SELECT '{5D944516-98E6-44C5-849F-9C277833C01B}'
UNION ALL
SELECT '5D944516-98E6-44C5-849F-9C277833C01BXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
UNION ALL
SELECT '{5D944516-98E6-44C5-849F-9C277833C01B}ssssssssss'
UNION ALL
SELECT 'ÀD944516-98E6-44C5-849F-9C277833C01B'
UNION ALL
SELECT 'fish'
) [A]
WHERE PATINDEX('[^0-9A-F-{}]%', [A].[INPUT]) = 0
This is a function based on the concept of some earlier comments. This function is very fast.
CREATE FUNCTION [dbo].[IsGuid] (#input varchar(50))
RETURNS bit AS
BEGIN
RETURN
case when #input like '[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]-[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]'
then 1 else 0 end
END
GO
/*
Usage:
select [dbo].[IsGuid]('123') -- Returns 0
select [dbo].[IsGuid]('ebd8aebd-7ea3-439d-a7bc-e009dee0eae0') -- Returns 1
select * from SomeTable where dbo.IsGuid(TableField) = 0 -- Returns table with all non convertable items!
*/
DECLARE #guid_string nvarchar(256) = 'ACE79678-61D1-46E6-93EC-893AD559CC78'
SELECT
CASE WHEN #guid_string LIKE '________-____-____-____-____________'
THEN CONVERT(uniqueidentifier, #guid_string)
ELSE NULL
END
You can write your own UDF. This is a simple approximation to avoid the use of a SQL-CLR assembly.
CREATE FUNCTION dbo.isuniqueidentifier (#ui varchar(50))
RETURNS bit AS
BEGIN
RETURN case when
substring(#ui,9,1)='-' and
substring(#ui,14,1)='-' and
substring(#ui,19,1)='-' and
substring(#ui,24,1)='-' and
len(#ui) = 36 then 1 else 0 end
END
GO
You can then improve it to check if it´s just about HEX values.
I use :
ISNULL(convert(nvarchar(50), userID), 'NULL') = 'NULL'
I had some Test users that were generated with AutoFixture, which uses GUIDs by default for generated fields. My FirstName fields for the users that I need to delete are GUIDs or uniqueidentifiers. That's how I ended up here.
I was able to cobble together some of your answers into this.
SELECT UserId FROM [Membership].[UserInfo] Where TRY_CONVERT(uniqueidentifier, FirstName) is not null
Use RLIKE for MYSQL
SELECT 1 WHERE #StringToCompare
RLIKE REPLACE('00000000-0000-0000-0000-000000000000', '0', '[0-9a-fA-F]');
In a simplest scenario. When you sure that given string can`t contain 4 '-' signs.
SELECT * FROM City WHERE Name LIKE('%-%-%-%-%')
In BigQuery you can use
SELECT *
FROM table
WHERE
REGEXP_CONTAINS(uuid, REPLACE('^00000000-0000-0000-0000-000000000000$', '0', '[0-9a-fA-F]'))