How to check if a value is a number in SQLite - sql

I have a column that contains numbers and other string values (like "?", "???", etc.)
Is it possible to add an "is number" condition to the where clause in SQLite? Something like:
select * from mytable where isnumber(mycolumn)

From the documentation,
The typeof(X) function returns a string that indicates the datatype of the expression X: "null", "integer", "real", "text", or "blob".
You can use where typeof(mycolumn) = "integer"

You could try something like this also:
select * from mytable where printf("%d", field1) = field1;
In case your column is text and contains numeric and string, this might be somewhat helpful in extracting integer data.
Example:
CREATE TABLE mytable (field1 text);
insert into mytable values (1);
insert into mytable values ('a');
select * from mytable where printf("%d", field1) = field1;
field1
----------
1

SELECT *
FROM mytable
WHERE columnNumeric GLOB '*[0-9]*'

select * from mytable where abs(mycolumn) <> 0.0 or mycolumn = '0'
http://sqlfiddle.com/#!5/f1081/2
Based on this answer

To test whether the column contains exclusively an integer with no other alphanumeric characters, use:
NOT myColumn GLOB '*[^0-9]*' AND myColumn LIKE '_%'
I.e., we test whether the column contains anything else than a digit and invert the result. Additionally we test whether it contains at least one character.
Note that GLOB '*[0-9]*' will find digits nested between other characters as well. The function typeof() will return 'text' for a column typed as TEXT, even if the text represents a number. As #rayzinnz mentioned, the abs() function is not reliable as well.

As SQLite and MySQL follow the same syntax and loose datatypes.
The query below is also possible
SELECT
<data>
, (
LENGTH(CAST(<data> AS UNSIGNED))
)
=
CASE WHEN CAST(<data> AS UNSIGNED) = 0
THEN CAST(<data> AS UNSIGNED)
ELSE (LENGTH(<data>)
) END AS is_int;
Note the <data> is BNF you would have the replace those values.
This answer is based on mine other answer
Running SQLite demo

For integer strings, test whether the roundtrip CAST matches the original string:
SELECT * FROM mytable WHERE cast(cast(mycolumn AS INTEGER) AS TEXT) = mycolumn
For consistently-formatted real strings (for example, currency):
SELECT * FROM mytable WHERE printf("%.2f", cast(mycolumn AS REAL)) = mycolumn
Input values:
Can't have leading zeroes
Must format negatives as -number rather than (number).

You can use the result of the function CAST( field as INTEGER) for numbers greater than zero and the simple condition like '0' per numbers equal to zero
SELECT *
FROM tableName
WHERE CAST(fieldName AS INTEGER) > 0
UNION
SELECT *
FROM tableName
WHERE fieldName like '0';

This answer is comprehensive and eliminates the shortcomings of all other answers. The only caveat is that it isn't sql standard... but neither is SQLite. If you manage to break this code please comment below, and I will patch it.
Figured this out accidentally. You can check for equality with the CAST value.
CASE {TEXT_field}
WHEN CAST({TEXT_field} AS INTEGER) THEN 'Integer' -- 'Number'
WHEN CAST({TEXT_field} AS REAL) THEN 'Real' -- 'Number'
ELSE 'Character'
END
OR
CASE
WHEN {TEXT_field} = CAST({TEXT_field} AS INTEGER) THEN 'Integer' --'Number'
WHEN {TEXT_field} = CAST({TEXT_field} AS Real) THEN 'Real' --'Number'
ELSE 'Character'
END
(It's the same thing just different syntax.)
Note the order of execution. REAL must come after INTEGER.
Perhaps their is some implicit casting of values prior to checking for equality so that the right-side is re-CAST to TEXT before comparison to left-side.
Updated for comment: #SimonWillison
I have added a check for 'Real' values
'1 frog' evaluated to 'Character' for me; which is correct
'0' evaluated to 'Integer' for me; which is correct
I am using SQLite version 3.31.1 with python sqlite3 version 2.6.0. The python element should not affect how a query executes.

Related

How to select rows that have numbers as a value?

I have got a table with a column that is type of VARCHAR2(255 BYTE). I would like to select only these rows that have numbers as a value, so I discard any other values as for example "lala","1z". I just want to have pure numbers from 1 to ..... 999999999 (just digital numbers in other words) :P
Could you tell me how to make it?
if you're using Oracle 12c r2 or later then use the built-in validate_conversion() function:
select *
from your_table
where validate_conversion(cast(your_column as number)) = 0
validate_conversion() returns 0 when the proposed conversion would succeed and 1 when it wouldn't. It also supports date and timestamp conversions. Find out more.
Something like this is the usual option. You could use regexp, but it's usually a bit slower.
select column1
from tableA
where translate(column1, '1234567890', '') is null;
Here's the regexp version kfinity referred to. The regex matches a line consisting of 1 or more digits.
select column1
from tableA
where regexp_like(column1, '^\d+$');
You don't want zero to start a number. So it seems like regular expressions are the way to go:
where regexp_like(column1, '^[1-9][0-9]*$');

Finding non-numeric values in varchar column

Requirement :
Generic query/function to check if the value provided in a varchar column in a table is actually a number & the precision does not exceed the allowed precision.
Available values:
Table_Name, Column_Name, Allowed Precision, Allowed Scale
General advise would be to create a function & use to_number() to validate the value however it won't validate the allowed length (precision-scale).
My solution:
Validate Number using Regexp NOT REGEXP_LIKE(COLUMN_NAME, '^-?[0-9.]+$')
Validate Length of left component (before decimal) (I have no idea what's its actually called) because for scale, oracle automatically rounds off if required. As the actual column is varchar i will use substr, instr to find the component on the left of decimal point.
As above Regexp allows number like 123...123124..55 I will also validate the number of decimal points. [If > 1 then error]
Query to find invalid number's:
Select * From Table_Name
Where
(NOT REGEXP_LIKE(COLUMN_NAME, '^-?[0-9.]+$')
OR
Function_To_Fetch_Left_Component(COLUMN_NAME) > (Precision-Scale)
/* Can use regexp_substr now but i already had a function for that */
OR
LENGTH(Column_Name) - LENGTH(REPLACE(Column_Name,'.','')) > 1
/* Can use regexp_count aswell*/)
I was happy & satisfied with my solution until a column with only '.' value escaped my check and I saw the limitation of my checks. Although adding another check to validate this as well will solve my problem the solution as a whole looks very inefficient to me.
I will really appreciate a better solution [in any way].
Thanks in advance.
Look for:
One-or-more digits optionally followed by a decimal point and zero-or-more digits; or
A leading decimal point (no preceding unit digit) and then one or more (decimal) digits.
Like this:
Select *
From Table_Name
Where NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(\d+(\.\d*)?|\.\d+)$')
If you do not want zero-padded values in the number string then:
Select *
From Table_Name
Where NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(([1-9]\d*|0)(\.\d*)?|\.\d+)$')
With precision and scale (assuming it works as per a NUMBER( precision, scale ) data type and scale < precision):
Select *
From Table_Name
Where NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(\d{1,'||(precision-scale)||'}(\.\d{0,'||scale||'})?|\.\d{1,'||scale||'})$')
or, for non-zero-padded numbers with precision and scale:
Select *
From Table_Name
Where NOT REGEXP_LIKE(COLUMN_NAME, '^[+-]?(([1-9]\d{0,'||(precision-scale-1)||'}|0)(\.\d{0,'||scale||'})?|\.\d{1,'||scale||'})$')
or, for any precision and scale:
Select *
From Table_Name
Where NOT REGEXP_LIKE(
COLUMN_NAME,
CASE
WHEN scale <= 0
THEN '^[+-]?(\d{1,'||precision||'}0{'||(-scale)||'})$'
WHEN scale < precision
THEN '^[+-]?(\d{1,'||(precision-scale)||'}(\.\d{0,'||scale||'})?|\.\d{1,'||scale||'})$'
WHEN scale >= precision
THEN '^[+-]?(0(\.0{0,'||scale||'})?|0?\.0{'||(scale-precision)||'}\d{1,'||precision||'})$'
END
)
The precision means that you want at most allowed_precision digits in the number (strictly speaking, not counting leading zeros, but I'll ignore that). The scale means that at most allowed_scale can be after the decimal point.
This suggests a regular expression such as:
[-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>}
You can construct the regular expression:
NOT REGEXP_LIKE(COLUMN_NAME,
REPLACE(REPLACE('[-]?[0-9]{1,<before>}[.]?[0-9]{0,<after>}', '<before>', allowed_precision - allowed_scale
), '<after>', allowed_scale)
Now, variable regular expressions are highly inefficient. You can do the logic using like and other functions as well. I think the conditions are:
(column_name not like '%.%.%' and
column_name not like '_%-%' and
translate(column_name, '0123456789-.x', 'x') is null and
length(translate(column_name, '-.x', 'x') <= allowed_precision and
length(translate(column_name, '-.x', 'x') >= 1 and
instr(translate(column_name, '-.x', 'x'), '.') <= allowed_precision - allowed_scale
)

Hive - how to check if a numeric columns have number/decimal?

I am trying to generate a hive query which will take multiple numeric column names and check whether it is has numeric values. If the column has numeric values then the output should be (column name,true) else if the field has NULL or some string value the output should be (column name,false)
SELECT distinct (test_nr1,test_nr2) FROM test.abc WHERE (test_nr1,test_nr2) not like '%[^0-9]%';
SELECT distinct test_nr1,test_nr2 from test.abc limit 2;
test_nr1 test_nr2
NULL 81432269
NULL 88868060
the desired output should be :
test_nr1 false
test_nr2 true
Since test_nr1 is a decimal field and it has NULL values, it should output false.
Appreciate valuable suggestions.
You can use cast function. It returns NULL when the value can not not be cast to numeric.
For example:
select case when cast('23ccc' as double) is null then false else true end as IsNumber;
You're trying to use character class pattern matching syntax here, and it doesn't work in every SQL implementation IIRC, however, regexp matching works in most, if not all, SQL implementations.
Considering you're using hive, this should do it:
SELECT ('test_nr1', test_nr1 RLIKE '\d'), ('test_nr2', test_nr2 RLIKE '\d') FROM test.abc;
You should remember that regexp matching is very slow in SQL though.

SQLite ORDER BY string containing number starting with 0

as the title states:
I have a select query, which I'm trying to "order by" a field which contains numbers, the thing is this numbers are really strings starting with 0s, so the "order by" is doing this...
...
10
11
12
01
02
03
...
Any thoughts?
EDIT: if I do this: "...ORDER BY (field+1)" I can workaround this, because somehow the string is internally being converted to integer. Is this the a way to "officially" convert it like C's atoi?
You can use CAST http://www.sqlite.org/lang_expr.html#castexpr to cast the expression to an Integer.
sqlite> CREATE TABLE T (value VARCHAR(2));
sqlite> INSERT INTO T (value) VALUES ('10');
sqlite> INSERT INTO T (value) VALUES ('11');
sqlite> INSERT INTO T (value) VALUES ('12');
sqlite> INSERT INTO T (value) VALUES ('01');
sqlite> INSERT INTO T (value) VALUES ('02');
sqlite> INSERT INTO T (value) VALUES ('03');
sqlite> SELECT * FROM T ORDER BY CAST(value AS INTEGER);
01
02
03
10
11
12
sqlite>
if I do this: "...ORDER BY (field+1)" I can workaround this, because somehow the string is internally being converted to integer. Is the a way to "officially" convert it like C's atoi?
Well thats interesting, though I dont know how many DBMS support such an operation so I don't recommend it just in case you ever need to use a different system that doesn't support it, not to mention you are adding an extra operation, which can affect performance, though you also do this ORDER BY (field + 0) Im going to investigate the performance
taken from the sqlite3 docs:
A CAST expression is used to convert the value of to a different storage class in a similar way to the conversion that takes place when a column affinity is applied to a value. Application of a CAST expression is different to application of a column affinity, as with a CAST expression the storage class conversion is forced even if it is lossy and irrreversible.
4.0 Operators
All mathematical operators (+, -, *, /, %, <<, >>, &, and |) cast both operands to the NUMERIC storage class prior to being carried out. The cast is carried through even if it is lossy and irreversible. A NULL operand on a mathematical operator yields a NULL result. An operand on a mathematical operator that does not look in any way numeric and is not NULL is converted to 0 or 0.0.
I was curios so I ran some benchmarks:
>>> setup = """
... import sqlite3
... import timeit
...
... conn = sqlite3.connect(':memory:')
... c = conn.cursor()
... c.execute('CREATE TABLE T (value int)')
... for index in range(4000000, 0, -1):
... _ = c.execute('INSERT INTO T (value) VALUES (%i)' % index)
... conn.commit()
... """
>>>
>>> cast_conv = "result = c.execute('SELECT * FROM T ORDER BY CAST(value AS INTEGER)')"
>>> cast_affinity = "result = c.execute('SELECT * FROM T ORDER BY (value + 0)')"
>>> timeit.Timer(cast_conv, setup).timeit(number = 1)
18.145697116851807
>>> timeit.Timer(cast_affinity, setup).timeit(number = 1)
18.259973049163818
>>>
As we can see its a bit slower though not by much, interesting.
You could use CAST:
ORDER BY CAST(columnname AS INTEGER)
In ListView with cursor loader!
String projection= some string column;
String selection= need to select;
String sort="CAST ("+ YOUR_COLUMN_NAME + " AS INTEGER)";
CursorLoader(getActivity(), Table.CONTENT_URI, projection, selection, selectionArgs, sort);
CONVERT CAST function using order by column value number format in SQL SERVER
SELECT * FROM Table_Name ORDER BY CAST(COLUMNNAME AS INT);
Thanks to Skinnynerd. with Kotlin, CAST worked as follows:
CAST fix the problems of prioritizing 9 over 10 OR 22 over 206.
define global variable to alter later on demand, and then plug it in the query:
var SortOrder:String?=null
to alter the order use:
For descendant:
SortOrder = "CAST(MyNumber AS INTEGER)" + " DESC"
(from highest to lowest)
For ascending:
SortOrder = "CAST(MyNumber AS INTEGER)" + " ASC"
(from lowest to highest)

PostgreSQL: IN A SINGLE SQL SYNTAX order by numeric value computed from a text column

A column has a string values like "1/200", "3.5" or "6". How can I convert this String to numeric value in single SQL query?
My actual SQL is more complicated, here is a simple example:
SELECT number_value_in_string FROM table
number_value_in_string's format will be one of:
##
#.##
#/###
I need to sort by the numeric value of this column. But of course postgres doesn't agree with me that 1/200 is a proper number.
Seeing your name I cannot but post a simplification of your answer:
SELECT id, number_value_in_string FROM table
ORDER BY CASE WHEN substr(number_value_in_string,1,2) = '1/'
THEN 1/substr(number_value_in_string,3)::numeric
ELSE number_value_in_string::numeric END, id;
Ignoring possible divide by zero.
I would define a stored function to convert the string to a numeric value, more or less like this:
CREATE OR REPLACE FUNCTION fraction_to_number(s CHARACTER VARYING)
RETURN DOUBLE PRECISION AS
BEGIN
RETURN
CASE WHEN s LIKE '%/%' THEN
CAST(split_part(s, '/', 1) AS double_precision)
/ CAST(split_part(s, '/', 2) AS double_precision)
ELSE
CAST(s AS DOUBLE PRECISION)
END CASE
END
Then you can ORDER BY fraction_to_number(weird_column)
If possible, I would revisit the data design. Is it all this complexity really necessary?
This postgres SQL does the trick:
select (parts[1] :: decimal) / (parts[2] :: decimal) as quotient
FROM (select regexp_split_to_array(number_value_in_string, '/') as parts from table) x
Here's a test of this code:
select (parts[1] :: decimal) / (parts[2] :: decimal) as quotient
FROM (select regexp_split_to_array('1/200', '/') as parts) x
Output:
0.005
Note that you would need to wrap this in a case statement to protect against divide-by-zero errors and/or array out of bounds issues etc if the column did not contain a forward slash
Note also that you could do it without the inner select, but you would have to use regexp_split_to_array twice (once for each part) and you would probably incur a performance hit. Nevertheless, it may be easier to code in-line and just accept the small performance loss.
I managed to solve my problem. Thanks all.
It goes something like this, in a single SQL. (I'm using POSTGRESQL)
It will sort a string coming in as either "#", "#.#" or "1/#"
SELECT id, number_value_in_string FROM table ORDER BY CASE WHEN position('1/' in number_value_in_string) = 1
THEN 1/substring(number_value_in_string from (position('1/' in number_value_in_string) + 2) )::numeric
ELSE number_value_in_string::numeric
END ASC, id
Hope this will help someone outhere in the future.