SQL / Celonis - where field only contains just numeric, remove anything else - sql

In a field I want to keep those where it is only a number between 100000 and 999999
This field has a mixture of numeric, alphanumeric, alphabet characters.
This function will keep if there is a number in a field e.g. keep abc123 where I would want that removed
where some_column NOT LIKE '%[^0-9]%'
Additional - Celonis Equivalent ?
Gordon has kindly provided SQL Server coding to do this exactly as intended.
I was hoping this would work in Celonis but the try_convert function is not supported. Is there another method?
where try_convert(int, some_column) between 100000 and 999999

First of all it is good to know that Celonis uses Vertica SQL for its event collection. Vertica has some specific functions, that you can find at https://www.vertica.com/docs/.
That said, I'm not sure if I understand your question well, but I think that the REGEXP_ILIKE is something you could use, where '[0-9]' could be the expression you are looking for to check if your field contains a numeric value.
If you want to extract the number itself (for instance to check if it is between 100000 and 999999, use then REGEXP_SUBSTR.
Your code will be something like:
WHERE REGEXP_SUBSTR(some_column, [0-9]) BETWEEN 100000 AND 999999

Your syntax suggests you are using SQL Server. If so, try:
where try_convert(int, some_column) between 100000 and 999999

Related

SQL function to transform number with a certain pattern

I need for a SQL query to transform an int with a value between 1 to 300000 to a number which has this pattern : always 8 number.
For example:
1 becomes 00000001,
123 becomes 00000123,
123456 becomes 00123456.
I have no idea how to do that... How can I do it?
In Standard SQL, you can use this trick:
select substring(cast( (num + 100000000) as varchar(255)) from 2)
Few databases actually support this syntax. Any given database can do what you want, but the method depends on the database you are using.
For MS SQL Server
You could use FORMAT function, like this:
SELECT FORMAT(123,'00000000')
https://database.guide/how-to-format-numbers-in-sql-server/#:~:text=Starting%20from%20SQL%20Server%202012,the%20output%20should%20be%20formatted.
Read at the link Leading Zeroes
For MySql/Oracle
You could use LPAD, like this:
SELECT LPAD('123',8,'0')
https://database.guide/how-to-add-leading-zeros-to-a-number-in-mysql/

SQL Server's ISNUMERIC function

I need to checking a column where numeric or not in SQL Server 2012.
This my case code.
CASE
WHEN ISNUMERIC(CUST_TELE) = 1
THEN CUST_TELE
ELSE NULL
END AS CUSTOMER_CONTACT_NO
But when the '78603D99' value is reached, it returns 1 which means SQL Server considered this string as numeric.
Why is that?
How to avoid this kind of issues?
Unfortunately, the ISNUMERIC() function in SQL Server has many quirks. It's not exactly buggy, but it rarely does what people expect it to when they first use it.
However, since you're using SQL Server 2012 you can use the TRY_PARSE() function which will do what you want.
This returns NULL:
SELECT TRY_PARSE('7860D399' AS int)
This returns 7860399
SELECT TRY_PARSE('7860399' AS int)
https://msdn.microsoft.com/en-us/library/hh213126.aspx
Obviously, this works for datatypes other than INT as well. You say you want to check that a value is numeric, but I think you mean INT.
Although try_convert() or try_parse() works for a built-in type, it might not do exactly what you want. For instance, it might allow decimal points, negative signs, and limit the length of digits.
Also, isnumeric() is going to recognize negative numbers, decimals, and exponential notation.
If you want to test a string only for digits, then you can use not like logic:
(CASE WHEN CUST_TELE NOT LIKE '%[^0-9]%'
THEN CUST_TELE
END) AS CUSTOMER_CONTACT_NO
This simply says that CUST_TELE contains no characters that are not digits.
Nothing substantive to add but a couple warnings.
1) ISNUMERIC() won't catch blanks but they will break numeric conversions.
2) If there is a single non-numeric character in the field and you use REPLACE to get rid of it you still need to handle the blank (usually with a CASE statement).
For instance if the field contains a single '-' character and you use this:
cast(REPLACE(myField, '-', '') as decimal(20,4)) myNumField
it will fail and you'll need to use something like this:
CASE WHEN myField IN ('','-') THEN NULL ELSE cast(REPLACE(myField, '-', '') as decimal(20,4)) END myNumField

Search Through All Between Values SQL

I have data following data structure..
_ID _BEGIN _END
7003 99210 99217
7003 10225 10324
7003 111111
I want to look through every _BEGIN and _END and return all rows where the input value is between the range of values including the values themselves (i.e. if 10324 is the input, row 2 would be returned)
I have tried this filter but it does not work..
where #theInput between a._BEGIN and a._END
--THIS WORKS
where convert(char(7),'10400') >= convert(char(7),a._BEGIN)
--BUT ADDING THIS BREAKS AND RETURNS NOTHING
AND convert(char(7),'10400') < convert(char(7),a._END)
Less than < and greater than > operators work on xCHAR data types without any syntactical error, but it may go semantically wrong. Look at examples:
1 - SELECT 'ab' BETWEEN 'aa' AND 'ac' # returns TRUE
2 - SELECT '2' BETWEEN '1' AND '10' # returns FALSE
Character 2 as being stored in a xCHAR type has greater value than 1xxxxx
So you should CAST types here. [Exampled on MySQL - For standard compatibility change UNSIGNED to INTEGER]
WHERE CAST(#theInput as UNSIGNED)
BETWEEN CAST(a._BEGIN as UNSIGNED) AND CAST(a._END as UNSIGNED)
You'd better change the types of columns to avoid ambiguity for later use.
This would be the obvious answer...
SELECT *
FROM <YOUR_TABLE_NAME> a
WHERE #theInput between a._BEGIN and a._END
If the data is string (assuming here as we don't know what DB) You could add this.
Declare #searchArg VARCHAR(30) = CAST(#theInput as VARCHAR(30));
SELECT *
FROM <YOUR_TABLE_NAME> a
WHERE #searchArg between a._BEGIN and a._END
If you care about performance and you've got a lot of data and indexes you won't want to include function calls on the column values.. you could in-line this conversion but this assures that your predicates are Sargable.
SELECT * FROM myTable
WHERE
(CAST(#theInput AS char) >= a._BEGIN AND #theInput < a.END);
I also saw several of the same type of questions:
SQL "between" not inclusive
MySQL "between" clause not inclusive?
When I do queries like this, I usually try one side with the greater/less than on either side and work from there. Maybe that can help. I'm very slow, but I do lots of trial and error.
Or, use Tony's convert.
I supposed you can convert them to anything appropriate for your program, numeric or text.
Also, see here, http://technet.microsoft.com/en-us/library/aa226054%28v=sql.80%29.aspx.
I am not convinced you cannot do your CAST in the SELECT.
Nick, here is a MySQL version from SO, MySQL "between" clause not inclusive?

How do I sort a VARCHAR column in PostgreSQL that contains words and numbers?

I need to order a select query using a varchar column, using numerical and text order. The query will be done in a java program, using jdbc over postgresql.
If I use ORDER BY in the select clause I obtain:
1
11
2
abc
However, I need to obtain:
1
2
11
abc
The problem is that the column can also contain text.
This question is similar (but targeted for SQL Server):
How do I sort a VARCHAR column in SQL server that contains words and numbers?
However, the solution proposed did not work with PostgreSQL.
Thanks in advance, regards,
I had the same problem and the following code solves it:
SELECT ...
FROM table
order by
CASE WHEN column < 'A'
THEN lpad(column, size, '0')
ELSE column
END;
The size var is the length of the varchar column, e.g 255 for varying(255).
You can use regular expression to do this kind of thing:
select THECOL from ...
order by
case
when substring(THECOL from '^\d+$') is null then 9999
else cast(THECOL as integer)
end,
THECOL
First you use regular expression to detect whether the content of the column is a number or not. In this case I use '^\d+$' but you can modify it to suit the situation.
If the regexp doesn't match, return a big number so this row will fall to the bottom of the order.
If the regexp matches, convert the string to number and then sort on that.
After this, sort regularly with the column.
I'm not aware of any database having a "natural sort", like some know to exist in PHP. All I've found is various functions:
Natural order sort in Postgres
Comment in the PostgreSQL ORDER BY documentation

Regular expressions inside SQL Server

I have stored values in my database that look like 5XXXXXX, where X can be any digit. In other words, I need to match incoming SQL query strings like 5349878.
Does anyone have an idea how to do it?
I have different cases like XXXX7XX for example, so it has to be generic. I don't care about representing the pattern in a different way inside the SQL Server.
I'm working with c# in .NET.
You can write queries like this in SQL Server:
--each [0-9] matches a single digit, this would match 5xx
SELECT * FROM YourTable WHERE SomeField LIKE '5[0-9][0-9]'
stored value in DB is: 5XXXXXX [where x can be any digit]
You don't mention data types - if numeric, you'll likely have to use CAST/CONVERT to change the data type to [n]varchar.
Use:
WHERE CHARINDEX(column, '5') = 1
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1
References:
CHARINDEX
ISNUMERIC
i have also different cases like XXXX7XX for example, so it has to be generic.
Use:
WHERE PATINDEX('%7%', column) = 5
AND CHARINDEX(column, '.') = 0 --to stop decimals if needed
AND ISNUMERIC(column) = 1
References:
PATINDEX
Regex Support
SQL Server 2000+ supports regex, but the catch is you have to create the UDF function in CLR before you have the ability. There are numerous articles providing example code if you google them. Once you have that in place, you can use:
5\d{6} for your first example
\d{4}7\d{2} for your second example
For more info on regular expressions, I highly recommend this website.
Try this
select * from mytable
where p1 not like '%[^0-9]%' and substring(p1,1,1)='5'
Of course, you'll need to adjust the substring value, but the rest should work...
In order to match a digit, you can use [0-9].
So you could use 5[0-9][0-9][0-9][0-9][0-9][0-9] and [0-9][0-9][0-9][0-9]7[0-9][0-9][0-9]. I do this a lot for zip codes.
SQL Wildcards are enough for this purpose. Follow this link: http://www.w3schools.com/SQL/sql_wildcards.asp
you need to use a query like this:
select * from mytable where msisdn like '%7%'
or
select * from mytable where msisdn like '56655%'