SQL Server - Finding Number patterns - sql

I've been looking at this this for the last hour and just can't seem to find a way to do it, I'm sure its pretty simple but my google and reading skills have failed me.
All I need to do is to find ascending and descending numerical patterns in a field.
Like in this pseudo-SQL Code:
select * where col = '123456' or '23456' or '7654' or '987654321'
Most of the pattern methods using LIKE seem to be around placement of characters/numbers rather than the specific ordering,
I've started trying to create a query than takes the first character and compares it to the next one but this seems really ineffective and inefficient as it would need to take each field in the column run the query and return it if it matches.
I've managed to find a way to get it if its a repeated character but not if its an increase or decrease.
Any help would be greatly appreciated.

You can put regular expression inside your LIKE quotes.
Ascending:
^(?=\d{4,10}$)1?2?3?4?5?6?7?8?9?0?$
Descending:
^(?=\d{4,10}$)9?8?7?6?5?4?3?2?1?0?$
d{4,10} here is possible value length, between 4 and 10 symbols.
Won't be fast, most likely.
You can check how it works on http://rubular.com/.
Edit: Sorry, I forgot to mention you will have to do a MS SQL Server CLR integration first. By default, MSSQL Server does not fully support RegEx.
This article describes how to create and use extensions for the LIKE (Transact-SQL) clause that supports Regular Expressions.
http://www.codeproject.com/Articles/42764/Regular-Expressions-in-MS-SQL-Server

Another option could be something like this:
Declare #Table table (col int)
Insert into #Table values
(4141243),(4290577),(98765432),(78635389),(4141243),(22222),(4290046),(55555555),(4141243),(6789),(77777),(45678),(4294461),(55555),(4141243),(5555)
Declare #Num table (Num int);Insert Into #Num values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)
Select Distinct A.*
From #Table A
Join (
Select Patt=replicate(Num,3) from #Num
Union All
Select Patt=right('000'+cast((Num*100+Num*10+Num)+12 as varchar(5)),3) from #Num where Num<8
Union All
Select Patt=reverse(right('000'+cast((Num*100+Num*10+Num)+12 as varchar(5)),3)) from #Num where Num<8
) B on CharIndex(Patt,cast(col as varchar(25)))>0
Returns
Col
5555
6789
22222
45678
55555
77777
55555555
98765432
**
Think RUMMY 500. A groups or runs of 3. For example 123 or 321 or
333 would be a hit.
**

Related

SQL rotate results from wide to vertical

I would love some help with the best way to capture some column data and rotate it so I can store the column name and numeric value in a temp table.
The results are a single row showing a value for the columns listed here:
AccountingCode ActiveCostAllocationCode1Segment1 ActiveCostAllocationCode1Segment1Description
-------------- --------------------------------- --------------------------------------------
0 71 264
I would like to take the above query and rotate the output to look more vertical.
ColName Value
--------------------------------------------- ---------
AccountingCode 0
ActiveCostAllocationCode1Segment1 71
ActiveCostAllocationCode1Segment1Description 264
I was trying to use PIVOT / UNPIVOT but could not figure how to make it work for this case.
Any ideas?
If you are working with SQL Sever then you can use APPLY :
SELECT tt.ColName, tt.val
FROM table t CROSS APPLY
( VALUES ('AccountingCode', AccountingCode),
('ActiveCostAllocationCode1Segment1', ActiveCostAllocationCode1Segment1),
('ActiveCostAllocationCode1Segment1Description', ActiveCostAllocationCode1Segment1Description)
) tt(ColName, Val);
In standard you can use UNION ALL to UNPIVOT the data.
The generic way in SQL is UNION ALL:
select 'AccountingCode', AccountingCode from t
union all
select 'ActiveCostAllocationCode1Segment1', ActiveCostAllocationCode1Segment1 from t
union all
select 'ActiveCostAllocationCode1Segment1Description', ActiveCostAllocationCode1Segment1Description
This assumes that the types of the columns are compatible (they all look like integers, so that is probably okay).
The better method is to use a lateral join (or apply in some databases), if your database supports it.

How to get all combinations (ordered sampling without replacement) in regex

I'm trying to match a comma-separated string of numbers to a certain pattern within an sql query. I used regular expressions for similar problems in the past successfully, so I'm trying to get them working here as well. The problem is as follows:
The string may contain any number in a range (e.g. 1-4) exactly 0-1 times.
Two numbers are comma-separated
The numbers have to be in ascending order
(I think this is kind of a case of ordered sampling without replacement)
Sticking with the example of 1-4, the following entries should match:
1
1,2
1,3
1,4
1,2,3
1,2,4
1,3,4
1,2,3,4
2
2,3
2,4
3
3,4
4
and these should not:
q dawda 323123 a3 a1 1aa,1234 4321 a4,32,1a 1112222334411
1,,2,33,444, 11,12,a 234 2,2,3 33 3,3,3 3,34 34 123 1,4,4,4a 1,444
The best try I currently have is:
\b[1-4][\,]?[2-4]?[\,]?[3-4]?[\,]?[4]?\b
This still has two major drawbacks:
It delivers quite a lot of false positives. Numbers are not eliminated after they occurred once.
It will get rather long, when the range of numbers increases, e.g. 1-18 is already possible as well, bigger ranges are thinkable of.
I used regexpal for testing purposes.
Side notes:
As I'm using sql it would be possible to implement some algorithm in another language to generate all the possible combinations and save them in a table that can be used for joining, see e.g. How to get all possible combinations of a list’s elements?. I would like to only rely on that as a last resort, as the creation of new tables will be involved and these will contain a lot of entries.
The resulting sql statement that uses the regex should run on both Postgres and Oracle.
The set of positive examples is also referred to as "powerset".
Edit: Clarified the list of positive examples
I wouldn't use Regex for this, as e.g. the requirements "have to be unique" and "have to be in ascending order" can't really be expressed with a regular expression (at least I can't think of a way to do that).
As you also need to have an expression that is identical in Postgres and Oracle, I would create a function that checks such a list and then hide the DBMS specific implementation in that function.
For Postgres I would use its array handling features to implement that function:
create or replace function is_valid(p_input text)
returns boolean
as
$$
select coalesce(array_agg(x order by x) = string_to_array(p_input, ','), false)
from (
select distinct x
from unnest(string_to_array(p_input,',')) as t(x)
where x ~ '^[0-9]+$' -- only numbers
) t
where x::int between 1 and 4 -- the cast is safe as the inner query only returns valid numbers
$$
language sql;
The inner query returns all (distinct) elements from the input list as individual numbers. The outer query then aggregates that back for values in the desired range and numeric order. If that result isn't the same as the input, the input isn't valid.
Then with the following sample data:
with sample_data (input) as (
values
('1'),
('1,2'),
('1,3'),
('1,4'),
('1,2,3'),
('1,2,4'),
('foo'),
('1aa,1234'),
('1,,2,33,444,')
)
select input, is_valid(input)
from sample_data;
It will return:
input | is_valid
-------------+---------
1 | true
1,2 | true
1,3 | true
1,4 | true
1,2,3 | true
1,2,4 | true
foo | false
1aa,1234 | false
1,,2,33,444, | false
If you want to use the same function in Postgres and Oracle you probably need to use returns integer in Postgres as Oracle still doesn't support a boolean data type in SQL
Oracle's string processing functions are less powerful than Postgres' functions (e.g. no string_to_array or unnest), but you can probably implement a similar logic in PL/SQL as well (albeit more complicated)

Sort a column that contains numbers in the text

I have a nvarchar(500) column in SQL server 2008 that contains letters and numbersand here is what data looks like when I user ORDER BY clause in SQL Server...
env
guide
Seg 18 - NWS
Seg 19 - NWS
Seg 1A - ECC
Seg 1B - ECC
Seg 22 - xxx
Seg 23 - GL
Seg 3- GL
Seg 4 - GL
Utils
But I would like to get this result...
env
guide
Seg 1A - ECC
Seg 1B - ECC
Seg 3- GL
Seg 4 - GL
Seg 18 - NWS
Seg 19 - NWS
Seg 22 - xxx
Seg 23 - GL
Utils
Any suggestions?
This is called a natural sort and going to be a nightmare for you if you don't own the database. You can follow the response here which seems quite robust, but involves injecting a new function into the CLR.
As one of the commenters suggests you can split and then sort on the 3 columns, but if your text isn't a fixed width, you may run into some more problems and result in an ultimate hacky solution. This is a decent T-SQL solution you can try but it relies on fixed width... to which more people suggest padding your numbers.
What else do you want? Have you tried and found those wanting?
First, I assume you have only one number in your pattern. If not, you can extend the below code assuming you have some known rule for detecting the right string.
So, this code here below (which I don't have any machine to test on currently...) finds the number start index and length, extracts it and converts it into an integer (I assume the string is inside a variable named #data):
DECLARE #numindex int;
SELECT #numindex = PATINDEX('[0-9]', #data);
DECLARE #numlength int;
SELECT #numlength = PATINDEX('[^0-9]', SUBSTRING(#data, #numindex, LEN(DATA) - #numindex - 1));
-- This is the result below
SELECT CONVERT(int, SUBSTRING(#data, #numindex, #numlength))
If all the assumptions I wrote do suit you, you could either create a scalar valued function from this, or add this directly to the query (which may make the query a bit unreadable...).
Regarding performance, this is obviously not ideal to sort like this on every query. If this is going to happen a lot and the data isn't going to change frequently, maybe creating a view that would possibly be cached would improve the performance.
Here is one way to do this assuming the only numeric values are what you posted. If there is the possibility that the suffix can also contain numbers this will need a slight tweak. I am using a super awesome inline table value function created by Dwain Camps at sql server central. I know this site requires a login but it is free and this technique is well worth signing up for.
http://www.sqlservercentral.com/articles/String+Manipulation/94365/
Using his function this is pretty simple. This is 100% set based. No loops, while or cursors at all.
declare #Table table (SomeValue varchar(25))
insert #Table
select 'Seg 1A - ECC' union all
select 'Seg 1B - ECC' union all
select 'Seg 3- GL' union all
select 'Seg 4 - GL' union all
select 'Seg 18 - NWS' union all
select 'Seg 19 - NWS' union all
select 'Seg 22 - xxx' union all
select 'Seg 23 - GL'
select t.*
from #Table t
outer apply dbo.PatternSplitCM(t.SomeValue, '[%0-9%]') x
where x.Matched = 1
order by x.Matched desc, Item

substring and trim in Teradata

I am working in Teradata with some descriptive data that needs to be transformed from a gerneric varchar(60) into the different field lengths based on the type of data element and the attribute value. So I need to take whatever is in the Varchar(60) and based on field 'ABCD' act on field 'XYZ'. In this case XYZ is a varchar(3). To do this I am using CASE logic within my select. What I want to do is
eliminate all occurances of non alphabet/numeric data. All I want left are upper case Alpha chars and numbers.
In this case "Where abcd = 'GROUP' then xyz should come out as a '000', '002', 'A', 'C'
eliminate extra padding
Shift everything Right
abcd xyz
1 GROUP NULL
2 GROUP $
3 GROUP 000000000000000000000000000000000000000000000000000000000000
4 GROUP 000000000000000000000000000000000000000000000000000000000002
5 GROUP A
6 GROUP C
7 GROUP r
To do this I have tried TRIM and SUBSTR amongst several other things that did not work. I have pasted what I have working now, but I am not reliably working through the data within the select. I am really looking for some options on how to better work with strings in Teradata. I have been working out of the "SQL Functions, Operators, Expressions and Predicates" online PDF. Is there a better reference. We are on TD 13
SELECT abcd
, CASE
-- xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
WHEN abcd= 'GROUP'
THEN(
CASE
WHEN SUBSTR(tx.abcd,60, 4) = 0
THEN (
SUBSTR(tx.abcd,60, 3)
)
ELSE
TRIM (TRAILING FROM tx.abcd)
END
)
END AS abcd
FROM db.descr tx
WHERE tx.abcd IS IN ( 'GROUP')
The end result should look like this
abcd xyz
1 GROUP 000
2 GROUP 002
3 GROUP A
4 GROUP C
I will have to deal with approx 60 different "abcd" types, but they should all conform to the type of data I am currently seeing.. ie.. mixed case, non numeric, non alphabet, padded, etc..
I know there is a better way, but I have come in several circles trying to figure this out over the weekend and need a little push in the right direction.
Thanks in advance,
Pat
The SQL below uses the CHARACTER_LENGTH function to first determine if there is a need to perform what amounts to a RIGHT(tx.xyz, 3) using the native functions in Teradata 13.x. I think this may accomplish what you are looking to do. I hope I have not misinterpreted your explanation:
SELECT CASE WHEN tx.abcd = 'GROUP'
AND CHARACTER_LENGTH(TRIM(BOTH FROM tx.xyz) > 3
THEN SUBSTRING(tx.xyz FROM (CHARACTER_LENGTH(TRIM(BOTH FROM tx.xyz)) - 3))
ELSE tx.abcd
END
FROM db.descr tx;
EDIT: Fixed parenthesis in SUBSTRING

How to multiply a row?

I have a Postgres 9.0 query returning results in a way similar to this:
item;qty
AAAA;2
EEEE;3
What I would like is to transform that into:
AAAA
AAAA
EEEE
EEEE
EEEE
Is there any way I can do that on simple, i.e., without stored procedures, functions, etc?
There's a function 'generate_series' which can be used to generate a table of values. These can be used to repeat a column via joining:
select item
from data,generate_series(0,1000)
where generate_series<qty order by item;
Consider the following demo:
CREATE TEMP TABLE x(item text, qty int);
INSERT INTO x VALUES
('AAAA',2)
,('EEEE',3)
,('IIII',4);
SELECT regexp_split_to_table(rtrim(repeat(item||'~#~',qty),'~#~'),'~#~') AS item
FROM x;
Produces exactly the requested result.
In my tests it performs faster by an order of magnitude than the solution with generate_series().
Additional bonus: works with any number of qty.
Weakness: you need a delimiter-string not contained in any item.
SELECT
myTable.item
FROM
myTable
INNER JOIN
(SELECT 1 AS counter UNION ALL SELECT 2 UNION ALL SELECT 3) AS multiplier
ON multiplier.counter <= myTable.qty
Increase the number of UNIONS based on your Maximum value in qty
But I'd also follow #djacobson's advice : explain why you want to do this, as the may be a completely different approach altogether. Doing this feels, ummm, odd...