Related
I have a task to write a stored procedure or a function to return all possible combinations of a 4 digit number.
For example, if I pass 1234 to the stored procedure or function, it should return 4 digit numbers (all possible combinations), like
1123, 1112, 1324, 1342, 2134, 2234
and so on.
It can be of 4 digits only.
I have been doing this with using LIKE operator:
select *
from Table
where mynumber like '%1%'
and mynumber like '%2%'
and mynumber like '%3%'
and mynumber like '%4%'
but the problem is, I have hardcoded the numbers 1,2,3 and 4.
The number can be anything.
And these many LIKE operators can also impact the performance on a large table.
Can anybody give me some generic query to get the combinations?
Thanks in advance.
You can use a cross join:
with digits as (
select substring(num, 1, 1) as d union all
select substring(num, 2, 1) as d union all
select substring(num, 3, 1) as d union all
select substring(num, 4, 1) as d
)
select (d1.d + d2.d + d3.d + d4.d)
from digits d1 cross join
digits d2 cross join
digits d3 cross join
digits d4;
Note: This assumes that the number is a string (based on the fact that you use like in your question).
First you need to be able to break a four-digit number into separate digits. I suggest using a table variable and the modulus operator. Assuming we have an integer input named #input, we can break it into its digits using this:
DECLARE #Digits Table(Number int)
INSERT INTO #Digits(Number)
VALUES (#input % 10),
(#input / 10 % 10),
(#input / 100 % 10),
(#input / 1000 % 10)
Now we have a table with four rows, one row per digit.
To create a combination of four digits, we need to include the table four times, meaning we need three joins. The joins have to be set up so no digit is duplicated. Thus our FROM and JOIN clauses will look like this:
FROM #Digits D1
JOIN #Digits D2 ON D2.Number <> D1.Number
JOIN #Digits D3 ON D3.Number <> D1.Number
AND D3.Number <> D2.Number
JOIN #Digits D4 ON D4.Number <> D1.Number
AND D4.Number <> D2.Number
AND D4.Number <> D3.Number
Now to take the values and make a new, four-digit integer:
SELECT Number = D1.Number * 1000
+ D2.Number * 100
+ D3.Number * 10
+ D4.Number
The complete solution:
CREATE PROC Combine(#input AS int)
AS
BEGIN
DECLARE #Digits Table(Number int)
;
INSERT INTO #Digits(Number)
VALUES (#input % 10),
(#input / 10 % 10),
(#input / 100 % 10),
(#input / 1000 % 10)
;
SELECT Number = D1.Number * 1000
+ D2.Number * 100
+ D3.Number * 10
+ D4.Number
FROM #Digits D1
JOIN #Digits D2 ON D2.Number <> D1.Number
JOIN #Digits D3 ON D3.Number <> D1.Number
AND D3.Number <> D2.Number
JOIN #Digits D4 ON D4.Number <> D1.Number
AND D4.Number <> D2.Number
AND D4.Number <> D3.Number
ORDER BY Number
;
END
Usage:
EXEC Combine 1234
Resultset:
Number
------
1234
1243
1324
1342
1423
1432
2134
2143
2314
2341
2413
2431
3124
3142
3214
3241
3421
4123
4132
4213
4231
4312
4321
24 row(s)
Click here to run the above code on RexTester
Improving #GordonLinoff's answer, you can add an additional column in you CTE so that you can only make sure that each number is only used once:
declare #num varchar(max);
set #num = '1234';
with numCTE as (
select SUBSTRING(#num, 1,1) as col, 1 as cnt union
select SUBSTRING(#num, 2,1) as col, 3 as cnt union
select SUBSTRING(#num, 3,1) as col, 9 as cnt union
select SUBSTRING(#num, 4,1) as col, 27 as cnt
)
select DISTINCT (a1.col+a2.col+a3.col+a4.col)
from numCTE a1
cross join numCTE a2
cross join numCTE a3
cross join numCTE a4
where a1.cnt + a2.cnt + a3.cnt + a4.cnt = 40
Additionally, you can remove the WHERE to allow each number to be used more than once.
Don't forget the DISTINCT keyword. :)
You can try this.
select * from Table where mynumber like '%[1234][1234][1234][1234]%'
if it should only be 4 digit
select * from Table where mynumber like '[1234][1234][1234][1234]'
Also, you can use [1-4] instead of [1234]
Here's query to return all combinations of four digits (characters in general):
select A.col + B.col + C.col + D.col [Combinations] from
(values ('1'),('2'),('3'),('4')) as A(col) cross join
(values ('1'),('2'),('3'),('4')) as B(col) cross join
(values ('1'),('2'),('3'),('4')) as C(col) cross join
(values ('1'),('2'),('3'),('4')) as D(col)
Taking inspiration from this answer:
WITH n AS (
SELECT n FROM (VALUES (1), (2), (3), (4)) n (n)
) SELECT ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n
FROM n ones, n tens, n hundreds, n thousands
You can define a table in your stored procedure will all possible combinations but using letters for codding:
DECLARE #Combinations TABLE
(
[value] CHAR(4)
);
INSERT INTO #Combinations ([value])
VALUES ('AAAA')
,('AAAB')
,('AAAC')
,('AAAD')
...
Then update every latter with the input number:
DECLARE #Numner1 TINYINT = 2
,#Numner2 TINYINT = 5
,#Numner3 TINYINT = 1
,#Numner4 TINYINT = 3;
UPDATE #Combinations
SET [value] = REPLACE([value], 'A', #Numner1);
UPDATE #Combinations
SET [value] = REPLACE([value], 'B', #Numner2);
UPDATE #Combinations
SET [value] = REPLACE([value], 'C', #Numner3);
UPDATE #Combinations
SET [value] = REPLACE([value], 'D', #Numner4);
Then just join the table with your table:
select *
from Table A
INNER JOIN #Combinations B
ON A.[mynumber] = B.[value];
Try This approach
DECLARE #Num INT = 5432
;WITH CTE
AS
(
SELECT
SeqNo = 1,
Original = CAST(#Num AS VARCHAR(20)),
Num = SUBSTRING(CAST(#Num AS VARCHAR(20)),1,1)
UNION ALL
SELECT
SeqNo = SeqNo+1,
Original,
Num = SUBSTRING(Original,SeqNo+1,1)
FROM CTE
WHERE SeqNo < LEN(Original)
)
SELECT
MyStr = C1.Num+C2.Num+C3.Num+C4.Num
FROM CTE C1
CROSS JOIN CTE C2
CROSS JOIN CTE C3
CROSS JOIN CTE C4
WHERE
(
C1.SeqNo <> C2.SeqNo
AND
C3.SeqNo <> C4.SeqNo
AND
C4.SeqNo <> C1.SeqNo
AND
C2.SeqNo <> C3.SeqNo
AND
C1.SeqNo <> C3.SeqNo
AND
C4.SeqNo <> C2.SeqNo
)
ORDER BY 1
My Result
MyStr
2345
2354
2435
2453
2534
2543
3245
3254
3425
3452
3524
3542
4235
4253
4325
4352
4523
4532
5234
5243
5324
5342
5423
5432
Please try this. SET BASED Approach to generate all Possible combinations of a number-
IF OBJECT_ID('Tempdb..#T') IS NOT NULL
DROP TABLE tempdb..#T
DECLARE # AS INT = 1234
IF LEN(#) <= 7
BEGIN
DECLARE #str AS VARCHAR(100)
SET #str = CAST(# AS VARCHAR(100))
DECLARE #cols AS VARCHAR(100) = ''
SELECT DISTINCT SUBSTRING(#str,NUMBER,1) n INTO #T FROM MASTER..spt_values WHERE number > 0 AND number <= LEN(#)
SELECT #cols = #cols + r
FROM ( SELECT DISTINCT CONCAT(', o',number,'.n') r FROM MASTER..spt_values WHERE number > 0 AND number <= (LEN(#)-1) )q
DECLARE #ExecStr AS VARCHAR(1000) = ''
SET #ExecStr = 'SELECT CAST(CONCAT( a.n' + #cols + ' ) AS INT) Combinations FROM #T a'
SELECT #ExecStr = #ExecStr + r FROM
(
SELECT DISTINCT CONCAT(' CROSS APPLY ( SELECT * FROM #T b' , number , ' WHERE ( b' , number, '.n' , ' <> a.n ) ',
CASE WHEN number = 1 then ''
WHEN number = 2 then ' AND ( b2.n <> o1.n )'
WHEN number = 3 then ' AND ( b3.n <> o1.n ) AND ( b3.n <> o2.n ) '
WHEN number = 4 then ' AND ( b4.n <> o1.n ) AND ( b4.n <> o2.n ) AND ( b4.n <> o3.n ) '
WHEN number = 5 then ' AND ( b5.n <> o1.n ) AND ( b5.n <> o2.n ) AND ( b5.n <> o3.n ) AND ( b5.n <> o4.n ) '
WHEN number = 6 then ' AND ( b6.n <> o1.n ) AND ( b6.n <> o2.n ) AND ( b6.n <> o3.n ) AND ( b6.n <> o4.n ) AND ( b6.n <> o5.n ) '
END
,') o' , number ) r FROM
MASTER..spt_values
WHERE number > 0 AND number <= (LEN(#)-1)
)p
EXEC (#ExecStr)
END
IF OBJECT_ID('tempdb..#T') IS NOT NULL
DROP TABLE tempdb..#T
OUTPUT
1432
1342
1423
1243
1324
1234
2431
2341
2413
2143
2314
2134
3421
3241
3412
3142
3214
3124
4321
4231
4312
4132
4213
4123
from - https://msbiskills.com/2016/05/20/sql-puzzle-generate-possible-combinations-of-a-number-puzzle/
You can try following alternative SQL Script as well
declare #param varchar(4) = '1234'
;with combination as (
select
distinct rn = DENSE_RANK() over (Order By num), num
from (
select substring(#param,1,1) as num
union all
select substring(#param,2,1)
union all
select substring(#param,3,1)
union all
select substring(#param,4,1)
) t
)
select
c1.num, c2.num, c3.num, c4.num,
cast(c1.num as char(1)) + cast(c2.num as char(1)) + cast(c3.num as char(1)) + cast(c4.num as char(1)) as number
from combination c1, combination c2, combination c3, combination c4
It produces 256 numbers for 4 digits
Actually this code is from SQL code which returns non-repeatable combinations in SQL of given set of items, but modified it to enable repeats of items in the output
I have a database field of Brazilian CPF numbers and want to check for their validity. These are 11 digit strings which are 9 digits and 2 checksum digits.
I currently implemented the checksum in MS Excel (see below) but I'd like to figure out a way to do it in SQL.
Checksum works as follows: (Hold on tight, this is nuts.)
The CPF number is written in the form ABCDEFGHI / JK or directly as
ABCDEFGHIJK, where the digits can not all be the same as each other.
The J is called 1st digit check of the CPF number.
The K is called the 2nd check digit of the CPF number.
First digit (J):
Multiply each digit of the first 9 by a constant:
10*A + 9*B + 8*C + 7*D + 6*E + 5*F + 4*G + 3*H + 2*I
Divide this sum by 11 and if the remainder is 0 or 1, J will be 0. If the remainder is >=2, J will be 11 - remainder.
Second digit (K): (Same calculation but including digit J)
Multiply each digit of the first 10 by a constant:
11A + 10B + 9C + 8D + 7E + 6F + 5G + 4H + 3I + 2J
Divide this sum by 11 and if the remainder is 0 or 1, K will be 0. If the remainder is >=2, K will be 11 - remainder.
--Implementation in MS Excel--
Assuming the CPF is in A2.
Optimizations here are welcome but not really the point of this question.
Digit J: =IF(MOD(SUM(MID($A2,1,1)*10,MID($A2,2,1)*9,MID($A2,3,1)*8,MID($A2,4,1)*7,MID($A2,5,1)*6,MID($A2,6,1)*5,MID($A2,7,1)*4,MID($A2,8,1)*3,MID($A2,9,1)*2),11)<=1,NUMBERVALUE(LEFT(RIGHT($A2,2),1))=0,NUMBERVALUE(LEFT(RIGHT($A2,2),1))=(11-MOD(SUM(MID($A2,1,1)*10,MID($A2,2,1)*9,MID($A2,3,1)*8,MID($A2,4,1)*7,MID($A2,5,1)*6,MID($A2,6,1)*5,MID($A2,7,1)*4,MID($A2,8,1)*3,MID($A2,9,1)*2),11)))
Digit K:
=IF(MOD(SUM(MID($A2,1,1)*11,MID($A2,2,1)*10,MID($A2,3,1)*9,MID($A2,4,1)*8,MID($A2,5,1)*7,MID($A2,6,1)*6,MID($A2,7,1)*5,MID($A2,8,1)*4,MID($A2,9,1)*3,MID($A2,10,1)*2),11)<=1,NUMBERVALUE(LEFT(RIGHT($A2,1),1))=0,NUMBERVALUE(LEFT(RIGHT($A2,1),1))=(11-MOD(SUM(MID($A2,1,1)*11,MID($A2,2,1)*10,MID($A2,3,1)*9,MID($A2,4,1)*8,MID($A2,5,1)*7,MID($A2,6,1)*6,MID($A2,7,1)*5,MID($A2,8,1)*4,MID($A2,9,1)*3,MID($A2,10,1)*2),11)))
My test table:
-- Create a table called CPF
CREATE TABLE CPF(Id integer PRIMARY KEY, No integer);
-- Create few records in this table
INSERT INTO CPF VALUES(1, 12345678901);
My nested query:
SELECT No,
(CASE WHEN (J != J2) THEN 'J wrong!' ELSE 'J ok!' END) as Jchk,
(CASE WHEN (K != K2) THEN 'K wrong!' ELSE 'K ok!' END) as Kchk
FROM
(SELECT No, J, K,
(CASE WHEN MJ < 2 THEN 0 ELSE 11 - MJ END) as J2,
(CASE WHEN MK < 2 THEN 0 ELSE 11 - MK END) as K2
FROM
(SELECT No, J, K,
MOD(10*A + 9*B + 8*C + 7*D + 6*E + 5*F + 4*G + 3*H + 2*I, 11) as MJ,
MOD(11*A + 10*B + 9*C + 8*D + 7*E + 6*F + 5*G + 4*H + 3*I + 2*J, 11) as MK
FROM
(SELECT
No,
substr(to_char(No), 1, 1) as A,
substr(to_char(No), 2, 1) as B,
substr(to_char(No), 3, 1) as C,
substr(to_char(No), 4, 1) as D,
substr(to_char(No), 5, 1) as E,
substr(to_char(No), 6, 1) as F,
substr(to_char(No), 7, 1) as G,
substr(to_char(No), 8, 1) as H,
substr(to_char(No), 9, 1) as I,
substr(to_char(No), 10, 1) as J,
substr(to_char(No), 11, 1) as K
FROM CPF)))
;
Assuming you have a table with an id primary key column and a cpf column that is NUMBER(9,0) data type then something like:
WITH digits ( id, a, b, c, d, e, f, g, h, i ) AS (
SELECT id,
MOD( TRUNC( cpf / 1e8 ), 10 ),
MOD( TRUNC( cpf / 1e7 ), 10 ),
MOD( TRUNC( cpf / 1e6 ), 10 ),
MOD( TRUNC( cpf / 1e5 ), 10 ),
MOD( TRUNC( cpf / 1e4 ), 10 ),
MOD( TRUNC( cpf / 1e3 ), 10 ),
MOD( TRUNC( cpf / 1e2 ), 10 ),
MOD( TRUNC( cpf / 1e1 ), 10 ),
MOD( TRUNC( cpf / 1e0 ), 10 )
FROM your_table
),
values1 ( id, j, k ) AS (
SELECT id,
MOD( 10*A + 9*B + 8*C + 7*D + 6*E + 5*F + 4*G + 3*H + 2*I, 11 ),
11*A + 10*B + 9*C + 8*D + 7*E + 6*F + 5*G + 4*H + 3*I
FROM digits
),
values2 ( id, j, k ) AS (
SELECT id,
CASE WHEN j <= 1 THEN 0 ELSE 11 - j END,
MOD( k + 2 * CASE WHEN j <= 1 THEN 0 ELSE 11 - j END, 11 )
FROM values1
)
SELECT id,
j,
CASE WHEN k <= 1 THEN 0 ELSE 11 - k END AS k
FROM values2
#SAR622: great question and thanks for the algorithm.
Here is a t-SQL solution for SQL Server, just in case. Note that Cadastro de Pessoas Físicas (CPF) numbers can only have 11 digits (pre-panded by zeros), that is they cannot exceed 10^12-1. If you note 14 digit numbers in your dataset, these are likely to be Cadastro Nacional da Pessoa Jurídica (CNPJ) numbers issued to business (or typos or something else). The fake CPF and CNPJ numbers can be generated (in bulk) and validated (individually) here. Also this site provides more info about a business located by its CNPJ (think of it as an implicit CNPJ validation). When validating a CPF number remember to check if the number is in range [0, 10^12-1]. You may need to remove any punctuation symbols and other invalid characters (as users, we tend to make typos).
This input table has top 5 invalid CPF numbers and bottom 4 valid ones:
IF OBJECT_ID('tempdb..#x') IS NOT NULL DROP TABLE #x;
CREATE TABLE #x (CPF BIGINT default NULL);
INSERT INTO #x (CPF) VALUES (12345678900);
INSERT INTO #x (CPF) VALUES (11);
INSERT INTO #x (CPF) VALUES (1010101010101010);
INSERT INTO #x (CPF) VALUES (11111179011525590);
INSERT INTO #x (CPF) VALUES (-32081397641);
INSERT INTO #x (CPF) VALUES (00000008726210061);
INSERT INTO #x (CPF) VALUES (56000608314);
INSERT INTO #x (CPF) VALUES (73570630706);
INSERT INTO #x (CPF) VALUES (93957133564);
The following t-SQL function modularizes implementation, but will likely be slower than the raw t-SQL that follows. Alternatively, you can create a t-SQL function with a TABLE input/output or a stored procedure.
ALTER FUNCTION fnIsCPF(#n BIGINT) RETURNS INT AS
BEGIN
DECLARE #isValid BIT = 0;
IF (#n > 0 AND #n < 100000000000)
BEGIN
--Parse out numbers
DECLARE #a TINYINT = FLOOR( #n / 10000000000)% 10;
DECLARE #b TINYINT = FLOOR( #n / 1000000000)% 10;
DECLARE #c TINYINT = FLOOR( #n / 100000000)% 10;
DECLARE #d TINYINT = FLOOR( #n / 10000000)% 10;
DECLARE #e TINYINT = FLOOR( #n / 1000000)% 10;
DECLARE #f TINYINT = FLOOR( #n / 100000)% 10;
DECLARE #g TINYINT = FLOOR( #n / 10000)% 10;
DECLARE #h TINYINT = FLOOR( #n / 1000)% 10;
DECLARE #i TINYINT = FLOOR( #n / 100)% 10;
DECLARE #j TINYINT = ISNULL(NULLIF(NULLIF(11-( 10*#a + 9*#b + 8*#c + 7*#d + 6*#e + 5*#f + 4*#g + 3*#h + 2*#i) % 11, 11), 10), 0);
DECLARE #k TINYINT = ISNULL(NULLIF(NULLIF(11 - (11*#a +10*#b + 9*#c + 8*#d + 7*#e + 6*#f + 5*#g + 4*#h + 3*#i + 2 * #j)% 11, 11), 10), 0);
RETURN CASE WHEN #j=FLOOR(#n / 10)% 10 AND #k=FLOOR(#n)% 10 THEN 1 ELSE 0 END
END;
RETURN #isValid;
END;
The output is:
SELECT CPF, isValid=dbo.fnIsCPF(CPF) FROM #x
CPF isValid
12345678900 0
11 0
1010101010101010 0
11111179011525590 0
-32081397641 0
8726210061 1
56000608314 1
73570630706 1
93957133564 1
t-SQL for a table:
WITH digits ( CPF, a, b, c, d, e, f, g, h, i ) AS (
SELECT CPF,
FLOOR( CPF / 10000000000)% 10,
FLOOR( CPF / 1000000000)% 10,
FLOOR( CPF / 100000000)% 10,
FLOOR( CPF / 10000000)% 10,
FLOOR( CPF / 1000000)% 10,
FLOOR( CPF / 100000)% 10,
FLOOR( CPF / 10000)% 10,
FLOOR( CPF / 1000)% 10,
FLOOR( CPF / 100)% 10
FROM #x
),
jk ( CPF, j, k ) AS (
SELECT CPF, ISNULL(NULLIF(NULLIF(11-( 10*A + 9*B + 8*C + 7*D + 6*E + 5*F + 4*G + 3*H + 2*I) % 11, 11), 10), 0),
11*A +10*B + 9*C + 8*D + 7*E + 6*F + 5*G + 4*H + 3*I
FROM digits
),
jk2 ( CPF, j, k ) AS (
SELECT CPF, j, ISNULL(NULLIF(NULLIF(11 - (k + 2 * j)% 11, 11), 10), 0)
FROM jk
)
SELECT CPF, isValid=CASE WHEN CPF>0 AND CPF<99999999999 AND j=FLOOR( CPF / 10)% 10 AND k=FLOOR( CPF)% 10 THEN 1 ELSE 0 END
FROM jk2
yielding the same output.
I have a table which contains data like this:
MinFormat(int) MaxFormat(int) Precision(nvarchar)
-2 3 1/2
The values in precision can be 1/2, 1/4, 1/8, 1/16, 1/32, 1/64 only.
Now I want result from query as -
-2
-3/2
-1
-1/2
0
1/2
1
3/2
2
5/2
3
Any query to get the result as follows?
Idea is to create result based onMinimum boundary (MinFomrat col value which is integer) to Maximum boundary (MaxFormat Col value which is integer) accordingly to the precision value.
Hence, in above example, value should start from -2 and generate the next values based on the precision value (1/2) till it comes to 3
Note this will only work for Precision 1/1, 1/2, 1/4, 1/8, 1/16, 1/32 and 1/64
DECLARE #t table(MinFormat int, MaxFormat int, Precision varchar(4))
INSERT #t values(-2, 3, '1/2')
DECLARE #numerator INT, #denominator DECIMAL(9,7)
DECLARE #MinFormat INT, #MaxFormat INT
-- put a where clause on this to get the needed row
SELECT #numerator = 1,
#denominator = STUFF(Precision, 1, charindex('/', Precision), ''),
#MinFormat = MinFormat,
#MaxFormat = MaxFormat
FROM #t
;WITH N(N)AS
(SELECT 1 FROM(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))M(N)),
tally(N)AS(SELECT ROW_NUMBER()OVER(ORDER BY N.N)FROM N,N a,N b,N c,N d,N e,N f)
SELECT top(cast((#MaxFormat- #MinFormat) / (#numerator/#denominator) as int) + 1)
CASE WHEN val % 1 = 0 THEN cast(cast(val as int) as varchar(10))
WHEN val*2 % 1 = 0 THEN cast(cast(val*2 as int) as varchar(10)) + '/2'
WHEN val*4 % 1 = 0 THEN cast(cast(val*4 as int) as varchar(10)) + '/4'
WHEN val*8 % 1 = 0 THEN cast(cast(val*8 as int) as varchar(10)) + '/8'
WHEN val*16 % 1 = 0 THEN cast(cast(val*16 as int) as varchar(10)) + '/16'
WHEN val*32 % 1 = 0 THEN cast(cast(val*32 as int) as varchar(10)) + '/32'
WHEN val*64 % 1 = 0 THEN cast(cast(val*64 as int) as varchar(10)) + '/64'
END
FROM tally
CROSS APPLY
(SELECT #MinFormat +(N-1) *(#numerator/#denominator) val) x
Sorry, now I'm late, but this was my approach:
I'd wrap this in a TVF actually and call it like
SELECT * FROM dbo.FractalStepper(-2,1,'1/4');
or join it with your actual table like
SELECT *
FROM SomeTable
CROSS APPLY dbo.MyFractalSteller(MinFormat,MaxFormat,[Precision]) AS Steps
But anyway, this was the code:
DECLARE #tbl TABLE (ID INT, MinFormat INT,MaxFormat INT,Precision NVARCHAR(100));
--Inserting two examples
INSERT INTO #tbl VALUES(1,-2,3,'1/2')
,(2,-4,-1,'1/4');
--Test with example 1, just set it to 2 if you want to try the other example
DECLARE #ID INT=1;
--If you want to get your steps numbered, just de-comment the tree occurencies of "Step"
WITH RecursiveCTE as
(
SELECT CAST(tbl.MinFormat AS FLOAT) AS RunningValue
,CAST(tbl.MaxFormat AS FLOAT) AS MaxF
,1/CAST(SUBSTRING(LTRIM(RTRIM(tbl.Precision)),3,10) AS FLOAT) AS Prec
--,1 AS Step
FROM #tbl AS tbl
WHERE tbl.ID=#ID
UNION ALL
SELECT RunningValue + Prec
,MaxF
,Prec
--,Step + 1
FROM RecursiveCTE
WHERE RunningValue + Prec <= MaxF
)
SELECT RunningValue --,Step
,CASE WHEN CAST(RunningValue AS INT)<>RunningValue
THEN CAST(RunningValue / Prec AS VARCHAR(10)) + '/' + CAST(CAST(1/Prec AS INT) AS VARCHAR(MAX))
ELSE CAST(RunningValue AS VARCHAR(10))
END AS RunningValueFractal
FROM RecursiveCTE;
The result
Value ValueFractal
-2 -2
-1,5 -3/2
-1 -1
-0,5 -1/2
0 0
0,5 1/2
1 1
1,5 3/2
2 2
2,5 5/2
3 3
Mine is a bit different others. I perform the addition on fraction and at the end, simplified the fraction.
-- This solution uses CTE
-- it breaks the #min, #max number into fraction
-- perform the addition in terms of fraction
-- at result, it attemp to convert the fraction to simpliest form
declare #min int,
#max int,
#step varchar(10),
#step_n int, -- precision step numerator portion
#step_d int -- precision step denominator portion
select #min = -2,
#max = 3,
#step = '1/16'
select #step_n = left(#step, charindex('/', #step) - 1),
#step_d = stuff(#step, 1, charindex('/', #step), '')
; with rcte as
(
-- Anchor member
select n = #min, -- numerator
d = 1, -- denominator
v = convert(decimal(10,5), #min)
union all
-- Recursive member
select n = case when ( (r.n * #step_d) + (r.d * #step_n) ) % #step_d = 0
and (r.d * #step_d) % #step_d = 0
then ( (r.n * #step_d) + (r.d * #step_n) ) / #step_d
else (r.n * #step_d) + (r.d * #step_n)
end,
d = case when ( (r.n * #step_d) + (r.d * #step_n) ) % #step_d = 0
and (r.d * #step_d) % #step_d = 0
then (r.d * #step_d) / #step_d
else (r.d * #step_d)
end,
v = convert(decimal(10,5), ((r.n * #step_d) + (r.d * #step_n)) / (r.d * #step_d * 1.0))
from rcte r
where r.v < #max
)
select *,
fraction = case when n = 0
then '0'
when coalesce(d2, d) = 1
then convert(varchar(10), coalesce(n2, n))
else convert(varchar(10), coalesce(n2, n)) + '/' + convert(varchar(10), coalesce(d2, d))
end
from rcte r
cross apply -- use to simplify the fraction result
(
select n2 = case when n % 32 = 0 and d % 32 = 0 then n / 32
when n % 16 = 0 and d % 16 = 0 then n / 16
when n % 8 = 0 and d % 8 = 0 then n / 8
when n % 4 = 0 and d % 4 = 0 then n / 4
when n % 2 = 0 and d % 2 = 0 then n / 2
end,
d2 = case when n % 32 = 0 and d % 32 = 0 then d / 32
when n % 16 = 0 and d % 16 = 0 then d / 16
when n % 8 = 0 and d % 8 = 0 then d / 8
when n % 4 = 0 and d % 4 = 0 then d / 4
when n % 2 = 0 and d % 2 = 0 then d / 2
end
) s
order by v
option (MAXRECURSION 0)
I have the following block of code that calculates the formula for a trend line using linear regression (method of least squares). It just find the R-Squared and coefficient of correlation value for X and Y axis.
This will calculate the exact value if X and Y axis are int and float.
CREATE FUNCTION [dbo].[LinearReqression] (#Data AS XML)
RETURNS TABLE AS RETURN (
WITH Array AS (
SELECT x = n.value('#x', 'float'),
y = n.value('#y', 'float')
FROM #Data.nodes('/r/n') v(n)
),
Medians AS (
SELECT xbar = AVG(x), ybar = AVG(y)
FROM Array ),
BetaCalc AS (
SELECT Beta = SUM(xdelta * (y - ybar)) / NULLIF(SUM(xdelta * xdelta), 0)
FROM Array
CROSS JOIN Medians
CROSS APPLY ( SELECT xdelta = (x - xbar) ) xd ),
AlphaCalc AS (
SELECT Alpha = ybar - xbar * beta
FROM Medians
CROSS JOIN BetaCalc),
SSCalc AS (
SELECT SS_tot = SUM((y - ybar) * (y - ybar)),
SS_err = SUM((y - (Alpha + Beta * x)) * (y - (Alpha + Beta * x)))
FROM Array
CROSS JOIN Medians
CROSS JOIN AlphaCalc
CROSS JOIN BetaCalc )
SELECT r_squared = CASE WHEN SS_tot = 0 THEN 1.0
ELSE 1.0 - ( SS_err / SS_tot ) END,
Alpha, Beta
FROM AlphaCalc
CROSS JOIN BetaCalc
CROSS JOIN SSCalc
)
Usage:
DECLARE #DataTable TABLE (
SourceID INT,
x Date,
y FLOAT
) ;
INSERT INTO #DataTable ( SourceID, x, y )
SELECT ID = 0, x = 1.2, y = 1.0
UNION ALL SELECT 1, 1.6, 1
UNION ALL SELECT 2, 2.0, 1.5
UNION ALL SELECT 3, 2.0, 1.75
UNION ALL SELECT 4, 2.1, 1.85
UNION ALL SELECT 5, 2.1, 2
UNION ALL SELECT 6, 2.2, 3
UNION ALL SELECT 7, 2.2, 3
UNION ALL SELECT 8, 2.3, 3.5
UNION ALL SELECT 9, 2.4, 4
UNION ALL SELECT 10, 2.5, 4
UNION ALL SELECT 11, 3, 4.5 ;
-- Create and view XML data array
DECLARE #DataXML XML ;
SET #DataXML = (
SELECT -- FLOAT values are formatted in XML like "1.000000000000000e+000", increasing the character count
-- Converting them to VARCHAR first keeps the XML small without sacrificing precision
-- They are unpacked as FLOAT in the function either way
[#x] = CAST(x AS VARCHAR(20)),
[#y] = CAST(y AS VARCHAR(20))
FROM #DataTable
FOR XML PATH('n'), ROOT('r') ) ;
SELECT #DataXML ;
-- Get the results
SELECT * FROM dbo.LinearReqression (#DataXML) ;
In my case X axis may be Date column also? So how can I calculate same regression analysis with date columns?
Short answer is: calculating trend line for dates is pretty much the same as calculating trend line for floats.
For dates you can choose some starting date and use number of days between the starting date and your dates as an X.
I didn't check your function itself and I assume that formulas there are correct.
Also, I don't understand why you generate XML out of the table and parse it back into the table inside the function. It is rather inefficient. You can simply pass the table.
I used your function to make two variants: for processing floats and for processing dates.
I'm using SQL Server 2008 for this example.
At first create a user-defined table type, so we could pass a table into the function:
CREATE TYPE [dbo].[FloatRegressionDataTableType] AS TABLE(
[x] [float] NOT NULL,
[y] [float] NOT NULL
)
GO
Then create the function that accepts such table:
CREATE FUNCTION [dbo].[LinearRegressionFloat] (#ParamData dbo.FloatRegressionDataTableType READONLY)
RETURNS TABLE AS RETURN (
WITH Array AS (
SELECT x,
y
FROM #ParamData
),
Medians AS (
SELECT xbar = AVG(x), ybar = AVG(y)
FROM Array ),
BetaCalc AS (
SELECT Beta = SUM(xdelta * (y - ybar)) / NULLIF(SUM(xdelta * xdelta), 0)
FROM Array
CROSS JOIN Medians
CROSS APPLY ( SELECT xdelta = (x - xbar) ) xd ),
AlphaCalc AS (
SELECT Alpha = ybar - xbar * beta
FROM Medians
CROSS JOIN BetaCalc),
SSCalc AS (
SELECT SS_tot = SUM((y - ybar) * (y - ybar)),
SS_err = SUM((y - (Alpha + Beta * x)) * (y - (Alpha + Beta * x)))
FROM Array
CROSS JOIN Medians
CROSS JOIN AlphaCalc
CROSS JOIN BetaCalc )
SELECT r_squared = CASE WHEN SS_tot = 0 THEN 1.0
ELSE 1.0 - ( SS_err / SS_tot ) END,
Alpha, Beta
FROM AlphaCalc
CROSS JOIN BetaCalc
CROSS JOIN SSCalc
)
GO
Very similarly, create a type for table with dates:
CREATE TYPE [dbo].[DateRegressionDataTableType] AS TABLE(
[x] [date] NOT NULL,
[y] [float] NOT NULL
)
GO
And create a function that accepts such table. For each given date it calculates the number of days between 2001-01-01 and the given date x using DATEDIFF and then casts the result to float to make sure that the rest of calculations is correct. You can try to remove the cast to float and you'll see the different result. You can choose any other starting date, it doesn't have to be 2001-01-01.
CREATE FUNCTION [dbo].[LinearRegressionDate] (#ParamData dbo.DateRegressionDataTableType READONLY)
RETURNS TABLE AS RETURN (
WITH Array AS (
SELECT CAST(DATEDIFF(day, '2001-01-01', x) AS float) AS x,
y
FROM #ParamData
),
Medians AS (
SELECT xbar = AVG(x), ybar = AVG(y)
FROM Array ),
BetaCalc AS (
SELECT Beta = SUM(xdelta * (y - ybar)) / NULLIF(SUM(xdelta * xdelta), 0)
FROM Array
CROSS JOIN Medians
CROSS APPLY ( SELECT xdelta = (x - xbar) ) xd ),
AlphaCalc AS (
SELECT Alpha = ybar - xbar * beta
FROM Medians
CROSS JOIN BetaCalc),
SSCalc AS (
SELECT SS_tot = SUM((y - ybar) * (y - ybar)),
SS_err = SUM((y - (Alpha + Beta * x)) * (y - (Alpha + Beta * x)))
FROM Array
CROSS JOIN Medians
CROSS JOIN AlphaCalc
CROSS JOIN BetaCalc )
SELECT r_squared = CASE WHEN SS_tot = 0 THEN 1.0
ELSE 1.0 - ( SS_err / SS_tot ) END,
Alpha, Beta
FROM AlphaCalc
CROSS JOIN BetaCalc
CROSS JOIN SSCalc
)
GO
This is how to test the functions:
-- test float data
DECLARE #FloatDataTable [dbo].[FloatRegressionDataTableType];
INSERT INTO #FloatDataTable (x, y)
VALUES
(1.2, 1.0)
,(1.6, 1)
,(2.0, 1.5)
,(2.0, 1.75)
,(2.1, 1.85)
,(2.1, 2)
,(2.2, 3)
,(2.2, 3)
,(2.3, 3.5)
,(2.4, 4)
,(2.5, 4)
,(3, 4.5);
SELECT * FROM dbo.LinearRegressionFloat(#FloatDataTable);
-- test date data
DECLARE #DateDataTable [dbo].[DateRegressionDataTableType];
INSERT INTO #DateDataTable (x, y)
VALUES
('2001-01-13', 1.0)
,('2001-01-17', 1)
,('2001-01-21', 1.5)
,('2001-01-21', 1.75)
,('2001-01-22', 1.85)
,('2001-01-22', 2)
,('2001-01-23', 3)
,('2001-01-23', 3)
,('2001-01-24', 3.5)
,('2001-01-25', 4)
,('2001-01-26', 4)
,('2001-01-31', 4.5);
SELECT * FROM dbo.LinearRegressionDate(#DateDataTable);
Here are two result sets:
r_squared Alpha Beta
----------------------------------------------------------
0.798224907472009 -2.66524390243902 2.46417682926829
r_squared Alpha Beta
----------------------------------------------------------
0.79822490747201 -2.66524390243902 0.246417682926829
I am looking for the SQL syntax to use HAVING in the following statement:
DECLARE #ORIG_LAT AS FLOAT = 40.4882011413574;
DECLARE #ORIG_LONG AS FLOAT = -80.1939010620117;
DECLARE #DISTANCE AS INT;
SELECT LATITUDE_DEG, LONGITUDE_DEG,SQRT(
POWER(69.1 * (LATITUDE_DEG - #ORIG_LAT), 2) +
POWER(69.1 * (#ORIG_LONG - LONGITUDE_DEG) * COS(LATITUDE_DEG / 57.3), 2)) AS DISTANCE
FROM NAVAIDS
HAVING DISTANCE < 80 --error
ORDER BY DISTANCE ASC;
Error:
Msg 207, Level 16, State 1, Line 9
Invalid column name 'distance'.
It's ok with the ORDER BY but I don't understand why it doesn't like the HAVING. Any help with direction? It is SQL Server 2008 R2
As you have noticed, you can't use an aliased column directly. The easiest solution would be to wrap your statement in a subselect and apply your clause on that.
SELECT *
FROM (
SELECT LATITUDE_DEG
, LONGITUDE_DEG
, SQRT(
POWER(69.1 * (LATITUDE_DEG - #ORIG_LAT), 2) +
POWER(69.1 * (#ORIG_LONG - LONGITUDE_DEG) * COS(LATITUDE_DEG / 57.3), 2)) AS DISTANCE
FROM NAVAIDS
) q
WHERE DISTANCE < 80
ORDER BY
DISTANCE ASC;
select *
from (
SELECT LATITUDE_DEG, LONGITUDE_DEG, SQRT(
POWER(69.1 * (LATITUDE_DEG - #ORIG_LAT), 2)
POWER(69.1 * (#ORIG_LONG - LONGITUDE_DEG) * COS(LATITUDE_DEG / 57.3), 2)
) AS DISTANCE
FROM NAVAIDS
) a
WHERE DISTANCE < 80
ORDER BY DISTANCE