Sort a column that contains numbers in the text

Sort a column that contains numbers in the text - sql

I have a nvarchar(500) column in SQL server 2008 that contains letters and numbersand here is what data looks like when I user ORDER BY clause in SQL Server...
env
guide
Seg 18 - NWS
Seg 19 - NWS
Seg 1A - ECC
Seg 1B - ECC
Seg 22 - xxx
Seg 23 - GL
Seg 3- GL
Seg 4 - GL
Utils
But I would like to get this result...
env
guide
Seg 1A - ECC
Seg 1B - ECC
Seg 3- GL
Seg 4 - GL
Seg 18 - NWS
Seg 19 - NWS
Seg 22 - xxx
Seg 23 - GL
Utils
Any suggestions?

This is called a natural sort and going to be a nightmare for you if you don't own the database. You can follow the response here which seems quite robust, but involves injecting a new function into the CLR.
As one of the commenters suggests you can split and then sort on the 3 columns, but if your text isn't a fixed width, you may run into some more problems and result in an ultimate hacky solution. This is a decent T-SQL solution you can try but it relies on fixed width... to which more people suggest padding your numbers.
What else do you want? Have you tried and found those wanting?

First, I assume you have only one number in your pattern. If not, you can extend the below code assuming you have some known rule for detecting the right string.
So, this code here below (which I don't have any machine to test on currently...) finds the number start index and length, extracts it and converts it into an integer (I assume the string is inside a variable named #data):
DECLARE #numindex int;
SELECT #numindex = PATINDEX('[0-9]', #data);
DECLARE #numlength int;
SELECT #numlength = PATINDEX('[^0-9]', SUBSTRING(#data, #numindex, LEN(DATA) - #numindex - 1));
-- This is the result below
SELECT CONVERT(int, SUBSTRING(#data, #numindex, #numlength))
If all the assumptions I wrote do suit you, you could either create a scalar valued function from this, or add this directly to the query (which may make the query a bit unreadable...).
Regarding performance, this is obviously not ideal to sort like this on every query. If this is going to happen a lot and the data isn't going to change frequently, maybe creating a view that would possibly be cached would improve the performance.

Here is one way to do this assuming the only numeric values are what you posted. If there is the possibility that the suffix can also contain numbers this will need a slight tweak. I am using a super awesome inline table value function created by Dwain Camps at sql server central. I know this site requires a login but it is free and this technique is well worth signing up for.
http://www.sqlservercentral.com/articles/String+Manipulation/94365/
Using his function this is pretty simple. This is 100% set based. No loops, while or cursors at all.
declare #Table table (SomeValue varchar(25))
insert #Table
select 'Seg 1A - ECC' union all
select 'Seg 1B - ECC' union all
select 'Seg 3- GL' union all
select 'Seg 4 - GL' union all
select 'Seg 18 - NWS' union all
select 'Seg 19 - NWS' union all
select 'Seg 22 - xxx' union all
select 'Seg 23 - GL'
select t.*
from #Table t
outer apply dbo.PatternSplitCM(t.SomeValue, '[%0-9%]') x
where x.Matched = 1
order by x.Matched desc, Item

Related

Avoid hardcoding in select clause

I have the below query
SELECT decode(detl_cd,'AAA',trunc(amt * to_number(pct) / 100,2),
'BBB',trunc(amt * to_number(pct) / 100,2),
'CCC',(amt-trunc(amt * to_number(pct) / 100,2))
INTO trans_amount
FROM dual;
I get the value detl_cd from a cursor and amt from an input file.
Select tb1.id, tb2.detl_cd,tb2.pct
from tb1
join tb2 on tb1.agent_code=tb2.agent_code
where tb1.id='1';
Each id has 3 detl cd's and each detl code has different calculation. How to avoid hardcode in decode. Creating a table is not an option.
Input file
ID Amount
1 1000
2 2500
3 350
Id 1 & 2 belong to a group that is assigned 3 different detl cd's and different precentage(pct).
output file
ID Detl_cd Amount
1 AAA1 250
1 BBB1 250
1 CCC1 750
2 AAA3 625
2 BBB3 625
2 CCC3 1875
3 350
Each ID has 3 different detl_cd's but the calculation for AAA1 and AAA2 are the same so is BBB & CCC.

Creating a table is not an option.
You want a solution which stores a set of business rules without specifying the business rules in the code. But also without creating a table to store those rules.
That only leaves a user-defined function.
create or replace function calc_amount
( p_detl_cd in varchar2
,p_amt in number
,p_pct in number )
return number
as
begin
case substr(p_detl_cd, 1, 3)
when 'AAA' then return trunc(p_amt * to_number(p_pct) / 100,2);
when 'BBB' then return trunc(p_amt * to_number(p_pct) / 100,2);
when 'CCC' then return (p_amt-trunc(p_amt * to_number(p_pct) / 100,2);
end case;
end calc_amount;
You would call this function in SQL or in PL/SQL. You are a bit vague regarding tables and files, so I'm not really clear where the data comes from, but it might look something like this in PL/SQL:
trans_amount := calc_amount(detl_cd, amt, pct);
I am looking to avoid hardcoding of 'AAA' as these codes may change/replaced in future and i do not want a rework
Or the codes may change the same and the calculations change. Doesn't matter. The hard truth is, you have to hard code the codes and their associated rules somewhere. It is impossible to have an infinitely flexible, soft coded system.
A table is the easiest thing to maintain, and that offers you the most flexibility. But you need to use dynamic SQL or a function to apply the calculation; a function would be my preference. The worst solution would be to have the codes and calculations in an external configuration file which you load at the same time as the input file.
Alternatively, try to put a value on "may change". How likely is it that the codes (or calculations) will change? How often? Do the maths and maybe you'll discover that change is unlikely or very infrequent, and the simplest option is to stick with that decode and take the hit of rework should the occasion arise.
Incidentally, are you sure you mean trunc() and not round()?

I guess you may want
WITH DATA AS
(Select tb1.id , tb2.detl_cd,tb2.pct
from tb1
join tb2 on tb1.agent_code=tb2.agent_code
where tb1.id='1')
SELECT decode(detl_cd,'AAA',trunc(amt * to_number(pct) /
100,2),
'BBB',trunc(amt * to_number(pct) / 100,2),
'CCC',(amt-trunc(amt * to_number(pct) / 100,2))
as trans_amount
FROM Data;

SQL rotate results from wide to vertical

I would love some help with the best way to capture some column data and rotate it so I can store the column name and numeric value in a temp table.
The results are a single row showing a value for the columns listed here:
AccountingCode ActiveCostAllocationCode1Segment1 ActiveCostAllocationCode1Segment1Description
-------------- --------------------------------- --------------------------------------------
0 71 264
I would like to take the above query and rotate the output to look more vertical.
ColName Value
--------------------------------------------- ---------
AccountingCode 0
ActiveCostAllocationCode1Segment1 71
ActiveCostAllocationCode1Segment1Description 264
I was trying to use PIVOT / UNPIVOT but could not figure how to make it work for this case.
Any ideas?

If you are working with SQL Sever then you can use APPLY :
SELECT tt.ColName, tt.val
FROM table t CROSS APPLY
( VALUES ('AccountingCode', AccountingCode),
('ActiveCostAllocationCode1Segment1', ActiveCostAllocationCode1Segment1),
('ActiveCostAllocationCode1Segment1Description', ActiveCostAllocationCode1Segment1Description)
) tt(ColName, Val);
In standard you can use UNION ALL to UNPIVOT the data.

The generic way in SQL is UNION ALL:
select 'AccountingCode', AccountingCode from t
union all
select 'ActiveCostAllocationCode1Segment1', ActiveCostAllocationCode1Segment1 from t
union all
select 'ActiveCostAllocationCode1Segment1Description', ActiveCostAllocationCode1Segment1Description
This assumes that the types of the columns are compatible (they all look like integers, so that is probably okay).
The better method is to use a lateral join (or apply in some databases), if your database supports it.

SQL Server - Finding Number patterns

I've been looking at this this for the last hour and just can't seem to find a way to do it, I'm sure its pretty simple but my google and reading skills have failed me.
All I need to do is to find ascending and descending numerical patterns in a field.
Like in this pseudo-SQL Code:
select * where col = '123456' or '23456' or '7654' or '987654321'
Most of the pattern methods using LIKE seem to be around placement of characters/numbers rather than the specific ordering,
I've started trying to create a query than takes the first character and compares it to the next one but this seems really ineffective and inefficient as it would need to take each field in the column run the query and return it if it matches.
I've managed to find a way to get it if its a repeated character but not if its an increase or decrease.
Any help would be greatly appreciated.

You can put regular expression inside your LIKE quotes.
Ascending:
^(?=\d{4,10}$)1?2?3?4?5?6?7?8?9?0?$
Descending:
^(?=\d{4,10}$)9?8?7?6?5?4?3?2?1?0?$
d{4,10} here is possible value length, between 4 and 10 symbols.
Won't be fast, most likely.
You can check how it works on http://rubular.com/.
Edit: Sorry, I forgot to mention you will have to do a MS SQL Server CLR integration first. By default, MSSQL Server does not fully support RegEx.
This article describes how to create and use extensions for the LIKE (Transact-SQL) clause that supports Regular Expressions.
http://www.codeproject.com/Articles/42764/Regular-Expressions-in-MS-SQL-Server

Another option could be something like this:
Declare #Table table (col int)
Insert into #Table values
(4141243),(4290577),(98765432),(78635389),(4141243),(22222),(4290046),(55555555),(4141243),(6789),(77777),(45678),(4294461),(55555),(4141243),(5555)
Declare #Num table (Num int);Insert Into #Num values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)
Select Distinct A.*
From #Table A
Join (
Select Patt=replicate(Num,3) from #Num
Union All
Select Patt=right('000'+cast((Num*100+Num*10+Num)+12 as varchar(5)),3) from #Num where Num<8
Union All
Select Patt=reverse(right('000'+cast((Num*100+Num*10+Num)+12 as varchar(5)),3)) from #Num where Num<8
) B on CharIndex(Patt,cast(col as varchar(25)))>0
Returns
Col
5555
6789
22222
45678
55555
77777
55555555
98765432
**
Think RUMMY 500. A groups or runs of 3. For example 123 or 321 or
333 would be a hit.
**

Grouping sets columns in aggregate arguments and NULL replacement

There are many grouping sets examples on the internet like query Q1 in the example below. But query Q2 is different because A2 is a grouping column and it is used as the argument to SUM().
Which one of the following is correct for Q2 according to the SQL Standard (any version since 2003 that supports grouping sets)? If (1) is correct, please explain why with reference to the Standard.
A2 is replaced by NULL unless it is in an argument to an aggregate. This interpretation would give results R1 below. This is Oracle's behaviour (which seems more useful).
A2 is replaced by NULL including where it is used in an aggregate: this means that the aggregate will return NULL.
This interpretation would give results R2 below. This is how I have understood the SQL Standard (possibly incorrectly).
Example code:
-- Setup
create table A (A1 int, A2 int, A3 int);
insert into A values (1, 1, 100);
insert into A values (1, 2, 40);
insert into A values (2, 1, 70);
insert into A values (5, 1, 90);
-- Query Q1
-- Expected/Observed results:
--
-- A1 A2 SUM(A3)
-- ---------- ---------- ----------
-- 1 - 140
-- 2 - 70
-- 5 - 90
-- - 1 260
-- - 2 40
-- - - 300
select A1, A2, sum (A3)
from A
group by grouping sets ((A1), (A2), ())
order by 1, 2;
-- Query Q2
-- Results R1 (Oracle):
-- A1 A2 SUM(A2)
-- ---------- ---------- ----------
-- 1 - 3
-- 2 - 1
-- 5 - 1
-- - 1 3
-- - 2 2
-- - - 5
--
-- Results R2 (SQL Standard?):
-- A1 A2 SUM(A2)
-- ---------- ---------- ----------
-- 1 - -
-- 2 - -
-- 5 - -
-- - 1 3
-- - 2 2
-- - - - -- NULL row
select A1, A2, sum (A2)
from A
group by grouping sets ((A1), (A2), ())
order by 1, 2;
I am aware of this from SQL 2003 7.9 Syntax 17, which describes how columns are replaced with NULLs. However, I might have missed or misunderstood a rule elsewhere that excludes arguments to aggregates.
m) For each GS_i:
iii) Case:
1) If GS_i is an <ordinary grouping set>, then
A) Transform SL2 to obtain SL3, and transform HC to obtain
HC3, as follows:
II) Replace each <column reference> in SL2 and HC that
references PC_k by "CAST(NULL AS DTPCk)"

As with many difficult SQL features, it can help to look at earlier versions of the standard where the phrasing might be simpler. And it turns out that grouping sets were introduced in SQL 1999 and were then revised in SQL 2003.
SQL 1999
Syntax Rule 4 states:
Let SING be the <select list> constructed by removing from SL every <select
sublist> that is not a <derived column> that contains at least one <set
function specification>.
Then Syntax Rule 11 defines PC_k as the column references contained in the group by list. It constructs a derived table projecting the union of GSQQL_i, which are query specifications projecting the PC_k or NULL as appropriate, the PCBIT_i grouping function indicators and SING.
Thus any that contains a set function will not have its argument replaced, and its columns won't be replaced either. So answer (1) is correct.
However, in the following query the GSQQL_i corresponding to the <grand total> doesn't group by C1 so I think it will give an error rather than replacing C1 with NULL for that grouping set.
select C1 + MAX(C2) from T group by grouping sets ((C1), ());
SQL 2003 - 2011
I still don't have a definitive answer for this. It hinges on what they meant (or forgot to specify?) by "references" in the replacement rule. It would be clearer if it said one of "immediately contained", "simply contained" or "directly contained", as defined in ISO 9075-1 (SQL Part 1: Framework).
The note (number 134 in SQL 2003) at the start of General Rules says "As a
result of the syntactic transformations specified in the Syntax Rules of this
Sub-clause, only primitive <group by clause>s are left to consider." So the
aggregate argument either has or has not actually been replaced: we aren't
expected to evaluate aggregates in a special way (whereas if General Rule 3 were in effect applied before the NULL substitution of Syntax Rule 17 then answer (1) would be correct).
I found a draft of Technical Corrigendum 5 [pdf], which is a "diff" towards SQL 2003. This includes the relevant changes to on pages 80-87. Unfortunately the bulk of the change has only the brief rationale "Provide a correct, unified treatment of CUBE and ROLLUP". General Rule 3, quoted above, has the rationale "clarify the semantics of column references".

SQL, Auxiliary table of numbers

For certain types of sql queries, an auxiliary table of numbers can be very useful. It may be created as a table with as many rows as you need for a particular task or as a user defined function that returns the number of rows required in each query.
What is the optimal way to create such a function?

Heh... sorry I'm so late responding to an old post. And, yeah, I had to respond because the most popular answer (at the time, the Recursive CTE answer with the link to 14 different methods) on this thread is, ummm... performance challenged at best.
First, the article with the 14 different solutions is fine for seeing the different methods of creating a Numbers/Tally table on the fly but as pointed out in the article and in the cited thread, there's a very important quote...
"suggestions regarding efficiency and
performance are often subjective.
Regardless of how a query is being
used, the physical implementation
determines the efficiency of a query.
Therefore, rather than relying on
biased guidelines, it is imperative
that you test the query and determine
which one performs better."
Ironically, the article itself contains many subjective statements and "biased guidelines" such as "a recursive CTE can generate a number listing pretty efficiently" and "This is an efficient method of using WHILE loop from a newsgroup posting by Itzik Ben-Gen" (which I'm sure he posted just for comparison purposes). C'mon folks... Just mentioning Itzik's good name may lead some poor slob into actually using that horrible method. The author should practice what (s)he preaches and should do a little performance testing before making such ridiculously incorrect statements especially in the face of any scalablility.
With the thought of actually doing some testing before making any subjective claims about what any code does or what someone "likes", here's some code you can do your own testing with. Setup profiler for the SPID you're running the test from and check it out for yourself... just do a "Search'n'Replace" of the number 1000000 for your "favorite" number and see...
--===== Test for 1000000 rows ==================================
GO
--===== Traditional RECURSIVE CTE method
WITH Tally (N) AS
(
SELECT 1 UNION ALL
SELECT 1 + N FROM Tally WHERE N < 1000000
)
SELECT N
INTO #Tally1
FROM Tally
OPTION (MAXRECURSION 0);
GO
--===== Traditional WHILE LOOP method
CREATE TABLE #Tally2 (N INT);
SET NOCOUNT ON;
DECLARE #Index INT;
SET #Index = 1;
WHILE #Index <= 1000000
BEGIN
INSERT #Tally2 (N)
VALUES (#Index);
SET #Index = #Index + 1;
END;
GO
--===== Traditional CROSS JOIN table method
SELECT TOP (1000000)
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS N
INTO #Tally3
FROM Master.sys.All_Columns ac1
CROSS JOIN Master.sys.ALL_Columns ac2;
GO
--===== Itzik's CROSS JOINED CTE method
WITH E00(N) AS (SELECT 1 UNION ALL SELECT 1),
E02(N) AS (SELECT 1 FROM E00 a, E00 b),
E04(N) AS (SELECT 1 FROM E02 a, E02 b),
E08(N) AS (SELECT 1 FROM E04 a, E04 b),
E16(N) AS (SELECT 1 FROM E08 a, E08 b),
E32(N) AS (SELECT 1 FROM E16 a, E16 b),
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E32)
SELECT N
INTO #Tally4
FROM cteTally
WHERE N <= 1000000;
GO
--===== Housekeeping
DROP TABLE #Tally1, #Tally2, #Tally3, #Tally4;
GO
While we're at it, here's the numbers I get from SQL Profiler for the values of 100, 1000, 10000, 100000, and 1000000...
SPID TextData Dur(ms) CPU Reads Writes
---- ---------------------------------------- ------- ----- ------- ------
51 --===== Test for 100 rows ============== 8 0 0 0
51 --===== Traditional RECURSIVE CTE method 16 0 868 0
51 --===== Traditional WHILE LOOP method CR 73 16 175 2
51 --===== Traditional CROSS JOIN table met 11 0 80 0
51 --===== Itzik's CROSS JOINED CTE method 6 0 63 0
51 --===== Housekeeping DROP TABLE #Tally 35 31 401 0
51 --===== Test for 1000 rows ============= 0 0 0 0
51 --===== Traditional RECURSIVE CTE method 47 47 8074 0
51 --===== Traditional WHILE LOOP method CR 80 78 1085 0
51 --===== Traditional CROSS JOIN table met 5 0 98 0
51 --===== Itzik's CROSS JOINED CTE method 2 0 83 0
51 --===== Housekeeping DROP TABLE #Tally 6 15 426 0
51 --===== Test for 10000 rows ============ 0 0 0 0
51 --===== Traditional RECURSIVE CTE method 434 344 80230 10
51 --===== Traditional WHILE LOOP method CR 671 563 10240 9
51 --===== Traditional CROSS JOIN table met 25 31 302 15
51 --===== Itzik's CROSS JOINED CTE method 24 0 192 15
51 --===== Housekeeping DROP TABLE #Tally 7 15 531 0
51 --===== Test for 100000 rows =========== 0 0 0 0
51 --===== Traditional RECURSIVE CTE method 4143 3813 800260 154
51 --===== Traditional WHILE LOOP method CR 5820 5547 101380 161
51 --===== Traditional CROSS JOIN table met 160 140 479 211
51 --===== Itzik's CROSS JOINED CTE method 153 141 276 204
51 --===== Housekeeping DROP TABLE #Tally 10 15 761 0
51 --===== Test for 1000000 rows ========== 0 0 0 0
51 --===== Traditional RECURSIVE CTE method 41349 37437 8001048 1601
51 --===== Traditional WHILE LOOP method CR 59138 56141 1012785 1682
51 --===== Traditional CROSS JOIN table met 1224 1219 2429 2101
51 --===== Itzik's CROSS JOINED CTE method 1448 1328 1217 2095
51 --===== Housekeeping DROP TABLE #Tally 8 0 415 0
As you can see, the Recursive CTE method is the second worst only to the While Loop for Duration and CPU and has 8 times the memory pressure in the form of logical reads than the While Loop. It's RBAR on steroids and should be avoided, at all cost, for any single row calculations just as a While Loop should be avoided. There are places where recursion is quite valuable but this ISN'T one of them.
As a side bar, Mr. Denny is absolutely spot on... a correctly sized permanent Numbers or Tally table is the way to go for most things. What does correctly sized mean? Well, most people use a Tally table to generate dates or to do splits on VARCHAR(8000). If you create an 11,000 row Tally table with the correct clustered index on "N", you'll have enough rows to create more than 30 years worth of dates (I work with mortgages a fair bit so 30 years is a key number for me) and certainly enough to handle a VARCHAR(8000) split. Why is "right sizing" so important? If the Tally table is used a lot, it easily fits in cache which makes it blazingly fast without much pressure on memory at all.
Last but not least, every one knows that if you create a permanent Tally table, it doesn't much matter which method you use to build it because 1) it's only going to be made once and 2) if it's something like an 11,000 row table, all of the methods are going to run "good enough". So why all the indigination on my part about which method to use???
The answer is that some poor guy/gal who doesn't know any better and just needs to get his or her job done might see something like the Recursive CTE method and decide to use it for something much larger and much more frequently used than building a permanent Tally table and I'm trying to protect those people, the servers their code runs on, and the company that owns the data on those servers. Yeah... it's that big a deal. It should be for everyone else, as well. Teach the right way to do things instead of "good enough". Do some testing before posting or using something from a post or book... the life you save may, in fact, be your own especially if you think a recursive CTE is the way to go for something like this. ;-)
Thanks for listening...

The most optimal function would be to use a table instead of a function. Using a function causes extra CPU load to create the values for the data being returned, especially if the values being returned cover a very large range.

This article gives 14 different possible solutions with discussion of each. The important point is that:
suggestions regarding efficiency and
performance are often subjective.
Regardless of how a query is being
used, the physical implementation
determines the efficiency of a query.
Therefore, rather than relying on
biased guidelines, it is imperative
that you test the query and determine
which one performs better.
I personally liked:
WITH Nbrs ( n ) AS (
SELECT 1 UNION ALL
SELECT 1 + n FROM Nbrs WHERE n < 500 )
SELECT n FROM Nbrs
OPTION ( MAXRECURSION 500 )

This view is super fast and contains all positive int values.
CREATE VIEW dbo.Numbers
WITH SCHEMABINDING
AS
WITH Int1(z) AS (SELECT 0 UNION ALL SELECT 0)
, Int2(z) AS (SELECT 0 FROM Int1 a CROSS JOIN Int1 b)
, Int4(z) AS (SELECT 0 FROM Int2 a CROSS JOIN Int2 b)
, Int8(z) AS (SELECT 0 FROM Int4 a CROSS JOIN Int4 b)
, Int16(z) AS (SELECT 0 FROM Int8 a CROSS JOIN Int8 b)
, Int32(z) AS (SELECT TOP 2147483647 0 FROM Int16 a CROSS JOIN Int16 b)
SELECT ROW_NUMBER() OVER (ORDER BY z) AS n
FROM Int32
GO

From SQL Server 2022 you will be able to do
SELECT Value
FROM GENERATE_SERIES(START = 1, STOP = 100, STEP=1)
In the public preview of SQL Server 2022 (CTP2.0) there are some very promising elements and other less so. Hopefully the negative aspects can be addressed before the actual release.
✅ Execution time for number generation
The below generates 10,000,000 numbers in 700 ms in my test VM (the assigning to a variable removes any overhead from sending results to the client)
DECLARE #Value INT
SELECT #Value =[value]
FROM GENERATE_SERIES(START=1, STOP=10000000)
✅ Cardinality estimates
It is simple to calculate how many numbers will be returned from the operator and SQL Server takes advantage of this as shown below.
❌ Unnecessary Halloween Protection
The plan for the below insert has a completely unnecessary spool - presumably as SQL Server does not currently have logic to determine the source of the rows is not potentially the destination.
CREATE TABLE dbo.NumberHeap(Number INT);
INSERT INTO dbo.Numbers
SELECT [value]
FROM GENERATE_SERIES(START=1, STOP=10);
When inserting into a table with a clustered index on Number the spool may be replaced by a sort instead (that also provides the phase separation)
❌ Unnecessary sorts
The below will return the rows in order anyway but SQL Server apparently does not yet have the properties set to guarantee this and take advantage of it in the execution plan.
SELECT [value]
FROM GENERATE_SERIES(START=1, STOP=10)
ORDER BY [value]
RE: This last point Aaron Bertrand indicates that this is not a box currently ticked but that this may be forthcoming.

Using SQL Server 2016+ to generate numbers table you could use OPENJSON :
-- range from 0 to #max - 1
DECLARE #max INT = 40000;
SELECT rn = CAST([key] AS INT)
FROM OPENJSON(CONCAT('[1', REPLICATE(CAST(',1' AS VARCHAR(MAX)),#max-1),']'));
LiveDemo
Idea taken from How can we use OPENJSON to generate series of numbers?

edit: see Conrad's comment below.
Jeff Moden's answer is great ... but I find on Postgres that the Itzik method fails unless you remove the E32 row.
Slightly faster on postgres (40ms vs 100ms) is another method I found on here adapted for postgres:
WITH
E00 (N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ),
E01 (N) AS (SELECT a.N FROM E00 a CROSS JOIN E00 b),
E02 (N) AS (SELECT a.N FROM E01 a CROSS JOIN E01 b ),
E03 (N) AS (SELECT a.N FROM E02 a CROSS JOIN E02 b
LIMIT 11000 -- end record 11,000 good for 30 yrs dates
), -- max is 100,000,000, starts slowing e.g. 1 million 1.5 secs, 2 mil 2.5 secs, 3 mill 4 secs
Tally (N) as (SELECT row_number() OVER (ORDER BY a.N) FROM E03 a)
SELECT N
FROM Tally
As I am moving from SQL Server to Postgres world, may have missed a better way to do tally tables on that platform ... INTEGER()? SEQUENCE()?

Still much later, I'd like to contribute a slightly different 'traditional' CTE (does not touch base tables to get the volume of rows):
--===== Hans CROSS JOINED CTE method
WITH Numbers_CTE (Digit)
AS
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9)
SELECT HundredThousand.Digit * 100000 + TenThousand.Digit * 10000 + Thousand.Digit * 1000 + Hundred.Digit * 100 + Ten.Digit * 10 + One.Digit AS Number
INTO #Tally5
FROM Numbers_CTE AS One CROSS JOIN Numbers_CTE AS Ten CROSS JOIN Numbers_CTE AS Hundred CROSS JOIN Numbers_CTE AS Thousand CROSS JOIN Numbers_CTE AS TenThousand CROSS JOIN Numbers_CTE AS HundredThousand
This CTE performs more READs then Itzik's CTE but less then the Traditional CTE.
However, it consistently performs less WRITES then the other queries.
As you know, Writes are consistently quite much more expensive then Reads.
The duration depends heavily on the number of cores (MAXDOP) but, on my 8core, performs consistently quicker (less duration in ms) then the other queries.
I am using:
Microsoft SQL Server 2012 - 11.0.5058.0 (X64)
May 14 2014 18:34:29
Copyright (c) Microsoft Corporation
Enterprise Edition (64-bit) on Windows NT 6.3 <X64> (Build 9600: )
on Windows Server 2012 R2, 32 GB, Xeon X3450 #2.67Ghz, 4 cores HT enabled.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sort a column that contains numbers in the text - sql

Related

Avoid hardcoding in select clause

SQL rotate results from wide to vertical

SQL Server - Finding Number patterns

Grouping sets columns in aggregate arguments and NULL replacement

SQL, Auxiliary table of numbers

Categories

Resources