We have a large database on which we have DB side pagination. This is quick, returning a page of 50 rows from millions of records in a small fraction of a second.
Users can define their own sort, basically choosing what column to sort by. Columns are dynamic - some have numeric values, some dates and some text.
While most sort as expected text sorts in a dumb way. Well, I say dumb, it makes sense to computers, but frustrates users.
For instance, sorting by a string record id gives something like:
rec1
rec10
rec14
rec2
rec20
rec3
rec4
...and so on.
I want this to take account of the number, so:
rec1
rec2
rec3
rec4
rec10
rec14
rec20
I can't control the input (otherwise I'd just format in leading 000s) and I can't rely on a single format - some are things like "{alpha code}-{dept code}-{rec id}".
I know a few ways to do this in C#, but can't pull down all the records to sort them, as that would be to slow.
Does anyone know a way to quickly apply a natural sort in Sql server?
We're using:
ROW_NUMBER() over (order by {field name} asc)
And then we're paging by that.
We can add triggers, although we wouldn't. All their input is parametrised and the like, but I can't change the format - if they put in "rec2" and "rec10" they expect them to be returned just like that, and in natural order.
We have valid user input that follows different formats for different clients.
One might go rec1, rec2, rec3, ... rec100, rec101
While another might go: grp1rec1, grp1rec2, ... grp20rec300, grp20rec301
When I say we can't control the input I mean that we can't force users to change these standards - they have a value like grp1rec1 and I can't reformat it as grp01rec001, as that would be changing something used for lookups and linking to external systems.
These formats vary a lot, but are often mixtures of letters and numbers.
Sorting these in C# is easy - just break it up into { "grp", 20, "rec", 301 } and then compare sequence values in turn.
However there may be millions of records and the data is paged, I need the sort to be done on the SQL server.
SQL server sorts by value, not comparison - in C# I can split the values out to compare, but in SQL I need some logic that (very quickly) gets a single value that consistently sorts.
#moebius - your answer might work, but it does feel like an ugly compromise to add a sort-key for all these text values.
order by LEN(value), value
Not perfect, but works well in a lot of cases.
Most of the SQL-based solutions I have seen break when the data gets complex enough (e.g. more than one or two numbers in it). Initially I tried implementing a NaturalSort function in T-SQL that met my requirements (among other things, handles an arbitrary number of numbers within the string), but the performance was way too slow.
Ultimately, I wrote a scalar CLR function in C# to allow for a natural sort, and even with unoptimized code the performance calling it from SQL Server is blindingly fast. It has the following characteristics:
will sort the first 1,000 characters or so correctly (easily modified in code or made into a parameter)
properly sorts decimals, so 123.333 comes before 123.45
because of above, will likely NOT sort things like IP addresses correctly; if you wish different behaviour, modify the code
supports sorting a string with an arbitrary number of numbers within it
will correctly sort numbers up to 25 digits long (easily modified in code or made into a parameter)
The code is here:
using System;
using System.Data.SqlTypes;
using System.Text;
using Microsoft.SqlServer.Server;
public class UDF
{
[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic=true)]
public static SqlString Naturalize(string val)
{
if (String.IsNullOrEmpty(val))
return val;
while(val.Contains(" "))
val = val.Replace(" ", " ");
const int maxLength = 1000;
const int padLength = 25;
bool inNumber = false;
bool isDecimal = false;
int numStart = 0;
int numLength = 0;
int length = val.Length < maxLength ? val.Length : maxLength;
//TODO: optimize this so that we exit for loop once sb.ToString() >= maxLength
var sb = new StringBuilder();
for (var i = 0; i < length; i++)
{
int charCode = (int)val[i];
if (charCode >= 48 && charCode <= 57)
{
if (!inNumber)
{
numStart = i;
numLength = 1;
inNumber = true;
continue;
}
numLength++;
continue;
}
if (inNumber)
{
sb.Append(PadNumber(val.Substring(numStart, numLength), isDecimal, padLength));
inNumber = false;
}
isDecimal = (charCode == 46);
sb.Append(val[i]);
}
if (inNumber)
sb.Append(PadNumber(val.Substring(numStart, numLength), isDecimal, padLength));
var ret = sb.ToString();
if (ret.Length > maxLength)
return ret.Substring(0, maxLength);
return ret;
}
static string PadNumber(string num, bool isDecimal, int padLength)
{
return isDecimal ? num.PadRight(padLength, '0') : num.PadLeft(padLength, '0');
}
}
To register this so that you can call it from SQL Server, run the following commands in Query Analyzer:
CREATE ASSEMBLY SqlServerClr FROM 'SqlServerClr.dll' --put the full path to DLL here
go
CREATE FUNCTION Naturalize(#val as nvarchar(max)) RETURNS nvarchar(1000)
EXTERNAL NAME SqlServerClr.UDF.Naturalize
go
Then, you can use it like so:
select *
from MyTable
order by dbo.Naturalize(MyTextField)
Note: If you get an error in SQL Server along the lines of Execution of user code in the .NET Framework is disabled. Enable "clr enabled" configuration option., follow the instructions here to enable it. Make sure you consider the security implications before doing so. If you are not the db admin, make sure you discuss this with your admin before making any changes to the server configuration.
Note2: This code does not properly support internationalization (e.g., assumes the decimal marker is ".", is not optimized for speed, etc. Suggestions on improving it are welcome!
Edit: Renamed the function to Naturalize instead of NaturalSort, since it does not do any actual sorting.
I know this is an old question but I just came across it and since it's not got an accepted answer.
I have always used ways similar to this:
SELECT [Column] FROM [Table]
ORDER BY RIGHT(REPLICATE('0', 1000) + LTRIM(RTRIM(CAST([Column] AS VARCHAR(MAX)))), 1000)
The only common times that this has issues is if your column won't cast to a VARCHAR(MAX), or if LEN([Column]) > 1000 (but you can change that 1000 to something else if you want), but you can use this rough idea for what you need.
Also this is much worse performance than normal ORDER BY [Column], but it does give you the result asked for in the OP.
Edit: Just to further clarify, this the above will not work if you have decimal values such as having 1, 1.15 and 1.5, (they will sort as {1, 1.5, 1.15}) as that is not what is asked for in the OP, but that can easily be done by:
SELECT [Column] FROM [Table]
ORDER BY REPLACE(RIGHT(REPLICATE('0', 1000) + LTRIM(RTRIM(CAST([Column] AS VARCHAR(MAX)))) + REPLICATE('0', 100 - CHARINDEX('.', REVERSE(LTRIM(RTRIM(CAST([Column] AS VARCHAR(MAX))))), 1)), 1000), '.', '0')
Result: {1, 1.15, 1.5}
And still all entirely within SQL. This will not sort IP addresses because you're now getting into very specific number combinations as opposed to simple text + number.
RedFilter's answer is great for reasonably sized datasets where indexing is not critical, however if you want an index, several tweaks are required.
First, mark the function as not doing any data access and being deterministic and precise:
[SqlFunction(DataAccess = DataAccessKind.None,
SystemDataAccess = SystemDataAccessKind.None,
IsDeterministic = true, IsPrecise = true)]
Next, MSSQL has a 900 byte limit on the index key size, so if the naturalized value is the only value in the index, it must be at most 450 characters long. If the index includes multiple columns, the return value must be even smaller. Two changes:
CREATE FUNCTION Naturalize(#str AS nvarchar(max)) RETURNS nvarchar(450)
EXTERNAL NAME ClrExtensions.Util.Naturalize
and in the C# code:
const int maxLength = 450;
Finally, you will need to add a computed column to your table, and it must be persisted (because MSSQL cannot prove that Naturalize is deterministic and precise), which means the naturalized value is actually stored in the table but is still maintained automatically:
ALTER TABLE YourTable ADD nameNaturalized AS dbo.Naturalize(name) PERSISTED
You can now create the index!
CREATE INDEX idx_YourTable_n ON YourTable (nameNaturalized)
I've also made a couple of changes to RedFilter's code: using chars for clarity, incorporating duplicate space removal into the main loop, exiting once the result is longer than the limit, setting maximum length without substring etc. Here's the result:
using System.Data.SqlTypes;
using System.Text;
using Microsoft.SqlServer.Server;
public static class Util
{
[SqlFunction(DataAccess = DataAccessKind.None, SystemDataAccess = SystemDataAccessKind.None, IsDeterministic = true, IsPrecise = true)]
public static SqlString Naturalize(string str)
{
if (string.IsNullOrEmpty(str))
return str;
const int maxLength = 450;
const int padLength = 15;
bool isDecimal = false;
bool wasSpace = false;
int numStart = 0;
int numLength = 0;
var sb = new StringBuilder();
for (var i = 0; i < str.Length; i++)
{
char c = str[i];
if (c >= '0' && c <= '9')
{
if (numLength == 0)
numStart = i;
numLength++;
}
else
{
if (numLength > 0)
{
sb.Append(pad(str.Substring(numStart, numLength), isDecimal, padLength));
numLength = 0;
}
if (c != ' ' || !wasSpace)
sb.Append(c);
isDecimal = c == '.';
if (sb.Length > maxLength)
break;
}
wasSpace = c == ' ';
}
if (numLength > 0)
sb.Append(pad(str.Substring(numStart, numLength), isDecimal, padLength));
if (sb.Length > maxLength)
sb.Length = maxLength;
return sb.ToString();
}
private static string pad(string num, bool isDecimal, int padLength)
{
return isDecimal ? num.PadRight(padLength, '0') : num.PadLeft(padLength, '0');
}
}
Here's a solution written for SQL 2000. It can probably be improved for newer SQL versions.
/**
* Returns a string formatted for natural sorting. This function is very useful when having to sort alpha-numeric strings.
*
* #author Alexandre Potvin Latreille (plalx)
* #param {nvarchar(4000)} string The formatted string.
* #param {int} numberLength The length each number should have (including padding). This should be the length of the longest number. Defaults to 10.
* #param {char(50)} sameOrderChars A list of characters that should have the same order. Ex: '.-/'. Defaults to empty string.
*
* #return {nvarchar(4000)} A string for natural sorting.
* Example of use:
*
* SELECT Name FROM TableA ORDER BY Name
* TableA (unordered) TableA (ordered)
* ------------ ------------
* ID Name ID Name
* 1. A1. 1. A1-1.
* 2. A1-1. 2. A1.
* 3. R1 --> 3. R1
* 4. R11 4. R11
* 5. R2 5. R2
*
*
* As we can see, humans would expect A1., A1-1., R1, R2, R11 but that's not how SQL is sorting it.
* We can use this function to fix this.
*
* SELECT Name FROM TableA ORDER BY dbo.udf_NaturalSortFormat(Name, default, '.-')
* TableA (unordered) TableA (ordered)
* ------------ ------------
* ID Name ID Name
* 1. A1. 1. A1.
* 2. A1-1. 2. A1-1.
* 3. R1 --> 3. R1
* 4. R11 4. R2
* 5. R2 5. R11
*/
ALTER FUNCTION [dbo].[udf_NaturalSortFormat](
#string nvarchar(4000),
#numberLength int = 10,
#sameOrderChars char(50) = ''
)
RETURNS varchar(4000)
AS
BEGIN
DECLARE #sortString varchar(4000),
#numStartIndex int,
#numEndIndex int,
#padLength int,
#totalPadLength int,
#i int,
#sameOrderCharsLen int;
SELECT
#totalPadLength = 0,
#string = RTRIM(LTRIM(#string)),
#sortString = #string,
#numStartIndex = PATINDEX('%[0-9]%', #string),
#numEndIndex = 0,
#i = 1,
#sameOrderCharsLen = LEN(#sameOrderChars);
-- Replace all char that have the same order by a space.
WHILE (#i <= #sameOrderCharsLen)
BEGIN
SET #sortString = REPLACE(#sortString, SUBSTRING(#sameOrderChars, #i, 1), ' ');
SET #i = #i + 1;
END
-- Pad numbers with zeros.
WHILE (#numStartIndex <> 0)
BEGIN
SET #numStartIndex = #numStartIndex + #numEndIndex;
SET #numEndIndex = #numStartIndex;
WHILE(PATINDEX('[0-9]', SUBSTRING(#string, #numEndIndex, 1)) = 1)
BEGIN
SET #numEndIndex = #numEndIndex + 1;
END
SET #numEndIndex = #numEndIndex - 1;
SET #padLength = #numberLength - (#numEndIndex + 1 - #numStartIndex);
IF #padLength < 0
BEGIN
SET #padLength = 0;
END
SET #sortString = STUFF(
#sortString,
#numStartIndex + #totalPadLength,
0,
REPLICATE('0', #padLength)
);
SET #totalPadLength = #totalPadLength + #padLength;
SET #numStartIndex = PATINDEX('%[0-9]%', RIGHT(#string, LEN(#string) - #numEndIndex));
END
RETURN #sortString;
END
I know this is a bit old at this point, but in my search for a better solution, I came across this question. I'm currently using a function to order by. It works fine for my purpose of sorting records which are named with mixed alpha numeric ('item 1', 'item 10', 'item 2', etc)
CREATE FUNCTION [dbo].[fnMixSort]
(
#ColValue NVARCHAR(255)
)
RETURNS NVARCHAR(1000)
AS
BEGIN
DECLARE #p1 NVARCHAR(255),
#p2 NVARCHAR(255),
#p3 NVARCHAR(255),
#p4 NVARCHAR(255),
#Index TINYINT
IF #ColValue LIKE '[a-z]%'
SELECT #Index = PATINDEX('%[0-9]%', #ColValue),
#p1 = LEFT(CASE WHEN #Index = 0 THEN #ColValue ELSE LEFT(#ColValue, #Index - 1) END + REPLICATE(' ', 255), 255),
#ColValue = CASE WHEN #Index = 0 THEN '' ELSE SUBSTRING(#ColValue, #Index, 255) END
ELSE
SELECT #p1 = REPLICATE(' ', 255)
SELECT #Index = PATINDEX('%[^0-9]%', #ColValue)
IF #Index = 0
SELECT #p2 = RIGHT(REPLICATE(' ', 255) + #ColValue, 255),
#ColValue = ''
ELSE
SELECT #p2 = RIGHT(REPLICATE(' ', 255) + LEFT(#ColValue, #Index - 1), 255),
#ColValue = SUBSTRING(#ColValue, #Index, 255)
SELECT #Index = PATINDEX('%[0-9,a-z]%', #ColValue)
IF #Index = 0
SELECT #p3 = REPLICATE(' ', 255)
ELSE
SELECT #p3 = LEFT(REPLICATE(' ', 255) + LEFT(#ColValue, #Index - 1), 255),
#ColValue = SUBSTRING(#ColValue, #Index, 255)
IF PATINDEX('%[^0-9]%', #ColValue) = 0
SELECT #p4 = RIGHT(REPLICATE(' ', 255) + #ColValue, 255)
ELSE
SELECT #p4 = LEFT(#ColValue + REPLICATE(' ', 255), 255)
RETURN #p1 + #p2 + #p3 + #p4
END
Then call
select item_name from my_table order by fnMixSort(item_name)
It easily triples the processing time for a simple data read, so it may not be the perfect solution.
Here is an other solution that I like:
http://www.dreamchain.com/sql-and-alpha-numeric-sort-order/
It's not Microsoft SQL, but since I ended up here when I was searching for a solution for Postgres, I thought adding this here would help others.
EDIT: Here is the code, in case the link goes away.
CREATE or REPLACE FUNCTION pad_numbers(text) RETURNS text AS $$
SELECT regexp_replace(regexp_replace(regexp_replace(regexp_replace(($1 collate "C"),
E'(^|\\D)(\\d{1,3}($|\\D))', E'\\1000\\2', 'g'),
E'(^|\\D)(\\d{4,6}($|\\D))', E'\\1000\\2', 'g'),
E'(^|\\D)(\\d{7}($|\\D))', E'\\100\\2', 'g'),
E'(^|\\D)(\\d{8}($|\\D))', E'\\10\\2', 'g');
$$ LANGUAGE SQL;
"C" is the default collation in postgresql; you may specify any collation you desire, or remove the collation statement if you can be certain your table columns will never have a nondeterministic collation assigned.
usage:
SELECT * FROM wtf w
WHERE TRUE
ORDER BY pad_numbers(w.my_alphanumeric_field)
For the following varchar data:
BR1
BR2
External Location
IR1
IR2
IR3
IR4
IR5
IR6
IR7
IR8
IR9
IR10
IR11
IR12
IR13
IR14
IR16
IR17
IR15
VCR
This worked best for me:
ORDER BY substring(fieldName, 1, 1), LEN(fieldName)
If you're having trouble loading the data from the DB to sort in C#, then I'm sure you'll be disappointed with any approach at doing it programmatically in the DB. When the server is going to sort, it's got to calculate the "perceived" order just as you would have -- every time.
I'd suggest that you add an additional column to store the preprocessed sortable string, using some C# method, when the data is first inserted. You might try to convert the numerics into fixed-width ranges, for example, so "xyz1" would turn into "xyz00000001". Then you could use normal SQL Server sorting.
At the risk of tooting my own horn, I wrote a CodeProject article implementing the problem as posed in the CodingHorror article. Feel free to steal from my code.
Simply you sort by
ORDER BY
cast (substring(name,(PATINDEX('%[0-9]%',name)),len(name))as int)
##
I've just read a article somewhere about such a topic. The key point is: you only need the integer value to sort data, while the 'rec' string belongs to the UI. You could split the information in two fields, say alpha and num, sort by alpha and num (separately) and then showing a string composed by alpha + num. You could use a computed column to compose the string, or a view.
Hope it helps
You can use the following code to resolve the problem:
Select *,
substring(Cote,1,len(Cote) - Len(RIGHT(Cote, LEN(Cote) - PATINDEX('%[0-9]%', Cote)+1)))alpha,
CAST(RIGHT(Cote, LEN(Cote) - PATINDEX('%[0-9]%', Cote)+1) AS INT)intv
FROM Documents
left outer join Sites ON Sites.IDSite = Documents.IDSite
Order BY alpha, intv
regards,
rabihkahaleh#hotmail.com
I'm fashionably late to the party as usual. Nevertheless, here is my attempt at an answer that seems to work well (I would say that). It assumes text with digits at the end, like in the original example data.
First a function that won't end up winning a "pretty SQL" competition anytime soon.
CREATE FUNCTION udfAlphaNumericSortHelper (
#string varchar(max)
)
RETURNS #results TABLE (
txt varchar(max),
num float
)
AS
BEGIN
DECLARE #txt varchar(max) = #string
DECLARE #numStr varchar(max) = ''
DECLARE #num float = 0
DECLARE #lastChar varchar(1) = ''
set #lastChar = RIGHT(#txt, 1)
WHILE #lastChar <> '' and #lastChar is not null
BEGIN
IF ISNUMERIC(#lastChar) = 1
BEGIN
set #numStr = #lastChar + #numStr
set #txt = Substring(#txt, 0, len(#txt))
set #lastChar = RIGHT(#txt, 1)
END
ELSE
BEGIN
set #lastChar = null
END
END
SET #num = CAST(#numStr as float)
INSERT INTO #results select #txt, #num
RETURN;
END
Then call it like below:
declare #str nvarchar(250) = 'sox,fox,jen1,Jen0,jen15,jen02,jen0004,fox00,rec1,rec10,jen3,rec14,rec2,rec20,rec3,rec4,zip1,zip1.32,zip1.33,zip1.3,TT0001,TT01,TT002'
SELECT tbl.value --, sorter.txt, sorter.num
FROM STRING_SPLIT(#str, ',') as tbl
CROSS APPLY dbo.udfAlphaNumericSortHelper(value) as sorter
ORDER BY sorter.txt, sorter.num, len(tbl.value)
With results:
fox
fox00
Jen0
jen1
jen02
jen3
jen0004
jen15
rec1
rec2
rec3
rec4
rec10
rec14
rec20
sox
TT01
TT0001
TT002
zip1
zip1.3
zip1.32
zip1.33
I still don't understand (probably because of my poor English).
You could try:
ROW_NUMBER() OVER (ORDER BY dbo.human_sort(field_name) ASC)
But it won't work for millions of records.
That why I suggested to use trigger which fills separate column with human value.
Moreover:
built-in T-SQL functions are really
slow and Microsoft suggest to use
.NET functions instead.
human value is constant so there is no point calculating it each time
when query runs.
Related
I would like to generate some UNIQUE random numbers in Snowflake with a specific starting/ending point. I would like for the numbers to start at 1,000 and end at 1,000,000.
Another requirement is joining a string at the beginning of these numbers.
So far I have been using this statement:
SELECT CONCAT('TEST-' , uniform(10000, 99000, RANDOM()));
Which works as expected and gives me the output of e.g. 'TEST-31633'.
However the problem is I am generating these for a large amount of rows, and I need for them to be completely unique.
I have heard of the 'SEQ1' functions however not sure how I could specify a starting point as well as adding a 'TEST-' with the CONCAT function at the beginning. Ideally they won't be in a strict sequence but differ from each other.
Thank You
It's not possible to guarantee uniqueness from a random number generator function unless you use some kind of calculated part. For example:
create or replace table testX as
select CONCAT('TEST-' , to_char(seq4(),'0000'),
uniform(100000, 990000, random())) c
from table(generator(rowcount => 1000));
Any time it's required to assign unique random numbers inside a range, that's best handled through a shuffle rather than generation of new random numbers. Simply generate the sequence, and then shuffle its order.
Depending on the use case, you could simply generate the rows and use order by random() with a limit on the number selected:
select * from (
select 'TEST-' || (seq4() + 10000)::string as MY_TEST_ID
from table(generator(rowcount => 89000))
) order by random() limit 10000
;
A general-purpose solution to this problem is more complex but certainly possible. A JavaScript UDTF can populate an array with all values in the range and shuffle it. Using a window function on the JavaScript UDTF to ensure that the rows are distributed in a single block, it will allow creation of unique random integers in a range in any SQL statement.
First, create the table functions:
create or replace function UNIQUE_RANDOM_INTEGERS(LBOUND float, UBOUND float)
returns table (UNIQUE_RAND_INT float)
language javascript
strict volatile
as
$$
{
initialize: function (argumentInfo, context) {
this.lBound = argumentInfo.LBOUND.constValue;
this.uBound = argumentInfo.UBOUND.constValue;
this.rSpace = this.uBound - this.lBound + 1;
if (this.lBound >= this.uBound) throw new Error(">>> LBOUND and UBOUND must be constants and UBOUND must be greater than LBOUND <<<");
if (this.rSpace > 25000000) throw new Error (">>> The difference between LBOUND and UBOUND must be 25,000,000 or less.");
this.rands = new Array(this.rSpace);
this.currentRow = 0;
for (let i = 0; i < this.rands.length; i++) {
this.rands[i] = this.lBound + i;
}
this.rands = shuffle(this.rands);
function shuffle(array) {
let currentIndex = array.length, randomIndex;
while (currentIndex != 0) {
randomIndex = Math.floor(Math.random() * currentIndex);
currentIndex--;
[array[currentIndex], array[randomIndex]] = [array[randomIndex], array[currentIndex]];
}
return array;
}
},
processRow: function (row, rowWriter, context) {
//
},
finalize: function (rowWriter, context) {
for (let i = 0; i < this.rSpace; i++) {
rowWriter.writeRow({UNIQUE_RAND_INT:this.rands[i]});
}
},
}
$$;
create or replace function UNIQUE_RANDOM_INTEGERS(LBOUND int, UBOUND int)
returns table(UNIQUE_RAND_INT int)
language sql
as
$$
select UNIQUE_RAND_INT::int from table(UNIQUE_RANDOM_INTEGERS(LBOUND::float, UBOUND::float) over (partition by 1))
$$;
You can then generate unique random numbers in a range using this SQL:
select 'TEST-' || UNIQUE_RAND_INT as MY_RAND_VAL
from table(unique_random_integers(10000, 99000));
Another shuffle answer:
some setup:
set range_start = 1;
set range_end = 10;
set range_len = $range_end - $range_start;
set name_width = 6;
select
'row_' ||lpad(row_number()over(order by true),$name_width,0) as row_num
from table(generator(ROWCOUNT => $range_len));
gives:
ROW_NUM
row_000001
row_000002
row_000003
row_000004
row_000005
row_000006
row_000007
row_000008
row_000009
so no repeats, now to shuffle them, just give them and random value and order by that:
select
'row_' ||lpad(row_number()over(order by true),$name_width,0) as row_num
,random() as rnd_order
from table(generator(ROWCOUNT => $range_len))
order by rnd_order;
ROW_NUM
RND_ORDER
row_000002
-7,719,769,195,714,581,512
row_000007
-7,227,137,977,152,948,936
row_000004
-4,651,924,808,142,841,706
row_000005
-4,571,360,566,799,746,506
row_000003
-1,059,648,113,216,115,246
row_000008
1,056,363,703,661,911,623
row_000001
2,281,704,740,829,606,855
row_000006
5,648,204,845,936,012,521
row_000009
8,998,464,501,571,068,967
change the setup to be in the relm of what you want:
set range_start = 1000;
set range_end = 1000000;
set range_len = $range_end - $range_start;
Now if you only wanted 500K answers instead of 9,999K, slap a limit 500000 on the end..
did you try with UNIFORM, may be this will work for you.
https://docs.snowflake.com/en/sql-reference/functions/uniform.html
select CONCAT('TEST-' , uniform(10000, 99000, random())) from table(generator(rowcount => 100));
I'm trying to make a function in PostgreSQL that takes a pageSize and a page Number as params and from that, calculates the offset. This makes paging a lot easier for me.
This is what I have so far:
CREATE OR REPLACE FUNCTION calculate_paging(page_size INT, page_number INT)
RETURNS INT AS $calculate_paging$
DECLARE
offset INT;
max_int INT = 2147483647;
BEGIN
offset = CASE WHEN CAST(page_size AS BIGINT) * (page_number - 1) > max_int
THEN max_int
ELSE page_size * (page_number - 1)
RETURN offset
END;
$calculate_paging$ LANGUAGE plpgsql;
This does not work, I keep getting synthax errors. But the idea is to calculate the offset and return the offset value.
Without considering the real purpose of the function and just focusing on the syntax errrors:
CREATE OR REPLACE FUNCTION calculate_paging(page_size INT, page_number INT)
RETURNS INT AS $$
DECLARE max_int INT = 2147483647;
BEGIN
IF page_size::bigint * (page_number - 1) > max_int THEN
RETURN max_int;
ELSE
RETURN page_size * (page_number - 1);
END IF;
END;
$$ LANGUAGE plpgsql;
When trying to store in an Oracle SQL table a string of more than 100 chars, while the field limitation is 1000 bytes which I understood is ~1000 English chars, I'm getting out of bounds exception:
StringIndexOutOfBoundsException: String index out of range: -3
What might be the cause for this low limitation?
Thanks!
EDIT :
The code where the error occurs is (see chat):
// Commenting the existing code, because for sensitive information
// toString returns masked data
int nullSize = 5;
int i = 0;
// removing '[' and ']', signs and fields with 'null' value and also add
// ';' as delimiter.
while (i != -1) {
int index1 = str.indexOf('[', i);
int index2 = str.indexOf(']', i + 1);
i = index2;
if (index2 != -1 && index1 != -1) {
int index3 = str.indexOf('=', index1);
if (index3 + nullSize > str.length() || !str.substring(index3 + 1, index3 + nullSize).equals("null")) {
String str1 = str.substring(index1 + 1, index2);
concatStrings = concatStrings.append(str1);
concatStrings = concatStrings.append(";");
}
}
}
Generally, when the string to store in a varchar field is too long, it is cropped silently. Anyway when there is an error message, it is generally specific. The error seems to be related to a operation on a string (String.substring()?).
Furthermore, even when the string is encoded in UTF-8, the ratio characters/bytes shouldn't be that low.
You really should put the code sample where your error occurs in you question and the string causing this and also have a closer look at the stacktrace to see where the error occurs.
From the code you posted in your chat, I can see this line of code :
String str1 = str.substring(index1 + 1, index2);
You check that index1 and index2 are different than -1 but you don't check if (index1 + 1) >= index2 which makes your code crash.
Try this with str = "*]ab=null[" (which length is under 100 characters) but you can also get the error with a longer string such as "osh]] [ = null ]Clipers: RRR was removed by user and customer called in to have it since it was an RRT".
Once again the size of the string doesn't matter, only the content!!!
You can reproduce your problem is a closing square bracket (]) before an opening one([) and between them an equal sign (=) followed (directly or not) by the "null" string.
I agree with Jonathon Ogden "limitations of 1000 bytes does not necessarily mean 1000 characters as it depends on character encoding".
I recommend you to Alter column in your Oracle table from VARCHAR2(1000 Byte) to VARCHAR2(1000 Char).
****UPDATED****
How to convert an exponent and coefficient to an integer? Is there a built-in method in SQL?
This is the value in scientific notation 6,1057747657e+011
DECLARE #s VARCHAR(25);
DECLARE #i BIGINT;
SET #s = '6.1057747657e+011';
SET #i = CAST(#s as FLOAT(53));
SELECT #i;
Results 610577476570
You need to store the result as a BIGINT because the number is too large for a 32-bit INT. Note that an implicit conversion is being done from FLOAT(53) to BIGINT.
If you want to control the rounding, you can use the ROUND(), FLOOR() or CEILING() functions. For example:
SET #i = ROUND(CAST(#s as FLOAT(53)), -2);
If it is possible that the input string might contain an invalid number, you would need to add error handling.
DECLARE #s VARCHAR(25);
DECLARE #i BIGINT;
SET #s = 'rubbish';
BEGIN TRY
SET #i = CAST(#s as FLOAT(53));
SELECT #i;
END TRY
BEGIN CATCH
-- error handling goes here
END CATCH
(Tested using T-SQL on SQL Server 2012.)
I think like this but to make into a integer I guess use BigInteger? let me research more
try {
Console.WriteLine(Double.Parse("6.1057747657e+011"));
}
catch (OverflowException) {
Console.WriteLine("{0} is outside the range of the Double type.",
value);
}
Different answer below.
http://msdn.microsoft.com/en-us/library/dd268285(v=vs.110).aspx
The "e" or "E" character, which indicates that the value is
represented in exponential (scientific) notation. The value parameter
can represent a number in exponential notation if style includes the
NumberStyles.AllowExponent flag.
BigInteger value = BigInteger.Parse("-903145792771643190182");
string[] specifiers = { "C", "D", "D25", "E", "E4", "e8", "F0",
"G", "N0", "P", "R", "X", "0,0.000",
"#,#.00#;(#,#.00#)" };
foreach (string specifier in specifiers)
Console.WriteLine("{0}: {1}", specifier, value.ToString(specifier));
// The example displays the following output:
// C: ($903,145,792,771,643,190,182.00)
// D: -903145792771643190182
// D25: -0000903145792771643190182
// E: -9.031458E+020
// E4: -9.0315E+020
// e8: -9.03145793e+020
// F0: -903145792771643190182
// G: -903145792771643190182
// N0: -903,145,792,771,643,190,182
// P: -90,314,579,277,164,319,018,200.00 %
// R: -903145792771643190182
// X: CF0A55968BB1A7545A
// 0,0.000: -903,145,792,771,643,190,182.000
// #,#.00#;(#,#.00#): (903,145,792,771,643,190,182.00)
In your case you might do
BigInteger value = System.Numerics.BigInteger.Parse("6.1057747657e+011", NumberStyles.Float, CultureInfo.InvariantCulture);
You must have .NET Framework 4.5 or higher for this to work.
I have an SQL file that looks like this (clearly the real thing is a bit longer and actualy does stuff :))
DECLARE #Mandatory int = 0
DECLARE #Fish int = 3
DECLARE #InitialPriceID int
if #Mandatory= 0
begin
select #InitialPriceID = priceID from Fishes where FishID = #Fish
end
I have a file of 'Mandatory' and 'Fish' values
Mandatory,Fish
1,3
0,4
1,4
1,3
1,7
I need to write a program that will produce an SQL file (or files) for our DBO to run against the database. but I am not quite sure how to approach the problem...
Cheers
You should generally prefer set based solutions. I've no idea what the full solution would look like, but from the start you've given:
declare #Values table (Mandatory int,Fish int)
insert into #Values(Mandatory,Fish) values
(1,3),
(0,4),
(1,4),
(1,3),
(1,7),
;with Prices as (
select
Mandatory,
Fish,
CASE
WHEN Mandatory = 0 THEN f.PriceID
ELSE 55 /* Calculation for Mandatory = 1? */
END as InitialPriceID
from
#Values v
left join /* Or inner join? */
Fishes f
on
v.Fish = f.Fish
) select * from Prices
You should aim to compute all of the results in one go, rather than trying to "loop through" each calculation. SQL works better this way.
At the risk of over-simplifying things in C# or similar you could use a string processing approach:
class Program
{
static void Main(string[] args)
{
var sb = new StringBuilder();
foreach(var line in File.ReadLines(#"c:\myfile.csv"))
{
string[] values = line.Split(',');
int mandatory = Int32.Parse(values[0]);
int fish = Int32.Parse(values[1]);
sb.AppendLine(new Foo(mandatory, fish).ToString());
}
File.WriteAllText("#c:\myfile.sql", sb.ToString());
}
private sealed class Foo
{
public Foo(int mandatory, int fish)
{
this.Mandatory = mandatory;
this.Fish = fish;
}
public int Mandatory { get; private set; }
public int Fish { get; set; }
public override string ToString()
{
return String.Format(#"DECLARE #Mandatory int = {0}
DECLARE #Fish int = {1}
DECLARE #InitialPriceID int
if #Mandatory=
begin
select #InitialPriceID = priceID from Fishes where FishID = #Fish
end
", this.Mandatory, this.Fish);
}
}
}
There are many article on how to read from a text file through t-sql, check "Stored Procedure to Open and Read a text file" on SO and if you can make change the format of you input files into xml, then you can check SQL SERVER – Simple Example of Reading XML File Using T-SQL