Essentially solving a linear equation in sql - sql

I hope someone can help.
Essentially I have a table which contains fixed size data bundles such as: 50gb, 100gb, 250gb and 1000gb. There are more bundles than this but this is to show examples.
Essentially I want to create something where I pass it a number such as 1250gb and it will give me a list of which bundle sizes make up this 1250gb bundle from the table mentioned above.
How would I go about doing something like this?

There is a restriction to this subset sum problem, namely that we are looking for the minimal required bundles. This allows us to start looking from the biggest to the smallest bundle under the target size (just like handing back cash change, you start from the largest note and work your way down to the smallest coin).
Sample data
Added an extra bundle with size 2000 for demonstration purposes.
create table bundles
(
sizeGB int
);
insert into bundles (sizeGB) values
(50),
(100),
(250),
(1000),
(2000);
Solution
Set target size #targetGB = 1550 to have an example with repeated bundles
and excluding bundles that are too big.
Define an initial value #sumGB = 0 to increment as we go.
Select the biggest bundle #sumPartGB that we can add to #sumGB
and stay within the #targetGB size limit.
Store that part of the sum in a result table #result.
Increment #sumGB with the selected bundle.
Repeat as long as #sumGB < #targetGB.
In code:
declare #targetGB int = 1550;
declare #sumGB int = 0;
declare #result table
(
sizeGB int
);
while #sumGB < #targetGB
begin
declare #sumPartGB int;
select top 1 #sumPartGB = b.sizeGB
from bundles b
where b.sizeGB + #sumGB <= #targetGB
order by b.sizeGB desc;
insert into #result (sizeGB) values (#sumPartGB);
set #sumGB += #sumPartGB;
end
select r.sizeGB as sumParts
from #result r
order by r.sizeGB desc;
Result
sumParts
--------
1000
250
250
50
Calling this algorithm could be done through a table-valued function (= function that returns a table). How you store or wrap this algorithm ultimately depends on your application.
Define function
create function getBundles(#targetGB int)
returns #result table (sizeGB int)
as
begin
declare #sumGB int = 0;
while #sumGB < #targetGB
begin
declare #sumPartGB int;
select top 1 #sumPartGB = b.sizeGB
from bundles b
where b.sizeGB + #sumGB <= #targetGB
order by b.sizeGB desc;
insert into #result (sizeGB) values (#sumPartGB);
set #sumGB += #sumPartGB;
end
return;
end;
Call function
select r.sizeGB as sumParts
from getBundles(1550) r
order by r.sizeGB desc;
Fiddle to see everything in action.

Related

How to process SQL string char-by-char to build a match weight?

The problem: I need to display fields for user entry on a form, dynamic to some lookup criteria.
My current solution: I've created a SQL table with some field entry criteria, based on a relatively simple matching criteria. The match criteria basically is such that Lookup Value starts with Match Code, and the most precise match is found by doing a LEN comparison.
select
f.[IS_REQUIRED]
, f.[MASK]
, f.[MAX_LENGTH]
, f.[MIN_LENGTH]
, f.[RESOURCE_KEY]
, f.[SEQUENCE]
from [dbo].[MY_RECORD] r with(nolock)
inner join [dbo].[ENTRY_FORMAT] f with(nolock)
on r.[LOOKUP_VALUE] like f.[MATCH_CODE]
-- Logic to filter by single, most-precise record match.
cross apply (
select f1.[SEQUENCE]
from [dbo].[ENTRY_FORMAT] f1 with(nolock)
where f.[SEQUENCE] = f1.[SEQUENCE]
and s.[MATCH_CODE] like f1.[MATCH_CODE]
group by f1.[SEQUENCE]
having len(f.[MATCH_CODE]) = max(len(f1.[MATCH_CODE]))
) tFilter
where r.[ID] = #RecordId
Current issues with this is that the most precise match has to be calculated each and every call, against each and every match. Additionally, I'm only currently able to support the % in the MATCH_CODE. (e.g., '%' is the default for all LOOKUP_VALUE, while an entry of '12%' would be the more precise match for a LOOKUP_VALUE of '12345', and MATCH_CODE of '12345' should obviously me the most precise match.) However, I would like to add support for [4-7], etc. wildcards. Going just off of LEN, this would definitely be wrong, because '[4-7]' adds a lot to the length, but, for example '12345' is still the desired match over '123[4-7]'
My desired update: To add a MATCH_WEIGHT column to ENTRY_FORMAT, which I can update via a trigger on insert/update. For my initial implementation, I'm just looking for something that can go through MATCH_CODE, character by character, increasing MATCH_WEIGHT, but treating [..] as just a single character when doing so. Is there a good mechanism (UDF - either SQL or CLR? CURSOR?) for iterating through characters of a varchar field to calculate a value in this way? Something like increasing MATCH_WEIGHT by two per non-wildcard, and perhaps by one on a wildcard; with details to be further thought out and worked out...
The goal being to use a query more like:
select
f.[IS_REQUIRED]
, f.[MASK]
, f.[MAX_LENGTH]
, f.[MIN_LENGTH]
, f.[RESOURCE_KEY]
, f.[SEQUENCE]
from [dbo].[MY_RECORD] r with(nolock)
-- Logic to filter by single, most-precise record match.
cross apply (
select top 1
f1.[MATCH_CODE]
, f1.[SEQUENCE]
from [dbo].[ENTRY_FORMAT] f1 with(nolock)
where r.[LOOKUP_VALUE] like f1.[MATCH_CODE]
group by f1.[SEQUENCE]
order by f1.[MATCH_WEIGHT] desc
) tFilter
inner join [dbo].[ENTRY_FORMAT] f with(nolock)
on f.[MATCH_CODE] = tFilter.[MATCH_CODE]
and f.[SEQUENCE] = tFilter.[SEQUENCE]
where r.[ID] = #RecordId
Note: I realize this is a relatively fragile setup. The ENTRY_FORMAT records are only entered by developers, who are aware of the restrictions, so for now assume that valid data is entered, and which does not cause match collisions.
With some help, I've come up with one implementation (answer below), but am still unsure as to my total design, so welcoming better answers or any criticism.
From Steve's answer on another question, I've used much of the body to create a function to accomplish support for the [..] wildcard at the end of a match code.
CREATE FUNCTION CalculateMatchWeight
(
-- Add the parameters for the function here
#MatchCode varchar(100)
)
RETURNS smallint
AS
BEGIN
-- Declare the return variable here
DECLARE #Result smallint = 0;
-- Add the T-SQL statements to compute the return value here
DECLARE #Pos int = 1, #N0 int = ascii('0'), #N9 int = ascii('9'), #AA int = ascii('A'), #AZ int = ascii('Z'), #Wild int = ascii('%'), #Range int = ascii('[');
DECLARE #Asc int;
DECLARE #WorkingString varchar(100) = upper(#MatchCode)
WHILE #Pos <= LEN(#WorkingString)
BEGIN
SET #Asc = ascii(substring(#WorkingString, #Pos, 1));
If ((#Asc between #N0 and #N9) or (#Asc between #AA and #AZ))
SET #Result = #Result + 2;
ELSE
BEGIN
-- Check wildcard matching, update value according to match strength, and stop calculating further.
-- TODO: In the future we may wish to have match codes with wildcards not just at the end; try to figure out a mechanism to calculating that case.
IF (#Asc = #Range)
BEGIN
SET #Result = #Result + 2;
SET #Pos = 100;
END
IF (#Asc = #Wild)
BEGIN
SET #Result = #Result + 1;
SET #Pos = 100;
END
END
SET #Pos = #Pos + 1
END
-- Return the result of the function
RETURN #Result
END
I've checked that this can generate desired output for the current cases that I'm trying to cover:
SELECT [dbo].[CalculateMatchWeight] ('12345'); -- Most precise (10)
SELECT [dbo].[CalculateMatchWeight] ('123[4-5]'); -- Middle (8)
SELECT [dbo].[CalculateMatchWeight] ('123%'); -- Least (7)
Now I can call this function in a trigger on INSERT/UPDATE to update the MATCH_WEIGHT:
CREATE TRIGGER TRG_ENTRY_FORMAT_CalcMatchWeight
ON ENTRY_FORMAT
AFTER INSERT,UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for trigger here
DECLARE #NewMatchWeight smallint = (select dbo.CalculateMatchWeight(inserted.MATCH_CODE) from inserted),
#CurrentMatchWeight smallint = (select inserted.MATCH_WEIGHT from inserted);
IF (#CurrentMatchWeight <> #NewMatchWeight)
BEGIN
UPDATE ENTRY_FORMAT
SET MATCH_WEIGHT = #NewMatchWeight
FROM inserted
WHERE ENTRY_FORMAT.[MATCH_CODE] = inserted.[MATCH_CODE]
AND ENTRY_FORMAT.[SEQUENCE] = inserted.[SEQUENCE]
END
END

SQL Server WHILE

I am trying to generating unique card number from following function. I put my query inside a while loop to prevent duplicate card number but still I am getting duplicate numbers.
Anyone can help me?
Create FUNCTION GetCardNumber ()
RETURNS varchar(20)
AS
BEGIN
Declare #NewID varchar(20);
Declare #NewID1 varchar(36) ;
Declare #Counter int = 0;
While(1=1)
Begin
Set #NewID1 = (SELECT [MyNewId] FROM Get_NewID);
Set #NewID = '2662464' + '823' + '001' +right(YEAR(GETUTCDATE()),2) +(left(convert(varchar,ABS(CAST(CAST(#NewID1 AS VARBINARY(5)) AS bigint))),5));
Set #Counter = (Select count(*) from ContactTBL where ContactMembershipID = #NewID);
If #Counter = 0
BEGIN
BREAK;
END
End
return #newID
END
Go
Update : I am getting MyNewID from View:
CREATE VIEW Get_NewID
AS
SELECT NEWID() AS MyNewID
GO
Many thanks in advance.
Won't this just return the same value every time you run it? I can't see anywhere where you're incrementing anything, or getting any kind of value that would give you unique values each time. You need to do something that changes the value each time, for example using the current exact date and time.
You're returning varchar(20) in line 2. To get your 'unique' NewId, you're doing this:
Set #NewId = (13 digit constant value) + (last 2 digits of current year) +
left(
convert(varchar,
ABS(CAST
(CAST(#NewID1 AS VARBINARY(5)) AS bigint)
)
)
,5)
which leaves you only 5 characters of uniqueness! This is almost certainly the issue. An easy fix may be increase the characters you return on line 2 e.g. RETURNS varchar(30)
What you're doing is unnecessarily complicated, and I think there is an element of overprotecting against potential duplicate values. This line is very suspect:
Set #NewID = '2662464' + '823' + '001' +right(YEAR(GETUTCDATE()),2) +(left(convert(varchar,ABS(CAST(CAST(#NewID1 AS VARBINARY(5)) AS bigint))),5));
The maximum for bigint is 2^63-1, so casting your 5-byte VARBINARY to a bigint could result in an overflow, which may also cause an issue.
I'm not sure exactly what you're trying to achieve, but you need to simplify things and make sure you have more scope for unique values!
Set #NewID1 = (SELECT [MyNewId] FROM Get_NewID);
always return the same result (if no other changes)
Set #NewID = '2662464' + '823' + '001' +right(YEAR(GETUTCDATE()),2) +(left(convert(varchar,ABS(CAST(CAST(#NewID1 AS VARBINARY(5)) AS bigint))),5));
as result #New_ID will be the same also

**Occasional** Arithmetic overflow error converting expression to data type int

I'm running an update script to obfuscate data and am occasionally experiencing the arithmetic overflow error message, as in the title. The table being updated has 260k records and yet the update script will need to be run several times to produce the error. Although it's so rare I can't rely on the code until it's fixed as it's a pain to debug.
Looking at other similar questions, this is often resolved by changing the data type e.g from INT to BIGINT either in the table or in a calculation. However, I can't see where this could be required. I've reduced the script to the below as I've managed to pin point it to the update of one column.
A function is being called by the update and I've included this below. I suspect that, due to the randomness of the error, the use of the NEW_ID function could be causing it but I haven't been able to re-create the error when just running this part of the function multiple times. The NEW_ID function can't be used in functions so it's being called from a view, also included below.
Update script:
UPDATE dbo.Addresses
SET HouseNumber = CASE WHEN LEN(HouseNumber) > 0
THEN dbo.fn_GenerateRandomString (LEN(HouseNumber), 1, 1, 1)
ELSE HouseNumber
END
NEW_ID view and random string function
CREATE VIEW dbo.vw_GetNewID
AS
SELECT NEWID() AS New_ID
CREATE FUNCTION dbo.fn_GenerateRandomString (
#stringLength int,
#upperCaseBit bit,
#lowerCaseBit bit,
#numberBit bit
)
RETURNS nvarchar(100)
AS
BEGIN
-- Sanitise string length values.
IF ISNULL(#stringLength, -1) < 0
SET #stringLength = 0
-- Generate a random string from the specified character sets.
DECLARE #string nvarchar(100) = ''
SELECT
#string += c2
FROM
(
SELECT TOP (#stringLength) c2 FROM (
SELECT c1 FROM
(
VALUES ('A'),('B'),('C')
) AS T1(c1)
WHERE #upperCaseBit = 1
UNION ALL
SELECT c1 FROM
(
VALUES ('a'),('b'),('c')
) AS T1(c1)
WHERE #lowerCaseBit = 1
SELECT c1 FROM
(
VALUES ('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9')
) AS T1(c1)
WHERE #numberBit = 1
)
AS T2(c2)
ORDER BY (SELECT ABS(CHECKSUM(New_ID)) from vw_GetNewID)
) AS T2
RETURN #string
END
Addresses table (for testing):
CREATE TABLE dbo.Addresses(HouseNumber nchar(32) NULL)
INSERT Addresses(HouseNumber)
VALUES ('DSjkmf jkghjsh35hjk h2jkhj3h jhf'),
('SDjfksj3548 ksjk'),
(NULL),
(''),
('2a'),
('1234567890'),
('An2b')
Note: only 7k of the rows in the addresses table have a value entered i.e. LEN(HouseNumber) > 0.
An arithmetic overflow in what is otherwise string-based code is confounding. But there is one thing that could be causing the arithmetic overflow. That is your ORDER BY clause:
ORDER BY (SELECT ABS(CHECKSUM(New_ID)) from vw_GetNewID)
CHECKSUM() returns an integer, whose range is -2,147,483,648 to 2,147,483,647. Note the absolute value of the smallest number is 2,147,483,648, and that is just outside the range. You can verify that SELECT ABS(CAST('-2147483648' as int)) generates the arithmetic overflow error.
You don't need the checksum(). Alas, you do need the view because this logic is in a function and NEWID() is side-effecting. But, you can use:
ORDER BY (SELECT New_ID from vw_GetNewID)
I suspect that the reason you are seeing this every million or so rows rather than every 4 billion rows or so is because the ORDER BY value is being evaluated multiple times for each row as part of the sorting process. Eventually, it is going to hit the lower limit.
EDIT:
If you care about efficiency, it is probably faster to do this using string operations rather than tables. I might suggest this version of the function:
CREATE VIEW vw_rand AS SELECT rand() as rand;
GO
CREATE FUNCTION dbo.fn_GenerateRandomString (
#stringLength int,
#upperCaseBit bit,
#lowerCaseBit bit,
#numberBit bit
)
RETURNS nvarchar(100)
AS
BEGIN
DECLARE #string NVARCHAR(255) = '';
-- Sanitise string length values.
IF ISNULL(#stringLength, -1) < 0
SET #stringLength = 0;
DECLARE #lets VARCHAR(255) = '';
IF (#upperCaseBit = 1) SET #lets = #lets + 'ABC';
IF (#lowerCaseBit = 1) SET #lets = #lets + 'abc';
IF (#numberBit = 1) SET #lets = #lets + '0123456789';
DECLARE #len int = len(#lets);
WHILE #stringLength > 0 BEGIN
SELECT #string += SUBSTRING(#lets, 1 + CAST(rand * #len as INT), 1)
FROM vw_rand;
SET #stringLength = #stringLength - 1;
END;
RETURN #string
END;
As a note: rand() is documented as being exclusive of the end of its range, so you don't have to worry about it returning exactly 1.
Also, this version is subtly different from your version because it can pull the same letter more than once (and as a consequence can also handle longer strings). I think this is actually a benefit.

Unique 6 digit number but not sequential for Customer ID (SQL)

Our Customer table has an int Identity column for ID. This was going to be given out to customers, so when they phone they could just give their ID.
It is now obvious that our competitors would easily be able to register twice on our site, say a month apart and find out exactly how many people have registered.
Therefore, is there a nice simple way to create a "Customer ID" (in SQL or c#) which we could give to customers that is:
(a) 6 digits long
(b) is unique
(c) is not sequential(
Thanks in advance
If you choose any increment that is not a factor of 1000000, then you could take the last 6 digits of that number to provide the ID; ie (IDENTITY (1,7)) % 1000000.
But your competitors could still find the increment by a few sequential registrations, so this would not completely solve the issue.
So it would seem you want a number that is completely random - so for that, you'll have to check whether it already exists when you generate it, or pre-generate a list of numbers, sort them randomly, and pick the next when creating a new customer.
Another option to consider is some form of encryption, if you can find or create an appropriate algorithm that creates a short enough output.
If you take the large non factor increment route, you could then subsequently re-arrange the order of the digits to create a more random number - eg;
declare #inc int , #loop int
declare #t table (i int, cn int, code varchar(4))
select #inc = 5173, #loop = 1
while #loop<=10000
begin
insert #t (i, cn)
select #loop, (#inc*#loop)%10000
select #loop = #loop + 1
end
update #t
set code = substring(convert(varchar(4),cn),2,1)
+ substring(convert(varchar(4),cn),4,1)
+ substring(convert(varchar(4),cn),3,1)
+ substring(convert(varchar(4),cn),1,1)
select code, count(*) from #t group by code having count(*)>1
select top 20 * from #t order by i
Depending on the number you choose, some sequential items will have the same difference between them, but this number will vary. So it's not cryptographically secure, but probably enough to thwart all but the most determined of competitors.
You could convert the above to a function to run off a standard IDENTITY(1,1) id field
Maybe this is insane, but here is my way of generating the Customer Numbers up front.
This will generate however many UNIQUE keys you want very quickly.
You could obviously save these into a real table.
Here is a SQLFiddle of the below: http://www.sqlfiddle.com/#!3/d41d8/3884
DECLARE #tbl TABLE
(
ID INT IDENTITY(1,1),
CustNo INT UNIQUE
)
DECLARE #Upper INT
DECLARE #Lower INT
DECLARE #NumberRequired INT
SET #Lower = 100000 ---- The lowest random number allowed
SET #Upper = 999999 ---- The highest random number allowed
SET #NumberRequired = 1000 -- How many IDs do we want?
WHILE (SELECT COUNT(*) FROM #tbl) < #NumberRequired
BEGIN
BEGIN TRY
INSERT INTO #tbl SELECT (ROUND(((#Upper - #Lower -1) * RAND() + #Lower), 0))
END TRY
BEGIN CATCH
-- If it goes wrong go round the loop again
END CATCH
END
SELECT *
FROM #tbl
EDIT: Actually this is probably faster. It generates all 900000 possible keys in around 30 seconds on my dev machine, which is okay for a one-off job.
DECLARE #tbl TABLE
(
ID INT
)
DECLARE #Upper INT
DECLARE #Lower INT
DECLARE #i INT;
SET #Lower = 100000 ---- The lowest random number allowed
SET #Upper = 999999 ---- The highest random number allowed
SET #i = #Lower
WHILE #i <= #Upper
BEGIN
INSERT INTO #tbl SELECT #i
SET #i = #i + 1
END
SELECT ID
FROM #tbl ORDER BY NEWID()
You can have calculated column that generated from Identity column and create unique value that Expect.
for example calculated column like below :
100000 + Identity_Column * 7 + 3
What if juts use user registration timestamp. It doesn't contain user's count information and unique (If you don't register users each second for example). For instance if you use 10000 in this query you can register users each minute and get unique 9 symbol digit:
select cast(cast(current_timestamp as float)*10000 as int)
You could make a table with 2 columns, one with the values 100.000 to 999.999 and one with a marker whether the number has been given out. When making a new client assign an unassigned number from this table at random and mark it assigned.

Selecting only numbers from a string

We have a program that can pull information from a database that we use for shipping. The way it works is it uses an ODBC driver to pull from our database, so that when we type in "order number 5" into the shipping program it will also pull the matching address, phone number, etc.
The problem is that the database contains only numbers for the orders, however the program that contains the database which we use for inventory management prints our labels with the order number in the format TK123456. I need to figure out how to make SQL interpret the order number as just numbers when inputted, so basically cut the TK off the start.
SELECT RXFILL.RXFILL_ID, RXMAIN.RX_NUMBER, PATIENT.FIRSTNAME, PATIENT.LASTNAME,
SHIPADDRESS1, SHIPADDRESS2, SHIPCITY, SHIPSTATE, SHIPZIP, EMAIL
FROM RXFILL
LEFT JOIN RXMAIN ON RXFILL.RXMAIN_ID = RXMAIN.RXMAIN_ID
LEFT JOIN PATIENT ON RXMAIN.PATIENT_ID = PATIENT.PATIENT_ID
WHERE RXFILL_ID=$ORDERNUMBER
If I am understanding it correctly the $ORDERNUMBER is what needs to be adjusted to not include letters. However the program does specify the final line must be in the format WHERE [field name]=$ORDERNUMBER.
How can this be done?
If you only want this to be solved in SQL and not in the calling application, and you know that the first two characters of $ORDERNUMBER will always be 'TK', then you can easily solve it by taking a substring of $ORDERNUMBER starting at the third character... i.e.
WHERE RXFILL_ID=SUBSTRING($ORDERNUMBER, 2).
That syntax might not be exact, since you haven't divulged your DBMS type and each DBMS implements SUBSTRING in whatever way they want.
If you share more info about the calling application which sets $ORDERNUMBER, I'm sure it would be better to make the change there.
I have write a function for this.
CREATE Function SelectNumbersFromString(#str varchar(max))
Returns varchar(max) as
BEGIN
Declare #cchk char(5);
Declare #len int ;
Declare #aschr int;
SET #len = ( SElect len(#str) );
Declare #count int
SET #count = 1
DECLARE #ans varchar(max)
SET #ans = ''
While #count <= #len
BEGIN
SET #cchk = ( select Substring(#str,#count,1) );
SET #aschr = ( select ASCII(#cchk) );
IF #aschr in ( 49,50,51,52,53,54,55,56,57,58 )
BEGIN
SET #ans = #ans + CHAR(#aschr)
END
SET #count = #count + 1;
END
RETURN #ans;
END
TESTED
SELECT SelectNumbersFromString('abc3deef5ff6') will return 356
From http://wfjanjua.blogspot.com/2012/07/add-numbers-from-stringvarchar-in-tsql.html