Convert a letter into a number - sql

I am building the back end of a web application which is processing a significant portion of data and the front end developers are looking for a stable integer code to use in joining data.
The current integer values they are trying to use are surrogate keys which will change going forward leading to a number of problems.
Each table has a alphanumeric code and I am looking for a way in which I could convert this into a stable int.
EG convert a code 'AAAA' into 1111 or MMMM into 13131313
Could anyone tell me if this is at all possible.
Thanks,

McNets' comment seems to be a very good approach...
If you can be sure, that you have
plain ASCII characters
Not more than 4 letters
You might cast the string to VARBINARY(4) and cast this to INT:
DECLARE #dummy TABLE(StrangeCode VARCHAR(10));
INSERT INTO #dummy VALUES
('AAAA'),('MMMM'),('ACAC'),('CDEF'),('ABCD');
SELECT CAST(CAST(StrangeCode AS VARBINARY(4)) AS INT)
FROM #dummy;
The result
1094795585
1296911693
1094926659
1128547654
1094861636
If you need bigger number, you might go up to BIGINT

A way is using CTE like this:
;with tt(i, c1, c2) as (
select 1, c, replace(c,char(65), 1)
from yourTable
union all
select i+1, c1, c2= replace(c2,char(65+i), i+1)
from tt
where i < 26
)
select c1, cast(c2 as bigint) num
from tt
where i = 26;

As McNets suggests, create a second table:
create table IntCodes (id INT IDENTITY(1,1), UserCode VARCHAR(50) NOT NULL)
insert into IntCodes (UserCode)
select distinct UserCode
from MyTable
You'll need a trigger:
create trigger Trg_UserCode
on MyTable
after insert as
begin
insert into IntCodes (UserCode)
select i1.UserCode
from INSERTED i1
where i1.UserCode not in (select ic.Usercode from IntCodes ic)
end
Now, as part of the query:
select t1.*, t2.id as IntCode
from MyTable t1
inner join IntCodes t2
on t1.UserCode = t2.UserCode
This means that you won't need to worry about updating the existing columns

Related

How do I join on a column that contains a string that I'm trying to search through using a substring?

I'm trying to join a table onto another table. The gimmick here is that the column from the table contains a long string. Something like this:
PageNumber-190-ChapterTitle-HelloThere
PageNumber-19-ChapterTitle-NotToday
I have another table that has a list of page numbers and whether or not I want to keep those pages, for example:
Page Number
Keep Flag
190
Y
19
N
I want to be able to return a query that contains the long string but only if the page number exists somewhere in the string. The problem I have is that, when using a LIKE statement to join:
JOIN t2 ON t1.string LIKE '%' + t2.page_number + '%' WHERE keep_flag = 'Y'
It will still return both results for whatever reason. The column of "Keep Flag" in the results query will change to "Y" for page 19 even though it shouldn't be in the results.
I obviously don't think LIKE is the best way to JOIN given that '19' is LIKE '190'. What else can I do here?
if Page_number is a number you must cats it as varchar, so that the types fit.
You can read on thehome page ,ot about cast and convert see https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-ver16
CREATE TABLE tab1
([str] varchar(38))
;
INSERT INTO tab1
([str])
VALUES
('PageNumber-190-ChapterTitle-HelloThere'),
('PageNumber-19-ChapterTitle-NotToday')
;
2 rows affected
CREATE TABLE tab2
([Page Number] int, [Keep Flag] varchar(1))
;
INSERT INTO tab2
([Page Number], [Keep Flag])
VALUES
(190, 'Y'),
(19, 'N')
;
2 rows affected
SELECT str
FROM tab1 JOIN tab2 ON tab1.str LIKE '%' + CAST(tab2.[Page Number]AS varchar) + '%' AND tab2.[Keep Flag] = 'Y'
str
PageNumber-190-ChapterTitle-HelloThere
fiddle
Please try the following solution.
It is doing JOIN on the exact match.
SQL
-- DDL and sample data population, start
DECLARE #tbl1 TABLE (ID INT IDENTITY PRIMARY KEY, tokens VARCHAR(100));
INSERT #tbl1 (tokens) VALUES
('PageNumber-190-ChapterTitle-HelloThere'),
('PageNumber-19-ChapterTitle-NotToday');
DECLARE #tbl2 TABLE (ID INT IDENTITY PRIMARY KEY, Page_Number INT, Keep_Flag CHAR(1));
INSERT #tbl2 (Page_Number, Keep_Flag) VALUES
(190, 'Y'),
(19 , 'N');
-- DDL and sample data population, end
SELECT *
FROM #tbl1 AS t1 INNER JOIN #tbl2 AS t2
ON PARSENAME(REPLACE(t1.tokens,'-','.'),3) = t2.Page_Number
WHERE t2.keep_flag = 'Y';
Output
ID
tokens
ID
Page_Number
Keep_Flag
1
PageNumber-190-ChapterTitle-HelloThere
1
190
Y

STRING_SPLIT skyrockets execution time of sql query

For a SQL query I need to split an input string into its integer components and select values from a table according to the provided integers.
My problem: if there are a lot of integers (>300), the query gets slower and slower, i.e. for around 600 integers it uses more than one minute!
Here a small example of the executed query:
DECLARE #inputStr VARCHAR(MAX) = '234,2344,12,523,5667,9825,345'
SELECT
surname,
firstname
FROM Addresses
WHERE id IN (SELECT CAST(value AS INTEGER) FROM STRING_SPLIT(#inputStr, ',')))
Is there a known problem to this or any improvements I could do?
I'm glad for any help!
The probleme is that any explicit values in a "IN" operator is translate by a multiple OR in the WHERE clause by the algebrizer, before optimizing the query...
A great number of values in the IN operator will allways causes a lack of performances, whatever the manner to do it... !
By creating a temporay table, you will have another query execution plan that will boost your performances.
Si try this way :
SELECT DISTINCT CAST(value AS INTEGER)
INTO #T
FROM STRING_SPLIT(#inputStr, ',');
SELECT surname,
firstname
FROM Addresses
WHERE id IN (SELECT value
FROM #T);
Eventually you can add a UNIQUE index to the temp table to increase performances :
SELECT DISTINCT CAST("value" AS INTEGER)
INTO #T
FROM STRING_SPLIT(#inputStr, ',');
CREATE UNIQUE INDEX X123456789 ON #T ("value");
SELECT surname,
firstname
FROM Addresses
WHERE id IN (SELECT value
FROM #T);
Personally, I would move STRING_SPLIT to a JOIN, as it could well be that SQL Server is running STRING_SPLIT once for every row:
SELECT surname,
firstname
FROM dbo.Addresses A
JOIN STRING_SPLIT(#InputStr,',') SS ON A.id = SS.[value];
If that is still slow, I would suggest you are missing an index on id. Considering it is the id, then you'll likely want it to be clustered, and probably the primary key:
ALTER TABLE dbo.Addresses ADD CONSTRAINT PK_Addresses PRIMARY KEY (id) CLUSTERED;
Could you try this?
DECLARE #inputStr VARCHAR(MAX) = '234,2344,12,523,5667,9825,345'
DROP TABLE IF EXISTS #TEST;
CREATE TABLE #TEST
(
[value] INT
);
INSERT INTO #TEST ([value])
SELECT value
FROM STRING_SPLIT(#inputStr, ',')
SELECT
surname,
firstname
FROM Addresses A
INNER JOIN #TEST B
ON A.id = b.value

Function to multiple tables

I have this function, but I wanted to pass a table so as to use the same function to get the job done for multiple tables. For example, I want this function work for table1, and table2. But it is just for table1 currently. I was trying to use a dynamic sql in vain; it doesn't pass the parameter selected.
Can someone help? Give me guide on how to pass table as a parameter.
Sample data, table1
CREATE TABLE table1 (id int identity (1,1), name varchar(60))
INSERT INTO table1
VALUES ('a1, a2, a9, a8')
Sample data, table2
CREATE TABLE table2 (id int identity (1,1), name varchar(60))
INSERT INTO table2
VALUES ('a1, a2, a9, a8')
The function:
CREATE FUNCTION f_split
(#id INT)
RETURNS #ab
TABLE (name VARCHAR(20),
ab1 VARCHAR(5)
)
AS
BEGIN
DECLARE #temp TABLE (rn INT, name VARCHAR(5))
INSERT INTO #temp(rn, name)
SELECT ROW_NUMBER() OVER(ORDER BY LTRIM(RTRIM(Split.a.value('.', 'NVARCHAR(MAX)'))) ASC) rn, LTRIM(RTRIM(Split.a.value('.', 'NVARCHAR(MAX)'))) Result
FROM
(
SELECT CAST('<X>'+REPLACE([name], ',', '</X><X>')+'</X>' AS XML) AS String
FROM table1 where id = #id
) AS A
CROSS APPLY String.nodes('/X') AS Split(a)
ORDER BY 1
INSERT INTO #ab
SELECT * FROM #temp
RETURN
END
This gives the result from table1.
SELECT * FROM F_SPLIT(1)
But I want the same function to work for table2 as well.
Any help is appreciated.
Use a partitioned view, which will allow you to specify the table name as a parameter in the where clause.
Start by creating a view that unions the two tables, plus an additional column to indicate which table the row comes from.
CREATE VIEW BothTables AS
SELECT 'Table1' TableName, * FROM Table1
UNION ALL
SELECT 'Table2' TableName, * FROM Table2
Then modify your function. When you pass the table name, use it to select a subset of rows from the view. So instead of
SELECT CAST('<X>'+REPLACE([name], ',', '</X><X>')+'</X>' AS XML) AS String
FROM table1
WHERE id = #id
Use
SELECT CAST('<X>'+REPLACE([name], ',', '</X><X>')+'</X>' AS XML) AS String
FROM BothTables
WHERE TableName = #TableName
AND id = #id

What is the best way to update a table by removing only one leading zero from numeric values in a column in SQL Server database

Currently, I have a column with 10 digits with leading '0' on all the records and I want to remove that '0' and keep 9 digits only even if it's start with a zero. I want to later join that table with another table in that column and the other table has 9 digits on all record and some start with 0.
If only the first digit is a 0 and the column is a numeric type you can convert it to int at the time of joining:
This will remove every 0 at the beginning
FROM table1 inner join table2 on CAST(table1.col as int) = table2.col
You can also substring it:
FROM table1 inner join table2 on substring(table1.col, 2, 9) = table2.col
Those two options are to be done at the moment of joining, however, you can just alter the table to update every value in the column if you convert the column's data type to int, but it will remove every 0 from the beginning, much like the first option:
ALTER TABLE table1
ALTER COLUMN col int;
If you are using it for a join then cast to int is probably more efficient
FROM table1
inner join table2
on CAST(table1.col as int) = CAST(table2.col as int)
If the column is a varchar and has always 10 digits that start with a 0 ?
Then RIGHT might be right for it.
declare #T1 table (id1 int identity(1,1) primary key, var_10 varchar(10));
insert into #T1 (var_10) values ('0123456789');
declare #T2 table (id2 int identity(1,1) primary key, var_09 varchar(9));
insert into #T2 (var_09) values ('123456789');
select *
from #T1 t1
join #T2 t2 on t2.var_09 = right(t1.var_10, 9);
But if you're not too sure about it, then I would suggest to TRY_CONVERT (or TRY_CAST) them both to an INT.
select *
from #T1 t1
join #T2 t2 on try_convert(int,t2.var_09) = try_convert(int,t1.var_10);
A TRY_CONVERT has the benifit over a CONVERT that if some of the data doesn't contain numbers that the SQL wouldn't crash on that.
But f.e. a CONVERT(int,'123FOOBAR') would crash the SQL.
While a TRY_CONVERT(int,'123FOOBAR') would just return a NULL.

Tally Table in SQL

I want to create a bunch of data with Tally table in SQL (sql2008) and definitely need help.
First of all, I have this table which contains 2 columns.
{
AcctNum (nchar(30), null),
DataInfo (nchar(745), null)
}
While I don't care the data in the DataInfo column, I do want to add about 10k of row into the table with unique AcctNum on each row.
The problem though is I need to keep the length of the data in both column. For example, AcctNum column looks like "400000000000001 ". how do I increment the number while keep the "blank space"?
Not sure if I make much sense here, but please let me know and I will try to explain more, thanks!
Using a recursive common table expression :
-- set up a table variable for demo purpose
declare #t table (AcctNum nchar(30) null, DataInfo nchar(745) null);
-- insert the starting value
insert #t values ('400000000000001', null);
-- run the cte to generate the sequence
with cte (acctnum, num) as (
select acctnum, cast(acctnum as bigint) + 1 num -- starting value
from #t
union all
select acctnum, num+1 from cte
where num < cast(acctnum as bigint) + 10000 -- stopping value
)
-- insert data sequence into the table
insert #t (AcctNum, DataInfo)
select num, null from cte
option (maxrecursion 10000);
select * from #t;
The table variable #t will now contain acctnum 400000000000001 -> 400000000010001 as a contiguous sequence.