Simulate row number using numbers table - sql

How would I simulate row number for a table using a numbers table WITHOUT using ROW_NUMBER() function.
sample table:
create table accounts
(
account_num VARCHAR(25),
primary key (account_num)
)
The numbers table has 1mil rows.

In case you're meaning, when it's not available (aka MySQL), try something like this:
select #rownum := #rownum + 1 rownum,
t.*
from (select * from table t order by col) t,
(select #rownum := 0) r
It'll yield the same as:
select row_number() over (order by col)
from table
order by col

A Numbers table does not help you here because you have no means to associate a value in your table with a number in the Numbers table. However, if you are asking whether it is possible to create a sequence without using ROW_NUMBER() or a variable, you can do it like so:
Select A1.Account_Num, Count( A2.Account_Num ) + 1 As Num
From Accounts As A1
Left Join Accounts As A2
On A2.Account_Num < A1.Account_Num
Group By A1.Account_Num

Related

Unique ID using function with every record in Insert statement

I have a statement in stored procedure
INSERT into table(ID, name, age)
SELECT fnGetLowestFreeID(), name, age
FROM #tempdata
The function fnGetLowestFreeID() gets the lowest free ID of the table table.
I want to insert unique ID with every record in the table. I have tried iteration and transaction. But they aren't fitting the scenario.
I cannot use Identity Column. I have this restriction of using IDs between 0-4 and assigning the lowest free ID using that function. In case of returned ID greater than 4, the function is returning an error. Suppose there are already 1 and 2 in the table. The function will return 0 and I have to assign this ID to the new record, 3 to the next record and so on on the basis of number of records in the #tempdata.
try this
CREATE TABLE dbo.Tmp_City(Id int NOT NULL IDENTITY(1, 1),
Name varchar(50) , Country varchar(50), )
OR
ALTER TABLE dbo.Tmp_City
MODIFY COLUMN Id int NOT NULL IDENTITY(1, 1)
OR
Create a Sequence and assign Sequence.NEXTVAL as ID
in the insert statement
You can make use of a rank function like row_number and do something like this.
INSERT into table(ID, name, age)
SELECT row_number() over (order by id) + fnGetLowestFreeID(), name, age
FROM #tempdata
Here are 3 scenarios-
1)Show the function which you are using
2) Doesn't make sense to use a function and make it unique
still- you can use rank-
INSERT into table(ID, name, age)
SELECT row_number() over (order by id) + fnGetLowestFreeID(), name, age
FROM #tempdata
3)Else, get rid of function and use max(id)+1 because you dont want to use identitiy column
You could use a Numbers table to join the query doing your insert. You can google the concept for more info, but essentially you create a table (for example with the name "Numbers") with a single column "nums" of some integer type, and then you add some amount of rows, starting with 0 or 1, and going as far as you need. For example, you could end with this table content:
nums
----
0
1
2
3
4
5
6
Then you can use such a table to modify your insert, you don't need the function anymore:
INSERT into table(ID, name, age)
SELECT t2.nums, t.name, t.age
FROM (
SELECT name, age, row_number() over (order by name) as seq
FROM #tempdata
) t
INNER JOIN (
SELECT n.nums, row_number() over (order by n.nums) as seq
FROM Numbers n
WHERE n.nums < 5 AND NOT EXISTS (
SELECT * FROM table WHERE table.ID = n.nums
)
) t2 ON t.seq = t2.seq
Now, this query leaves out one of your requirements, that would be launching an error when no slots are available, but that is easy to fix. You can do previously a query and test if the count of records in table plus the sum of records in #tempdata is higher than 5. If so, you launch the error as you know there would not be enough free slots for the records in #tempdata.
Side note: table looks like a terrible name for a table, I hope that in your real code you have a meaningful name :)

PostgreSQL - How to select the first consecutive group having same value

I have a table with pk and dept columns:
pk dept
-------
27 A
29 A
30 B
31 B
33 A
I need to select the first consecutive group, that is the first successive set of rows all having the same dept value when the table is ordered by pk, i.e. the expected result is:
pk dept
-------
27 A
29 A
In my example there are 3 consecutive groups (AA, BB and A). The size of a group is unlimited (can be more than 2).
The following query should do what you want (I named your table tx):
SELECT *
FROM tx t1
WHERE NOT EXISTS (
SELECT *
FROM tx t2
WHERE t2.dept <> t1.dept
AND t2.pk < t1.pk);
The idea is to look for tuples such that no tuple with a lesser pk and a different department exists.
The first two A tuples are kept;
The B tuples are dropped because of the first two A tuples;
The last A tuple is dropped because of the B tuples.
Remember about stored functions. Unlike to using window functions its allows to avoid the reading of the whole table:
--drop function if exists foo();
--drop table if exists t;
create table t(pk int, dep text);
insert into t values(27,'A'),(29,'A'),(30,'B'),(31,'B'),(33,'A');
create function foo() returns setof t language plpgsql as $$
declare
r t;
p t;
begin
for r in (select * from t order by pk) loop
if p is null then
p := r;
end if;
exit when p.dep is distinct from r.dep;
return next r;
end loop;
return;
end $$;
select * from foo();
Its a little bit complex and probably, the permformance poor, but you can achieve what you want with the code below. There are four operations:
The first one is where we obtain the base order and base group ids
for the next operation.
In the sencond operation we make the trick computing an unique group
id for each group
In the third operation, where are spreading the unique group id over
the rows of each group.
Finally, we compute a consecutive group id for each group to allow
the discretionary selection of groups, so we only have to filter by
the group number we want to obtain.
Hope this helps.
SELECT fourthOperation.pk,
fourthOperation.dept
FROM (SELECT thirdOperation.pk,
thirdOperation.dept,
DENSE_RANK() OVER (ORDER BY thirdOperation.spreadedIdGroup) denseIdGroup
FROM (SELECT secondOperation.*,
NVL(idGroup, LAG(secondOperation.idGroup IGNORE NULLS) OVER (ORDER BY secondOperation.numRow)) spreadedIdGroup
FROM (SELECT firstOperation.*,
CASE WHEN LAG(firstOperation.rankRow) OVER (ORDER BY firstOperation.numRow) = firstOperation.rankRow
THEN NULL
ELSE firstOperation.numRow
END idGroup
FROM (SELECT yourTable.*,
ROW_NUMBER() OVER (ORDER BY PK) AS numRow,
DENSE_RANK() OVER (ORDER BY DEPT) AS rankRow
FROM ABORRAR yourTable) firstOperation) secondOperation ) thirdOperation) fourthOperation
WHERE fourthOperation.denseIdGroup = 1
I'm not sure if I understand your question, but for the first pk of each dept you can try this:
select min(pk) as pk,
dept
from your_table
group by dept

The number of differences in a column

I would like to retrieve a column of how many differences in letters in each row. For instance
If you have a a value "test" and another row has a value "testing ", then the differences is 4 letter between "test" and "testing ". The data of the column would be value 4
I have reflected about it and I don't know where to begin
id || value || category || differences
--------------------------------------------------
1 || test || 1 || 4
2 || testing || 1 || null
11 || candy || 2 || -3
12 || ca || 2 || null
In this scenario and context it is no difference between "Test" and "rest".
I think what you are looking for is a measure of edit difference, rather than just counting prefix similarity, for which there are a few common algorithms. Levenshtein's method is one that I've used before and I've seen it implemented as TSQL functions. The answers to this SO question suggest a couple of implementations in TSQL that you might just be able to take and use as-is.
(though take time to test the code and understand the method rather than just copying the code and using it, so that you can understand the output if something seems to go wrong - otherwise you could be creating some technical debt you'll have to pay back later)
Exactly which distance calculation method you want will depend on how you want to count certain things, for instance do you count a substitution as one change or a delete and an insert, and if your strings are long enough for it to matter do you want to consider substring moves, and so forth.
I think you just want len() and lead():
select t.id, t.value, t.category,
(len(lead(value) over (partition by t.category order by t.id) -
len(value)
) as difference
from t;
You read a next record with LEAD. Then compare the strings with LIKE or other string functions:
select
id, value, category,
case when value like next_value + '%' or next_value like value + '%'
then len(next_value) - len(value)
end as differences
from
(
select id, value, category, lead(value) over (order by id) as next_value
from mytable
) this_and_next;
If you only want to compare values within the same category use a partition clause:
lead(value) over (partition by category order by id)
UPDATE: Please see DhruvJoshi's answer on SQL Server's LEN. This function doesn't count trailing blanks, as I assumed, so you need his trick in case you want to have them counted. Here is the doc on LEN confirming this behaviour: https://technet.microsoft.com/en-us/library/ms190329(v=sql.105).aspx
create table #temp
(
id int,
value varchar(30),
category int
)
insert into #temp
select 1,'test',1
union all
select 2,'testing',1
union all
select 1,'Candy',2
union all
select 2,'Ca',2
;with cte
as
(
select id,value,category,lead(value) over (partition by category order by id) as nxtvalue
from #temp
)
select id,value,category,len(replace(nxtvalue,value,'')) as differences
from cte
you can also use self joining query like below:
--create table tbl (id int, value nvarchar(100), category int);
--insert into tbl values
--(1,N'test',1)
--,(2,N' testing',1)
--,(11,N'candy',2)
--,(12,N'ca',2);
select A.*, LEN(B.value)-LEN(A.value) as difference
from tbl A LEFT JOIN tbl B on A.id +1 =B.id and A.category=B.category
--drop table tbl
Update: I noticed that you have oddly positioned the space at the end. SQL server most times does not count the trailing spaces when calculating length. So here's the hack on above query
select A.*, LEN(B.value+'>')-LEN(A.value+'>') as difference
from tbl A LEFT JOIN tbl B on A.id +1 =B.id and A.category=B.category
As pointed out in comments, that Id's may not be consecutive, in such cases
try this :
create table #temp ( rownum int PRIMARY KEY IDENTITY(1,1), id int, value nvarchar(100), category int)
insert into #temp (id, value, category)
select id, value, category from tbl order by id asc
select A.id, A.value, A.category, LEN(B.value+'>')-LEN(A.value+'>') as difference
from #temp A LEFT JOIN #temp B on A.rownum +1 =B.rownum and A.category=B.category

How to add row number column in SQL Server 2012

I'm trying to add a new column to an existing table, where the value is the row number/rank. I need a way to generate the row number/rank value, and I also need to limit the rows affected--in this case, the presence of a substring within a string.
Right now I have:
UPDATE table
SET row_id=ROW_NUMBER() OVER (ORDER BY col1 desc) FROM table
WHERE CHARINDEX('2009',col2) > 0
And I get this error:
Windowed functions can only appear in the SELECT or ORDER BY clauses.
(Same error for RANK())
Is there any way to create/update a column with the ROW_NUMBER() function? FYI, this is meant to replace an incorrect, already-existing "rank" column.
You can do this with a CTE, something like:
with cte as
(
select *
, new_row_id=ROW_NUMBER() OVER (ORDER BY col1 desc)
from MyTable
where charindex('2009',col2) > 0
)
update cte
set row_id = new_row_id
SQL Fiddle with demo.
If you are only updating a few thousand rows, you could try something like this:
select 'UPDATE MyTable SET ID = ' + CAST(RowID as varchar) + ' WHERE ID = ' + CAST(ID as varchar)
From (
select MyTable, ROW_NUMBER() OVER (ORDER BY SortColumn) RowID from RaceEntry
where SomeClause and SomeOtherClause
) tbl
Copy and paste the query results into the query editor and run. It's a bit sluggish and yukky bit it works.
Simple workaround would be to create a temp table that looks like
CREATE TABLE #temp (id int, rank int)
Where id is the same type as primary key in you main table.
Just use SELECT INTO to first fill temp table and then update from temp tableā€¦

Add an incremental number in a field in INSERT INTO SELECT query in SQL Server

I have an INSERT INTO SELECT query. In the SELECT statement I have a subquery in which I want to add an incremental number in a field.
This query will work fine if my SELECT query and returns only one record, But if it returns multiple rows it inserts the same number in the incremental field for all those rows.
Is there any way to restrict it to add an incremental number every time?
INSERT INTO PM_Ingrediants_Arrangements_Temp
(AdminID,ArrangementID,IngrediantID,Sequence)
(SELECT
#AdminID, #ArrangementID, PM_Ingrediants.ID,
(SELECT
MAX(ISNULL(sequence,0)) + 1
FROM
PM_Ingrediants_Arrangements_Temp
WHERE
ArrangementID=#ArrangementID)
FROM
PM_Ingrediants
WHERE
PM_Ingrediants.ID IN (SELECT
ID
FROM
GetIDsTableFromIDsList(#IngrediantsIDs))
)
You can use the row_number() function for this.
INSERT INTO PM_Ingrediants_Arrangements_Temp(AdminID, ArrangementID, IngrediantID, Sequence)
SELECT #AdminID, #ArrangementID, PM_Ingrediants.ID,
row_number() over (order by (select NULL))
FROM PM_Ingrediants
WHERE PM_Ingrediants.ID IN (SELECT ID FROM GetIDsTableFromIDsList(#IngrediantsIDs)
)
If you want to start with the maximum already in the table then do:
INSERT INTO PM_Ingrediants_Arrangements_Temp(AdminID, ArrangementID, IngrediantID, Sequence)
SELECT #AdminID, #ArrangementID, PM_Ingrediants.ID,
coalesce(const.maxs, 0) + row_number() over (order by (select NULL))
FROM PM_Ingrediants cross join
(select max(sequence) as maxs from PM_Ingrediants_Arrangement_Temp) const
WHERE PM_Ingrediants.ID IN (SELECT ID FROM GetIDsTableFromIDsList(#IngrediantsIDs)
)
Finally, you can just make the sequence column an auto-incrementing identity column. This saves the need to increment it each time:
create table PM_Ingrediants_Arrangement_Temp ( . . .
sequence int identity(1, 1) -- and might consider making this a primary key too
. . .
)