SQL Server RowNumbering - sql

SrNo TextCol
--------------
NULL ABC
NULL ABC
NULL ASC
NULL qwe
I want to update the SrNo column with numbers 1,2,3,4 without changing sequence of other columns.

It only makes sense to speak of using row number if there exist a column which can provide ordering. Assuming the ordering is specified by the TextCol column, then we can try the following:
WITH cte AS (
SELECT SrNo, TextCol, ROW_NUMBER() OVER (ORDER BY TextCol) rn
FROM yourTable
)
UPDATE cte
SET SrNo = rn;

Tables are unordered, so you cannot rely on "existing sequence". However a "trick" is to use select null which in effect does nothing to the row order. While it works you should not rely on it as a permanent solution.
WITH cte AS (
SELECT SrNo, TextCol
, ROW_NUMBER() OVER (ORDER BY (select NULL)) rn
FROM yourTable
)
UPDATE cte
SET SrNo = rn;

Related

How do I exclude null duplicates without excluding valid nulls - SQL Server

I'm using a SQL Server and am trying to pull certain data. However, some rows pop up with duplicate order #s, where one line # is null while the other has a correct value.
The tricky thing is that there also are other line #s that are null, and if I were to do a filter that line_id is not null then I would exclude all the valid order #s with null values. Would I use a case statement? A subquery? I'm at a loss.
Here's an abridged version of my code and what I mean:
select
order_number
line_number
from table_1
With NOT EXISTS:
select t.*
from tablename t
where t.line_number is not null
or not exists (
select 1 from tablename
where order_number = t.order_number and line_number is not null
)
or with ROW_NUMBER() window function:
select t.order_number, t.line_number
from (
select *,
row_number() over (partition by order_number order by case when order_number is not null then 1 else 2 end) rn
from tablename
) t
where t.rn = 1

Find duplicate ID and add new sequence ID

I have a table where ID must be unique. There are some IDs that are not unique. How do I generate a new column which adds a sequence to this ID? I want to generate ID_new_generated in the table below
ID Company Name ID_new_generated
1 A 1
1 B 1_2
2 C 2
You can use a windowing function (e.g. Rank) to to generate an secondary ID, over each window defined by rows that have the same ID number, then just concatenate it to create the new one.
something like:
select
ID
, companyName
, rank() over(partition by ID ORDER BY companyName)
, concat(ID, '_', rank() over(partition by ID ORDER BY companyName)) as new_id
from test;
See this demo: https://www.db-fiddle.com/f/bd6aQKnZ7gcZCQjFpZicrp/0
Syntax will be different depending on which sql you are using.
Assumed you are looking for a solution in SQL Server:
First you will need to add a nullable column ID_Generated like below:
ALTER TABLE tablename
ADD COLUMN ID_Generated varchar(25) null
GO
Then, use row_number like below in a cte structure (you can use temp table if you are using mysql):
;with cte as (
SELECT DISTINCT t.ID,
(ROW_NUMBER() over (partition by t.ID order by t.ID)) as RowNumber
FROM tablename t
INNER JOIN (select ID, Count(*) RecCount
From tablename
group by ID
having Count(*) > 1) tt on t.ID = t.ID
ORDER BY id ASC
)
Update t
set t.ID_Generated = cte.RowNumber
from tablename t
inner join cte on t.ID = cte.ID
I think you want:
select ID, companyName,
(case when row_number() over (partition by id order by companyname) = 1
then cast(id as varchar(255))
else id || '_' || row_number() over (partition by id order by companyname)
end) as new_id
from test;
|| is the ANSI/ISO standard concatenation operator in SQL. Not all databases support it, so you might need to replace the operator with the one appropriate for your database.

How to get the first not null value from a column of values in Big Query?

I am trying to extract the first not null value from a column of values based on timestamp. Can somebody share your thoughts on this. Thank you.
What have i tried so far?
FIRST_VALUE( column ) OVER ( PARTITION BY id ORDER BY timestamp)
Input :-
id,column,timestamp
1,NULL,10:30 am
1,NULL,10:31 am
1,'xyz',10:32 am
1,'def',10:33 am
2,NULL,11:30 am
2,'abc',11:31 am
Output(expected) :-
1,'xyz',10:30 am
1,'xyz',10:31 am
1,'xyz',10:32 am
1,'xyz',10:33 am
2,'abc',11:30 am
2,'abc',11:31 am
You can modify your sql like this to get the data you want.
FIRST_VALUE( column )
OVER (
PARTITION BY id
ORDER BY
CASE WHEN column IS NULL then 0 ELSE 1 END DESC,
timestamp
)
Try this old trick of string manipulation:
Select
ID,
Column,
ttimestamp,
LTRIM(Right(CColumn,20)) as CColumn,
FROM
(SELECT
ID,
Column,
ttimestamp,
MIN(Concat(RPAD(IF(Column is null, '9999999999999999',STRING(ttimestamp)),20,'0'),LPAD(Column,20,' '))) OVER (Partition by ID) CColumn
FROM (
SELECT
*
FROM (Select 1 as ID, STRING(NULL) as Column, 0.4375 as ttimestamp),
(Select 1 as ID, STRING(NULL) as Column, 0.438194444444444 as ttimestamp),
(Select 1 as ID, 'xyz' as Column, 0.438888888888889 as ttimestamp),
(Select 1 as ID, 'def' as Column, 0.439583333333333 as ttimestamp),
(Select 2 as ID, STRING(NULL) as Column, 0.479166666666667 as ttimestamp),
(Select 2 as ID, 'abc' as Column, 0.479861111111111 as ttimestamp)
))
As far as I know, Big Query has no options like 'IGNORE NULLS' or 'NULLS LAST'. Given that, this is the simplest solution I could come up with. I would like to see even simpler solutions.
Assuming the input data is in table "original_data",
select w2.id, w1.column, w2.timestamp
from
(select id,column,timestamp
from
(select id,column,timestamp, row_number()
over (partition BY id ORDER BY timestamp) position
FROM original_data
where column is not null
)
where position=1
) w1
right outer join
original_data as w2
on w1.id = w2.id
SELECT id,
(SELECT top(1) column FROM test1 where id=1 and column is not null order by autoID desc) as name
,timestamp
FROM yourTable
Output :-
1,'xyz',10:30 am
1,'xyz',10:31 am
1,'xyz',10:32 am
1,'xyz',10:33 am
2,'abc',11:30 am
2,'abc',11:31 am

ORA-00907 Missing right parenthesis issue - select with order by inside insert query

I am trying to do insert to a table and it uses one select statement for one column. Below is the illustration of my query.
INSERT INTO MY_TBL (MY_COL1, MY_COL2)
VALUES (
(SELECT DATA FROM FIR_TABL WHERE ID = 1 AND ROWNUM = 1 ORDER BY CREATED_ON DESC),
1
);
It throws ORA-00907 Missing right Parenthesis. If I remove ORDER BY from this, it works as expected. But I need order it. How can I fix it?
Both the current answers ignore the fact that using order by and rownum in the same query is inherently dangerous. There is absolutely no guarantee that you will get the data you want. If you want the first row from an ordered query you must use a sub-query:
insert into my_tbl ( col1, col2 )
select data, 'more data'
from ( select data
from fir_tabl
where id = 1
order by created_on desc )
where rownum = 1
;
You can also use a function like rank to order the data in the method you want, though if you had two created_on dates that were identical you would end up with 2 values with rnk = 1.
insert into my_tbl ( col1, col2 )
select data, 'more data'
from ( select data
, rank() over ( order by created_on desc ) as rnk
from fir_tabl
where id = 1)
where rnk = 1
;
You don't use a SELECT when using the VALUES keyword. Use this instead:
INSERT INTO MY_TBL (MY_COL)
SELECT DATA FROM FIR_TABL WHERE ID = 1 ORDER BY CREATED_ON DESC
;
Your edited query would look like:
INSERT INTO MY_TBL (MY_COL1, MY_COL2)
SELECT DATA, 1 FROM FIR_TABL WHERE ID = 1 AND ROWNUM = 1 ORDER BY CREATED_ON DESC
;
I agree that ordering should be performed when extracting data, not when inserting it.
However, as a workaround, you could isolate the ORDER BY clause from the INSERT incapsulating your whole SELECT into another SELECT.
This will avoid the error:
INSERT INTO MY_TABLE (
SELECT * FROM (
SELECT columns
FROM table
ORDER BY clause
)
)

Select top and bottom rows

I'm using SQL Server 2005 and I'm trying to achieve something like this:
I want to get the first x rows and the last x rows in the same select statement.
SELECT TOP(5) BOTTOM(5)
Of course BOTTOM does not exist, so I need another solution. I believe there is an easy and elegant solution that I'm not getting. Doing the select again with GROUP BY DESC is not an option.
Using a union is the only thing I can think of to accomplish this
select * from (select top(5) * from logins order by USERNAME ASC) a
union
select * from (select top(5) * from logins order by USERNAME DESC) b
Check the link
SQL SERVER – How to Retrieve TOP and BOTTOM Rows Together using T-SQL
Did you try to using rownumber?
SELECT *
FROM
(SELECT *, ROW_NUMBER() OVER (Order BY columnName) as TopFive
,ROW_NUMBER() OVER (Order BY columnName Desc) as BottomFive
FROM Table
)
WHERE TopFive <=5 or BottomFive <=5
http://www.sqlservercurry.com/2009/02/select-top-n-and-bottom-n-rows-using.html
I think you've two main options:
SELECT TOP 5 ...
FROM ...
ORDER BY ... ASC
UNION
SELECT TOP 5 ...
FROM ...
ORDER BY ... DESC
Or, if you know how many items there are in the table:
SELECT ...
FROM (
SELECT ..., ROW_NUMBER() OVER (ORDER BY ... ASC) AS intRow
FROM ...
) AS T
WHERE intRow BETWEEN 1 AND 5 OR intRow BETWEEN #Number - 5 AND #Number
Is it an option for you to use a union?
E.g.
select top 5 ... order by {specify columns asc}
union
select top 5 ... order by {specify columns desc}
i guess you have to do it using subquery only
select * from table where id in (
(SELECT id ORDER BY columnName LIMIT 5) OR
(SELECT id ORDER BY columnName DESC LIMIT 5)
)
select * from table where id in (
(SELECT TOP(5) id ORDER BY columnName) OR
(SELECT TOP(5) id ORDER BY columnName DESC)
)
EDITED
select * from table where id in (
(SELECT TOP 5 id ORDER BY columnName) OR
(SELECT TOP 5 id ORDER BY columnName DESC)
)
No real difference between this and the union that I'm aware of, but technically it is a single query.
select t.*
from table t
where t.id in (select top 5 t2.id from table t2 order by MyColumn)
or
t.id in (select top 5 t2.id from table t2 order by MyColumn desc);
SELECT *
FROM (
SELECT x, rank() over (order by x asc) as rown
FROM table
) temp
where temp.rown = 1
or temp.rown = (select count(x) from table)
Then you are out - doing the select again IS the only option, unless you want to pull in the complete result set and then throwing away everything in between.
ANY sql I cna think of is the same way - for the bottom you need to know first either how many items you have (materialize everything or use count(*)) or a reverse sort order.
Sorry if that does not suit you, but at the end.... reality does not care, and I do not see any other way to do that.
I had to do this recently for a very large stored procedure; if your query is quite large, and you want to minimize the amount of queries you could declare a #tempTable, insert into that #tempTable then query from that #tempTable,
DECLARE #tempTable TABLE ( columns.. )
INSERT INTO #tempTable
VALUES ( SELECT.. your query here ..)
SELECT TOP(5) columns FROM #tempTable ORDER BY column ASC -- returns first to last
SELECT TOP(5) columns FROM #tempTable ORDER BY column DESC -- returns last to first