How can I query Row number in this way - sql

I'm struggling to get query and set row_number in this way. Could anyone give me a way to set row number like this?
ProjectID|RevisionYear|Row_Number|
1 |2016 |1 |
1 |2017 |2 |
1 |2017 |2 |
2 |2019 |1 |
2 |2019 |1 |
2 |2020 |2 |

You need to use DENSE_RANK() instead of ROW_NUMBER(). As is explained in the documentation, this function returns the rank of each row within a result set partition, with no gaps in the ranking values and the rank of a specific row is one plus the number of distinct rank values that come before that specific row:
Statement:
SELECT
ProjectID,
StartYear,
DENSE_RANK() OVER (PARTITION BY ProjectID ORDER BY StartYear) AS Row_Number
FROM (VALUES
(1, 2016),
(1, 2017),
(1, 2017),
(2, 2019),
(2, 2019),
(2, 2020)
) v (ProjectID, StartYear)
Result:
ProjectID StartYear Row_Number
1 2016 1
1 2017 2
1 2017 2
2 2019 1
2 2019 1
2 2020 2

Related

SQL query to split and keep only the top N values

I have the following table data:
| name |items |
--------------------
| Bob |1, 2, 3 |
| Rick |5, 3, 8, 4|
| Bill |2, 4 |
I need to create a table with a split items column, but with the limitation to have at most N items per name. E.g. for N = 3 the table should look like this:
|name |item|
-----------
|Bob |1 |
|Bob |2 |
|Bob |3 |
|Rick |5 |
|Rick |3 |
|Rick |8 |
|Bill |2 |
|Bill |4 |

I have the following query that splits items correctly, but doesn't account for the maximum number N. What should I modify in the query (standard SQL, BigQuery) to account for N?
WITH data_split AS (
SELECT name, SPLIT(items,',') AS item
FROM (
SELECT name, items
-- A lot of additional logic here
FROM data
)
)
SELECT name, item
FROM data_split
CROSS JOIN UNNEST(data_split.item) AS item
You can try a more semi-standard way - works practically everywhere:
WITH
-- your input ...
indata(id,nam,items) AS ( -- need a sorting column "id" to keep the sort order
SELECT 1, 'Bob' ,'1,2,3' -- blanks after comma can irritate
UNION ALL SELECT 2, 'Rick','5,3,8,4' -- the splitting function below ...
UNION ALL SELECT 3, 'Bill','2,4'
)
-- real query starts here, replace comma below with "WITH" ...
,
-- exactly 3 integers
i(i) AS (
SELECT 1 -- need to add FROM DUAL , in Oracle, for example ...
UNION ALL SELECT 2
UNION ALL SELECT 3
)
SELECT
id
, nam
, SPLIT(items,',',i) AS item -- SPLIT_PART in other DBMS-s
FROM indata CROSS JOIN i
WHERE SPLIT_PART(items,',',i) <> ''
ORDER BY 1, 3
;
-- out id | nam | item
-- out ----+------+------
-- out 1 | Bob | 1
-- out 1 | Bob | 2
-- out 1 | Bob | 3
-- out 2 | Rick | 3
-- out 2 | Rick | 5
-- out 2 | Rick | 8
-- out 3 | Bill | 2
-- out 3 | Bill | 4
Consider below approach (BigQuery)
select name, trim(item) item
from your_table, unnest(split(items)) item with offset
where offset < 3
if applied to sample data in your question - output is

how to remove the duplicate fields from SQL

I am trying to create a SQL query with a similar kind of record like below as well as the expected outcome. Basically, to fetch the Project records with the top most FundSrc in the list.
Can someone please suggest a query for this?
e.g- Tablename- Proj
| Project | FundSrc |
|---------|---------|
| 1001 | ABC |
| 1001 | XYZ |
| 1001 | TYS |
| 1002 | XYZ |
| 1002 | TYS |
| 1003 | ABC |
| 1003 | TYS |
| 1003 | TYS |
Expected outcome-
Result
| Project | FundSrc |
|--------- |--------- |
| 1001 | ABC |
| 1002 | XYZ |
| 1003 | ABC |
Find duplicate rows using the GROUP BY clause or ROW_NUMBER() function.
Use the DELETE statement to remove duplicate rows.
SELECT [Project],
[FundSrc],
COUNT(*) AS CNT
FROM [SampleDB].[dbo].[dbname]
GROUP BY [Project],
[FundSrc]
HAVING COUNT(*) > 1;
First, the CTE uses the ROW_NUMBER() function to find the duplicate rows specified by values in the Projectand and FundSrc columns.
Then, the DELETE statement deletes all the duplicate rows but keeps only one occurrence of each duplicate group.
SQL tables represent unordered sets. There is not "topmost" value unless a column specifies what "topmost" means. Your data doesn't have such a column.
If it did, then you would have different options. One simple way uses row_number():
select p.*
from (select p.*,
row_number() over (partition by project order by <ordering col> desc) as seqnum
from proj p
) p
where seqnum = 1;
You need ordering column to avoid random result
with source (key, value, o) as (values
(1001, 'ABC', 1),
(1001, 'XYZ', 2),
(1001, 'TYS', 3),
(1002, 'XYZ', 4),
(1002, 'TYS', 5),
(1003, 'ABC', 6),
(1003, 'TYS', 7),
(1003, 'TYS', 8)
)
select distinct key, first_value (value) over (partition by key order by o) from source
;

Get count of all instances before a certain date

I have a table like this:
--------------------------------------
RecID|name |date
--------------------------------------
1 |John | 05/09/2016
2 |John | 05/02/2016
3 |Mary | 05/09/2016
4 |Mary | 05/08/2016
5 |Mary | 03/02/2016
and I want to get the count for name for each instance in which that name has appeared on or before that date in the row. So I want the output to look like this:
--------------------------------------
RecID|name |date |count
--------------------------------------
1 |John | 05/09/2016 | 2
2 |John | 05/02/2016 | 1
3 |Mary | 05/09/2016 | 3
4 |Mary | 05/08/2016 | 2
5 |Mary | 03/02/2016 | 1
Any ideas on how I should go about doing this?
You can use the count function with a window specification.
select t.*, count(*) over(partition by name order by date) as cnt
from tablename t
This will produce incorrect results if there are mutliple rows on a given date for a name. One way to avoid this is using a correlated sub-query.
select t.*,
(select count(distinct t2.date)
from tablename t2
where t2.name=t.name and t2.date<=t.date) as cnt
from tablename t
Or use row_number.
select t.*, row_number() over(partition by name order by date) as cnt
from tablename t
Or use dense_rank if there can be multiple rows for the same name on a given date.
select t.*, dense_rank() over(partition by name order by date) as cnt
from tablename t
The easiest solution of all would be to use dense_rank.
use
count(*) count
and
group by date
if your date is already a string (i.e. without hour/minute information)

CTE for all Child for Multiple parent

I have a parent child relation table as shown below:
ContractID ContractIdRef
---------- -------------
1 null
2 1
3 1
4 2
5 4
10 null
11 10
12 11
15 null
16 12
I want result like below:
ContractID ContractIdRef rw
----------- -------------- ---
1 null 1
2 1 1
3 1 1
4 2 1
5 4 1
10 null 10
11 10 10
12 11 10
15 null 15
16 12 10
In above result I want to specify each rows parent.
Thanks
As you mentioned in the TAGS Comman Table Expression is the way to go
;WITH REC_CTE
AS (SELECT [contractid],
[ContractIdRef],
[contractid] AS rw
FROM Yourtable
WHERE [contractidref] IS NULL
UNION ALL
SELECT T.[contractid],
T.[contractidref],
c.rw
FROM Yourtable AS T
INNER JOIN REC_CTE C
ON T.[contractidref] = c.[contractid]
WHERE T.[contractid] <> T.[contractidref])
SELECT [contractid],
[contractidref],
rw
FROM REC_CTE
ORDER BY [contractid]
Demo
Schema Setup
If object_id('tempdb.dbo.#Yourtable') is not null
DROP table #Yourtable
CREATE TABLE #Yourtable
([ContractID] INT, [ContractIdRef] INT);
Sample data
INSERT INTO #Yourtable
([ContractID], [ContractIdRef])
VALUES
('1', NULL),
('2', '1'),
('3', '1'),
('4', '2'),
('5', '4'),
('10', NULL),
('11', '10'),
('12', '11'),
('15', NULL),
('16', '12');
Query
;WITH REC_CTE
AS (SELECT [ContractID],
[ContractIdRef] as [ContractIdRef],
[ContractID] AS rw
FROM #Yourtable where [ContractIdRef] is null
UNION ALL
SELECT T.[ContractID],
T.[ContractIdRef],
c.rw
FROM #Yourtable AS T
INNER JOIN REC_CTE c
ON T.[ContractIdRef] = c.[ContractID]
WHERE T.[ContractID] <> T.[ContractIdRef])
SELECT [ContractID],
[ContractIdRef],
rw
FROM REC_CTE
ORDER BY [ContractID]
Result
+-----------+-------------+----+
|ContractID |ContractIdRef| rw |
+-----------+-------------+----+
|1 |NULL | 1 |
|2 |1 | 1 |
|3 |1 | 1 |
|4 |2 | 1 |
|5 |4 | 1 |
|10 |NULL | 10 |
|11 |10 | 10 |
|12 |11 | 10 |
|15 |NULL | 15 |
|16 |12 | 10 |
+-----------+-------------+----+
with Q as(
select ContractID, ContractIdRef, ContractID as root
from childs
where ContractIdRef is null
union all
select C.ContractID, C.ContractIdRef, Q.root
from Q, childs C
where C.ContractIdRef=Q.ContractID
)
select * from Q
order by ContractID
Tested on MS SQL 2014.
For Postgresql need add word 'recursive' after 'with'. Test on sqlfiddle.com
For Oracle first line writed as with Q(ContractID,ContractIdRef,root).

Max value for each group in subquery

I have a table with 10 columns and I am interested in 3 of those.
Say tableA with id, name, url, ranking.
id |name |url |ranking
--------------------------------
1 |apple |a1.com |1
2 |apple |a1.com |2
3 |apple |a1.com |3
4 |orange |o1.com |1
5 |orange |o1.com |2
6 |apple |a1.com |4
So, what I want is, all the columns for row with id 5 and 6. That would be row with maximum ranking for each group (apple, orange)
Use row_number to number the rows in each name group by their ranking in the descending order and select the the first row per each group.
select id,name,url,ranking
from
(select t.*, row_number() over(partition by name order by ranking desc) as rn
from tablename t) t
where rn =1