How to select just the third or fourth row in SQL Server - sql

I am having a little bit of trouble figuring out a way to select just the third or fourth row in a query I am writing, any help would be greatly appreciated.
This is an example of the code I came up with, this however only selects the first row.
Left Outer Join (select ap_attachments.ap_table_key, ap_description, ap_creation_date, ap_creation_time, ap_file_name, ap_attach_id
from ap_attachments
inner join (select Min(ap_attachment_id) ap_attach_id, ap_table_key
from ap_attachments
where ap_file_name like '%jpg%'
group by ap_table_key) C
On ap_attachments.ap_attachment_id = C.ap_attach_id) apImgThree_attach
On apImgTwo_attach.ap_table_key = order_link.to_order_id

You can do this with the ROW_NUMBER() function:
select ap_attachment_id, ap_table_key,ROW_NUMBER() OVER(PARTITION BY ap_table_key ORDER BY ap_attachment_id) AS RN
from ap_attachments
where ap_file_name like '%jpg%'
Then you can specify which row you'd like to return using the RN value. This may require some adapting depending on your source data, the DENSE_RANK() function may be more appropriate.
The ROW_NUMBER() function assigns a number to each row. PARTITION BY is optional, but used to start the numbering over for each value in that group, ie: if you PARTITION BY Some_Date then for each unique date value the numbering would start over at 1. ORDER BY of course is used to define how the counting should go, and is required in the ROW_NUMBER() function.

Look up the docs on lead and lag. You can also use the PARTITION clause to create the window within a specific date, for example;
declare #table table(
[flower] [sysname]);
insert into #table
([flower])
values (N'rose'),
(N'tulip'),
(N'chamomile'),
(N'lily');
select [flower] from #table order by [flower];
select [flower]
, lag ([flower]
, 1
, 0)
over (
order by [flower] desc) as [previous_flower]
, lead ([flower]
, 1
, 0)
over (
order by [flower] desc) as [next_flower]
from #table;

Related

How to get the last record from the duplicate records in SQL?

I want to get the last record from the duplicate records and want the non-duplicate records also.
As depicted in the below image I want to get row number 4, 5, 7 and 9 in my output.
Here, In the below image the ** Main table** was shown. From which I have to concat first two columns and then from that new column I need the last row of duplicate records and the non-duplicate rows also.
I have tried with the given below SQL code.
DECLARE #dense_rank_demo AS TABLE (
Bid INT,
cid INT,
BCode NVARCHAR(10)
);
INSERT INTO #dense_rank_demo(Bid,cid,BCode)
VALUES(2393,1,'LAX'),(2394,54,'BRK'),(2395,57,'ONT'),(2393,1,'SAN'),(2393,1,'LAX'),(2393,1,'BRK'),(2394,54,'ONT'),(2395,57,'SAN'),(2394,1,'ONT');
SELECT * FROM #dense_rank_demo;
SELECT
CONCAT([Bid],'_',[cid]) as [Key],BCode,DENSE_RANK() over( order by CONCAT([Bid],'_',[cid]))
from #dense_rank_demo
From the SQL code I found that there is no column on which we can apply order by for getting the expected Result.
So that, I have add one column name Id and done some other changes for getting expected output.
Here I am Sharing the code in which I have done some changes.
DECLARE #dense_rank_demo AS TABLE (
ID INT IDENTITY(1,1),
Bid INT,
cid INT,
BCode NVARCHAR(10));
DECLARE #tableGroupKey TABLE
(
dr bigint,
[Key] VARCHAR(50)
)
INSERT INTO #dense_rank_demo(Bid,cid,BCode)
VALUES(2393,1,'LAX'),
(2394,54,'BRK'),
(2395,57,'ONT'),
(2393,1,'SAN'),
(2393,1,'LAX'),
(2393,1,'BRK'),
(2394,54,'ONT'),
(2395,57,'SAN'),
(2394,1,'ONT');
with [drd] as
(
select
concat([Bid],'_',[cid]) as [Key],
BCode,
dense_rank() over(partition by concat([Bid],'_',[cid]) order by ID) as
[dr]
from #dense_rank_demo
)
INSERT INTO #tableGroupKey(dr,[Key])
select MAX(dr) dr,[Key]
from [drd]
GROUP BY [Key]
SELECT *,CONCAT(Bid,'_',cid) AS [Key] FROM #dense_rank_demo [drd]
select Result.* FROM
(
SELECT *,CONCAT(Bid,'_',cid) AS [Key] ,
dense_rank() over(partition by concat([Bid],'_',[cid]) order by ID) as
[dr]
FROM #dense_rank_demo [drd]
) as [Result]
INNER JOIN #tableGroupKey [gk] ON
[Result].[Key] = [gk].[Key] AND [gk].dr = [Result].dr
ORDER BY [Result].ID
The Expected output is as below:
[Output]
The issue here is the ordering of the values within the result set. If you had a specific order to use, this would be fairly straightforward - however, you are relying on dense_rank() to consistently and reliably returning the same values for those in the table. If we could use, for example, the alpha sort on the BCode column then it would be simple to use a CTE and get the last/first one:
with [drd] as
(
select
concat([Bid],'_',[cid]) as [Key],
BCode,
dense_rank() over(partition by concat([Bid],'_',[cid]) order by Bcode desc) as [dr]
from #dense_rank_demo
)
select *
from [drd]
where dr = 1
As the order of dense_rank() is not guaranteed in your code, I'm not sure that this is feasible in a scalable way.
See this for more information about reliably sorted results: how does SELECT TOP works when no order by is specified?
you need one row per BID i.e the latest one, But you have not specified the logic of the last row. Usually, last row is the most recent added one and so there is usually a timestamp that can be used to pick the latest row where there are duplicates.
The code below uses the Bcode as a part of the order by calause, that means it will automatically pick the row that has the lowest alphabet order, which not be the row that you expect unless thats how you define the most recent row. You would in general need to play with the order by clause based on your needs but the timestamp makes most sense
row_number() generates the values 1-n based on the partition by, incase there is a tie, and you need both rows, then you need to use the dense_rank instead. Based on your needs you can adjust that
with main as (
select
concat(Bid, cid) as key,
row_number() over(partition by concat(Bid, cid) order by Bcode) as rank_
from <table_name>
)
select * from main where rank_ = 1

selecting max value from table with two variable colums (microsoft SQL)

i´m working with a table that looks like this:
Start
https://i.stack.imgur.com/uibc3.png
My desired result would look like this:
Result
https://i.stack.imgur.com/v0sic.png
So i´m triyng to select the max value from two "combined" colums. If the values are the same amount (Part C), the outcome doesn't matter.
I tried to order the table by max value and then using distinct but the result didn't turn out as expected
Could you please offer a solution or some insight to this? Thanks in advance!
Use row_number():
select *
from (
select t.*, row_number() over(partition by part order by amount desc, zone) rn
from mytable t
) t
where rn = 1
For each part, this gives you the row with the highest amount; if there are top ties, column zone is used to break them.
If you want to allow ties, then use rank() instead, like:
rank() over(partition by part order by amount desc) rn
You can achieve this by using SUB Query
DECLARE #T TABLE(
PART VARCHAR(50),
ZONE VARCHAR(10),
Amt INT)
Insert Into #T Values ('PartA','71H',1),('PartA','75H',2),('PartB','98D',1),('PartB','98A',3),('PartC','75H',1),('PartC','52H',1)
SELECT M.PART,MIN(M.Zone) AS ZONE,S.AMOUNT
FROM #T M
INNER JOIN (
SELECT Part,MAX(Amt) as AMOUNT From #T
GROUP BY PART) S ON S.AMOUNT=M.Amt AND S.PART=M.PART
GROUP BY M.PART,S.AMOUNT
ORDER BY M.PART

Duration between 2 dates based on another column

I currently have a table of data that shows different steps in a process, with a date/time each step was carried out.
enter image description here
What I'm looking to do is add a column that calculates the time in minutes between each step, however it has to relate to the claimID, so in the image shown I would be looking for difference between each step for the top 4 results (as they share the same claimID), then the following 6 results, etc.
Can anyone help? I'm using SQL Server
Depending on what version of SQL Server you are using you can either use a self join or the lag window function (this should work in SQL Server 2012+):
select
claimid
, statusid
, statussetdate
, coalesce(datediff(minute,
lag(statussetdate) over (partition by claimid order by statussetdate),
statussetdate
),0) as diff_in_minutes
from
your_table
order by
ClaimID
, StatusSetDate;
You can self join the table to itself using the Row_number to get the previous date and do a DateDiff on the 2 values..
;WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ClaimID ORDER BY StatusSetDate) Rn
FROM ClaimStatus
)
SELECT curr.*,
ISNULL(DATEDIFF(minute, prev.StatusSetDate, curr.StatusSetDate),0)
FROM cte curr
LEFT JOIN cte prev ON curr.ClaimID = prev.ClaimID AND curr.Rn = prev.Rn + 1
ORDER BY curr.ClaimID, curr.StatusSetDate

Retrieving most recent data in SQL

Total disclosure: I'm a SQL beginner.
I have a data set of certain accounting and governance metrics for US companies. It has about 15 columns and roughly 18 million rows. Each row is a unique combination of company, date and metric being measured. The columns include certain identifiers like isin number, ticker symbol, etc, the date the metric was released, the metric description, and the metric itself.
What I'm trying to do is write a query that will yield the NEWEST values for a certain metric for all companies. In my hopeless search over the past few days I've come to think that the GROUP BY clause may be what I'm looking for. However, it doesn't seem to do exactly what I need. I've got it working with just 2 columns: isin number (company identifier), and date. In other words, I can spit out a list that shows the most recent date for each company, but I'm not sure how to add more columns to this, how to specify what metric to look at.
Any guidance would be appreciated, even if it's just pointing me in the right direction towards what kind of commands I should be looking into.
Thanks!
EDIT: Wow. Thanks for the quick and thorough replies. And point taken on the clarity and example data sets/starting query. Update: I think I have it working. Here's what I used:
SELECT a1.["id_isin_number"], a1.["metric_description"], a1.["date_period_ends"], a1.["company_metric_value"], a2.maxdate
FROM [AGR Metrics].[dbo].[Audit_Integrity_Metric_Data_File_NA Original_0] a1
INNER JOIN (
SELECT a2.["id_isin_number"], MAX(a2.["date_period_ends"]) AS maxdate
FROM [AGR Metrics].[dbo].[Audit_Integrity_Metric_Data_File_NA Original_0] a2
GROUP BY a2.["id_isin_number"]
) a2
ON a1.["date_period_ends"] = a2.maxdate
AND a1.["id_isin_number"] = a2.["id_isin_number"]
WHERE a1.["metric_description"] = '"Litigation: Class Action"'
I'm looking over the responses now to make sure I'm doing this as efficiently as possible.
You can use the ROW_NUMBER() function for this (if using SQL Server 2005 or newer):
SELECT *
FROM (SELECT *,ROW_NUMBER() OVER(PARTITION BY isin ORDER BY [date] DESC) AS RowRank
FROM YourTable
)sub
WHERE RowRank = 1
Just list out the fields you want in place of * if you don't want them all returned.
The ROW_NUMBER() function adds a number to each row, PARTITION BY is optional and is used to define a group for which numbering will start over at 1, in this case, you want the most recent for each value of isin so we PARTITION BY that. ORDER BY is required and defines the order of the numbering, in this case by date.
Your current query can also be used, but the ROW_NUMBER() method is simpler and more efficient:
SELECT a.*
FROM YourTable a
JOIN (SELECT isin, MAX([date])
FROM YourTable
GROUP BY isin
)b
ON a.isin = b.isin
AND a.[date] = b.[date]
Well as you quote the date the metric was released , So you can use it to sort your table using Order By .
This is a very basic example which can be used to simply sort data and selecting top 1 value.
Please refer This
CREATE TABLE trialOne (
Id INT NULL,
NAME VARCHAR(50) NULL,
[Date] DATETIME NULL
)
SELECT * FROM dbo.ETProgram
INSERT INTO trialone VALUES(1,'john','2009-01-06 11:39:51.827')
INSERT INTO trialone VALUES(2,'joseph','2010-01-06' )
INSERT INTO trialone VALUES(3,'Ajay','2009-05-06' )
INSERT INTO trialone VALUES(4,'Dave','2009-11-06' )
INSERT INTO trialone VALUES(5,'jonny','2004-01-06')
INSERT INTO trialone VALUES(6,'sunny','2005-01-06')
INSERT INTO trialone VALUES(7,'elle','2013-01-06' )
INSERT INTO trialone VALUES(8,'mac','2012-01-06' )
INSERT INTO trialone VALUES(8,'Sam','2008-01-06' )
INSERT INTO trialone VALUES(10,'xxxxx','2013-08-06')
SELECT TOP(1)name FROM trialone ORDER BY Date DESC

SQL "over" partition WHERE date between two values

I have a query that partitions and ranks "Note" records, grouping them by ID_Task (users add notes for each task). I want to rank the notes by date, but I also want to restrict it so they're ranked between two dates.
I'm using SQL Server 2008. So far my SELECT looks like this:
SELECT Note.ID,
Note.ID_Task,
Note.[Days],
Note.[Date],
ROW_NUMBER() OVER (PARTITION BY ID_Task ORDER BY CAST([Date] AS DATE), Edited ASC) AS Rank
FROM
Note
WHERE
Note.Locked = 1 AND Note.Deleted = 0
Now, I assume that if I put the WHERE clause at the bottom, although they'll still have ranks, I might or might not get item with rank 1, as it might get filtered out. So is there a way I can only partition records WHERE , ignoring all of the others? I could partition a sub-query I guess.
The intention is to use the rank number to find the most recent note for each task, in another query. So in that query I'll join with this result WHERE rank = 1.
row_number() operates after where. You'll always get a row 1.
For example:
declare #t table (id int)
insert #t values (3), (1), (4)
select row_number() over (order by id)
from #t
where id > 1
This prints:
1
2