i´m working with a table that looks like this:
Start
https://i.stack.imgur.com/uibc3.png
My desired result would look like this:
Result
https://i.stack.imgur.com/v0sic.png
So i´m triyng to select the max value from two "combined" colums. If the values are the same amount (Part C), the outcome doesn't matter.
I tried to order the table by max value and then using distinct but the result didn't turn out as expected
Could you please offer a solution or some insight to this? Thanks in advance!
Use row_number():
select *
from (
select t.*, row_number() over(partition by part order by amount desc, zone) rn
from mytable t
) t
where rn = 1
For each part, this gives you the row with the highest amount; if there are top ties, column zone is used to break them.
If you want to allow ties, then use rank() instead, like:
rank() over(partition by part order by amount desc) rn
You can achieve this by using SUB Query
DECLARE #T TABLE(
PART VARCHAR(50),
ZONE VARCHAR(10),
Amt INT)
Insert Into #T Values ('PartA','71H',1),('PartA','75H',2),('PartB','98D',1),('PartB','98A',3),('PartC','75H',1),('PartC','52H',1)
SELECT M.PART,MIN(M.Zone) AS ZONE,S.AMOUNT
FROM #T M
INNER JOIN (
SELECT Part,MAX(Amt) as AMOUNT From #T
GROUP BY PART) S ON S.AMOUNT=M.Amt AND S.PART=M.PART
GROUP BY M.PART,S.AMOUNT
ORDER BY M.PART
Related
I want to get the last record from the duplicate records and want the non-duplicate records also.
As depicted in the below image I want to get row number 4, 5, 7 and 9 in my output.
Here, In the below image the ** Main table** was shown. From which I have to concat first two columns and then from that new column I need the last row of duplicate records and the non-duplicate rows also.
I have tried with the given below SQL code.
DECLARE #dense_rank_demo AS TABLE (
Bid INT,
cid INT,
BCode NVARCHAR(10)
);
INSERT INTO #dense_rank_demo(Bid,cid,BCode)
VALUES(2393,1,'LAX'),(2394,54,'BRK'),(2395,57,'ONT'),(2393,1,'SAN'),(2393,1,'LAX'),(2393,1,'BRK'),(2394,54,'ONT'),(2395,57,'SAN'),(2394,1,'ONT');
SELECT * FROM #dense_rank_demo;
SELECT
CONCAT([Bid],'_',[cid]) as [Key],BCode,DENSE_RANK() over( order by CONCAT([Bid],'_',[cid]))
from #dense_rank_demo
From the SQL code I found that there is no column on which we can apply order by for getting the expected Result.
So that, I have add one column name Id and done some other changes for getting expected output.
Here I am Sharing the code in which I have done some changes.
DECLARE #dense_rank_demo AS TABLE (
ID INT IDENTITY(1,1),
Bid INT,
cid INT,
BCode NVARCHAR(10));
DECLARE #tableGroupKey TABLE
(
dr bigint,
[Key] VARCHAR(50)
)
INSERT INTO #dense_rank_demo(Bid,cid,BCode)
VALUES(2393,1,'LAX'),
(2394,54,'BRK'),
(2395,57,'ONT'),
(2393,1,'SAN'),
(2393,1,'LAX'),
(2393,1,'BRK'),
(2394,54,'ONT'),
(2395,57,'SAN'),
(2394,1,'ONT');
with [drd] as
(
select
concat([Bid],'_',[cid]) as [Key],
BCode,
dense_rank() over(partition by concat([Bid],'_',[cid]) order by ID) as
[dr]
from #dense_rank_demo
)
INSERT INTO #tableGroupKey(dr,[Key])
select MAX(dr) dr,[Key]
from [drd]
GROUP BY [Key]
SELECT *,CONCAT(Bid,'_',cid) AS [Key] FROM #dense_rank_demo [drd]
select Result.* FROM
(
SELECT *,CONCAT(Bid,'_',cid) AS [Key] ,
dense_rank() over(partition by concat([Bid],'_',[cid]) order by ID) as
[dr]
FROM #dense_rank_demo [drd]
) as [Result]
INNER JOIN #tableGroupKey [gk] ON
[Result].[Key] = [gk].[Key] AND [gk].dr = [Result].dr
ORDER BY [Result].ID
The Expected output is as below:
[Output]
The issue here is the ordering of the values within the result set. If you had a specific order to use, this would be fairly straightforward - however, you are relying on dense_rank() to consistently and reliably returning the same values for those in the table. If we could use, for example, the alpha sort on the BCode column then it would be simple to use a CTE and get the last/first one:
with [drd] as
(
select
concat([Bid],'_',[cid]) as [Key],
BCode,
dense_rank() over(partition by concat([Bid],'_',[cid]) order by Bcode desc) as [dr]
from #dense_rank_demo
)
select *
from [drd]
where dr = 1
As the order of dense_rank() is not guaranteed in your code, I'm not sure that this is feasible in a scalable way.
See this for more information about reliably sorted results: how does SELECT TOP works when no order by is specified?
you need one row per BID i.e the latest one, But you have not specified the logic of the last row. Usually, last row is the most recent added one and so there is usually a timestamp that can be used to pick the latest row where there are duplicates.
The code below uses the Bcode as a part of the order by calause, that means it will automatically pick the row that has the lowest alphabet order, which not be the row that you expect unless thats how you define the most recent row. You would in general need to play with the order by clause based on your needs but the timestamp makes most sense
row_number() generates the values 1-n based on the partition by, incase there is a tie, and you need both rows, then you need to use the dense_rank instead. Based on your needs you can adjust that
with main as (
select
concat(Bid, cid) as key,
row_number() over(partition by concat(Bid, cid) order by Bcode) as rank_
from <table_name>
)
select * from main where rank_ = 1
I have a table like below:
I want the results to be like below which fetch the start and end of the balance but we can't use group by as balance should be grouped only based on consecutive groups. can you please help me with this ?:
There is most certainly a duplicate of this question, however, it is easier to crank out an answer than to search. These types of problems, data in the order inserted or shown with no order indicator, can simply be solved by two derivative queries. The first to use LAG or LEAD to check for gaps and the second to sum up the changes which are represented by a value of 1 as opposed to 0. The key here, using MSSQL Server, is SUM(x) OVER (ORDER BY Date ROWS UNBOUNDED PRECEDING).
DECLARE #T TABLE(balance INT, date DATETIME)
INSERT #T VALUES
(36,'1/1/2020'),
(36,'1/2/2020'),
(36,'1/3/2020'),
(24,'1/4/2020'),
(24,'1/5/2020'),
(36,'1/6/2020'),
(36,'1/7/2020'),
(37,'1/8/2020'),
(38,'1/9/2020')
;WITH GapsMarked AS
(
--If the prev value by date (by natural order of data above) does not equal this value mark it as a boundry
SELECT *,
IsBoundry = CASE WHEN ISNULL(LAG(balance) OVER (ORDER BY date),balance) = balance THEN 0 ELSE 1 END
FROM #T
)
,VirtualGroup AS
(
SELECT
*,
--This serialzes the marked groups into seequntial clusters
IslandsMarked = SUM(IsBoundry) OVER (ORDER BY Date ROWS UNBOUNDED PRECEDING)
FROM
GapsMarked
)
SELECT
MAX(balance) AS balance,
MIN(date) AS start,
MAX(date) AS [end]
FROM
VirtualGroup
GROUP BY
IslandsMarked
select balance, min(start), max(end) from table where balance is in (
select balance from table
group by balance)
Hope it will help you
I am having a little bit of trouble figuring out a way to select just the third or fourth row in a query I am writing, any help would be greatly appreciated.
This is an example of the code I came up with, this however only selects the first row.
Left Outer Join (select ap_attachments.ap_table_key, ap_description, ap_creation_date, ap_creation_time, ap_file_name, ap_attach_id
from ap_attachments
inner join (select Min(ap_attachment_id) ap_attach_id, ap_table_key
from ap_attachments
where ap_file_name like '%jpg%'
group by ap_table_key) C
On ap_attachments.ap_attachment_id = C.ap_attach_id) apImgThree_attach
On apImgTwo_attach.ap_table_key = order_link.to_order_id
You can do this with the ROW_NUMBER() function:
select ap_attachment_id, ap_table_key,ROW_NUMBER() OVER(PARTITION BY ap_table_key ORDER BY ap_attachment_id) AS RN
from ap_attachments
where ap_file_name like '%jpg%'
Then you can specify which row you'd like to return using the RN value. This may require some adapting depending on your source data, the DENSE_RANK() function may be more appropriate.
The ROW_NUMBER() function assigns a number to each row. PARTITION BY is optional, but used to start the numbering over for each value in that group, ie: if you PARTITION BY Some_Date then for each unique date value the numbering would start over at 1. ORDER BY of course is used to define how the counting should go, and is required in the ROW_NUMBER() function.
Look up the docs on lead and lag. You can also use the PARTITION clause to create the window within a specific date, for example;
declare #table table(
[flower] [sysname]);
insert into #table
([flower])
values (N'rose'),
(N'tulip'),
(N'chamomile'),
(N'lily');
select [flower] from #table order by [flower];
select [flower]
, lag ([flower]
, 1
, 0)
over (
order by [flower] desc) as [previous_flower]
, lead ([flower]
, 1
, 0)
over (
order by [flower] desc) as [next_flower]
from #table;
I need to get the latest price of an item (as part of a larger select statement) and I can't quite figure it out.
Table:
ITEMID DATE SALEPRICE
1 1/1/2014 10
1 2/2/2014 20
2 3/3/2014 15
2 4/4/2014 13
I need the output of the select to be '20' when looking for item 1 and '13' when looking for item 2 as per the above example.
I am using Oracle SQL
The most readable/understandable SQL (in my opinion) would be this:
select salesprice from `table` t
where t.date =
(
select max(date) from `table` t2 where t2.itemid = t.itemid
)
and t.itemid = 1 -- change item id here;
assuming your table's name is table and you only have one price per day and item (else the where condition would match more than one row per item). Alternatively, the subselect could be written as a self-join (should not make a difference in performance).
I'm not sure about the OVER/PARTITION used by the other answers. Maybe they could be optimized to better performance depending on the DBMS.
Maybe something like this:
Test data
DECLARE #tbl TABLE(ITEMID int,DATE DATETIME,SALEPRICE INT)
INSERT INTO #tbl
VALUES
(1,'1/1/2014',10),
(1,'2/2/2014',20),
(2,'3/3/2014',15),
(2,'4/4/2014',13)
Query
;WITH CTE
AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY ITEMID ORDER BY [DATE] DESC) AS rowNbr,
tbl.*
FROM
#tbl AS tbl
)
SELECT
*
FROM
CTE
WHERE CTE.rowNbr=1
Try this!
In sql-server may also work in Oracle sql
select * from
(
select *,rn=row_number()over(partition by ITEMID order by DATE desc) from table
)x
where x.rn=1
You need Row_number() to allocate a number to all records which is partition by ITEMID so each group will get a RN,then as you are ordering by date desc to get Latest record
SEE DEMO
Total disclosure: I'm a SQL beginner.
I have a data set of certain accounting and governance metrics for US companies. It has about 15 columns and roughly 18 million rows. Each row is a unique combination of company, date and metric being measured. The columns include certain identifiers like isin number, ticker symbol, etc, the date the metric was released, the metric description, and the metric itself.
What I'm trying to do is write a query that will yield the NEWEST values for a certain metric for all companies. In my hopeless search over the past few days I've come to think that the GROUP BY clause may be what I'm looking for. However, it doesn't seem to do exactly what I need. I've got it working with just 2 columns: isin number (company identifier), and date. In other words, I can spit out a list that shows the most recent date for each company, but I'm not sure how to add more columns to this, how to specify what metric to look at.
Any guidance would be appreciated, even if it's just pointing me in the right direction towards what kind of commands I should be looking into.
Thanks!
EDIT: Wow. Thanks for the quick and thorough replies. And point taken on the clarity and example data sets/starting query. Update: I think I have it working. Here's what I used:
SELECT a1.["id_isin_number"], a1.["metric_description"], a1.["date_period_ends"], a1.["company_metric_value"], a2.maxdate
FROM [AGR Metrics].[dbo].[Audit_Integrity_Metric_Data_File_NA Original_0] a1
INNER JOIN (
SELECT a2.["id_isin_number"], MAX(a2.["date_period_ends"]) AS maxdate
FROM [AGR Metrics].[dbo].[Audit_Integrity_Metric_Data_File_NA Original_0] a2
GROUP BY a2.["id_isin_number"]
) a2
ON a1.["date_period_ends"] = a2.maxdate
AND a1.["id_isin_number"] = a2.["id_isin_number"]
WHERE a1.["metric_description"] = '"Litigation: Class Action"'
I'm looking over the responses now to make sure I'm doing this as efficiently as possible.
You can use the ROW_NUMBER() function for this (if using SQL Server 2005 or newer):
SELECT *
FROM (SELECT *,ROW_NUMBER() OVER(PARTITION BY isin ORDER BY [date] DESC) AS RowRank
FROM YourTable
)sub
WHERE RowRank = 1
Just list out the fields you want in place of * if you don't want them all returned.
The ROW_NUMBER() function adds a number to each row, PARTITION BY is optional and is used to define a group for which numbering will start over at 1, in this case, you want the most recent for each value of isin so we PARTITION BY that. ORDER BY is required and defines the order of the numbering, in this case by date.
Your current query can also be used, but the ROW_NUMBER() method is simpler and more efficient:
SELECT a.*
FROM YourTable a
JOIN (SELECT isin, MAX([date])
FROM YourTable
GROUP BY isin
)b
ON a.isin = b.isin
AND a.[date] = b.[date]
Well as you quote the date the metric was released , So you can use it to sort your table using Order By .
This is a very basic example which can be used to simply sort data and selecting top 1 value.
Please refer This
CREATE TABLE trialOne (
Id INT NULL,
NAME VARCHAR(50) NULL,
[Date] DATETIME NULL
)
SELECT * FROM dbo.ETProgram
INSERT INTO trialone VALUES(1,'john','2009-01-06 11:39:51.827')
INSERT INTO trialone VALUES(2,'joseph','2010-01-06' )
INSERT INTO trialone VALUES(3,'Ajay','2009-05-06' )
INSERT INTO trialone VALUES(4,'Dave','2009-11-06' )
INSERT INTO trialone VALUES(5,'jonny','2004-01-06')
INSERT INTO trialone VALUES(6,'sunny','2005-01-06')
INSERT INTO trialone VALUES(7,'elle','2013-01-06' )
INSERT INTO trialone VALUES(8,'mac','2012-01-06' )
INSERT INTO trialone VALUES(8,'Sam','2008-01-06' )
INSERT INTO trialone VALUES(10,'xxxxx','2013-08-06')
SELECT TOP(1)name FROM trialone ORDER BY Date DESC