Distinct on three columns in sql server 2008 - sql

I have a 20 Columns Table which have duplicate value but not in all columns. that what normal distinct clause not working on it..
so i want to apply distinct on three columns (name,fname,dob) , but how? . Please give me any solution .

You could use ROW_NUMBER with a common-table-expression(CTE):
WITH CTE AS
(
SELECT t.*, RN = ROW_NUMBER() OVER (PARTITION BY name,fname,dob ORDER BY name,fname,dob)
FROM dbo.TableName t
)
SELECT * FROM CTE WHERE RN = 1
This takes one per group. Change ORDER BY name,fname,dob according to your logic.

Related

Listing multiple columns in a single row in SQL

(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
Hello,
On above query, I want to get rows of transaction id's which has seqnum=1 and seqnum=2
But if that transaction id has no second row (seqnum=2), I dont want to get any row for that transaction id.
Thanks!!
Something like this
Not 100% sure if this is correct without you table definition, but my understanding is that you want to EXCLUDE records if that record has an entry with seqnum=2 -- you can't use a where clause alone because that would still return seqnum = 1.
You can use an exists /not exists or in/not in clause like this
(select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,ROW_NUMBER() OVER(PARTITION BY EXTERNAL_TRANSACTION_ID ORDER BY ID ) AS SEQNUM
from AC_POS_TRANSACTION_TRK aptt WHERE [RESULT] ='Success'
and not exists ( select 1 from AC_POS_TRANSACTION_TRK a where a.id = aptt.id
and a.seqnum = 2)
GROUP BY ID, EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE )
basically what this does is it excludes records if a record exists as specified in the NOT EXISTS query.
One option you can try is to add a count of rows per group using the same partioning critera and then filter accordingly. Not entirely sure about your query without seeing it in context and with sample data - there's no aggregation so why use group by?
However can you try something along these lines
select * from (
select ID,EXTERNAL_TRANSACTION_ID,EXTERNAL_TRANSACTION_TYPE,
Row_Number() over(partition by EXTERNAL_TRANSACTION_ID order by ID) as SEQNUM,
Count(*) over(partition by EXTERNAL_TRANSACTION_ID) Qty
from AC_POS_TRANSACTION_TRK
where [RESULT] ='Success'
)x
where SEQNUM in (1,2) and Qty>1
This should do the job.
With Qry As (
-- Your original query goes here
),
Select Qry.*
From Qry
Where Exists (
Select *
From Qry Qry1
Where Qry1.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry1.SEQNUM = 1
)
And Exists (
Select *
From Qry Qry2
Where Qry2.EXTERNAL_TRANSACTION_ID = Qry.EXTERNAL_TRANSACTION_ID
And Qry2.SEQNUM = 2
)
BTW, your original query looks problematic to me, specifically I think that instead of a GROUP BY columns those columns should be in the PARTITION BY clause of the OVER statement, but without knowing more about the table structures and what you're trying to achieve, I could not say for sure.

Find the second largest value with Groupings

In SQL Server, I am attempting to pull the second latest NOTE_ENTRY_DT_TIME (items highlighted in screenshot). With the query written below it still pulls the latest date (I believe it's because of the grouping but the grouping is required to join later). What is the best method to achieve this?
SELECT
hop.ACCOUNT_ID,
MAX(hop.NOTE_ENTRY_DT_TIME) AS latest_noteid
FROM
NOTES hop
WHERE
hop.GEN_YN IS NULL
AND hop.NOTE_ENTRY_DT_TIME < (SELECT MAX(hope.NOTE_ENTRY_DT_TIME)
FROM NOTES hope
WHERE hop.GEN_YN IS NULL)
GROUP BY
hop.ACCOUNT_ID
Data sample in the table:
One of the "easier" ways to get the Nth row in a group is to use a CTE and ROW_NUMBER:
WITH CTE AS(
SELECT Account_ID,
Note_Entry_Dt_Time,
ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY Note_Entry_Dt_Time DESC) AS RN
FROM dbo.YourTable)
SELECT Account_ID,
Note_Entry_Dt_Time
FROM CTE
WHERE RN = 2;
Of course, if an ACCOUNT_ID only has 1 row, then it will not be returned in the result set.
The OP's statement "The row will not always be 2." from the comments conflicts with their statement "I am attempting to pull the second latest NOTE_ENTRY_DT_TIME" in the question. At a best guess, this means that the OP has rows with the same date, that could be the "latest" date. If so, then would simply need to replace ROW_NUMBER with DENSE_RANK. Their sampple data, however, doesn't suggest this is the case.
You can use window functions:
select *
from (
select
n.*,
row_number() over(partition by account_id order by note_entry_dt_time desc) rn
from notes n
) t
where rn = 2

Select Single and Duplicate Row and Return Multiple Columns

I'm currently working with my database in SQL Server. I have a table with 23 fields and it has single and duplicate rows. How can I select both of them without having any duplicate data.
I have try this query:
SELECT
Code, Stuff, and other fields....
FROM
(
SELECT
*,ROW_NUMBER() OVER (PARTITION BY Code ORDER BY Code) AS RN
FROM
my_table
)t
WHERE RN = 1
The above code just return the data from the duplicate rows. But, I want the "single rows" also returned.
This is the illustration.
Thank you for the help.
Could it be as simple as:
SELECT DISTINCT Code, Stuff FROM MyTable
Or, just add stuff to the partition by clause:
PARTITION BY Code,Stuff ORDER BY Code
Try This
You may need to add Stuff and more fields in Partition BY
SELECT
Code, Stuff
FROM
(
SELECT
*,ROW_NUMBER() OVER (PARTITION BY Code,Stuff ORDER BY Code) AS RN
FROM
my_table
)t
WHERE RN = 1

How to retrieve specific rows from SQL Server table?

I was wondering is there a way to retrieve, for example, 2nd and 5th row from SQL table that contains 100 rows?
I saw some solutions with WHERE clause but they all assume that the column on which WHERE clause is applied is linear, starting at 1.
Is there other way to query a SQL Server table for a specific rows in case table doesn't have a column whose values start at 1?
P.S. - I know for a solution with temporary tables, where you copy your select statement output and add a linear column to the table. I am using T-SQL
Try this,
SELECT * FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY ColumnName ASC) AS rownumber
FROM TableName
) as temptablename
WHERE rownumber IN (2,5)
With SQL Server:
; WITH Base AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY id) RN FROM YourTable
)
SELECT *
FROM Base WHERE RN IN (2, 5)
The id that you'll have to replace with your primary key or your ordering, YourTable that is your table.
It's a CTE (Common Table Expression) so it isn't a temporary table. It's something that will be expanded together with your query.
There is no 2nd or 5th row in the table.
There is only the 2nd or 5th result in a resultset that you return, as determined by the order you specify in that query.
If you are on SQL Server 2005 or above, you could use Row_Number() function. Ex:
;With CTE as (
select col1, ..., row_number() over (order by yourOrderingCol) rn
from yourTable
)
select col1,...
from cte
where rn in (2,5)
Please note that yourOrderingCol will decide the value of row number (i.e. rn).

Return rows between a specific range, with one select statement

I'm looking to some expresion like this (using SQL Server 2008)
SELECT TOP 10 columName FROM tableName
But instead of that I need the values between 10 and 20. And I wonder if there is a way of doing it using only one SELECT statement.
For example this is useless:
SELECT columName FROM
(SELECT ROW_NUMBER() OVER(ORDER BY someId) AS RowNum, * FROM tableName) AS alias
WHERE RowNum BETWEEN 10 AND 20
Because the select inside brackets is already returning all the results, and I'm looking to avoid that, due to performance.
Use SQL Server 2012 to fetch/skip!
SELECT SalesOrderID, SalesOrderDetailID, ProductID, OrderQty, UnitPrice, LineTotal
FROM AdventureWorks2012.Sales.SalesOrderDetail
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY;
There's nothing better than you're describing for older versions of sql server. Maybe use CTE, but unlikely to make a difference.
WITH NumberedMyTable AS
(
SELECT
Id,
Value,
ROW_NUMBER() OVER (ORDER BY Id) AS RowNumber
FROM
MyTable
)
SELECT
Id,
Value
FROM
NumberedMyTable
WHERE
RowNumber BETWEEN #From AND #To
or, you can remove top 10 rows and then get next 10 rows, but I double anyone would want to do that.
There is a trick with row_number that does not involve sorting all the rows.
Try this:
SELECT columName
FROM (SELECT ROW_NUMBER() OVER(ORDER BY (select NULL as noorder)) AS RowNum, *
FROM tableName
) as alias
WHERE RowNum BETWEEN 10 AND 20
You cannot use a constant in the order by. However, you can use an expression that evaluates to a constant. SQL Server recognizes this and just returns the rows as encountered, properly enumerated.
Why do you think SQL Server would evaluate the entire inner query? Assuming your sort column is indexed, it'll just read the first 20 values. If you're really nervous you could do this:
Select
Id
From (
Select Top 20 -- note top 20
Row_Number() Over(Order By Id) As RowNum,
Id
From
dbo.Test
Order By
Id
) As alias
Where
RowNum Between 10 And 20
Order By
Id
but I'm pretty sure the query plan is the same either way.
(Really) Fixed as per Aaron's comment.
http://sqlfiddle.com/#!3/db162/6
One more option
SELECT TOP(11) columName
FROM dbo.tableName
ORDER BY
CASE WHEN ROW_NUMBER() OVER (ORDER BY someId) BETWEEN 10 AND 20
THEN ROW_NUMBER() OVER (ORDER BY someId) ELSE NULL END DESC
You could create a temp table that is ordered the way you want like:
SELECT ROW_NUMBER() OVER(ORDER BY someId) AS RowNum, * FROM tableName
into ##tempTable
...
That way you have an ordered list of rows.
and can just query by row number the subsequent times instead of doing the inner query multiple times.