Select unique records [duplicate] - sql

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 2 years ago.
I'm working with a table that has about 50 colums and 100,000 rows.
One column, call it TypeID, has 10 possible values:
1 thourgh 10.
There can be 10,000 records of TypeID = 1, and 10,000 records of TypeID = 2 and so one.
I want to run a SELECT statement that will return 1 record of each distinct TypeID.
So something like
TypeID JobID Language BillingDt etc
------------------------------------------------
1 123 EN 20130103 etc
2 541 FR 20120228 etc
3 133 FR 20110916 etc
4 532 SP 20130822 etc
5 980 EN 20120714 etc
6 189 EN 20131009 etc
7 980 SP 20131227 etc
8 855 EN 20111228 etc
9 035 JP 20130615 etc
10 103 EN 20100218 etc
I've tried:
SELECT DISTINCT TypeID, JobID, Language, BillingDt, etc
But that produces multiple TypeID rows of the same value. I get a whole bunch of '4', '10', and so on.
This is an ORACLE Database that I'm working with.
Any advise would be greatly appreciated; thanks!

You can use ROW_NUMBER() to get the top n per group:
SELECT TypeID,
JobID,
Language,
BillingDt,
etc
FROM ( SELECT TypeID,
JobID,
Language,
BillingDt,
etc,
ROW_NUMBER() OVER(PARTITION BY TypeID ORDER BY JobID) RowNumber
FROM T
) T
WHERE RowNumber = 1;
SQL Fidle
You may need to change the ORDER BY clause to fit your requirements, as you've not said how to pick one row per TypeID I had to guess.

WITH RankedQuery AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY TypeID ORDER BY [ordercolumn] DESC) AS rn
FROM [table]
)
SELECT *
FROM RankedQuery
WHERE rn = 1;
This will return the top row for each type id, you can add an order by if you want a specific row, not just any.

Related

Snowflake: Repeating rows based on column value

How to repeat rows based on column value in snowflake using sql.
I tried a few methods but not working such as dual and connect by.
I have two columns: Id and Quantity.
For each ID, there are different values of Quantity.
So if you have a count, you can use a generator:
with ten_rows as (
select row_number() over (order by null) as rn
from table(generator(ROWCOUNT=>10))
), data(id, count) as (
select * from values
(1,2),
(2,4)
)
SELECT
d.*
,r.rn
from data as d
join ten_rows as r
on d.count >= r.rn
order by 1,3;
ID
COUNT
RN
1
2
1
1
2
2
2
4
1
2
4
2
2
4
3
2
4
4
Ok let's start by generating some data. We will create 10 rows, with a QTY. The QTY will be randomly chosen as 1 or 2.
Next we want to duplicate the rows with a QTY of 2 and leave the QTY =1 as they are.
Obviously you can change all parameters above to suit your needs - this solution works super fast and in my opinion way better than table generation.
Simply stack SPLIT_TO_TABLE(), REPEAT() with a LATERAL() join and voila.
WITH TEN_ROWS AS (SELECT ROW_NUMBER()OVER(ORDER BY NULL)SOME_ID,UNIFORM(1,2,RANDOM())QTY FROM TABLE(GENERATOR(ROWCOUNT=>10)))
SELECT
TEN_ROWS.*
FROM
TEN_ROWS,LATERAL SPLIT_TO_TABLE(REPEAT('hire me $10/hour',QTY-1),'hire me $10/hour')ALTERNATIVE_APPROACH;

pivot function in Bigquery for transpose, wrong value falling in

I'm trying to use pivot function to transpose rows however action_type=4 keeps falling to the wrong column after I ran my query. Below is the sample data:
SessionId
action_type
products
122
3
5
122
4
1
127
3
2
189
4
1
Ideal output will look like below:
SessionId
action_type_1
products_1
action_type_2
products_2
122
3
5
4
1
127
3
2
189
4
1
I have written below query trying to do the transpose:
select * from
(select * except (SessionId),
max(SessionId) over win SessionId,
row_number() over (win order by SessionId, action_type, products) tab
from
`xxx.sample.xxx`
window win as (partition by SessionId)
)
pivot (
any_value (action_type) as action_type ,
any_value(products) as products for tab in (1,2))
However this output has returning some strange results, for example I see value 4 under action_type_1, which is not what I expected. action_type_1 should only have value 3 because I wanted to define action_type_1=3 and action_type_2=4. Can anyone help look at my query? Any advises are appreciated!
I think, below is what you are looking for
select * from your_table
pivot (
any_value (action_type) as action_type ,
any_value(products) as products
for action_type in (3,4)
)
with output
so, as you can see - instead of relying on offset - you just simply go directly off of action type!
In case if for some reason you need output as _1 and _2 - use below trick
select * from your_table
pivot (
any_value (action_type) as action_type ,
any_value(products) as products
for case action_type when 3 then 1 when 4 then 2 end in (1,2)
)
with output

Complex SQL query or queries

I looked at other examples, but I don't know enough about SQL to adapt it to my needs. I have a table that looks like this:
ID Month NAME COUNT First LAST TOTAL
------------------------------------------------------
1 JAN2013 fred 4
2 MAR2013 fred 5
3 APR2014 fred 1
4 JAN2013 Tom 6
5 MAR2014 Tom 1
6 APR2014 Tom 1
This could be in separate queries, but I need 'First' to equal the first month that a particular name is used, so every row with fred would have JAN2013 in the first field for example. I need the 'Last" column to equal the month of the last record of each name, and finally I need the 'total' column to be the sum of all the counts for each name, so in each row that had fred the total would be 10 in this sample data. This is over my head. Can one of you assist?
This is crude but should do the trick. I renamed your fields a bit because you are using a bunch of "RESERVED" sql words and that is bad form.
;WITH cte as
(
Select
[NAME]
,[nmCOUNT]
,ROW_NUMBER() over (partition by NAME order by txtMONTH ASC) as 'FirstMonth'
,ROW_NUMBER() over (partition by NAME order by txtMONTH DESC) as 'LastMonth'
,SUM([nmCOUNT]) as 'TotNameCount'
From Table
Group by NAME, [nmCOUNT]
)
,cteFirst as
(
Select
NAME
,[nmCOUNT]
,[TotNameCount]
,[txtMONTH] as 'ansFirst'
From cte
Where FirstMonth = 1
)
,cteLast as
(
Select
NAME
,[txtMONTH] as 'ansLast'
From cte
Where LastMonth = 1
Select c.NAME, c.nmCount, c.ansFirst, l.ansLast, c.TotNameCount
From cteFirst c
LEFT JOIN cteLast l on c.NAME = l.NAME

SQL - Set field value based on count of previous rows values

I have the following table structure in Microsoft SQL:
ID Name Number
1 John
2 John
3 John
4 Mark
5 Mark
6 Anne
7 Anne
8 Luke
9 Rachael
10 Rachael
I am looking to set the 'Number' field to the number of times the 'Name' field has appeared previously, using SQL.
Desired output as follows:
ID Name Number
1 John 1
2 John 2
3 John 3
4 Mark 1
5 Mark 2
6 Anne 1
7 Anne 2
8 Luke 1
9 Rachael 1
10 Rachael 2
The table is ordered by 'Name', so there is no worry of 'John' appearing under ID 11 again, using my example.
Any help would be appreciated. I'm not sure if I can do this with a simple SELECT statement, or whether I will need an UPDATE statement, or something more advanced.
Use ROW_NUMBER:
SELECT ID, Name,
ROW_NUMBER() OVER (PARTITION BY Name
ORDER BY ID) AS Number
FROM mytable
There is no need to add a field for this, as the value can be easily calculated using window functions.
You should be able to use the ROW_NUMBER() function within SQL Server to partition each group (by their Name property) and output the individual row in each partition :
SELECT ID,
Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY ID) AS Number
FROM YourTable
ORDER BY ID
You can see what your data looks like prior to the query :
and then after it is executed :
If your system doesnt support OVER PARTITION, you can use following code:
SELECT
ID,
Name,
(
SELECT
SUM(counterTable.nameCount)
FROM
mytable innerTable
JOIN (SELECT 1 as nameCount) as counterTable
WHERE
innerTable.ID <= outerTable.ID
AND outerTable.Name = innerTable.Name
) AS cumulative_sum
FROM
mytable outerTable
ORDER BY outerTable.ID
Following CREATE TABLE statement I used and then filled in your data:
CREATE TABLE `mytable` (
`ID` INT(11) NULL DEFAULT NULL,
`Name` VARCHAR(50) NULL DEFAULT NULL
);
This should work with DBS not supporting OVER PARTITION like MySQL, Maria, ...

How to delete all duplicate records from SQL Table?

Hello I have table name FriendsData that contains duplicate records as shown below
fID UserID FriendsID IsSpecial CreatedBy
-----------------------------------------------------------------
1 10 11 FALSE 1
2 11 5 FALSE 1
3 10 11 FALSE 1
4 5 25 FALSE 1
5 10 11 FALSE 1
6 12 11 FALSE 1
7 11 5 FALSE 1
8 10 11 FALSE 1
9 12 11 FALSE 1
I want to remove duplicate combinations rows using MS SQL?
Remove latest duplicate records from MS SQL FriendsData table.
here I attached image which highlights duplicate column combinations.
How I can removed all duplicate combinations from SQL table?
Try this
DELETE
FROM FriendsData
WHERE fID NOT IN
(
SELECT MIN(fID)
FROM FriendsData
GROUP BY UserID, FriendsID)
See here
Or here is more ways to do what you want
Hope this helps
It seems counter-intuitive, but you can delete from a common table expression (under certain circumstances). So, I'd do it like so:
with cte as (
select *,
row_number() over (partition by userid, friendsid order by fid) as [rn]
from FriendsData
)
delete cte where [rn] <> 1
This will keep the record with the lowest fid. If you want something else, change the order by clause in the over clause.
If it's an option, put a uniqueness constraint on the table so you don't have to keep doing this. It doesn't help to bail out a boat if you still have a leak!
I don't know if the syntax is correct for MS-SQL, but in MySQL, the query would look like:
DELETE FROM FriendsData WHERE fID
NOT IN ( SELECT fID FROM FriendsData
GROUP BY UserID, FriendsUserID, IsSpecial, CreatedBy)
In the GROUP BY clause you put the columns you need to be identical in order to consider two records duplicate
Try this query,
select * from FriendsData f1, FriendsData f2
Where f1.fID=f2.fID and f1.UserID =f2.UserID and f1.FriendsID =f2.FriendsID
If it returns you the duplicate rows, then replace Select * by "Delete"
that will solve your problem
Works in Postgres:
DELETE from "FriendsData" where "fID" in
(SELECT "fID" from
(SELECT *, ROW_NUMBER() OVER(PARTITION BY "UserID", "FriendsID" ORDER BY "fID") as rn
FROM "FriendsData") as inner1
WHERE rn > 1);