Select only 20 rows of every distinct name - sql

I have a table in which I have over 1000+ rows, in which there is a column "AnaId", values of this column are repeated many times like name 003912 is repeated 85 times, name 003156 in repeated 70 time, I want to select maximum 20 rows of every distinct AnaID. I have no idea how to do it.
SELECT dbo.Analysis.AnaId, Analysis.CasNo, MoleculeId,
SUM(dbo.AnalysisSummary.Area) as TotalArea
FROM dbo.Analysis LEFT JOIN dbo.AnalysisSummary
ON dbo.AnalysisSummary.AnaId = dbo.Analysis.AnaId
WHERE dbo.Analysis.Sample like '%Oil%'
GROUP BY dbo.Analysis.AnaId,Analysis.CasNo, MoleculeId ORDER BY
TotalArea DESC

You can use row_number():
select t.*
from (select t.*, row_number() over (partition by name order by name) as seqnum
from t
) t
where seqnum <= 20;
With the edits to your question, you can do:
with t as (
<your query here without order by>
)
select t.*
from (select t.*, row_number() over (partition by name order by name) as seqnum
from t
) t
where seqnum <= 20;
If you have another table of names, you can also use cross apply:
select t.*
from names n cross apply
(select top 20 t.*
from t
where t.name = n.name
) t;

Using Rank()
select t.*
from (select t.*, rank() over (partition by name order by name) as seqnum
from t
) t
where seqnum <= 20;
Using Dense_Rank()
select t.*
from (select t.*, Dense_Rank() over (partition by name order by name) as seqnum
from t
) t
where seqnum <= 20;
Using Row_Number
select t.*
from (select t.*, row_number() over (partition by name order by name) as seqnum
from t
) t
where seqnum <= 20;
This will help uunderstand usage of each Special Functions
Base Code Credits:-#gordon

Related

Selecting the latest order

I need to select the data of all my customers with the records displayed in the image. But I need to get the most recent record only, for example I need to get the order # E987 for John and E888 for Adam. As you can see from the example, when I do the select statement, I get all the order records.
You don't mention the specific database, so I'll answer with a generic solution.
You can do:
select *
from (
select t.*,
row_number() over(partition by name order by order_date desc) as rn
from t
) x
where rn = 1
You can use analytical function row_number.
Select * from
(Select t.*,
Row_number() over (partition by customer_id order by order_date desc) as rn
From your_table t) t
Where rn = 1
Or you can use not exists as follows:
Select *
From yoir_table t
Where not exists
(Select 1 from your_table tt
Where t.customer_id = tt.custome_id
And tt.order_date > t.order_date)
You can do it with a subquery that finds the last order date.
SELECT t.*
FROM yoir_table t
JOIN (SELECT tt.custome_id,
MAX(tt.order_date) MaxOrderDate
FROM yoir_table tt
GROUP BY tt.custome_id) AS tt
ON t.custome_id = tt.custome_id
AND t.order_date = tt.MaxOrderDate

Filtering for MAX Beginning Date

I am currently getting the output of the image below. I want to be able to retrieve the latest Turn Time. Essentially the MAX beginning date and MAX end date. How Should I structure my query ?
I think you just want order by:
select top (1) t.*
from t
order by enddate desc, beginning_date desc;
If you want this per id, then you can use window functions or top (1) with ties:
select top (1) t.*
from (select t.*,
row_number() over (partition by id order by enddate desc, beginning_date desc) as seqnum
from t
) t
where seqnum = 1;
You can use row_number()
select * from
(
select *,row_number() over(parition by id order by beginningdate desc) as rn
from tablename
)A where rn=1
For the larger turn time -
select * from
(
select *,row_number() over(parition by id order by turntime desc) as rn
from tablename
)A where rn=1

how to update all rows except first?

I have this sql query :
update CCUSTOMERINFO set VALIDTO=sysdate where (
select * from (
select row_number() over (order by created desc) rn, customer_id, CCUSTOMERINFO.VALIDTO
from CCUSTOMERINFO
where customer_id=100309772 order by created DESC) where rn > 1);
But it say it have some mistake.
This query returns all i want to update :
select * from (
select row_number() over (order by created desc) rn, customer_id, CCUSTOMERINFO.VALIDTO
from CCUSTOMERINFO
where customer_id=100309772 order by created DESC) where rn > 1)
Any suggestion how can i do that?
Use it like the below
update CCUSTOMERINFO set VALIDTO=sysdate where rowid in (
select row_id from (
select row_number() over (order by created desc) rn, customer_id, rowid row_id,
CCUSTOMERINFO.VALIDTO
from CCUSTOMERINFO
where customer_id=100309772 order by created DESC) where rn > 1);

Select Random Values for Grouped Dataset

I'm no whizz at SQL. However I'm using the following query:
select count(*) as countis, avclassfamily
from malwarehashesandstrings
where behaviouralbinary IS true and
avclassfamily != 'SINGLETON'
group by avclassfamily
ORDER BY countis desc
LIMIT 50;
I would like to select 3 random hashes from the malwarehashsha256 column grouped by the avclassfamily column.
The following query works, question over:
select count(*) as countis,avclassfamily from malwarehashesandstrings where behaviouralbinary IS true and avclassfamily != 'SINGLETON' group by avclassfamily ORDER BY countis desc LIMIT 50;
virustotal=# select m.avclassfamily, m.cnt,
array_agg(malwarehashsha256)
from (select malwarehashesandstrings.*,
count(*) over (partition by avclassfamily) as cnt,
row_number() over (partition by avclassfamily order by random()) as seqnum
from malwarehashesandstrings
where behaviouralbinary and
avclassfamily <> 'SINGLETON'
) as m
where seqnum <= 3
group by m.avclassfamily, m.cnt ORDER BY m.cnt DESC LIMIT 50;
If I understand correctly, you can use row_number():
select m.*
from (select m.*,
row_number() over (partition by m.avclassfamily order by random()) as seqnum
from malwarehashesandstrings m
where m.behaviouralbinary and
m.avclassfamily <> 'SINGLETON'
) m
where seqnum <= 3;
If you want this in a column in your existing query, one method is:
select m.avgclassfamily, m.cnt,
array_agg(m.malwarehashsha256)
from (select m.*,
count(*) over (partition by m.avgclassfamily) as cnt,
row_number() over (partition by m.avclassfamily order by random()) as seqnum
from malwarehashesandstrings m
where m.behaviouralbinary and
m.avclassfamily <> 'SINGLETON'
) m
where seqnum <= 3
group by m.avgclassfamily, m.cnt;

I want to generate continuously number by 2 column and batch wise

I want to generate continuously number with the combination of 2 columns and in batch size of 5. Anybody can help to solve this?
An adoption of #GordonLinoff's answer...
SELECT
name,
rank,
DENSE_RANK() OVER (ORDER BY name DESC, Rank, ((seqnum - 1) / 5)) AS rno
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY name, rank ORDER BY (SELECT null)) AS seqnum
FROM
yourTable
)
sequenced
ORDER BY
3
You can use row_number() and arithmetic:
select name, rank,
((seqnum - 1) / 5) + 1 as rno
from (select t.*,
row_number() as (partition by name, rank order by (select null)) as seqnum
from t
) t
order by seqnum;