Max value for each group in subquery - sql

I have a table with 10 columns and I am interested in 3 of those.
Say tableA with id, name, url, ranking.
id |name |url |ranking
--------------------------------
1 |apple |a1.com |1
2 |apple |a1.com |2
3 |apple |a1.com |3
4 |orange |o1.com |1
5 |orange |o1.com |2
6 |apple |a1.com |4
So, what I want is, all the columns for row with id 5 and 6. That would be row with maximum ranking for each group (apple, orange)

Use row_number to number the rows in each name group by their ranking in the descending order and select the the first row per each group.
select id,name,url,ranking
from
(select t.*, row_number() over(partition by name order by ranking desc) as rn
from tablename t) t
where rn =1

Related

How to sum duplicates SQL/PSQL

I have some problem with how to build a query to sum all duplicates, in this query below I can count all occurrences.
SELECT COUNT (*) occurrences
FROM back.submission s
GROUP BY s.name
HAVING COUNT(*) > 1
----------
|# |occurrences|
|1 | 9 |
|2 | 6 |
|3 | 5 |
|4 | 4 |
|5 | 4 |
|6 | 3 |
....
I would like to know how to sum all occurrences, i tried to put count inside SUM, but it doesn't work
Do you want an other level of aggregation?
SELECT COUNT(occurences) AS count_of_duplicates, SUM(occurences) AS sum_of_duplicates
FROM (
SELECT COUNT (*) occurrences
FROM back.submission s
GROUP BY s.name
HAVING COUNT(*) > 1
) t
SELECT #,count(*) As Total FROM back.submission GROUP BY # HAVING COUNT(*) > 1;
With CTE
As
(
Select [#],Count([*]) As Total From back.submission Group By [#]
)
select [#],Total From CTE Where Total>1

how many rows have different value

Given a table events
sensor_id | event_type | value | time
----------+------------+--------+------------
2 |2 | 3.45 | 2014-02 (...)
2 |4 | (...) | (...)
2 |2 | (...) | (...)
3 |2 | (...) | (...)
2 |3 | (...) | (...)
Write an SQL query that returns a set of all sensors_id with the number of different event_types registered by each of them, ORDER BY sensor_id ASC
Query should return the following rowset
sensor_id | type
----------+------------
2 |3
3 |1
The names of the columns in the rowest don't matter, but their order does
My query:
SELECT
sensor_id, COUNT(*) AS `types`
FROM
`events`
GROUP BY
sensor_id
ORDER BY
sensor_id ASC
And result:
sensor_id | types
----------+------------
2 |4 <= error
3 |1
use distinct event_Type inside count
SELECT
sensor_id, COUNT(distinct event_type) AS `types`
FROM
`events`
GROUP BY
sensor_id
ORDER BY
sensor_id ASC
You can use window function:
select distinct sensor_id, types from (
SELECT
sensor_id, COUNT(distinct event_type) over(partition by sensor_id) AS `types`
FROM
`events` ) X
ORDER BY
sensor_id ASC;
Try this:
SELECT sensor_id, COUNT(DISTINCT event_type) as type
FROM #tbltemp
GROUP BY sensor_id
ORDER BY sensor_id
If you do not include count distinct value it will count no 2 two times (2,2,3,4).
If you put distinct it will count as (2,3,4) only.

How can I query Row number in this way

I'm struggling to get query and set row_number in this way. Could anyone give me a way to set row number like this?
ProjectID|RevisionYear|Row_Number|
1 |2016 |1 |
1 |2017 |2 |
1 |2017 |2 |
2 |2019 |1 |
2 |2019 |1 |
2 |2020 |2 |
You need to use DENSE_RANK() instead of ROW_NUMBER(). As is explained in the documentation, this function returns the rank of each row within a result set partition, with no gaps in the ranking values and the rank of a specific row is one plus the number of distinct rank values that come before that specific row:
Statement:
SELECT
ProjectID,
StartYear,
DENSE_RANK() OVER (PARTITION BY ProjectID ORDER BY StartYear) AS Row_Number
FROM (VALUES
(1, 2016),
(1, 2017),
(1, 2017),
(2, 2019),
(2, 2019),
(2, 2020)
) v (ProjectID, StartYear)
Result:
ProjectID StartYear Row_Number
1 2016 1
1 2017 2
1 2017 2
2 2019 1
2 2019 1
2 2020 2

T-SQL get the last date time record

My table looks like this:
+---------+------------------------+-------+---------+---------+
|channel |date |code |comment |order_id |
+---------+------------------------+-------+---------+---------+
|1 |2017-10-27 12:04:45.397 |2 |comm1 |1 |
|1 |2017-10-27 12:14:20.997 |1 |comm2 |1 |
|2 |2017-10-27 12:20:59.407 |3 |comm3 |1 |
|2 |2017-10-27 13:14:20.997 |1 |comm4 |1 |
|3 |2017-10-27 12:20:59.407 |2 |comm5 |1 |
|3 |2017-10-27 14:20:59.407 |1 |comm6 |1 |
+---------+------------------------+-------+---------+---------+
And I expect result like this:
+---------+------------------------+-------+---------+
|channel |date |code |comment |
+---------+------------------------+-------+---------+
|1 |2017-10-27 12:14:20.997 |1 |comm2 |
|2 |2017-10-27 13:14:20.997 |1 |comm4 |
|3 |2017-10-27 14:20:59.407 |1 |comm6 |
+---------+------------------------+-------+---------+
Always 1 record with order_id = x and max date for each channel. Total number of channels is constant.
My query works but I'm worried about performance as the table grows. Doing three almost identical queries doesn't seem smart.
select
*
from
(select top(1)
channel,
date,
code,
comment
from
status
where
channel = 1 and
order_id = 1 and
cast(date as date) = '2017-10-27'
order by
date desc) channel1
union
select
*
from
(select top(1)
channel,
date,
code,
comment
from
status
where
channel = 2 and
order_id = 1 and
cast(date as date) = '2017-10-27'
order by
date desc) channel2
union
select
*
from
(select top(1)
channel,
date,
code,
comment
from
status
where
channel = 3 and
order_id = 1 and
cast(date as date) = '2017-10-27'
order by
date desc) channel3
How can I improve this?
Another option is using the WITH TIES clause. No sub-query or extra field.
Select top 1 with ties *
From YourTable
Order By Row_Number() over (Partition By channel order by date desc)
Try using the ROW_NUMBER() function and a derived table. It will save you a lot of headaches. Try:
select channel
,date
,code
,comment
from
(select *
,row_number() over(partition by channel order by code asc) rn --probably don't need asc since it is ascending by default
from mytable) t
where t.rn = 1
Assuming you want the latest row for each channel, this would work.
SELECT *
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY s.channel ORDER BY [date] DESC) AS rn,
*
FROM [status] AS s
) AS t
WHERE t.rn = 1

Get count of all instances before a certain date

I have a table like this:
--------------------------------------
RecID|name |date
--------------------------------------
1 |John | 05/09/2016
2 |John | 05/02/2016
3 |Mary | 05/09/2016
4 |Mary | 05/08/2016
5 |Mary | 03/02/2016
and I want to get the count for name for each instance in which that name has appeared on or before that date in the row. So I want the output to look like this:
--------------------------------------
RecID|name |date |count
--------------------------------------
1 |John | 05/09/2016 | 2
2 |John | 05/02/2016 | 1
3 |Mary | 05/09/2016 | 3
4 |Mary | 05/08/2016 | 2
5 |Mary | 03/02/2016 | 1
Any ideas on how I should go about doing this?
You can use the count function with a window specification.
select t.*, count(*) over(partition by name order by date) as cnt
from tablename t
This will produce incorrect results if there are mutliple rows on a given date for a name. One way to avoid this is using a correlated sub-query.
select t.*,
(select count(distinct t2.date)
from tablename t2
where t2.name=t.name and t2.date<=t.date) as cnt
from tablename t
Or use row_number.
select t.*, row_number() over(partition by name order by date) as cnt
from tablename t
Or use dense_rank if there can be multiple rows for the same name on a given date.
select t.*, dense_rank() over(partition by name order by date) as cnt
from tablename t
The easiest solution of all would be to use dense_rank.
use
count(*) count
and
group by date
if your date is already a string (i.e. without hour/minute information)