How do I select rows with maximum value? - sql

Given this table I want to retrieve for each different url the row with the maximum count. For this table the output should be: 'dell.html' 3, 'lenovo.html' 4, 'toshiba.html' 5
+----------------+-------+
| url | count |
+----------------+-------+
| 'dell.html' | 1 |
| 'dell.html' | 2 |
| 'dell.html' | 3 |
| 'lenovo.html' | 1 |
| 'lenovo.html' | 2 |
| 'lenovo.html' | 3 |
| 'lenovo.html' | 4 |
| 'toshiba.html' | 1 |
| 'toshiba.html' | 2 |
| 'toshiba.html' | 3 |
| 'toshiba.html' | 4 |
| 'toshiba.html' | 5 |
+----------------+-------+
What SQL query do I need to write to do this?

Try to use this query:
select url, max(count) as count
from table_name
group by url;

use aggregate function
select max(count) ,url from table_name group by url
From your comments it seems you need corelated subquery
select t1.* from table_name t1
where t1.count = (select max(count) from table_name t2 where t2.url=t1.url
)
If row_number support on yours sqllite version
then you can write query like below
select * from
(
select *,row_number() over(partition by url order by count desc) rn
from table_name
) a where a.rn=1

Related

Oracle SQL Any comparision with subquery raises right paranthesis missing error

The query works fine with any operator for multiple values for any comparison.
SELECT Name, ID
from tblABC
where ID = ANY (1,2,3,4,5 )
But when a subquery is used for any comparison a right parenthesis missing error occurs
SELECT Name, ID
from tblABC
where ID = ANY (select ID from tblXYZ where ROWNUM <= 10 order by ID desc )
The subquery just gives the top 10 recent id entries from the selected table. Should there be a conversion to number or missing condition in this query?
The reason is order by, which is not necessary as it is evaluated after count stopkey (which is rownum < <constant>).
select *
from table(dbms_xplan.display_cursor(format => 'BASIC +PREDICATE'));
| PLAN_TABLE_OUTPUT |
| :----------------------------------------------------------------------- |
| EXPLAINED SQL STATEMENT: |
| ------------------------ |
| select /*+ gather_plan_statistics */ * from t where rownum < 5 order by |
| 1 asc |
| |
| Plan hash value: 846588679 |
| |
| ------------------------------------ |
| | Id | Operation | Name | |
| ------------------------------------ |
| | 0 | SELECT STATEMENT | | |
| | 1 | SORT ORDER BY | | |
| |* 2 | COUNT STOPKEY | | |
| | 3 | TABLE ACCESS FULL| T | |
| ------------------------------------ |
| |
| Predicate Information (identified by operation id): |
| --------------------------------------------------- |
| |
| 2 - filter(ROWNUM<5) |
| |
If you are on Oracle 12C+, then you may use fetch first:
select *
from dual
where 1 = any(select l from t order by 1 asc fetch first 4 rows only)
| DUMMY |
| :---- |
| X |
Or row_number() for older versions:
select *
from dual
where 1 = any (
select l
from (
select l, row_number() over(order by l asc) as rn
from t
)
where rn < 5
)
| DUMMY |
| :---- |
| X |
db<>fiddle here
It is order by part. It is not supported within sub-queries like this.
Just remove it. You don't need it for comparison anyway.
SELECT Name, ID
from tblABC
where ID = ANY (select ID from tblXYZ where ROWNUM <= 10 )
You can use FETCH FIRST <n> ROWS ONLY instead of using the old ROWNUM in the subquery.
For example:
SELECT Name, ID
from tblABC
where ID = ANY (select ID
from tblXYZ
order by ID desc
fetch first 10 rows only)
See running example at db<>fiddle.

SQL SERVER How to select the latest record in each group? [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 2 years ago.
| ID | TimeStamp | Item |
|----|-----------|------|
| 1 | 0:00:20 | 0 |
| 1 | 0:00:40 | 1 |
| 1 | 0:01:00 | 1 |
| 2 | 0:01:20 | 1 |
| 2 | 0:01:40 | 0 |
| 2 | 0:02:00 | 1 |
| 3 | 0:02:20 | 1 |
| 3 | 0:02:40 | 1 |
| 3 | 0:03:00 | 0 |
I have this and I would like to turn it into
| ID | TimeStamp | Item |
|----|-----------|------|
| 1 | 0:01:00 | 1 |
| 2 | 0:02:00 | 1 |
| 3 | 0:03:00 | 0 |
Please advise, thank you!
A correlated subquery is often the fastest method:
select t.*
from t
where t.timestamp = (select max(t2.timestamp)
from t t2
where t2.id = t.id
);
For this, you want an index on (id, timestamp).
You can also use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by timestamp desc) as seqnum
from t
) t
where seqnum = 1;
This is typically a wee bit slower because it needs to assign the row number to every row, even those not being returned.
You need to group by id, and filter out through timestamp values descending in order to have all the records returning as first(with value 1) in the subquery with contribution of an analytic function :
SELECT *
FROM
(
SELECT *,
DENSE_RANK() OVER (PARTITION BY ID ORDER BY TimeStamp DESC) AS dr
FROM t
) t
WHERE t.dr = 1
where DENSE_RANK() analytic function is used in order to include records with ties also.

Impala - Does impala allow multi GROUP_CONCAT in one query

For example, I have a table below
+-----------+-------+------------+
| Id | a| b|
+-----------+-------+------------+
| 1 | 6 | 20 |
| 1 | 4 | 55 |
| 1 | 9 | 56 |
| 1 | 2 | 67 |
| 1 | 7 | 80 |
| 1 | 5 | 66 |
| 1 | 3 | 33 |
| 1 | 8 | 34 |
| 1 | 1 | 52 |
I want the output would be like below by using Impala
+-----------+-------------------+-----------------------------+
| Id | a | b |
+-----------+-------------------+-----------------------------+
| 1 | 6,4,9,2,7,5,3,8,1 | 20,55,56,67,80,66,33,34,52 |
+-----------+-------------------+-----------------------------+
In Impala, I have used
SELECT Id,
group_concat(DISTINCT a) AS a,
group_concat(DISTINCT b) AS b
FROM table GROUP BY Id
It will always get Syntax error. Just wondering is that we are not allowed to use multi group_concat for one query in Impala? or not allow to use multi Distinct for one query?
From the documentation for GROUP_CONCAT:
You cannot apply the DISTINCT operator to the argument of this function.
But, as workaround, we can use two separate subqueries to find the distinct values:
WITH cte1 AS (
SELECT Id, GROUP_CONCAT(a) AS a
FROM (SELECT DISTINCT Id, a FROM yourTable) t
GROUP BY Id
),
cte2 AS (
SELECT Id, GROUP_CONCAT(b) AS b
FROM (SELECT DISTINCT Id, b FROM yourTable) t
GROUP BY Id
)
SELECT
t1.Id,
t1.a,
t2.b
FROM cte1 t1
INNER JOIN cte2 t2
ON t1.Id = t2.Id;

Getting the last updated name

I am having a table having records like this:
+------+------+
| ID | name |
+------+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
+------+------+
I need to get value of A after it was last updated from a different value, for example here it would be the row at ID 6.
Try this query (MySQL syntax):
select min(ID)
from records
where name = 'A'
and ID >=
(
select max(ID)
from records
where name <> 'A'
);
Illustration:
select * from records;
+------+------+
| ID | name |
+------+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
+------+------+
-- run query:
+---------+
| min(ID) |
+---------+
| 6 |
+---------+
Using the Lag function...
SELECT Max([ID])
FROM (SELECT [name], [ID],
Lag([name]) OVER (ORDER BY [ID]) AS PrvVal
FROM tablename) tbl
WHERE [name] = 'A'
AND prvval <> 'A'
Online Demo: http://www.sqlfiddle.com/#!18/a55eb/2/0
If you want to get the whole row, you can do this...
SELECT Top 1 *
FROM (SELECT [name], [ID],
Lag([name]) OVER (ORDER BY [ID]) AS PrvVal
FROM tablename) tbl
WHERE [name] = 'A' AND prvval <> 'A'
ORDER BY [ID] DESC
Online Demo: http://www.sqlfiddle.com/#!18/a55eb/22/0
The ANSI SQL below uses a self-join on the previous id.
And the where-clause gets those with a name that's different from the previous.
select max(t1.ID) as ID
from YourTable as t1
left join YourTable as t2 on t1.ID = t2.ID+1
where (t1.name <> t2.name or t2.name is null)
and t1.name = 'A';
It should work on most RDBMS, including MS Sql Server.
Note that with the ID+1 that there's an assumption that are no gaps between the ID's.

How to apply a SUM operation without grouping the results in SQL?

I have a table like this one:
+----+---------+----------+
| id | group | value |
+----+---------+----------+
| 1 | GROUP A | 0.641028 |
| 2 | GROUP B | 0.946927 |
| 3 | GROUP A | 0.811552 |
| 4 | GROUP C | 0.216978 |
| 5 | GROUP A | 0.650232 |
+----+---------+----------+
If I perform the following query:
SELECT `id`, SUM(`value`) AS `sum` FROM `test` GROUP BY `group`;
I, obviously, get:
+----+-------------------+
| id | sum |
+----+-------------------+
| 1 | 2.10281205177307 |
| 2 | 0.946927309036255 |
| 4 | 0.216977506875992 |
+----+-------------------+
But I need a table like this one:
+----+-------------------+
| id | sum |
+----+-------------------+
| 1 | 2.10281205177307 |
| 2 | 0.946927309036255 |
| 3 | 2.10281205177307 |
| 4 | 0.216977506875992 |
| 5 | 2.10281205177307 |
+----+-------------------+
Where summed rows are explicitly repeated.
Is there a way to obtain this result without using multiple (nested) queries?
IT would depend on your SQL server, in Postgres/Oracle I'd use Window Functions. In MySQL... not possible afaik.
Perhaps you can fake it like this:
SELECT a.id, SUM(b.value) AS `sum`
FROM test AS a
JOIN test AS b ON a.`group` = b.`group`
GROUP BY a.id, b.`group`;
No there isn't AFAIK. You will have to use a join like
SELECT t.`id`, tsum.sum AS `sum`
FROM `test` as t GROUP BY `group`
JOIN (SELECT `id`, SUM(`value`) AS `sum` FROM `test` GROUP BY `group`) AS tsum
ON tsum.id = t.id