SQL - How to extract the minimum of all the maximums - sql

Supposing I have a table "main":
CAT YEAR
1 2010
1 2015
2 2012
2 2010
I succeed to extract the maximum year by category with:
SELECT CAT, MAX(YEAR) FROM main GROUP BY CAT
And I would like to get the minimum of the maximum year values, namely 2012 (the third row).
Something like that:
SELECT MIN(SELECT MAX(YEAR) FROM main GROUP BY CAT)
Could someone help me?

Using your query as a subquery you can select the minimum value of the extracted years from the subquery as follow:
select min(year) from -- Select the minimum year from
(
SELECT CAT, MAX(YEAR) as year FROM main GROUP BY CAT
) -- Your query as subquery

SELECT MIN(yr)
FROM (SELECT CAT, MAX(YEAR) as yr FROM main GROUP BY CAT)
This is done using a subquery.
A subquery is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. A subquery can be used anywhere an expression is allowed.

One method uses a subquery. I prefer returning a single row:
select cat, max(year)
from main
group by cat
order by max(year) desc
fetch first 1 row only;
This allows you to get the cat and the year in a single query.
Note: not all databases support the ANSI standard fetch first 1 row only. Some use limit, top or even other methods.

You can try use SELECT in SELECT. It's good ways to resolve problem like this.
SELECT min(yr)
FROM (SELECT cat, max(YEAR) as yr
FROM main
GROUP BY cat)

You're already halfway there and can select the minimum value from the table you just created (by using it in the FROM part):
SELECT MIN(maxYearPerCat) FROM
(SELECT CAT, MAX(YEAR) as maxYearPerCatFROM main GROUP BY CAT) as tmpTable
See this SqlFiddle

SELECT MAX(YEAR) FROM main GROUP BY CAT ORDER BY 1 LIMIT 1;

Related

Oracle SQL how to find count less than avg

my code is like :
SELECT
number,
name,
count(*) as "the number of correct answer"
FROM
table1 NATURAL JOIN table2
WHERE
answer = 'T'
GROUP BY
number,
name
HAVING
count(*) < avg(count(*))
ORDER BY
count(*);
Here I want to find the group with count less than the average number of count for each group, but here I failed to use HAVING or WHERE, could anyone help me?
How can I only select the 1 name1 2 since avg of count is (2+6+7)/3 = 5 and only 2 is less than avg.
number name count
1 name1 2
2 name2 6
3 name3 7
I would advise you to never use natural joins. They obfuscate the query and make the query a maintenance nightmore.
You can use window functions:
SELECT t.*
FROM (SELECT number, name,
COUNT(*) as num_correct,
AVG(COUNT(*)) OVER () as avg_num_correct
FROM table1 JOIN
table2
USING (?). -- be explicit about the column name
WHERE answer = 'T'
GROUP BY number, name
) t
WHERE num_correct < avg_num_correct;
As with your version of the query, this filters out all groups that have no correct answers.
I would place your current query logic into a CTE, and then tag on the average count in the process:
WITH cte AS (
SELECT number, name, COUNT(*) AS cnt,
AVG(COUNT(*)) OVER () AS avg_cnt
FROM table1
NATURAL JOIN table2
WHERE answer = 'T'
GROUP BY number, name
)
SELECT number, name, cnt AS count
FROM cte
WHERE cnt < avg_cnt;
Here we are using the AVG() function as an analytic function, with the window being the entire aggregated table. This means it will find the average of the counts per group, across all groups (after aggregation). Window functions (almost) always evaluate last.

SQL (BigQuery): How do i use a single value, derived with another query?

This is my query:
WITH last_transaction AS (
SELECT
month
FROM db.transactions
ORDER BY date DESC
LIMIT 1
)
SELECT
*
FROM db.transactions
-- WHERE month = last_transaction.month
WHERE month = 11
GROUP BY
id
Commented out line doesn't work, but intention is clear, i assume: i need to select transactions for the latest month. Business logic might not make sense, because i've extracted it from a bigger query. The main question is: how do i use a single value, derived with another query.
You have only one row, so you can use a scalar subquery:
SELECT t.*
FROM db.transactions t
WHERE month = (SELECT last_transaction.month FROM last_transaction);
I removed the GROUP BY id because it would be a syntax error in BigQuery and it logically does not make sense. Why would a column called id be duplicated in the table?
However, this query would often be written as:
SELECT t.*
FROM (SELECT t.*, MAX(month) OVER () as max_month
FROM db.transactions t
WHERE month = max_month;
Try to JOIN the last_transaction.
A bit like this;
SELECT *
FROM db.transactions
JOIN last_transaction
ON db.transactions.id = last_transaction.id
WHERE month = last_transaction.month
GROUP BY id

Find max over multiple columns

I am trying to query a list of meetings from the most recent semester, where semester is determined by two fields (year, semester). Here's a basic outline of the schema:
Otherfields Year Semester
meeting1 2014 1
meeting2 2014 1
meeting3 2013 2
... etc ...
As the max should be considered for the Year first, and then the Semester, my results should look like this:
Otherfields Year Semester
meeting1 2014 1
meeting2 2014 1
Unfortunately simply using the MAX() function on each column separately will try to find Year=2014, Semester=2, which is incorrect. I tried a couple approaches using nested subqueries and inner joins but couldn't quite get something to work. What is the most straightforward approach to solving this?
Using a window function:
SELECT Year, Semester, RANK() OVER(ORDER BY Year DESC, Semester DESC) R
FROM your_table;
R will be a column containing the "rank" of the couple (Year, Semester). You can then use this column as a filter, for instance :
WITH TT AS (
SELECT Year, Semester, RANK() OVER(ORDER BY Year DESC, Semester DESC) R
FROM your_table
)
SELECT ...
FROM TT
WHERE R = 1;
If you don't want gaps between ranks, you can use dense_rank instead of rank.
This answer assumes you use a RDBMS who is advanced enough to offer window functions (i.e. not MySQL)
I wouldn't be surprised if there's a more effecient way to do this (and avoid the duplicate subquery), but this will get you the answer you want:
SELECT * FROM table WHERE Year =
(SELECT MAX(Year) FROM table)
AND Semester =
(SELECT MAX(Semester) FROM table WHERE Year =
(SELECT MAX(Year) FROM table))
Here's Postgres:
with table2 as /*virtual temporary table*/
(
select *, year::text || semester as yearsemester
from table
)
select Otherfields, year, semester
from table2
where (Otherfields, yearsemester) in
(
select Otherfields, max(yearsemester)
from table2
group by Otherfields
)
I've been overthinking this, there's a much simpler way to get this:
SELECT Meeting.year, Meeting.semester, Meeting.otherFields
FROM Meeting
JOIN (SELECT year, semester
FROM Meeting
WHERE ROWNUM = 1
ORDER BY year DESC, semester DESC) MostRecent
ON MostRecent.year = Meeting.year
AND MostRecent.semester = Meeting.semester
(and working Fiddle)
Note that variations of this should work for pretty much all dbs (anything that supports a limiting clause in a subquery); here's the MySQL version, for example:
SELECT Meeting.year, Meeting.semester, Meeting.otherFields
FROM Meeting
JOIN (SELECT year, semester
FROM Meeting
ORDER BY year DESC, semester DESC
LIMIT 1) MostRecent
ON MostRecent.year = Meeting.year
AND MostRecent.semester = Meeting.semester
(...and working fiddle)
Given some of the data in this answer this should be performant for Oracle, and I suspect other dbs as well (given the shortcuts the optimizer is allowed to take). This should be able to replace the use of things like ROW_NUMBER() in most instances where no partitioning clause is provided (no window).
why don't you simply use ORDER BY???
that way, it would be easier to handle and less messy!! :)
SELECT * FROM table
Where Year = (Select Max(Year) from table) /* optional clause to select only 2014*/
Order by Semester ASC, Year DESC, Otherfields; /*numericaly lowest sem first. in case of sem clash, sort by descending year first */
EDIT
In case, you need limited results from 2014, use Limit clause ( for mysql )
SELECT * FROM table
Where Year = (Select Max(Year) from table)
Order by Semester ASC, Year DESC, Otherfields
LIMIT 10;
It will order first, then get the Limit - 10, so u get your limited result set!
This will fetch output like :
Otherfields Year Semester
meeting1 2014 1
meeting2 2014 1
meeting1 2013 1
meeting2 2013 2
Answering my own question here:
This query was run in a stored procedure, so I went ahead and found the maximum year/semester in separate queries before the rest of the query. This is most likely inefficient and inelegant, but it is also the most understandable method- I don't need to worry about other members of my team getting confused by it. I'll leave this question here since it's generally applicable to many other situations, and there appear to be some good answers providing alternative approaches.
-- Find the most recent year.
SELECT MAX(year) INTO max_year FROM meeting;
-- Find the most recent semester in the year.
SELECT MAX(semester) INTO max_semester FROM meeting WHERE year = max_year;
-- Open a ref cursor for meetings in most recent year/semester.
OPEN meeting_list FOR
SELECT otherfields, year, semester
FROM meeting
WHERE year = max_year
AND semester = max_semester;

SELECT a variable as condition and display another one

My problem is the following.
I want to SELECT the minimum of a years list and display another row of my table.
An example:
SELECT MIN(Year)
FROM table -> Searching for the lowest year.
and then I want it to display the Winners of the first year.
Is there a way to do this in just one line?
select winner from table where year in (select min(year) from table)
You need to do a self-JOIN between table and itself (I suppose you want to do it in a single statement, not a single line):
SELECT A.*
FROM table AS A
JOIN ( SELECT MIN(Year) AS Year FROM table ) AS B
ON (A.Year = B.Year);
This assumes that there is only one record per minimum-year in every group of interest.
Try it
select t.columntoshow, min(t.year) from table t group by t.columntoshow

adding count( ) column on each row

I'm not sure if this is even a good question or not.
I have a complex query with lot's of unions that searches multiple tables for a certain keyword (user input). All tables in which there is searched are related to the table book.
There is paging on the resultset using LIMIT, so there's always a maximum of 10 results that get withdrawn.
I want an extra column in the resultset displaying the total amount of results found however. I do not want to do this using a separate query. Is it possible to add a count() column to the resultset that counts every result found?
the output would look like this:
ID Title Author Count(...)
1 book_1 auth_1 23
2 book_2 auth_2 23
4 book_4 auth_.. 23
...
Thanks!
This won't add the count to each row, but one way to get the total count without running a second query is to run your first query using the SQL_CALC_FOUND_ROWS option and then select FOUND_ROWS(). This is sometimes useful if you want to know how many total results there are so you can calculate the page count.
Example:
select SQL_CALC_FOUND_ROWS ID, Title, Author
from yourtable
limit 0, 10;
SELECT FOUND_ROWS();
From the manual:
http://dev.mysql.com/doc/refman/5.1/en/information-functions.html#function_found-rows
The usual way of counting in a query is to group on the fields that are returned:
select ID, Title, Author, count(*) as Cnt
from ...
group by ID, Title, Author
order by Title
limit 1, 10
The Cnt column will contain the number of records in each group, i.e. for each title.
Regarding second query:
select tbl.id, tbl.title, tbl.author, x.cnt
from tbl
cross join (select count(*) as cnt from tbl) as x
If you will not join to other table(s):
select tbl.id, tbl.title, tbl.author, x.cnt
from tbl, (select count(*) as cnt from tbl) as x
My Solution:
SELECT COUNT(1) over(partition BY text) totalRecordNumber
FROM (SELECT 'a' text, id_consult_req
FROM consult_req cr);
If your problem is simply the speed/cost of doing a second (complex) query I would suggest you simply select the resultset into a hash-table and then count the rows from there while returning, or even more efficiently use the rowcount of the previous resultset, then you do not even have to recount
This will add the total count on each row:
select count(*) over (order by (select 1)) as Cnt,*
from yourtable
Here is your answare:
SELECT *, #cnt count_rows FROM (
SELECT *, (#cnt := #cnt + 1) row_number FROM your_table
CROSS JOIN (SELECT #cnt := 0 AS variable) t
) t;
You simply cannot do this, you'll have to use a second query.