Limiting output with different criterias - sql

I have the following SQL statement:
select
row_number() over(),
car, group, yearout
from (select..... )inner
where year(inner.yearout) between '2010' and '2030'
order by inner.group)temp
the output is like
1 test1 1 2010
2 test2 1 2010
3 test3 1 2012
4 test1 2 2010
5 test1 3 2011
and so on.
There is another table called outerno with is filled like:
no yearo amnt
1 2010 10
2 2010 15
3 2010 5
4 2010 10
5 2010 15
6 2010 8
1 2011 4
2 2011 15
and so on.
There are 6 groups in the table for each year.
Now the problem is that I need to limit the output of the query as stated in the outerno table.
So I need the first 10 row for 2010 for group 1, the first 15 rows of 2010 for group 2 and so on. For each year and group there is a value in the outerno.
I tried to use row_number but I don't know how to limit the output in this way since I would be needing for example rows 1-10, 50-65, 83-88 and so on.
Any idea on how to do this?
Thanks in advance for all your help.
TheVagabond

You'd use ROW_NUMBER() to give you record numbers per group. Then add a WHERE clause to only get row numbers up to the desired number. In ROW_NUMBER's ORDER BY you can spcify which records to prefer.
select row_number() over (), car, group, yearout
from
(
select
row_number() over (partition by inner.group, inner.yearout order by inner.car) as rn,
inner.car, inner.group, inner.yearout
from (select..... ) inner
where inner.yearout between '2010' and '2030'
order by inner.group
) all_records
where all_records.rn <=
(
select amnt
from outerno
where outerno.year = all_records.yearout
and outerno.no = all_records.group
);
BTW: I wouldn't choose group for a column name, as it is a reserved word in SQL.

Related

SQL Query to sort date based on two column

I have following data in table RATING and i want to sort this based on Rating and Year.
Database MS SQL Server
Unsorted data in Table
ID PlayerName Rating Year
1 A 8 2022
2 B 8 2022
3 C 0 2022
4 A 7 2020
5 B 6 2020
6 C 6 2020
7 E 5 2020
8 D 5 2020
9 D 5 2022
Data should show as below
ID PlayerName Rating Year
1 A 8 2022
2 B 8 2022
3 D 5 2022
9 C 0 2022
4 A 7 2020
5 B 6 2020
6 C 6 2020
7 E 5 2020
8 D 5 2020
I am not able to get it right i used following Query
SELECT ID, PlayerName, Rating, Year
FROM RATING
Where Year IN (SELECT Year from Rating)
order by year DESC
but it doesn't get the correct order as i am not able to use order by clause in sub query as it generates error The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
Event two column sort is not working properly
SELECT ID, PlayerName, Rating, Year
FROM RATING
order by Rating, Year
You can try to ORDER BY year DESC first then Rating DESC
SELECT ID, PlayerName, Rating, [Year]
FROM RATING
order by [Year] DESC,Rating DESC
Year is a keyword in sqlserver, I would use brackets to contain it.
Your attempt is not bad so far. But there are two things you need to change.
First: The column with the higher priority for your sorting is the column "year", so this has to be used first, and then the column "rating".
Second: You must add the key word "DESC" to begin with the newest year and the highest ranking.
So your query should be this one:
SELECT * FROM rating ORDER BY year DESC, rating DESC;
You can see this is working here: db<>fiddle
If you can rename the columns on your DB, I recommend to do not use SQL key words as column names (in your example, this the column "year") and to do not use column names that are identic to the table names (in your case, you could rename the table "rating" to "ratings" or similar).
Both of this is of course possible, but could sometimes be bad to read and let increase the risk of issues.

SQL Union with fixed number of rows as result

This is a simple representation of my student table.
Student
Year
Class
1
2010
A
2
2010
A
3
2010
C
4
2010
B
5
2011
B
6
2012
B
I want to compose a random group of 5 students.
In the group there will be 2 students of year 2010 and 3 students of group B. These values are user specified.
A simple union doesn't always work, because of the duplicate values. Sometimes there are 4 students, sometimes there are 5 students. I always want 5 records.
SELECT * FROM (SELECT TOP 2 * FROM [Student] WHERE YEAR = 2010 ORDER BY NEWID()) A
UNION
SELECT * FROM (SELECT TOP 3 * FROM [Student] WHERE Class = 'B' ORDER BY NEWID()) B
Is this possible with a SQL query?

Extract column from SQL table based on another column if the same table

I m using POSTGRESQL.
Table of PURCHASES looks like this:
ID | CUSTOMER_ID | YEAR
1 1 2011
2 2 2012
3 2 2012
4 1 2013
5 3 2014
6 3 2014
7 3 2015
I need to extract 'ID' of the purchase with the latest 'date/year' for each CUSTOMER.
For example for CUSTOMER_ID 1 the year s 2013 which correcponds with id '4'.
I need to get ONE column as a return data structure.
PS. i m stuck with this kinda simple task )))
If you want one row per customer, you can use distinct on:
select distinct on (customer_id) id
from purchases
order by customer_id, year desc;
This returns one column which is an id from the most recent year for that customer.
This should work, but doesn't look too pretty...
SELECT DISTINCT ON(CUSTOMER_ID) ID FROM PURCHASES P
WHERE (CUSTOMER_ID,YEAR) =
(SELECT CUSTOMER_ID,MAX(YEAR) FROM PURCHASES WHERE CUSTOMER_ID = P.CUSTOMER_ID
GROUP BY CUSTOMER_ID);
So for input
ID | CUSTOMER_ID | YEAR
1 1 2011
2 2 2012
3 2 2012
4 1 2013
5 3 2014
6 3 2014
7 3 2015
It will return
id
4
2
7
Meaning:
For the lowest CUSTOMER_ID (it is 1) the id is 4 (year 2013)
Next we have CUSTOMER_ID (it is 2) the id is 2 (year 2012)
Lastly the CUSTOMER_ID (it is 3) the id is 7 (year 2015)
The idea behind this:
Group by CUSTOMER_ID
For each group select max(year)
While looping over all records - if Customer_id and year equals those from number 2. then select ID from this record.
Without DISTINCT ON(CUSTOMER_ID) it would return 2 records
for CUSTOMER_ID = 2, because for both years 2012 it would find some records while looping.
If you write in the beginning instead of:
SELECT DISTINCT ON(CUSTOMER_ID) ID FROM PURCHASES P
this code:
SELECT DISTINCT ON(CUSTOMER_ID) * FROM PURCHASES P
then you will see everything clearly.
Use row_number() analytic function with partition by customer_id to select by each customer with descending ordering by year ( if ties occur for year values [e.g. they're equal], then the below query brings the least ID values for each customer_id. e.g. 4, 2, 7 respectively )
WITH P2 AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY CUSTOMER_ID ORDER BY YEAR DESC) AS RN,
*
FROM PURCHASES
)
SELECT ID FROM P2 WHERE RN = 1
Demo

How do I "dedup" rows based on most recently updated

Lets say I have a table whose content looks like
ID Name Last Update
============================
1 A 1 JAN 2018
1 A 2 JAN 2018
1 A 3 JAN 2018
2 B 3 JAN 2018
2 B 6 JAN 2018
I want to get the result
ID Name Last Update
============================
1 A 3 JAN 2018
2 B 6 JAN 2018
How can I do it?
I tried to group by ID but, how do I get the most recent?
While #Nik's solution can work in situations where there are either no ties for the MAX(date) values (or it doesn't matter which tie value gets selected and whether this produces multiple output rows), an alternative approach is to group all records by ID sort all records belonging to one group by date in descending order and then pick the very first result row per group.
This can be achieved by using the SQL standard window function ROW_NUMBER() like this:
SELECT ID, NAME, DATE
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY ID
ORDER BY DATE DESC) RN
, ID
, NAME
, DATE
FROM <TABLE_NAME>
)
WHERE RN = 1;
You could use a query like this to get the results that you need:
SELECT *
FROM table
WHERE (ID, date) IN (SELECT
ID, MAX(Last Update)
FROM table
GROUP BY ID)

Derby DB last x row average

I have the following table structure.
ITEM TOTAL
----------- -----------------
ID | TITLE ID |ITEMID|VALUE
1 A 1 2 6
2 B 2 1 4
3 C 3 3 3
4 D 4 3 8
5 E 5 1 2
6 F 6 5 4
7 4 5
8 2 8
9 2 7
10 1 3
11 2 2
12 3 6
I am using Apache Derby DB. I need to perform the average calculation in SQL. I need to show the list of item IDs and their average total of the last 3 records.
That is, for ITEM.ID 1, I will go to TOTAL table and select the last 3 records of the rows which are associated with the ITEMID 1. And take average of them. In Derby database, I am able to do this for a given item ID but I cannot make it without giving a specific ID. Let me show you what I've done it.
SELECT ITEM.ID, AVG(VALUE) FROM ITEM, TOTAL WHERE TOTAL.ITEMID = ITEM.ID GROUP BY ITEM.ID
This SQL gives the average of all items in a list. But this calculates for all values of the total tables. I need last 3 records only. So I changed the SQL to this:
SELECT AVG(VALUE) FROM (SELECT ROW_NUMBER() OVER() AS ROWNUM, TOTAL.* FROM TOTAL WHERE ITEMID = 1) AS TR WHERE ROWNUM > (SELECT COUNT(ID) FROM TOTAL WHERE ITEMID = 1) - 3
This works if I supply the item ID 1 or 2 etc. But I cannot do this for all items without giving an item ID.
I tried to do the same thing in ORACLE using partition and it worked. But derby does not support partitioning. There is WINDOW but I could not make use of it.
Oracle one
SELECT ITEMID, AVG(VALUE) FROM(SELECT ITEMID, VALUE, COUNT(*) OVER (PARTITION BY ITEMID) QTY, ROW_NUMBER() OVER (PARTITION BY ITEMID ORDER BY ID) IDX FROM TOTAL ORDER BY ITEMID, ID) WHERE IDX > QTY -3 GROUP BY ITEMID ORDER BY ITEMID
I need to use derby DB for its portability.
The desired output is this
RESULT
-----------------
ITEMID | AVERAGE
1 (9/3)
2 (17/3)
3 (17/3)
4 (5/1)
5 (4/1)
6 NULL
As you have noticed, Derby's support for the SQL 2003 "OLAP Operations" support is incomplete.
There was some initial work (see https://wiki.apache.org/db-derby/OLAPOperations), but that work was only partially completed.
I don't believe anyone is currently working on adding more functionality to Derby in this area.
So yes, Derby has a row_number function, but no, Derby does not (currently) have partition by.