Finding the most recent date in SQL for a range of rows - sql

I have a table of course work marks, with the table headings:
Module code, coursework numbers, student, date submitted, mark
Sample data in order of table headings:
Maths, 1, Parry, 12-JUN-92, 20
Maths, 2, Parry, 13-JUN-92, 20
Maths, 2, Parry, 15-JUN-92, 25
Expected data after query
Maths, 1, Parry, 12-JUN-92, 20
Maths, 2, Parry, 15-JUN-92, 25
Sometimes a student retakes an exam and they have an additional row for a piece of coursework.
I need to try get only the latest coursework’s in a table. The following works when I isolate a particular student:
SELECT *
FROM TABLE
WHERE NAME = ‘NAME’
AND DATE IN (SELECT MAX(DATE)
FROM TABLE
WHERE NAME = ‘NAME’
GROUP BY MODULE_CODE, COURSEWORK_NUMBER, STUDENT)
This provides the correct solution for that person, giving me the most recent dates for each row (each coursework) in the table. However, this:
SELECT *
FROM TABLE
AND DATE IN (SELECT MAX(DATE)
FROM TABLE
GROUP BY MODULE_CODE, COURSEWORK_NUMBER, STUDENT)
Does not provide me with the same table but for every person who has attempted the coursework. Where am I going wrong? Sorry if the details are a bit sparse, but I’m worried about plagiarism.
Working with SQL plus

This is a good spot to use Oracle keep syntax:
select
module_code,
course_work_number,
student,
max(date_submitted) date_submitted,
max(mark) keep(dense_rank first order by date_submitted desc) mark
from mytable
group by module_code, course_work_number, student
Demo on DB Fiddle:
MODULE_CODE | COURSE_WORK_NUMBER | STUDENT | DATE_SUBMITTED | MARK
:---------- | -----------------: | :------ | :------------- | ---:
Maths | 1 | Parry | 12-JUN-92 | 20
Maths | 2 | Parry | 15-JUN-92 | 25

You are looking for a groupwise maximum. See this article from MySQL:
https://dev.mysql.com/doc/refman/8.0/en/example-maximum-column-group-row.html
I'm not sure about the correct syntax for Oracle, but it should be similar. At least the query structure should put you on the right path.

You could use the row_number function to solve this:
select x.*
(SELECT a.*,row_number() over(partition by name order by date desc) as row1
FROM TABLE a)x
where x.row1=1
The idea is to assign a row number based on the date and then select the cases where row number is 1. Hope this helps.

Related

How can I delete completely duplicate rows from a query, without having a unique value for it?

I'm having an issue getting information from an MS Access Database table. I need a count of a code but I don't have to take into account duplicate rows, which means that I need to delete all duplicate rows.
Here's an example to illustrate what I need:
Code | Name
12 | George
20 | John
12 | George
33 | John
I will need first to delete both rows with the same code, and then I need a count for the name the rest of the table data for example this will be the result that I'm expecting:
Name | Count
John | 2
I already have a query that does that for me, but is taking around 1 hour to get me around 5000 rows and I need something more efficient. My query:
select name, count(*) from Table
where name = '" + input_name + "'
and code in (select code from Table group by code
having count(code) = 1)
group by name
order by count(name) desc;
I would appreciate any suggestion.
Rather than using in, I might suggest filtering the original dataset in a subquery, e.g.:
select u.name, count(*)
from (select t.code, t.name from yourtable t group by t.code, t.name having count(*) = 1) u
group by u.name
Here, change yourtable to the name of your table.

How to filter out conditions based on a group by in JPA?

I have a table like
| customer | profile | status | date |
| 1 | 1 | DONE | mmddyy |
| 1 | 1 | DONE | mmddyy |
In this case, I want to group by on the profile ID having max date. Profiles can be repeated. I've ruled out Java 8 streams as I have many conditions here.
I want to convert the following SQL into JPQL:
select customer, profile, status, max(date)
from tbl
group by profile, customer,status, date, column-k
having count(profile)>0 and status='DONE';
Can someone tell how can I write this query in JPQL if it is correct in SQL? If I declare columns in select it is needed in group by as well and the query results are different.
I am guessing that you want the most recent customer/profile combination that is done.
If so, the correct SQL is:
select t.*
from t
where t.date = (select max(t2.date)
from t t2
where t2.customer = t.customer and t2.profile = t.profile
) and
t.status = 'DONE';
I don't know how to convert this to JPQL, but you might as well start with working SQL code.
In your query date column not needed in group by and status='DONE' should be added with where clause
select customer, profile, status, max(date)
from tbl
where status='DONE'
group by profile, customer,status,
having count(profile)>0

How to get dates based on months that appear more than once?

I'm trying to get months of Employees' birthdays that are found in at least 2 rows
I've tried to unite birthday information table with itself supposing that I could iterate through them abd get months that appear multiple times
There's the question: how to get birthdays with months that repeat more than once?
SELECT DISTINCT e.EmployeeID, e.City, e.BirthDate
FROM Employees e
GROUP BY e.BirthDate, e.City, e.EmployeeID
HAVING COUNT(MONTH(b.BirthDate))=COUNT(MONTH(e.BirthDate))
UNION
SELECT DISTINCT b.EmployeeID, b.City, b.BirthDate
FROM Employees b
GROUP BY b.EmployeeID, b.BirthDate, b.City
HAVING ...
Given table:
| 1 | City1 | 1972-03-26|
| 2 | City2 | 1979-12-13|
| 3 | City3 | 1974-12-16|
| 4 | City3 | 1979-09-11|
Expected result :
| 2 | City2 |1979-12-13|
| 3 | City3 |1974-12-16|
Think of it in steps.
First, we'll find the months that have more than one birthday in them. That's the sub-query, below, which I'm aliasing as i for "inner query". (Substitute MONTH(i.Birthdate) into the SELECT list for the 1 if you want to see which months qualify.)
Then, in the outer query (o), you want all the fields, so I'm cheating and using SELECT *. Theoretically, a WHERE IN would work here, but IN can have unfortunate side effects if a NULL comes back, so I never use it. Instead, there's a correlated sub=query; which is to say we look for any results where the month from the outer query is equal to the months that make the cut in the inner (correlated sub-) query.
When using a correlated sub-query in the WHERE clause, the SELECT list doesn't matter. You could put 1/0 and it won't throw an error. But I always use SELECT 1 to show that the inner query isn't actually returning any results to the outer query. It's just there to look for, well, the correlation between the two data sets.
SELECT
*
FROM
#table AS o
WHERE
EXISTS
(
SELECT
1
FROM
#table AS i
WHERE
MONTH(i.Birthdate) = MONTH(o.Birthdate)
GROUP BY
MONTH(i.Birthdate)
HAVING
COUNT(*) > 1
);
Seems to be an odd requirement.
This might help with some tweaks. Works in Oracle.
SELECT DATE FROM TABLE WHERE EXTRACT(MONTH FROM DATE)=EXTRACT(MONTH FROM SOMEDATE);
Give this a try and you may be able to dispense with your UNION:
SELECT
EmployeeId
, City
, BirthDate
FROM Employees
GROUP BY
EmployeeId
, City
, BirthDate
HAVING COUNT(Month(BirthDate)) > 2
Here is another approach using GROUP_CONCAT. It's not exactly what you're looking for but it might do the job. Eric's approach is better though. (Note: This is for MySQL)
SELECT GROUP_CONCAT(EmployeeID) EmployeeID, BirthDate, COUNT(*) DupeCount
FROM Employees
GROUP BY MONTH(BirthDate)
HAVING DupeCount> 1;

find maximum with specific data in sql server

I still confused about sql server, for example i have student table and i try to find maximum mark for java student
STUDENT
| id | name | mark | subject |
| 1 | jenny | 67 | db |
| 2 | mark | 74 | java |
| 3 | nala | 90 | java |
i try to get output like this
| 3 | nala | 90 |
i write this in sql, but the output is empty.
SELECT id,name,mark
FROM student
WHERE subject='Java'
AND mark=
(SELECT max(mark) FROM student);
how i'm supposed to correct it?
There are many ways to get what you want in SQL. However, you should understand the problem with your approach:
SELECT id, name, mark
FROM student
WHERE subject = 'Java' AND
mark = (SELECT max(mark) FROM student);
The problem is that the maximum value of mark may not be for 'Java'. Hence, no rows can pass both where conditions.
You need to repeat the filter in the subquery, either explicitly:
SELECT id, name, mark
FROM student
WHERE subject = 'Java' AND
mark = (SELECT max(mark) FROM student WHERE subject = 'Java');
Or using a correlated subquery:
SELECT s.id, s.name, s.mark
FROM student s
WHERE s.subject = 'Java' AND
s.mark = (SELECT max(mark) FROM student s2 WHERE s2.subject = s.subject);
Notice that the last query uses table aliases. You should learn to use these in your queries; sometimes they are necessary and they generally make queries easier to write, read, and understand.
You don't need to use sub query. Use Top 1 with ties to get the student with max marks and also if the max mark is shared by more than one student
Where condition will filter the result to have only subject = 'Java' after that Top 1 with order by will fetch you the max mark in java
SELECT TOP 1 with ties id, name, mark
FROM student
WHERE subject = 'Java'
ORDER BY mark DESC
Use TOP 1 with ORDER BY clause to fetch highest data
Try this:
SELECT TOP 1 id, name, mark
FROM student
WHERE subject = 'Java'
ORDER BY mark DESC;
OR
SELECT id, name, mark
FROM (SELECT id, name, mark, ROW_NUMBER() OVER (ORDER BY mark DESC) AS RowNum
FROM student
WHERE subject = 'Java'
) AS A
WHERE RowNum = 1;
With what you are trying (with subquery), you could do:
SELECT id,name,mark
FROM student
WHERE subject='Java'
AND mark = (SELECT MAX(mark)
FROM student
WHERE subject='Java');
You are trying to fetch max of all the records irrespective of subject name.
TRY THIS SIMPLE QUERY TO GET RESULT
SELECT TOP 1 ID,NAME,mark FROM STUDENT
WHERE SUBJECT ='JAVA'
ORDER BY MARK DESC

How to Select and Order By columns not in Groupy By SQL statement - Oracle

I have the following statement:
SELECT
IMPORTID,Region,RefObligor,SUM(NOTIONAL) AS SUM_NOTIONAL
From
Positions
Where
ID = :importID
GROUP BY
IMPORTID, Region,RefObligor
Order BY
IMPORTID, Region,RefObligor
There exists some extra columns in table Positions that I want as output for "display data" but I don't want in the group by statement.
These are Site, Desk
Final output would have the following columns:
IMPORTID,Region,Site,Desk,RefObligor,SUM(NOTIONAL) AS SUM_NOTIONAL
Ideally I'd want the data sorted like:
Order BY
IMPORTID,Region,Site,Desk,RefObligor
How to achieve this?
It does not make sense to include columns that are not part of the GROUP BY clause. Consider if you have a MIN(X), MAX(Y) in the SELECT clause, which row should other columns (not grouped) come from?
If your Oracle version is recent enough, you can use SUM - OVER() to show the SUM (grouped) against every data row.
SELECT
IMPORTID,Site,Desk,Region,RefObligor,
SUM(NOTIONAL) OVER(PARTITION BY IMPORTID, Region,RefObligor) AS SUM_NOTIONAL
From
Positions
Where
ID = :importID
Order BY
IMPORTID,Region,Site,Desk,RefObligor
Alternatively, you need to make an aggregate out of the Site, Desk columns
SELECT
IMPORTID,Region,Min(Site) Site, Min(Desk) Desk,RefObligor,SUM(NOTIONAL) AS SUM_NOTIONAL
From
Positions
Where
ID = :importID
GROUP BY
IMPORTID, Region,RefObligor
Order BY
IMPORTID, Region,Min(Site),Min(Desk),RefObligor
I believe this is
select
IMPORTID,
Region,
Site,
Desk,
RefObligor,
Sum(Sum(Notional)) over (partition by IMPORTID, Region, RefObligor)
from
Positions
group by
IMPORTID, Region, Site, Desk, RefObligor
order by
IMPORTID, Region, RefObligor, Site, Desk;
... but it's hard to tell without further information and/or test data.
A great blog post that covers this dilemma in detail is here:
http://bernardoamc.github.io/sql/2015/05/04/group-by-non-aggregate-columns/
Here are some snippets of it:
Given:
CREATE TABLE games (
game_id serial PRIMARY KEY,
name VARCHAR,
price BIGINT,
released_at DATE,
publisher TEXT
);
INSERT INTO games (name, price, released_at, publisher) VALUES
('Metal Slug Defense', 30, '2015-05-01', 'SNK Playmore'),
('Project Druid', 20, '2015-05-01', 'shortcircuit'),
('Chroma Squad', 40, '2015-04-30', 'Behold Studios'),
('Soul Locus', 30, '2015-04-30', 'Fat Loot Games'),
('Subterrain', 40, '2015-04-30', 'Pixellore');
SELECT * FROM games;
game_id | name | price | released_at | publisher
---------+--------------------+-------+-------------+----------------
1 | Metal Slug Defense | 30 | 2015-05-01 | SNK Playmore
2 | Project Druid | 20 | 2015-05-01 | shortcircuit
3 | Chroma Squad | 40 | 2015-04-30 | Behold Studios
4 | Soul Locus | 30 | 2015-04-30 | Fat Loot Games
5 | Subterrain | 40 | 2015-04-30 | Pixellore
(5 rows)
Trying to get something like this:
SELECT released_at, name, publisher, MAX(price) as most_expensive
FROM games
GROUP BY released_at;
But name and publisher are not added due to being ambiguous when aggregating...
Let’s make this clear:
Selecting the MAX(price) does not select the entire row.
The database can’t know and when it can’t give the right answer every
time for a given query it should give us an error, and that’s what it
does!
Ok… Ok… It’s not so simple, what can we do?
Use an inner join to get the additional columns
SELECT g1.name, g1.publisher, g1.price, g1.released_at
FROM games AS g1
INNER JOIN (
SELECT released_at, MAX(price) as price
FROM games
GROUP BY released_at
) AS g2
ON g2.released_at = g1.released_at AND g2.price = g1.price;
Or Use a left outer join to get the additional columns, and then filter by the NULL of a duplicate column...
SELECT g1.name, g1.publisher, g1.price, g2.price, g1.released_at
FROM games AS g1
LEFT OUTER JOIN games AS g2
ON g1.released_at = g2.released_at AND g1.price < g2.price
WHERE g2.price IS NULL;
Hope that helps.