SQL only select rows with max date within each user - sql

SQL beginner here. I've got a simple test that users take, and each row is the answer to one of their questions. They're allowed to take the exam once per day, so some people take it a second time on another day, and thus will have many rows with different test dates. What I'm basically trying to do is get each user's most recent score.
Here is what my data looks like (table name is dumdum):
+----------+----------------+----------+------------------+
| USERNAME | CORRECT_ANSWER | RESPONSE | DATE_TAKEN |
+----------+----------------+----------+------------------+
| matt | 1 | 1 | 3/23/15 1:04:26 |
| matt | 2 | 2 | 3/23/15 1:04:28 |
| matt | 3 | 3 | 3/23/15 1:04:23 |
| david | 1 | 3 | 3/20/15 1:04:25 |
| david | 2 | 2 | 3/20/15 1:04:28 |
| david | 3 | 1 | 3/20/15 1:04:30 |
| david | 1 | 1 | 3/21/15 11:03:14 |
| david | 2 | 3 | 3/21/15 11:03:17 |
| david | 3 | 2 | 3/21/15 11:03:19 |
| chris | 1 | 2 | 3/17/15 12:45:52 |
| chris | 2 | 2 | 3/17/15 12:45:56 |
| chris | 3 | 3 | 3/17/15 12:45:59 |
| peter | 1 | 1 | 3/19/15 2:45:33 |
| peter | 2 | 3 | 3/19/15 2:45:35 |
| peter | 3 | 2 | 3/19/15 2:45:38 |
| peter | 1 | 1 | 3/20/15 12:32:04 |
| peter | 2 | 2 | 3/20/15 12:32:05 |
| peter | 3 | 3 | 3/20/15 12:32:05 |
+----------+----------------+----------+------------------+
and what I'm trying to get in the end...
+----------+------------------+-------+
| USERNAME | MOST_RECENT_TEST | SCORE |
+----------+------------------+-------+
| matt | 3/23/2015 | 100 |
| david | 3/21/2015 | 33 |
| chris | 3/17/2015 | 67 |
| peter | 3/20/2015 | 100 |
+----------+------------------+-------+
I ran into some trouble because I need to go by day, and not by day/time, so I had to do a weird maneuver where I went to character and back to date... This is what I have so far, but I can't figure out how to use only the scores from the most recent test (right now it's factoring in all scores from every test ever taken)...
SELECT username, to_date(substr(max(test_date),1,9),'dd-MON-yy') as most_recent_test, round((sum(case when response=correct_answer then 1 end)/3)*100,0) as score
FROM dumdum group by username
Any help would be appreciated! Thanks!

There are several solutions to this problem this one uses the WITH clause and the RANK function.
It also uses the TRUNC function rather than to_date(substr(
with mxDate as
(SELECT USERNAME,
TRUNC(DATE_TAKEN) as MOST_RECENT_TEST,
CASE WHEN CORRECT_ANSWER = RESPONSE THEN 1 else 0 END as SCORE,
RANK () OVER (PARTITION BY USERNAME
ORDER BY TRUNC(DATE_TAKEN) DESC) Rk
FROM dumdum)
SELECT
USERNAME,
MOST_RECENT_TEST,
SUM(SCORE)/3 * 100
FROM
mxDate
WHERE
rk = 1
GROUP BY
USERNAME,
MOST_RECENT_TEST
Demo

Related

SQL, query to check and list distinct entries that occur in another table within a specific time frame

I'm using Oracle.
I have two tables. One contains users and the other is an access log of sorts. I need to list all users whose latest log entry appears in the log within a specified time frame including the timestamp of the latest entry. A single user can have several entries in the log.
Here are simplified versions of the tables:
Users
|----------------------------------|
| userid| username | name |
|----------------------------------|
| 1 | josm | John Smith |
| 2 | lajo | Laura Jones |
| 3 | miwi | Mike Williams |
| 4 | subo | Susan Brown |
| 5 | peda | Peter Davis |
| 6 | jami | Jane Miller |
|----------------------------------|
Log
|----------------------------------|
| userid| action | timestamp |
|----------------------------------|
| 3 | a | 20-01-2020 |
| 2 | v | 19-11-2019 |
| 2 | y | 02-11-2019 |
| 4 | b | 15-09-2019 |
| 1 | a | 23-05-2019 |
| 6 | y | 22-05-2019 |
| 3 | b | 16-04-2019 |
| 2 | a | 07-01-2019 |
| 5 | v | 18-11-2018 |
| 6 | a | 12-09-2018 |
|----------------------------------|
Desired result if the time frame is set to last six months:
|---------------------------------------|
| username | name | timestamp |
|--------------------------|------------|
| miwi | Mike Williams | 20-01-2020 |
| lajo | Laura Jones | 19-11-2019 |
| subo | Susan Brown | 15-09-2019 |
|---------------------------------------|
Any help will be greatly appreciated.
You can use aggregation:
select u.username, u.userid, max(l.timestamp)
from logs l join
users u
on l.userid = u.userid
group by u.username, u.userid
having max(l.timestamp) >= add_months(sysdate, -6)

SQL Server 2016 count similar rows as a column without duplicating query

I have a SQL query that returns data similar to this pseudo-table:
| Name | Id1 | Id2 | Guid |
|------+-----+-----+------|
| Joe | 1 | 1 | 1123 |
| Joe | 2 | 1 | 1123 |
| Joe | 3 | 1 | 1120 |
| Jeff | 1 | 1 | 1123 |
| Moe | 3 | 42 | 1120 |
I would like to display an additional column on the output, listing the total number of records that have matching GUIDs to a given row, like this:
| Name | Id1 | Id2 | Guid | # Matching |
+------+-----+-----+------+------------+
| Joe | 1 | 1 | 1123 | 3 |
| Joe | 2 | 1 | 1123 | 3 |
| Joe | 3 | 1 | 1120 | 2 |
| Jeff | 1 | 1 | 1123 | 3 |
| Moe | 3 | 42 | 1120 | 2 |
I was able to accomplish this by joining the query with itself, and doing a count. However, the query is rather large and takes awhile to complete, is there any way I can accomplish this without joining the query with itself?
You want a window function:
select t.*, count(*) over (partition by guid) as num_matching
from t;

SQL Get cases related to a user and the number of files attached to that case

Hi everyone got a little stuck on an sql query. I have four tables
users
+----+------------+-----------+--------+
| id | first_name | last_name | active |
+----+------------+-----------+--------+
| 1 | Joe | Bloggs | 1 |
| 2 | John | Doe | 1 |
| 3 | Dave | Smith | 1 |
+----+------------+-----------+--------+
cases
+----+-----------+-------------+
| id | case_code | case_name |
+----+-----------+-------------+
| 1 | THEC12C | Test Case 1 |
| 2 | ABCD23A | Test Case 2 |
+----+-----------+-------------+
case_creditors
+----+---------+-------------+
| id | case_id | creditor_id |
+----+---------+-------------+
| 1 | 1 | 3 |
| 2 | 2 | 1 |
+----+---------+-------------+
case_files
+----+---------+----------+-----------+
| id | case_id | filename | file type |
+----+---------+----------+-----------+
| 1 | 1 | test.pdf | pfd |
| 2 | 2 | file.txt | txt |
| 3 | 2 | word.doc | doc |
+----+---------+----------+-----------+
When a user logs in i need to show a table with the users accociated cases the number of files attached to that case so if Joe Blogs loged in head see the following table
+-----------+-------------+-------+
| Case Code | Case Name | Files |
+-----------+-------------+-------+
| ABCD23A | Test Case 2 | 2 |
+-----------+-------------+-------+
ive been trying to write the sql statement to do this but am getting stuck on the query and wandered if someone could help give me some pointers. the sql ive gor so far
SELECT * FROM cases
(SELECT COUNT(*) FROM case_files WHERE case_files.case_id = cases.id) as Files
JOIN case_creditors ON cases.id = case_creditors.case_id
WHERE case_creditors.creditor_id = 1
managed to sort this with
SELECT
ips_case.*,
COUNT(case_files.file_id) AS Files
FROM
ips_case
LEFT JOIN case_files ON ips_case.id = case_files.case_id
JOIN case_creditors ON ips_case.id = case_creditors.case_id
WHERE
case_creditors.creditors_id = 4
GROUP BY
ips_case.id

SQL compare multiple rows or partitions to find matches

The database I'm working on is DB2 and I have a problem similar to the following scenario:
Table Structure
-------------------------------
| Teacher Seating Arrangement |
-------------------------------
| PK | seat_argmt_id |
| | teacher_id |
-------------------------------
-----------------------------
| Seating Arrangement |
-----------------------------
|PK FK | seat_argmt_id |
|PK | Row_num |
|PK | seat_num |
|PK | child_name |
-----------------------------
Table Data
------------------------------
| Teacher Seating Arrangement|
------------------------------
| seat_argmt_id | teacher_id |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
------------------------------
---------------------------------------------------
| Seating Arrangement |
---------------------------------------------------
| seat_argmt_id | row_num | seat_num | child_name |
| 1 | 1 | 1 | Abe |
| 1 | 1 | 2 | Bob |
| 1 | 1 | 3 | Cat |
| | | | |
| 2 | 1 | 1 | Abe |
| 2 | 1 | 2 | Bob |
| 2 | 1 | 3 | Cat |
| | | | |
| 3 | 1 | 1 | Abe |
| 3 | 1 | 2 | Cat |
| 3 | 1 | 3 | Bob |
| | | | |
| 4 | 1 | 1 | Abe |
| 4 | 1 | 2 | Bob |
| 4 | 1 | 3 | Cat |
| 4 | 2 | 2 | Dan |
---------------------------------------------------
I want to see where there are duplicate seating arrangements for a teacher. And by duplicates I mean where the row_num, seat_num, and child_name are the same among different seat_argmt_id for one teacher_id. So with the data provided above, only seat id 1 and 2 are what I would want to pull back, as they are duplicates on everything but the seat id. If all the children on the 2nd table are exact (sans the primary & foreign key, which is seat_argmt_id in this case), I want to see that.
My initial thought was to do a count(*) group by row#, seat#, and child. Everything with a count of > 1 would mean it's a dupe and = 1 would mean it's unique. That logic only works if you are comparing single rows though. I need to compare multiple rows. I cannot figure out a way to do it via SQL. The solution I have involves going outside of SQL and works (probably). I'm just wondering if there is a way to do it in DB2.
Does this do what you want?
select d.teacher_id, sa.row_num, sa.seat_num, sa.child_name
from seatingarrangement sa join
data d
on sa.seat_argmt_id = d.seat_argmt_id
group by d.teacher_id, sa.row_num, sa.seat_num, sa.child_name
having count(*) > 1;
EDIT:
If you want to find two arrangements that are the same:
select sa1.seat_argmt_id, sa2.seat_argmt_id
from seatingarrangement sa1 join
seatingarrangement sa2
on sa1.seat_argmt_id < sa2.seat_argmt_id and
sa1.row_num = sa2.row_num and
sa1.seat_num = sa2.seat_num and
sa1.child_name = sa2.child_name
group by sa1.seat_argmt_id, sa2.seat_argmt_id
having count(*) = (select count(*) from seatingarrangement sa where sa.seat_argmt_id = sa1.seat_argmt_id) and
count(*) = (select count(*) from seatingarrangement sa where sa.seat_argmt_id = sa2.seat_argmt_id);
This finds the matches between two arrangements and then verifies that the counts are correct.

SQL Query - Grouping Data

So every morning at work we have a stand-up meeting. We throw the nearest object to hand around the room as a method of deciding who speaks in what order. Being slightly odd I decided it could be fun to get some data on these throws. So, every morning I memorise the order of throws (as well as other relevant things like who dropped the ball/strange sponge object that was probably once a ball too and who threw to someone who'd already been or just gave an atrocious throw), and record this data in a table:
+---------+-----+------------+----------+---------+----------+--------+--------------+
| throwid | day | date | thrownum | thrower | receiver | caught | correctthrow |
+---------+-----+------------+----------+---------+----------+--------+--------------+
| 1 | 1 | 10/01/2012 | 1 | dan | steve | 1 | 1 |
| 2 | 1 | 10/01/2012 | 2 | steve | alice | 1 | 1 |
| 3 | 1 | 10/01/2012 | 3 | alice | matt | 1 | 1 |
| 4 | 1 | 10/01/2012 | 4 | matt | justin | 1 | 1 |
| 5 | 1 | 10/01/2012 | 5 | justin | arif | 1 | 1 |
| 6 | 1 | 10/01/2012 | 6 | arif | pete | 1 | 1 |
| 7 | 1 | 10/01/2012 | 7 | pete | greg | 0 | 1 |
| 8 | 1 | 10/01/2012 | 8 | greg | alan | 1 | 1 |
| 9 | 1 | 10/01/2012 | 9 | alan | david | 1 | 1 |
| 10 | 1 | 10/01/2012 | 10 | david | dan | 1 | 1 |
| 11 | 2 | 11/01/2012 | 1 | dan | david | 1 | 1 |
| 12 | 2 | 11/01/2012 | 2 | david | alice | 1 | 1 |
| 13 | 2 | 11/01/2012 | 3 | alice | steve | 1 | 1 |
| 14 | 2 | 11/01/2012 | 4 | steve | arif | 1 | 1 |
| 15 | 2 | 11/01/2012 | 5 | arif | pete | 0 | 1 |
| 16 | 2 | 11/01/2012 | 6 | pete | justin | 1 | 1 |
| 17 | 2 | 11/01/2012 | 7 | justin | alan | 1 | 1 |
| 18 | 2 | 11/01/2012 | 8 | alan | dan | 1 | 1 |
| 19 | 2 | 11/01/2012 | 9 | dan | greg | 1 | 1 |
+---------+-----+------------+----------+---------+----------+--------+--------------+
I've now got quite a few days worth of data for this, and I'm starting to run some queries on it for my own purposes (I've not told the rest of the team yet...wouldn't like to influence the results). I've done a few with no issues, but I'm stuck trying to get a certain result out.
What I'm looking for is the number of times each person has been the last team member to receive the ball. Now, as you can see on the table, due to absences etc the number of throws per day is not always constant, so I can't simply select the receiver by thrownum.
In the case for the data above, it would return:
+--------+-------------------+
| person | LastReceiverTotal |
+--------+-------------------+
| dan | 1 |
| greg | 1 |
+--------+-------------------+
I've got this far:
SELECT MAX(thrownum) AS LastThrowNum, day FROM Throws GROUP BY day
Now, this returns some useful data. I get the highest thrownum for each and every day. It would seem like all I need to do is get the receiver for this value, and then get a count grouped by receiver to get my answer. This doesn't work, though, because the resultset isn't what it seems due to the above query using aggregate functions.
I suspect there's a much better way of designing tables to store the data to be honest, but equally I'm also sure there's a way to get this information with the tables as they are - some kind of inner query? I can't figure out how it would work. Can anyone shed some light on how this would be done?
The query that you have gives you the biggest thrownum for each day.
With that, you just do a inner join with your table and get the receiver and the number of times he happears.
select t.receiver as person, count(t.day) as LastReceiverTotal from Throws t
inner join (SELECT MAX(thrownum) AS LastThrowNum, day FROM Throws GROUP BY day) a on a.LastThrowNum = t.thrownum and a.day = t.day
group by t.receiver