Query for max to_date for one user id? - sql

I am getting some unexpected results from a SQL query.
Table data:
users:
id username
1 admin
2 x1
3 y1
4 z1
my_connections:
id user_id friend_user_id status
1 1 2 201
2 2 1 201
3 2 4 201
4 1 3 200
5 2 3 201
6 3 2 201
7 4 2 201
8 4 1 200
jobs:
id user_id company_name designation from_date to_date
1 1 A 1 2011-06-01 2011-07-30
2 1 B 11 2011-08-02 2014-01-20
3 2 c 12 2012-05-02 2014-01-20
4 3 D 13 2010-05-02 2014-01-20
5 4 E 11 2009-05-25 2014-01-01
Here is my query:
SELECT users.id,users.username,my_connections.user_id,my_connections.friend_user_id,my_connections.status,jobs.user_id,jobs.company_name,
jobs.designation,jobs.from_date,MAX(jobs.to_date)
FROM users
LEFT JOIN jobs ON jobs.user_id = users.id
LEFT JOIN my_connections ON my_connections.friend_user_id = users.id
WHERE my_connections.status = 201 AND users.id IN (1,3,4)
GROUP BY jobs.company_name
ORDER BY jobs.to_date DESC
And the results:
id username user_id friend_user_id status user_id company_name designs from_date to_date
3 .. 2 3 201 3 D .. 2010-05-02 2014-01-20
4 .. 2 4 201 4 E .. 2009-05-25 2014-01-01
1 .. 2 1 201 1 A .. 2011-08-02 2014-01-20
1 .. 2 1 201 1 B .. 2011-06-01 2011-07-30
In the result, I wanted one row per friend_user_id, with the maximum value of to_date. Instead I am getting multiple rows (if there are multiple rows in the jobs table).
How can I fix this query?

if you want unique results on the friend_user_id field you must group by friend_user_id. This will guarantee unique results on the friend_user_id column. But im pretty sure you don't want that because it may show incorrect data. I am still unsure how the query is working because the group by only contains one field. You must group by all the raw fields in the select query and perform aggregate functions on fields that are not in the group by clause for example:
SELECT users.id,users.username,my_connections.user_id,my_connections.friend_user_id,my_connections.status,jobs.user_id,jobs.company_name,
jobs.designation,jobs.from_date,MAX(jobs.to_date)
FROM users
LEFT JOIN jobs ON jobs.user_id = users.id
LEFT JOIN my_connections ON my_connections.friend_user_id = users.id
WHERE my_connections.status = 201 AND users.id IN (1,3,4)
GROUP BY users.id,users.username,my_connections.user_id,my_connections.friend_user_id,my_connections.status,jobs.user_id,jobs.company_name,
jobs.designation,jobs.from_date
ORDER BY jobs.to_date DESC
In this query all of the fields in the group by clause are in the select clause. Now all the fields not included in the group by clause can use functions like: MAX(), AVG(), SUM() etc.

Related

Get max record for each group of records, link multiple tables

I seek to find the maximum timestamp (ob.create_ts) for each group of marketid's (ob.marketid), joining tables obe (ob.orderbookid = obe.orderbookid) and market (ob.marketid = m.marketid). Although there are a number of solutions posted like this for a single table, when I join multiple tables, I get redundant results. Sample table and desired results below:
table: ob
orderbookid
marketid
create_ts
1
1
1664635255298
2
1
1664635255299
3
1
1664635255300
4
2
1664635255301
5
2
1664635255302
6
2
1664635255303
table: obe
orderbookentryid
orderbookid
entryname
1
1
'entry-1'
2
1
'entry-2'
3
1
'entry-3'
4
2
'entry-4'
5
2
'entry-5'
6
3
'entry-6'
7
3
'entry-7'
8
4
'entry-8'
9
5
'entry-9'
10
6
'entry-10'
table: m
marketid
marketname
1
'market-1'
2
'market-2'
desired results
ob.orderbookid
ob.marketid
obe.orderbookentryid
obe.entryname
m.marketname
3
1
6
'entry-6'
'market-1'
3
1
7
'entry-7'
'market-1'
6
2
10
'entry-10'
'market-2'
Use ROW_NUMBER() to get a properly filtered ob table. Then JOIN the other tables onto that!
WITH
ob_filtered AS (
SELECT
orderbookid,
marketid
FROM
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY
marketid
ORDER BY
create_ts DESC
) AS create_ts_rownumber
FROM
ob
) ob_with_rownumber
WHERE
create_ts_rownumber = 1
)
SELECT
ob_filtered.orderbookid,
ob_filtered.marketid,
obe.orderbookentryid,
obe.entryname,
m.marketname
FROM
ob_filtered
JOIN m
ON m.marketid = ob_filtered.marketid
JOIN obe
ON ob_filtered.orderbookid = obe.orderbookid
;

How to add a condition to count function in PostgreSQL

I have these tables Course, subscription,subscription_Course(A table that creates a relation between Course and subscription), and another with Student. I want to Select all the id_courses that have a subscription count higher than 1 but only want to count the subscriptions from different students. Example: If a Student Subscribes two times the same course I want to have a condition that enables the count function to not count more than one time in these cases
These are my tables:
Student:
idStudent(pk)
cc
nif
1
30348507
232928185
2
30338507
231428185
3
30438507
233528185
4
30323231
3232132
Subscription
idsubscription(pk)
Student(fk)
value_subscription
vouchercurso
date
1
1
100
null
2021-11-01
2
2
150
null
2021-12-11
3
3
160
null
2021-01-03
4
4
500
null
1996-11-07
5
1
900
null
2001-07-05
6
2
432
null
2021-05-09
Subscription_Course
idsubscription(PK/fk)
id_Course(pk/fk)
Grade
1
3
9
2
4
15
3
5
12
6
3
9
5
4
16
2
6
20
6
5
4
For example, when counting within my table Subscription_Course only the id_course:5 would have a count higher than 1 because 3 and 4 have a subscription from the same student.
I have this query for now:
Select id_Course
From Subscription_Course
Group by id_Course
Having Count (id_Course)>1
I don't know what to do to add this condition to the count.
seems like you need to join to Subscription and count unique Student id's:
select id_Course
from Subscription_Course sc
join Subscription s
on s.idsubscription = sc.idsubscription
group by id_Course
having Count(distinct Studentid)>1
You can join the Subscription_Course table with the Subscription table in order to access the id_Student column. Then just count the distinct id_Student values for each id_Course value.
SELECT
Subscription_Course.id_Course,
COUNT(DISTINCT Subscription.id_Student) AS student_count
FROM Subscription_Course
INNER JOIN Subscription
ON Subscription_Course.id_Subscription = Subscription.id_Subscription
GROUP BY Subscription_Course.id_Course
HAVING COUNT(DISTINCT Subscription.id_Student) > 1
ORDER BY student_count DESC;
With result:
id_course | student_count
-----------+---------------
3 | 2
4 | 2
5 | 2

How to count on join a table with 2 conditions?

I have an items table
id
name
1
Nganu
2
Kae
3
Lho
Also I have an item_usages table:
id
item_id
user_id
usage_time
1
1
99
2021-10-07 00:00:00
2
2
99
2021-10-07 00:00:00
3
1
99
2021-10-08 00:00:00
4
1
22
2021-10-08 00:00:00
5
3
22
2021-10-08 00:00:00
6
1
99
2021-10-08 00:00:00
I want to find an item's total usage and user usage in a query. an example I would like to find user_id 99 usage, expected result:
id
name
total_usage
user_usage
2
Kae
1
1
1
Nganu
4
3
3
Lho
1
0
I tried:
select
"items".*,
count(total_usage.id) as total_usage,
count(user_usage.id) as user_usage
from
"items"
left join
"item_usages" as "total_usage" on "items"."id" = "total_usage"."item_id"
left join
"item_usages" as "user_usage" on "user_usage"."item_id" = "items"."id"
and "user_usage"."user_id" = 99
group by
"items"."id";
but it returns:
id
name
total_usage
user_usage
2
Kae
1
1
1
Nganu
12
12
3
Lho
1
0
item_usages only have 6 rows, why Nganu have 12 on both usage? How to fix my query?
I tried on PostgreSQL 12.8 and 13.4, I also tested on SQLFiddle(PostgreSQL 9.6), Here is the link:
http://sqlfiddle.com/#!17/f1aac/5
I got the query that returned the correct result:
select
"items".*,
min(total_usage.total_count) as total_usage,
count(user_usage.id) as user_usage
from "items"
left join
(select item_id,count(item_id) as total_count from item_usages group by item_id) as total_usage
on "items"."id" = "total_usage"."item_id"
left join "item_usages" as "user_usage"
on "user_usage"."item_id" = "items"."id" and "user_usage"."user_id" = 99
group by "items"."id";
But I don't know about the performance, so I still find faster query if possible and still wondering:
Why does my first query give wrong result?
The reason your query returns high numbers is that you join 2 times.
(From the side of Nganu) The first join will result in 4 rows, the second will map those 4 rows with 3 rows of the same table, resulting in 12 rows.
You can solve this problem with only 1 join:
select "items".id,
count(total_usage.id) as total_usage,
sum(case when total_usage.user_id = 99 then 1 else 0 end) as user_usage
from "items"
left join "item_usages" as "total_usage" on "items"."id" = "total_usage"."item_id"
group by "items".id
And it should work faster (though, on a small dataset is not visible)

Selecting rows and filler (null data)

I have a table that looks like this:
ReportID | TeamID | Inning | Runs
1 A 1 3
1 A 2 3
1 A 5 7
1 B 1 3
1 B 3 2
1 B 6 1
I need to select all of that data, plus null data for the missing innings. It also need to stop at the max Inning for both teams (i.e. teamB's highest inning is 6, so I would collect 6 rows for both teamA and teamB yielding 12 total rows.)
For a visual, I need the output of the query to look like this:
ReportID | TeamID | Inning | Runs
1 A 1 3
1 A 2 3
1 A 3 NULL
1 A 4 NULL
1 A 5 7
1 A 6 NULL
1 B 1 3
1 B 2 NULL
1 B 3 2
1 B 4 NULL
1 B 5 NULL
1 B 6 1
Is there anyway to do this with just a query? Modifying the original table to add the null values is not an option.
Self join to generate the permutations of reports and teams
Left self join to generate hits which might be nullable.
This is probably a lot more efficient if it's done outside of SQL
SELECT ins.ReportID, teams.TeamID, ins.inning, score.Runs
FROM games as ins
JOIN games AS teams
ON ins.ReportID = teams.ReportID
LEFT JOIN games AS score
ON ins.ReportID = score.ReportID
AND teams.TeamID = score.TeamID
AND ins.inning = score.inning
GROUP BY ins.ReportID, teams.TeamID, ins.inning;

How to do grouping by a date span?

Conside this Table Structure.
Key ID VISITDATE
1 1 2011-01-07
2 1 2011-01-09
3 2 2011-01-10
4 1 2011-01-12
5 3 2011-01-12
6 1 2011-01-15
7 2 2011-01-21
9 1 2011-02-28
10 2 2011-03-21
11 1 2011-01-06
I need to get all the IDs,Key,min(VisitDate) where VisitDate is within 10 days span?if you have two visits within 10 days one row need to be there in the result.
Result
KEY ID VISITDATE
11 1 2011-01-06
3 2 2011-01-10
5 3 2011-01-12
7 2 2011-01-21
9 1 2011-02-28
10 2 2011-03-21
Can this be done without a self join. i have a query which does a self join with the table on ID and check the datediff.is there a better solution?can we use recursive CTE here?
EDIT
Prefer a solution which can use the index on date column
Yes a CTE would work nicely for this (everything with me is CTEs lately)...
;WITH TenDayVisits
AS (
SELECT
ID
,MIN(VisitDate) AS VisitDate
FROM Visits
GROUP BY ID
UNION ALL
SELECT
t.ID
,v.VisitDate
FROM Visits AS v
JOIN TenDayVisits AS t ON v.ID = t.ID
AND DATEDIFF(dd,t.Visitdate,v.VisitDate) > 10
)
SELECT
DISTINCT
v.[key]
,t.id
,t.VisitDate
FROM TenDayVisits as T
JOIN Visits AS v ON t.id = v.id
AND t.VisitDate = v.VisitDate