Postgresql query to filter latest data based on 2 columns - sql

Table Structure First
users table
id
1
2
3
sites table
id
1
2
site_memberships table
site_id
user_id
created_on
1
1
1
1
1
2
1
1
3
2
1
1
2
1
2
1
2
2
1
2
3
Assuming higher the created_on number, latest the record
Expected Output
site_id
user_id
created_on
1
1
3
2
1
2
1
2
3
Expected output: I need latest record for each user for each site membership.
Tried the following query, but this does not seem to work.
select * from users inner join
(
SELECT ROW_NUMBER () OVER (
PARTITION BY sm.user_id,
sm.created_on
), sm.*
from site_memberships sm
inner join sites s on sm.site_id=s.id
) site_memberships
ON site_memberships.user_id = users.user_id where row_number=1```

I think you have overcomplicated the problem you want to solve.
You seem to want aggregation:
select site_id, user_id, max(created_on)
from site_memberships sm
group by site_id, user_id;
If you had additional columns that you wanted, you could use distinct on instead:
select distinct on (site_id, user_id) sm.*
from site_memberships sm
order by site_id, user_id, created_on desc;

Related

Select rows with max date from table

I have such table and need table 2 result. I am trying to select rows with max date grouped by project_id and ordered by id. And result table must have id column. Tried such request:
SELECT MAX(charges.id) as id,
"charges"."profile_id", MAX(failed_at) AS failed_at
FROM "charges"
GROUP BY "charges"."profile_id"
ORDER BY "charges"."id" ASC
And have error:
ERROR: column "charges.id" must appear in the GROUP BY clause or be used in an aggregate function)
Example table
id
profile_id
failed_at
1
1
01.01.2021
2
1
01.02.2021
3
1
01.03.2021
4
2
01.06.2021
5
2
01.05.2021
6
2
01.04.2021
Needed result
id
profile_id
failed_at
3
1
01.03.2021
4
2
01.06.2021
SELECT charges.*
FROM charges
INNER JOIN
(
SELECT
profile_id,
MAX(charges.failed_at) AS MaxFailed_at
FROM charges
GROUP BY profile_id
) AS xQ ON charges.profile_id = xQ.profile_id AND charges.failed_at = xQ.MaxFailed_at

How to get rank of a user from all users

I have table called summary_coins , By ranking of coins I am trying to get an user ranking
I have tried like below
SELECT
user_id,
sum(get_count),
rank() over (order by sum(get_count) asc) as rank
FROM summary_coins
WHERE user_id = 2
GROUP BY user_id
sample data , without user_id = 2 in where I am getting below list
user_id sum rank
44 2 1
13 4 2
57 4 2
47 4 2
11 5 5
2 5 5
My desire out put :
2 5 5
Here I am always getting ranking 1 for user ID 2 , But from list of user it should be rank 5.
You want to apply WHERE user_id = 2 late. RANK OVER is the last thing to happen in your query, but you want to apply the WHERE clause afterwards. In order to do this make your query a subquery you select from:
SELECT user_id, sum_count, rank
FROM
(
SELECT
user_id,
sum(get_count) AS sum_count,
rank() over (order by sum(get_count) asc) as rank
FROM summary_coins
GROUP BY user_id
) all_users
WHERE user_id = 2;

How to select IDs that have at least two specific instaces in a given column

I'm working with a medical claim table in pyspark and I want to return only userid's that have at least 2 claim_ids. My table looks something like this:
claim_id | userid | diagnosis_type | claim_type
__________________________________________________
1 1 C100 M
2 1 C100a M
3 2 D50 F
5 3 G200 M
6 3 C100 M
7 4 C100a M
8 4 D50 F
9 4 A25 F
From this example, I would want to return userid's 1, 3, and 4 only. Currently I'm building a temp table to count all of the distinct instances of the claim_ids
create table temp.claim_count as
select distinct userid, count(distinct claim_id) as claims
from medical_claims
group by userid
and then pulling from this table when the number of claim_id >1
select distinct userid
from medical_claims
where userid (
select distinct userid
from temp.claim_count
where claims>1)
Is there a better / more efficient way of doing this?
If you want only the ids, then use group by:
select userid, count(*) as claims
from medical_claims
group by userid
having count(*) > 1;
If you want the original rows, then use window functions:
select mc.*
from (select mc.*, count(*) over (partition by userid) as num_claims
from medical_claims mc
) mc
where num_claims > 1;

SQL - Order by amount of occurrences

It's my first question here so I hope I can explain it well enough,
I want to order my data by amount of occurrences in the table.
My table is like this:
id Daynr
1 2
1 4
2 4
2 5
2 6
3 1
4 2
4 5
And I want it to sort it like this:
id Daynr
3 1
1 2
1 4
4 2
4 5
2 4
2 5
2 6
Player #3 has one day in the table, and Player #1 has 2.
My table is named "dayid"
Both id and Daynr are foreign keys, together making it a primary key
I hope this explains my problem enough, Please ask for more information it's my first time here.
Thanks in advance
You can do this by counting the number of times that things occur for each id. Most databases support window functions, so you can do this as:
select id, daynr
from (select t.*, count(*) over (partition by id) as cnt
from table t
) t
order by cnt, id;
You can also express this as a join:
select t.id, t.daynr
from table as t inner join
(select id, count(*) as cnt
from table
group by id
) as tg
on t.id = tg.id
order by tg.cnt, id;
Note that both of these include the id in the order by. That way, if two ids have the same count, all rows for the id will appear together.

Active Record select 15 records order by date with different field value using

Here I have some articles:
id text group_id source_id
1 t1 1 1
2 t2 1 1
3 t3 2 2
4 t4 3 4
So I want to have records in result ordered by created_at column (it exists, but I didn't show it in table) and having distinct group id, such as that:
id text group_id source_id
1 t1 1 1
3 t3 2 2
4 t4 3 4
Also, I should be able to filter result with source_id.
I'm stuck with this question for two days and don't even know how to start solve problem.
Assuming you want the minimum values of the non-duplicated columns, try:
select min(id) as id,
min(text) as text,
group_id,
source_id,
min(created_at) as created_at
from articles
where source_id = #your_parameter_value
group by group_id,
source_id
order by 5
Select * from
(Select * from articles
Order by group_id, id) x
Group by group_id