How to get intro record from each group - sql

I have a following table
id group name
1 2 dodo
2 1 sdf
3 2 sd
4 3 dfs
5 3 fda
....
and i want to get intro record from each group like following
id group name
... 1 sdf
2 dodo
3 dfs
...

SELECT MIN(id) id, group, name
FROM TABLE1
GROUP BY group
ORDER BY group

select * from table_name where id in (select min(id) from table_name group by group)

Related

Postgresql query to filter latest data based on 2 columns

Table Structure First
users table
id
1
2
3
sites table
id
1
2
site_memberships table
site_id
user_id
created_on
1
1
1
1
1
2
1
1
3
2
1
1
2
1
2
1
2
2
1
2
3
Assuming higher the created_on number, latest the record
Expected Output
site_id
user_id
created_on
1
1
3
2
1
2
1
2
3
Expected output: I need latest record for each user for each site membership.
Tried the following query, but this does not seem to work.
select * from users inner join
(
SELECT ROW_NUMBER () OVER (
PARTITION BY sm.user_id,
sm.created_on
), sm.*
from site_memberships sm
inner join sites s on sm.site_id=s.id
) site_memberships
ON site_memberships.user_id = users.user_id where row_number=1```
I think you have overcomplicated the problem you want to solve.
You seem to want aggregation:
select site_id, user_id, max(created_on)
from site_memberships sm
group by site_id, user_id;
If you had additional columns that you wanted, you could use distinct on instead:
select distinct on (site_id, user_id) sm.*
from site_memberships sm
order by site_id, user_id, created_on desc;

How to delete records with lower version in big query?

Lets say my table contains the following data
id
name
version
1
Rahul
1
1
Rahul
2
2
John
1
3
Mike
1
2
John
2
4
Rubel
1
5
David
1
1
Rahul
3
I need to filter the duplicate records with lower version. How can this be done?
The output essentially should be
id
name
version
1
Rahul
3
2
John
2
3
Mike
1
4
Rubel
1
5
David
1
For this dataset, aggregation seems sufficient:
select id, name, max(version) as max_version
from mytable
group by id, name
You can use not exists as follows:
select id, name, version
from your_table t
Where not exists
(Select 1 from your_table tt
Where tt.id = t.id and tt.version > t.version)
Or you can use analytical function row_number as follows:
Select id, name, version from
(select t.*,
Row_number() over (partition by id order by version desc) as rn
from your_table t) t
Where rn = 1

How to select IDs that have at least two specific instaces in a given column

I'm working with a medical claim table in pyspark and I want to return only userid's that have at least 2 claim_ids. My table looks something like this:
claim_id | userid | diagnosis_type | claim_type
__________________________________________________
1 1 C100 M
2 1 C100a M
3 2 D50 F
5 3 G200 M
6 3 C100 M
7 4 C100a M
8 4 D50 F
9 4 A25 F
From this example, I would want to return userid's 1, 3, and 4 only. Currently I'm building a temp table to count all of the distinct instances of the claim_ids
create table temp.claim_count as
select distinct userid, count(distinct claim_id) as claims
from medical_claims
group by userid
and then pulling from this table when the number of claim_id >1
select distinct userid
from medical_claims
where userid (
select distinct userid
from temp.claim_count
where claims>1)
Is there a better / more efficient way of doing this?
If you want only the ids, then use group by:
select userid, count(*) as claims
from medical_claims
group by userid
having count(*) > 1;
If you want the original rows, then use window functions:
select mc.*
from (select mc.*, count(*) over (partition by userid) as num_claims
from medical_claims mc
) mc
where num_claims > 1;

SQL COUNT(DISTINCT(field1)) GROUP BY MAX(filed2)

I have a table like
name num_try
John 2
John 1
Mike 3
Mike 2
Linda 2
And I want to know count distinct names group by MAX(num_try).
Desired result should look like
MAX(num_try) COUNT(DISTINCT(names))
2 2
3 1
Can you help me with this query?
select max_num_try, count(*) from
(
select name, max(num_try) as max_num_try
from table1
group by name
) a
group by max_num_try
order by max_num_try desc

Count and Grouping 2 column in a table

This is my table and data
id | owner | buyer
1 1 3
2 2 2
3 1 2
I want the result to be like this
user | totals
2 3
1 2
3 1
User field means owner and buyer.
Hope you all are understand.
Thanks ~
You can do this using union all and group by:
select user, count(*)
from ((select owner as user from t
) union all
(select buyer from t
)
) ob
group by user
order by user;