Counting in sql and subas

Counting in sql and subas - sql

I have the following code
select ID, count(*) from
( select ID, service type from database
group by 1,2) suba
group by 1
having count (*) > 1
And I get a table where i see the IDs and a count of changes. Similar to this
ID | Count(*)
5675 | 2
5695 | 3
5855 | 2
5625 | 4
5725 | 3
Can someone explain to me how to count all the count(*) into groups such that i get a table similar to...
count (*) | number
2 | 2
3 | 2
4 | 1
and so forth. Can someone also explain to be me what suba means?
MY NEWEST CODE:
select suba.id, count(*) from
( select id, service_type from table_name
group by 1,2) as suba
group by 1
having count (*) > 1

Haven't tried it, but I think this should work
select NoOfChanges, count (*) from
(
select suba.id, count(*) as NoOfChanges from
( select id, service_type from table_name
group by 1,2) as suba
group by 1
having count (*) > 1
)
subtableb
group by NoOfChanges
You can think of that as
select NoOfChanges, count (*) from subtableb
group by NoOfChanges
but subtableb isn't a real table, but the results from your previous query

suba is the alias of the subquery. Every table or subquery needs a unique name or an alias so you can refer to it in other parts of the query (and disambiguate). Note there is a missing implicit AS between the closing parenthesis and "suba".

Related

Postgresql query to filter latest data based on 2 columns

Table Structure First
users table
id
1
2
3
sites table
id
1
2
site_memberships table
site_id
user_id
created_on
1
1
1
1
1
2
1
1
3
2
1
1
2
1
2
1
2
2
1
2
3
Assuming higher the created_on number, latest the record
Expected Output
site_id
user_id
created_on
1
1
3
2
1
2
1
2
3
Expected output: I need latest record for each user for each site membership.
Tried the following query, but this does not seem to work.
select * from users inner join
(
SELECT ROW_NUMBER () OVER (
PARTITION BY sm.user_id,
sm.created_on
), sm.*
from site_memberships sm
inner join sites s on sm.site_id=s.id
) site_memberships
ON site_memberships.user_id = users.user_id where row_number=1```

I think you have overcomplicated the problem you want to solve.
You seem to want aggregation:
select site_id, user_id, max(created_on)
from site_memberships sm
group by site_id, user_id;
If you had additional columns that you wanted, you could use distinct on instead:
select distinct on (site_id, user_id) sm.*
from site_memberships sm
order by site_id, user_id, created_on desc;

How to select IDs that have at least two specific instaces in a given column

I'm working with a medical claim table in pyspark and I want to return only userid's that have at least 2 claim_ids. My table looks something like this:
claim_id | userid | diagnosis_type | claim_type
__________________________________________________
1 1 C100 M
2 1 C100a M
3 2 D50 F
5 3 G200 M
6 3 C100 M
7 4 C100a M
8 4 D50 F
9 4 A25 F
From this example, I would want to return userid's 1, 3, and 4 only. Currently I'm building a temp table to count all of the distinct instances of the claim_ids
create table temp.claim_count as
select distinct userid, count(distinct claim_id) as claims
from medical_claims
group by userid
and then pulling from this table when the number of claim_id >1
select distinct userid
from medical_claims
where userid (
select distinct userid
from temp.claim_count
where claims>1)
Is there a better / more efficient way of doing this?

If you want only the ids, then use group by:
select userid, count(*) as claims
from medical_claims
group by userid
having count(*) > 1;
If you want the original rows, then use window functions:
select mc.*
from (select mc.*, count(*) over (partition by userid) as num_claims
from medical_claims mc
) mc
where num_claims > 1;

How to get MAX Hike in Min month?

below is table:
Name | Hike% | Month
------------------------
A 7 1
A 6 2
A 8 3
b 4 1
b 7 2
b 7 3
Result should be:
Name | Hike% | Month
------------------------
A 8 3
b 7 2

Here is one way of doing this:
SELECT Name, [Hike%], Month
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY [Hike%] DESC, Month) rn
FROM yourTable
) t
WHERE rn = 1
ORDER BY Name;
If you instead want to return multiple records per name, in the case where two or more records might be tied for having the greatest hike%, then replace ROW_NUMBER with RANK.

use correlated subquery
select Name,min(Hike) as Hike,min(Month) as Month
from
(
select * from tablename a
where Hike in (select max(Hike) from tablename b where a.name=b.name)
)A group by Name

You can use something similar to the below:
SELECT Name, MAX(Hike), Month
FROM table
GROUP BY Name, Month
Hope this helps :)

Can the result set of inner query can be displayed with final result set

I have a table that contains some data say
====================
Record | Record_Count
1 | 12
3 | 87
5 | 43
6 | 54
1 | 43
3 | 32
5 | 65
6 | 43
I have a query that returns Record Count sum grouped by Record
select record,sum(record_count)
FROM table_name
WHERE <conditions>
GROUP BY tcpa_code
ORDER BY sum(record_count)
The result is something like this
====================
Record | Record_Count
1 | 55
3 | 119
5 | 108
6 | 97
Now I also want a grand total of record_count (Sum of all record Count).
The thing is I want the above result set along with the grand total also.
I had tried this
select sum(subquery.record_count)
from (
select record,sum(record_count)
FROM table_name
WHERE <conditions>
GROUP BY tcpa_code
ORDER BY sum(record_count) ) as subquery
But by using this I am losing the individual record_count sum.
So my question is can I achieve result set that contains record_count sum for each record and grand total of record_count in a single query?

You may use union to achieve what you need:
(select cast(record as varchar(16)) record,sum(record_count) from schema.table
group by 1)
union
(select 'Grand_Total' as record,sum(record_count) from schema.table
group by 1);
Check here - SQL Fiddle
If your DB supports group by ... with rollup, you may also use:
select ifnull(record,'Grand Total')record,sum(record_count) total_count
from schema.table
group by record asc with rollup
Check here - SQL Fiddle

To save some typing, a common table expression (cte) can be used:
with cte as
(
select record, sum(record_count) rsum
FROM table_name
WHERE <conditions>
GROUP BY record
)
select record, rsum from cte
union all
select 'Grand_Total', sum(rsum) from cte

You should utilize windows functions of PostgrSQL.
As for this query,
SELECT record, record_count, sum(record_count) OVER()
FROM (
SELECT record, sum(record_count) record_count
FROM table_name
WHERE <conditions>
GROUP BY tcpa_code
ORDER BY sum(record_count)
) as subquery

How to find records that are associated to the same group more than once?

I need to find if a record id is on the same group id more than once.
group id | record id | comments
--------------------------------
3 1
3 1
4 2
In this case, the record with id 1 is being associated with group id 3 twice.
Is there a query that can provide all records with this behaviour?
Thanks

This query should do the job:
select t.group_id, t.record_id
from [your_table] t
group by t.group_id, t.record_id
having count(*) > 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Counting in sql and subas - sql

suba is the alias of the subquery. Every table or subquery needs a unique name or an alias so you can refer to it in other parts of the query (and disambiguate). Note there is a missing implicit AS between the closing parenthesis and "suba".

Related

Postgresql query to filter latest data based on 2 columns

How to select IDs that have at least two specific instaces in a given column

How to get MAX Hike in Min month?

Can the result set of inner query can be displayed with final result set

How to find records that are associated to the same group more than once?

Categories

Resources