How to select max of count in PostgreSQL - sql

I have table in PostgreSQL with the following schema:
Category | Type
------------+---------
A | 0
C | 11
B | 5
D | 1
D | 0
F | 2
E | 11
E | 9
. | .
. | .
How can I select category wise maximum occurrence of type? The following give me all:
SELECT
category,
type,
COUNT(*)
FROM
table
GROUP BY
category,
type
ORDER BY
category,
count
DESC
My expected result is something like this:
Cat |Type |Count
--------+-------+------
A |0 |5
B |5 |30
C |2 |20
D |3 |10
That is the type with max occurrence in each category with count of that type.

You can use the following query:
SELECT category, type, cnt
FROM (
SELECT category, type, cnt,
RANK() OVER (PARTITION BY category
ORDER BY cnt DESC) AS rn
FROM (
SELECT category, type, COUNT(type) AS cnt
FROM mytable
GROUP BY category, type ) t
) s
WHERE s.rn = 1
The above query uses your own query as posted in the OP and applies RANK() windowed function to it. Using RANK() we can specify all records coming from the initial query having the greatest COUNT(type) value.
Note: If there are more than one types having the maximum number of occurrences for a specific category, then all of them will be returned by the above query, as a consequence of using RANK.
Demo here

If I understand correctly, you can use window functions:
SELECT category, type, cnt
FROM (SELECT category, type, COUNT(*) as cnt,
ROW_NUMBER() OVER (PARTITION BY type ORDER BY COUNT(*) DESC) as seqnum
FROM table
GROUP BY category, type
) ct
WHERE seqnum = 1;

SELECT
category,
type,
COUNT(*)
FROM
table
GROUP BY
category,
type
HAVING
COUNT(*) = (SELECT MAX(C) FROM (SELECT COUNT(*) AS C FROM A GROUP BY A) AS Q)
EDITED:
I apologize to readers,
COUNT(*) = (SELECT MAX(COUNT(*)) FROM table GROUP BY category,type)
is the ORACLE version, postgresql version is:
COUNT(*) = (SELECT MAX(C) FROM (SELECT COUNT(*) AS C FROM A GROUP BY A) AS Q)

SELECT category , MAX (Occurence)
FROM (SELECT t.category as category , Count(*) AS Occurence FROM table t);

SELECT
category,
type,
COUNT(*) AS count
FROM
table
GROUP BY
category,
type
ORDER BY
category ASC

Related

Distinct particular field in select query

I have table with below sample values.
|Id|Keyword|insertedon|
|:-|:------|:---------|
|1 | abcd | 13/12/20 |
|2 | cdef | 14/12/20 |
|3 | abcd | 14/12/20 |
|4 | defg | 14/12/20 |
In the above table i need distinct values of keywords order by insertedon desc order.
I need recent top 5 results.
Expected Result:
defc
abcd
cdef
Please let me know how to achieve this.
You get the top 5 results with TOP(5) in SQL Server. You'd order the keywords by their last insertedon date:
select top(5) keyword
from mytable
group by keyword
order by max(insertedon) desc;
If you are looking for latest entries based on insertedon column, you can find using the group by clause, something like this:
select keyword, max(insertedon)
from table
group by keyword
order by 2 desc
You can just use select distinct:
select distinct keyword
from t;
If you wanted a full row, you could use row_number():
select t.*
from (select t.*,
row_number() over (partition by keyword order by newid()) as seqnum
from t
) t
where seqnum = 1;
EDIT:
For the edited version, you can use:
select distinct keyword
from (select top (5) keyword
from t
order by insertedon desc
) k
Give a row number based on the descending order of the date column and then select the row wth row number 1.
Query
;with cte as(
select [rn] = row_number() over(
partition by [keyword]
order by [insertedon] desc, [id] desc
)
)
select [keyword] from cte
where [rn] = 1;
You can use the analytical functions as follows:
select t.* from
(select t.*,
row_number() over (partition by keyword order by insertedon desc) as rn,
Dense_rank() over (order by insertedon desc) as dr
from t ) t where rn = 1 and dr <= 5;

Select SUM and column with max

I looking best or simplest way to SELECT type, user_with_max_value, SUM(value) GROUP BY type. Table look similar
type | user | value
type1 | 1 | 100
type1 | 2 | 200
type2 | 1 | 50
type2 | 2 | 10
And result look:
type1 | 2 | 300
type2 | 1 | 60
Use window functions:
select type, max(case when seqnum = 1 then user end), sum(value)
from (select t.*,
row_number() over (partition by type order by value desc) as seqnum
from t
) t
where seqnum = 1;
Some databases have functionality for an aggregation function that returns the first value. One method without a subquery using standard SQL is:
select distinct type,
first_value(user) over (partition by type order by value desc) as user,
sum(value) over (partition by type)
from t;
You can use window function :
select t.*
from (select t.type,
row_number() over (partition by type order by value desc) as seq,
sum(value) over (partition by type) as value
from table t
) t
where seq = 1;
Try below query.
It will help you.
SELECT type, max(user), SUM(value) from table1 GROUP BY type
use analytical functions
create table poo2
(
thetype varchar(5),
theuser int,
thevalue int
)
insert into poo2
select 'type1',1,100 union all
select 'type1',2,200 union all
select 'type2',1,50 union all
select 'type2',2,10
select thetype,theuser,mysum
from
(
select thetype ,theuser
,row_number() over (partition by thetype order by thevalue desc) r
,sum(thevalue) over (partition by thetype) mysum from poo2
) ilv
where r=1

several conditions in the same sql query (having and min)

I have a table with these variables and I want to select the row where sum of the VALUE is greater than 1000 and at the same time the MONTH is the minimum
|CLIENT |MONTH |VALUE
|1 |1 |500
|1 |2 |1050
|1 |3 |1100
the result should be this:
|CLIENT|MONTH|VALUE
|1 |2 |1050
Is it possible to do it in only one query?
My attempt:
SELECT
client,
SUM(value) AS SUM_of_value,
MIN(month) AS MIN_of_month,
FROM mytable
GROUP BY 1
having SUM_of_value>1000;
You coud try using a join between your query and the query group by client and month
select t.client, t2.SUM_of_value, t2.MIN_of_month
from (
SELECT
client,
SUM(value) AS SUM_of_value,
MIN(month) AS MIN_of_month,
FROM mytable
GROUP BY 1
having SUM_of_value>1000;
) t1
inner join (
SELECT
client,
month ,
SUM(value) AS SUM_of_value
FROM mytable
GROUP BY 1,2
having SUM_of_value>1000;
) t2 ON t2.client = t1.client
AND t2.month = t1.MIN_of_month
If you want one row, then you can use order by:
select t.*
from mytable
where value > 1000
order by month
fetch first 1 row only;
I'm not sure why your query refers to "sum of value", because there is only one value per month for the data in the question.
EDIT:
Based on the comment, you do seem to want an aggregation query:
select client, month, sum(value)
from
group by client, month
having sum(value) > 1000
order by month
fetch first 1 row only;
Borrowing from Gordon and Forpas. Your sample data and output seems too simplistic. I think you would want one row per client.
select t.client, t.month, t.value
from (
select client, month, sum(value) as value, row_number() over (partition by client order by month) rn
from my_table
group by client, month
having sum(value) > 1000
) t
where t.rn = 1
If there is only 1 VALUE per each MONTH then use ROW_NUMBER() window function:
select t.CLIENT, t.MONTH, t.VALUE
from (
select *, row_number() over (partition by client order by month) rn
from mytable
where value > 1000
) t
where t.rn = 1
See the demo.
Results:
| CLIENT | MONTH | VALUE |
| ------ | ----- | ----- |
| 1 | 2 | 1050 |

Min Date from one column multiple rows

My apologies, I should have added every column and complete problem not just portion.
I have a table A which stores all invoices issued(id 1) payments received (id 4) from clients. Sometimes client pay in 2-3 installments. I want to find dateifference between invoice issued and last payment collected for the invoice. My data looks like this
**a.cltid**|**A.Invnum**|A.Cash|A.Date | a.type| a.status
70 |112 |-200 |2012-03-01|4 |P
70 |112 |-500 |2012-03-12|4 |P
90 |124 |-550 |2012-01-20|4 |P
70 |112 |700 |2012-02-20|1 |p
55 |101 |50 |2012-01-15|1 |d
90 |124 |550 |2012-01-15|1 |P
I am running
Select *, Datediff(dd,T.date,P.date)
from (select a.cltid, a.invnumber,a.cash, min(a.date)date
from table.A as A
where a.status<>'d' and a.type=1
group by a.cltid, a.invnumber,a.cash)T
join
Select *
from (select a.cltid, a.invnumber,a.cash, min(a.date)date
from table.A as A
where a.status<>'d' and a.type=4
group by a.cltid, a.invnumber,a.cash)P
on
T.invnumb=P.invnumber and T.cltid=P.cltid
How can I make it work? So it shows me
70|112|-500|2012-03-12|4|P 70|112|700|2012-02-20|1|p|22
90|124|-550|2012-01-20|4|P 90|124|550|2012-01-15|1|P|5
Edited***
You can use row_number to assign sequence number within each cltid in the order of decreasing date and then filter to get the first row for each cltid which will be the row with latest date for that cltid:
select *
from (
select A.*,
row_number() over (
partition by a.cltid order by a.date desc
) rn
from table.A as A
) t
where rn = 1;
It will return one row (with latest date) for each client. If you want to return all the rows which have latest date, use rank() instead.
Use a ranking function to get all the columns:
select a.*
from (select a.*,
row_number() over (partition by cltid order by date desc) as seqnum
from a
) a
where seqnum = 1;
Use aggregation if you only want the date. The issue with your query is that the group by clause has too many columns:
select a.cltid, max(a.date) as date
from table.A as A
group by a.cltid;
And the fact that min() returns the first date not the last date.
There are many ways to do this. Here are some of them:
test setup: http://rextester.com/VGUY60367
with common_table_expression as () using row_number()
with cte as (
select *
, rn = row_number() over (
partition by cltid, Invnum
order by [date] desc
)
from a
)
select cltid, Invnum, Cash, [date]
from cte
where rn = 1
cross apply version:
select distinct
a.cltid
, a.Invnum
, x.Cash
, x.[date]
from a
cross apply (
select top 1
cltid, Invnum
, [date]
, Cash
from a as i
where i.cltid =a.cltid
and i.Invnum=a.Invnum
order by i.[date] desc
) as x;
top with ties version:
select top 1 with ties
*
from a
order by
row_number() over (
partition by cltid, Invnum
order by [date] desc
)
all return:
+-------+--------+---------------------+------+
| cltid | Invnum | date | Cash |
+-------+--------+---------------------+------+
| 70 | 112 | 12.03.2012 00:00:00 | -500 |
| 90 | 124 | 20.01.2012 00:00:00 | -550 |
+-------+--------+---------------------+------+
You can achieve the desired o/p by this:
Select
a.cltid, a.invnumber,a.cash, max(a.date) [date]
from
YourTable a
group by
a.cltid, a.invnumber, a.cash, a.date

how to query the percentage of aggregate in vertica

Table product
productId type
1 A
2 A
3 A
4 B
5 B
6 C
What I want:
type perc
A 0.5
B 0.33
C 0.17
We can write a simple query like this:
Select type, cnt/(select count(*) from product) AS perc
FROM (
select type, count(*) as cnt
from product
group by type
) nested
But vertica doesn't support the subselect which is not correlated
Need someone's help!
Vertica does support both correlated and non-correlated subquery even if you might have restrictions on the joining predicate.
So, your query here above just works. And - guess what - it continues to work even if you use indentation:
SQL> SELECT
type
, cnt/( select count (*) FROM product ) AS perc
FROM
( SELECT type, count (*) as cnt
FROM product
GROUP BY type
) nested ;
type | perc
------+----------------------
C | 0.166666666666666667
A | 0.500000000000000000
B | 0.333333333333333333
(3 rows)
Of course you can re-write it in a different way. For example:
SQL> SELECT
a.type
, a.cnt/b.tot as perc
FROM
( SELECT type , count (*) as cnt
FROM product
GROUP BY type ) a
CROSS JOIN
( SELECT count (*) AS tot
FROM product ) b
ORDER BY 1
;
type | perc
------+----------------------
A | 0.500000000000000000
B | 0.333333333333333333
C | 0.166666666666666667
(3 rows)
You could also use analytic functions, which are messy in this application, but work:
WITH product AS (
select 1 as productId, 'A' as type
union all select 2, 'A'
union all select 3, 'A'
union all select 4, 'B'
union all select 5, 'B'
union all select 6, 'C'
)
SELECT distinct /* distinct because analytic functions don't reduce row count like aggregate functions */
type, count(*) over (partition by type) / count(*) over ()
FROM product;
type | perc
------+----------------------
A | 0.500000000000000000
B | 0.333333333333333333
C | 0.166666666666666667
count(*) over (partition by type) counts each type;
count(*) over () counts over everything, so gets the total count