Limit number of occurances in output group-by sql query

Limit number of occurances in output group-by sql query - sql

I have this query
select rep, companyname,count(companyname) as [count], Commission from customers
group by repid,companyname,Commission
It returns lets say
rep companyname count commision
1 ABC 1 10%
2 XYZ 2 10%
2 XYZ 1 20%
3 JKL 4 10%
3 JKL 1 30%
Desire output is
rep companyname count commision
2 XYZ 2 10%
2 XYZ 1 20%
3 JKL 4 10%
3 JKL 1 30%
I would like to have an output so that I show the only those companies who are repeated twice or more in the result. How do I modify the above query. I made the query simple (remove where clause).

I would use a subquery to get the non-unique company names like this.
select rep, companyname,count(companyname) as [count], Commission from customers
where companyname in (
select c1.companyname from customers c1
group by c1.companyname having count(*) >= 2
)
group by repid,companyname,Commission

I think this will match your requirements. I couldn't think of a way of doing it without some sort of sub query or CTE:
select
rep, companyname, [count], commission
from (
select
rep, companyname,count(companyname) as [count], Commission,
count(1) over (PARTITION by companyname) as [companycount]
from customers
group by repid,companyname,Commission
) sub
where companycount > 1

select rep
, companyname
, count(*) as [count] --- equivalent to count(companyname)
, Commission
from customers c
where exists
( select *
from customers c2
where c2.companyname = c.companyname
and ( c2.repid <> c.repid
or c2.Commission <> c.Commission
)
and ( extra-conditions )
)
and ( extra-conditions )
group by repid, companyname, Commission

Add a HAVING clause after your group by, e.g. HAVING count(companyName) > 1

You're looking for the HAVING keyword, which is essentially a WHERE condition for your GROUP BY
select rep, companyname,count(companyname) as [count], Commission from customers
group by repid,companyname,Commission
having count(companyname) > 1

Related

Count distinct of multiple columns

I've been trying to figure a query out.
Let's say a table looks like this:
cus_id prod_category agreement_id type_id
111 10 123456 1
111 10 123456 1
111 10 123456 2
111 20 123456 2
123 20 987654 6
999 0 135790 99
999 0 246810 99
and so on...
I would like to get the count of prod_category for distinct values over agreement_id and type_id
so I would like to get a result like this:
cus_id prod_id count
111 10 2
111 20 1
123 20 1
999 0 2

We can use the following two level aggregation query:
SELECT cus_id, prod_category, COUNT(*) AS count
FROM
(
SELECT DISTINCT cus_id, prod_category, agreement_id, type_id
FROM yourTable
) t
GROUP BY cus_id, prod_category;
The inner distinct query de-duplicated tuples, and the outer aggregation query counts the number of distinct tuples per customer and product category.

You want to count distinct (agreement_id, type_id) tuples per (cus_id, prod_category) tuple.
"Per (cus_id, prod_category) tuple" translates to GROUP BY cus_id, prod_category in SQL.
And we count distinct (agreement_id, type_id) tuples with COUNT(DISTINCT agreement_id, type_id).
SELECT cus_id, prod_category, COUNT(DISTINCT agreement_id, type_id) AS distinct_count
FROM mytable
GROUP BY cus_id, prod_category
ORDER BY cus_id, prod_category;

numbers of users buying the exact same product from the same shop for > 2 times in 1 years

I have data like this:
date user prod shop cat1 cat2
2022-02-01 1 a a ah g
2022-02-02 1 a1 b ah g
2022-04-03 1 a a ah g
2022-04-19 1 a a ah g
2022-05-01 2 b c bg g
I want to know how many user buy the same product in the same shop for >2 times in period 1 year. The result i want like:
table 1
cat1 number_of_user
ah 1
table 2
cat2 number_of_user
g 1
For total user, my query like:
WITH data_product AS(
SELECT DATE(payment_time) date,
user,
CONCAT(prod, "_", shop) product_shop,
cat1,
cat2
FROM
a
WHERE
DATE(payment_time) BETWEEN "2022-01-01" AND DATE_SUB(current_date, INTERVAL 1 day)
ORDER BY 1,2,3),
purchased AS (
SELECT user, product_shop, count(product_shop) tot_purchased
FROM data_product
GROUP BY 1,2
HAVING COUNT(product_shop) > 2
)
SELECT COUNT(user) number_of_user FROM purchased
Please help to get number of user buy the same product in the same shop more than 2 times in period based on cat1 and cat2.

Try this:
create temporary table table1 as(
select *,extract(YEAR from date) as year from `projectid.dataset.table`
);
create temporary table table2 as(
select * except(date,cat2) ,count(user) over(partition by cat1,year,user,prod,shop) tcount from table1
);
create temporary table table4 as(
select * except(date,cat1) ,count(user) over(partition by cat2,year,user,prod,shop) tcount from table1
);
select distinct year,cat1 ,count(distinct user) number_of_user from table2 where tcount>2 group by YEAR,cat1;
select distinct year,cat2 ,count(distinct user) number_of_user from table4 where tcount>2 group by YEAR,cat2;
If you want a single result set you can union both the select statements.

I think this query might work. The first part shows count of customers who purchased same product in category1 from same shop during one year. Second part shows that for category2, then we concatenate the two set by union operation :
with cte as
(select distinct
PDate,userID as userID,prod as prod,shop,cat1 as cat1,cat2,
count(userID) over (partition by UserID,prod,shop,year(Pdate),cat1) as cat1_count,
count(PDate) over (partition by UserID,prod,shop,year(Pdate),cat2) as cat2_count
from tbl1)
select
cte.cat1 as c1,'0' as c2,count(distinct cte.cat1) as Num
from cte
where cte.cat1_count>1
group by cte.prod,cte.userID,cte.cat1
union
select
'0',cte.cat2,count(distinct cte.cat2)
from cte
where cte.cat2_count>1
group by cte.prod,cte.userID,cte.cat2

How to get the most sold Product in PostgreSQL?

Given a table products
pid
name
123
Milk
456
Tea
789
Cake
...
...
and a table sales
stamp
pid
units
14:54
123
3
15:02
123
9
15:09
456
1
15:14
456
1
15:39
456
2
15:48
789
12
...
...
...
How would I be able to get the product(s) with the most sold units?
My goal is to run a SELECT statement that results in, for this example,
pid
name
123
Milk
789
Cake
because the sum of sold units of both those products is 12, the maximum value (greater than 4 for Tea, despite there being more sales for Tea).
I have the following query:
SELECT DISTINCT products.pid, products.name
FROM sales
INNER JOIN products ON sale.pid = products.pid
INNER JOIN (
SELECT pid, SUM(units) as sum_units
FROM sales
GROUP BY pid
) AS total_units ON total_units.pid = sales.pid
WHERE total_units.sum_units IN (
SELECT MAX(sum_units) as max_units
FROM (
SELECT pid, SUM(units) as sum_units
FROM sales
GROUP BY pid
) AS total_units
);
However, this seems very long, confusing, and inefficient, even repeating the sub-query to obtain total_units, so I was wondering if there was a better way to accomplish this.
How can I simplify this? Note that I can't use ORDER BY SUM(units) LIMIT 1 in case there are multiple (i.e., >1) products with the most units sold.
Thank you in advance.

Since Postgres 13 it has supported with ties so your query can be simply this:
select p.pId, p.name
from sales s
join products p on p.pid = s.pid
group by p.pId, p.name
order by Sum(units) desc
fetch first 1 rows with ties;
See demo Fiddle

Solution for your problem:
WITH cte1 AS
(
SELECT s.pid, p.name,
SUM(units) as total_units
FROM sales s
INNER JOIN products p
ON s.pid = p.pid
GROUP BY s.pid, p.name
),
cte2 AS
(
SELECT *,
DENSE_RANK() OVER(ORDER BY total_units DESC) as rn
FROM cte1
)
SELECT pid,name
FROM cte2
WHERE rn = 1
ORDER BY pid;
Working example: db_fiddle link

SQL sum grouped by field with all rows

I have this table:
id sale_id price
-------------------
1 1 100
2 1 200
3 2 50
4 3 50
I want this result:
id sale_id price sum(price by sale_id)
------------------------------------------
1 1 100 300
2 1 200 300
3 2 50 50
4 3 50 50
I tried this:
SELECT id, sale_id, price,
(SELECT sum(price) FROM sale_lines GROUP BY sale_id)
FROM sale_lines
But get the error that subquery returns different number of rows.
How can I do it?
I want all the rows of sale_lines table selecting all fields and adding the sum(price) grouped by sale_id.

You can use window function :
sum(price) over (partition by sale_id) as sum
If you want sub-query then you need to correlate them :
SELECT sl.id, sl.sale_id, sl.price,
(SELECT sum(sll.price)
FROM sale_lines sll
WHERE sl.sale_id = sll.sale_id
)
FROM sale_lines sl;

Don't use GROUP BY in the sub-query, make it a co-related sub-query:
SELECT sl1.id, sl1.sale_id, sl1.price,
(SELECT sum(sl2.price) FROM sale_lines sl2 where sl2.sale_id = sl.sale_id) as total
FROM sale_lines sl1

In addition to other approaches, You can use CROSS APPLY and get the sum.
SELECT id, sale_id,price, Price_Sum
FROM YourTable AS ot
CROSS APPLY
(SELECT SUM(price) AS Price_Sum
FROM YourTable
WHERE sale_id = ot.sale_id);

SELECT t1.*,
total_price
FROM `sale_lines` AS t1
JOIN(SELECT Sum(price) AS total_price,
sale_id
FROM sale_lines
GROUP BY sale_id) AS t2
ON t1.sale_id = t2.sale_id

sql query to get data group by customer type and need to add default value if customer type not found

I have a table "Customers" with columns CustomerID, MainCountry and CustomerTypeID.
I have 5 customer types 1,2,3,4,5 .
I want to count number of customers of each country according to customer type. I am using the following query:
select count(CustomerID) as CustomerCount,MainCountry,CustomerTypeID
from Customers
group by CustomerTypeID,MainCountry
But some countries not have any customers, under type 1,2,3,4 or 5.
So I want to put a default value 0 for if customer type is not exist for that country.
Currently it is giving data as follows :-
CustomerCount MainCountry CustomerTypeID
5695 AU 1
525 AU 2
12268 AU 3
169 AU 5
18658 CA 1
1039 CA 2
24496 CA 3
2259 CA 5
2669 CO 1
10 CO 2
463 CO 3
22 CO 4
39 CO 5
As "AU" not have type 4 so I want a default value for it.

You should JOIN your table with a table with TypeId's. In this case
select count(CustomerID) as CustomerCount,TypeTable.MainCountry,TypeTable.TId
from
Customers
RIGHT JOIN (
select MainCountry,TId from
(
select Distinct MainCountry from Customers
) as T1,
(
select 1 as Tid
union all
select 2 as Tid
union all
select 3 as Tid
union all
select 4 as Tid
union all
select 5 as Tid
) as T2
) as TypeTable on Customers.CustomerTypeID=TypeTable.TId
and Customers.MainCountry=TypeTable.MainCountry
group by TypeTable.TId,TypeTable.MainCountry

Select Country.MainCountry, CustomerType.CustomerTypeId, Count(T.CustomerID) As CustomerCount
From (Select Distinct MainCountry From Customers) As Country
Cross Join (Select Distinct CustomerTypeId From Customers) As CustomerType
Left Join Customers T
On Country.MainCountry = T.MainCountry
And CustomerType.CustomerTypeId = T.CustomerTypeId
-- Edit here
And T.CreatedDate > Convert(DateTime, '1/1/2013')
-- End Edit
Group By Country.MainCountry, CustomerType.CustomerTypeId
Order By MainCountry, CustomerTypeId

Try that:
with cuntry as (
Select Distinct MainCountry From Customers
),
CustomerType as (
(Select Distinct CustomerTypeId From Customers
),
map as (
select MainCountry, CustomerTypeId from cuntry,CustomerType
)
select count(CustomerID) as CustomerCount,a.MainCountry,a.CustomerTypeID
from
map a left join Customers b on a.CustomerCount=b.CustomerCount and a.CustomerTypeID=b.CustomerTypeID

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Limit number of occurances in output group-by sql query - sql

I would use a subquery to get the non-unique company names like this. select rep, companyname,count(companyname) as [count], Commission from customers where companyname in ( select c1.companyname from customers c1 group by c1.companyname having count(*) >= 2 ) group by repid,companyname,Commission

Add a HAVING clause after your group by, e.g. HAVING count(companyName) > 1

You're looking for the HAVING keyword, which is essentially a WHERE condition for your GROUP BY select rep, companyname,count(companyname) as [count], Commission from customers group by repid,companyname,Commission having count(companyname) > 1

Related

Count distinct of multiple columns

numbers of users buying the exact same product from the same shop for > 2 times in 1 years

How to get the most sold Product in PostgreSQL?

SQL sum grouped by field with all rows

sql query to get data group by customer type and need to add default value if customer type not found

Categories

Resources