Count distinct of multiple columns

Count distinct of multiple columns - sql

I've been trying to figure a query out.
Let's say a table looks like this:
cus_id prod_category agreement_id type_id
111 10 123456 1
111 10 123456 1
111 10 123456 2
111 20 123456 2
123 20 987654 6
999 0 135790 99
999 0 246810 99
and so on...
I would like to get the count of prod_category for distinct values over agreement_id and type_id
so I would like to get a result like this:
cus_id prod_id count
111 10 2
111 20 1
123 20 1
999 0 2

We can use the following two level aggregation query:
SELECT cus_id, prod_category, COUNT(*) AS count
FROM
(
SELECT DISTINCT cus_id, prod_category, agreement_id, type_id
FROM yourTable
) t
GROUP BY cus_id, prod_category;
The inner distinct query de-duplicated tuples, and the outer aggregation query counts the number of distinct tuples per customer and product category.

You want to count distinct (agreement_id, type_id) tuples per (cus_id, prod_category) tuple.
"Per (cus_id, prod_category) tuple" translates to GROUP BY cus_id, prod_category in SQL.
And we count distinct (agreement_id, type_id) tuples with COUNT(DISTINCT agreement_id, type_id).
SELECT cus_id, prod_category, COUNT(DISTINCT agreement_id, type_id) AS distinct_count
FROM mytable
GROUP BY cus_id, prod_category
ORDER BY cus_id, prod_category;

Related

Partition Over issue in SQL

I have a Order shipment table like below -
Order_ID
shipment_id
pkg_weight
1
101
5
1
101
5
1
101
5
1
102
3
1
102
3
I want the output table to look like below -
Order_ID
Distinct_shipment_id
total_pkg_weight
1
2
8
select
order_id
, count(distinct(shipment_id)
, avg(pkg_weight) over (partition by shipment_id)
from table1
group by order_id
but getting the below error -
column "pkg_weight" must appear in the GROUP BY clause or be used in
an aggregate function
Please help

Use a distinct select first, then aggregate:
SELECT Order_ID,
COUNT(DISTINCT shipment_id) AS Distinct_shipment_id,
SUM(pkg_weight) AS total_pkg_weight
FROM
(
SELECT DISTINCT Order_ID, shipment_id, pkg_weight
FROM table1
) t
GROUP BY Order_ID;

How to build SQL to capture most unique value?

I am trying to build a query results with SQL. Here is my table:
CUST_ID ORDER_ID STORE_FREQUENCY
---------- ----------- ---------------
100 20122 500
100 20100 500
100 20100 737
200 20119 287
300 20130 434
300 20150 434
300 20130 434
300 20120 120
The expected output is:
CUST_ID UNIQUE_ORDERS TOP_STORE
--------- ----------------- ---------
100 2 737
200 1 287
300 3 434
The requirement for the output is:
TOP_STORE = Per CUST_ID, sort the STORE_FREQUENCY column by DESC and get the greatest store frequency
UNIQUE_ORDERS = Per CUST_ID, the number of unique ORDER_IDs in the column
I have started this SELECT statement, but having difficulties completing it to include the 2 columns correctly:
Select cust_id, Count(order_id) as unique_orders
From ORDERS_TABLE
Group By Order_ID
Can you help me complete the 2 columns?

Use aggregate functions such as COUNT(DISTINCT ...) and MAX()
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID), MAX(STORE_FREQUENCY )
FROM TableName
GROUP BY CUST_ID
Here's a DEMO.

It seems to be that the top store should be the store with the greatest number of orders. If so, then CUST_ID 100 should have store 500 as the top store, not 737. In other words, I would expect the following output:
This requirement changes the query strategy, because we no longer can just do a single simple aggregation over the entire table. One approach is to do a separate calculation to find the top store for each customer, then join that result to a query similar to the other answers.
WITH cte AS (
SELECT CUST_ID, STORE_FREQUENCY, cnt,
ROW_NUMBER() OVER (PARTITION BY CUST_ID ORDER BY cnt DESC) rn
FROM
(
SELECT CUST_ID, STORE_FREQUENCY,
COUNT(*) OVER (PARTITION BY CUST_ID, STORE_FREQUENCY) cnt
FROM yourTable
) t
)
SELECT
t1.CUST_ID,
t1.UNIQUE_ORDERS,
t2.TOP_STORE
FROM
(
SELECT CUST_ID, COUNT(DISTINCT ORDER_ID) AS UNIQUE_ORDERS
FROM yourTable
GROUP BY CUST_ID
) t1
INNER JOIN
(
SELECT CUST_ID, STORE_FREQUENCY AS TOP_STORE
FROM cte
WHERE rn = 1
) t2
ON t1.CUST_ID = t2.CUST_ID;
Demo

SQL Server : how to select the rows in a table with the same value on a column but some exact values on another column for the grouped rows

I have this table with some sample data:
Supplier_ID Product_ID Stock
-----------------------------
1 272 1
1 123 5
1 567 3
1 564 3
2 272 4
2 12 3
2 532 1
3 123 4
3 272 5
I want to check the suppliers that have both products: 272 and 123, so the result would be like:
Supplier_ID
-----------
1
3

You can use GROUP BY and HAVING:
SELECT Supplier_ID
FROM your_tab
WHERE Product_ID IN (272, 123)
GROUP BY Supplier_ID
HAVING COUNT(DISTINCT Product_ID) = 2;
LiveDemo

Try this code:
SELECT A.Supplier_ID FROM
(SELECT Supplier_ID FROM Your_Table WHERE Product_ID = 272) AS A
INNER JOIN
(SELECT Supplier_ID FROM Your_Table WHERE Product_ID = 123) AS B
ON A.Supplier_ID = B.Supplier_ID

This is how it works using set operations. IMHO a too little used feature of databases.
select Supplier_ID from table1 where product_id=272
intersect
select Supplier_ID from table1 where product_id=123
it produces as well
Supplier_ID
1
3
By the way a distinct is not needed due to intersect delivers distinct rows.
http://sqlfiddle.com/#!6/13b11/3

Try This Code: By Using Row_number()
;WITH cte
AS (SELECT *,
Row_number()
OVER(
partition BY [Supplier_ID]
ORDER BY [Supplier_ID]) AS rn
FROM #Your_Table
WHERE Product_ID IN ( 272, 123 ))
SELECT DISTINCT Supplier_ID
FROM cte
WHERE rn > 1
OUTPUT:
Supplier_ID
1
3

Sum function in SQL Server 2008

I have a table:
PropertyID Amount
--------------------------
1 40
1 20
1 10
2 10
2 90
I would like to achieve :
PropertyId Amount Total_Amount
---------------------------------------
1 40 70
1 20 70
1 10 70
2 10 100
2 90 100
using below query :
SELECT
PropertyID,
SUM(Amount),
SUM(TotalAmount)
FROM
yourTable
WHERE
EndDate IS NULL
GROUP BY
PropertyID
Output:
PropertyId Amount TotalAmount
-------------------------------------
1 70 70
2 100 100
Let me know how can I get my desired output ...

You can do this using window functions:
select PropertyID, Amount,
sum(Amount) over (partition by PropertyId) as TotalAmount
from yourtable;
The window function for sum() does the following. It calculates the sum of amount for groups of rows in the same group. The group is defined by the partition by clause, so rows with the same value of PropertyId are in the same group.

SELECT PropertyID,
Amount,
(select sum(yt.Amount)
from yourTable yt where yt.PropertyID==y.PropertyID and yt.EndDate IS NULL)
as TotalAmount
FROM yourTable y
WHERE y.EndDate IS NULL

Limit number of occurances in output group-by sql query

I have this query
select rep, companyname,count(companyname) as [count], Commission from customers
group by repid,companyname,Commission
It returns lets say
rep companyname count commision
1 ABC 1 10%
2 XYZ 2 10%
2 XYZ 1 20%
3 JKL 4 10%
3 JKL 1 30%
Desire output is
rep companyname count commision
2 XYZ 2 10%
2 XYZ 1 20%
3 JKL 4 10%
3 JKL 1 30%
I would like to have an output so that I show the only those companies who are repeated twice or more in the result. How do I modify the above query. I made the query simple (remove where clause).

I would use a subquery to get the non-unique company names like this.
select rep, companyname,count(companyname) as [count], Commission from customers
where companyname in (
select c1.companyname from customers c1
group by c1.companyname having count(*) >= 2
)
group by repid,companyname,Commission

I think this will match your requirements. I couldn't think of a way of doing it without some sort of sub query or CTE:
select
rep, companyname, [count], commission
from (
select
rep, companyname,count(companyname) as [count], Commission,
count(1) over (PARTITION by companyname) as [companycount]
from customers
group by repid,companyname,Commission
) sub
where companycount > 1

select rep
, companyname
, count(*) as [count] --- equivalent to count(companyname)
, Commission
from customers c
where exists
( select *
from customers c2
where c2.companyname = c.companyname
and ( c2.repid <> c.repid
or c2.Commission <> c.Commission
)
and ( extra-conditions )
)
and ( extra-conditions )
group by repid, companyname, Commission

Add a HAVING clause after your group by, e.g. HAVING count(companyName) > 1

You're looking for the HAVING keyword, which is essentially a WHERE condition for your GROUP BY
select rep, companyname,count(companyname) as [count], Commission from customers
group by repid,companyname,Commission
having count(companyname) > 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Count distinct of multiple columns - sql

Related

Partition Over issue in SQL

How to build SQL to capture most unique value?

SQL Server : how to select the rows in a table with the same value on a column but some exact values on another column for the grouped rows

Sum function in SQL Server 2008

Limit number of occurances in output group-by sql query

Categories

Resources