How to count over rows and avoid duplicates?

How to count over rows and avoid duplicates? - sql

For a university project I have to calculate a kpi based on the data of one table. The table stores data about baskets of a supermarket and the shopped line items and their product category. I have to calculate a number of all product categories of products which were bought in a specific store. So in tables it looks like this:
StoreId BasketID CategoryId
1 1 1
1 1 2
1 1 3
1 2 1
1 2 3
1 2 4
2 3 1
2 3 2
2 3 3
2 4 1
As a result of the query I want a table which counts the distinct product categories over all basket associated to a store.
Something like this:
StoreId Count(CategoryId)
1 4
2 3
If I do a not dynamic statement with hard values, it is working.
select basket_hash, store_id, count(DISTINCT retailer_category_id)
from promo.checkout_item
where store_id = 2
and basket_hash = 123
GROUP BY basket_hash, store_id;
But when I try to write it in a dynamic way, the sql calculates the amount per basket and adds the single amounts together.
select store_id, Count(DISTINCT retailer_category_id)
from promo.checkout_item
group by store_id;
But like this it isn't comparing the categories over all baskets associated to a store and I'm getting duplicates because a category can be in basket 1 and in basket 2.
Can somebody pls help?!
Thx!

As your expected result, Do you want following statement?
SELECT StoreId, COUNT(*)
FROM (
SELECT DISTINCT StoreId, CategoryId
FROM table_name
)
GROUP BY StoreId;
Please, replace "table_name" in statement by your table's name.
I'm not sure what is "dynamic way" meaning.

I'm confused by your requirements. This is what I suppose you mean:
with checkout_item (store_id, basket_hash, retailer_category_id) as (
values
(1,1,1),(1,1,2),(1,1,3),(1,2,1),(1,2,3),
(1,2,4),(2,3,1),(2,3,2),(2,3,3),(2,4,1)
)
select distinct store_id, basket_hash, store_cats, basket_cats
from (
select store_id, basket_hash,
max(store_cats) over (partition by store_id) as store_cats,
max(basket_cats) over (partition by basket_hash) as basket_cats
from (
select store_id, basket_hash,
dense_rank() over (
partition by store_id
order by retailer_category_id
) as store_cats,
dense_rank() over (
partition by basket_hash
order by retailer_category_id
) as basket_cats
from checkout_item
) s
) s
order by 1, 2
;
store_id | basket_hash | store_cats | basket_cats
----------+-------------+------------+-------------
1 | 1 | 4 | 3
1 | 2 | 4 | 3
2 | 3 | 3 | 3
2 | 4 | 3 | 1

Related

SUM a column in SQL, based on DISTINCT values in another column, GROUP BY a third column

I'd appreciate some help on the following SQL problem:
I have a table of 3 columns:
ID Group Value
1 1 5
1 1 5
1 2 10
1 2 10
1 3 20
2 1 5
2 1 5
2 1 5
2 2 10
2 2 10
3 1 5
3 2 10
3 2 10
3 2 10
3 4 50
I need to group by ID, and I would like to SUM the values based on DISTINCT values in Group. So the value for a group is only accounted for once even though it may appear multiple for times for a particular ID.
So for IDs 1, 2 and 3, it should return 35, 15 and 65, respectively.
ID SUM
1 35
2 15
3 65
Note that each Group doesn't necessarily have a unique value
Thanks

the CTE will remove all duplicates, so if there a sdiffrenet values for ID and Group, it will be counted.
The next SELECT wil "GROUP By" ID
For Pstgres you would get
WITH CTE as
(SELECT DISTINCT "ID", "Group", "Value" FROM tablA
)
SELECT "ID", SUM("Value") FROM CTE GROUP BY "ID"
ORDER BY "ID"
ID | sum
-: | --:
1 | 35
2 | 15
3 | 65
db<>fiddle here

Given what we know at the moment this is what I'm thinking...
The CTE/Inline view eliminate duplicates before the sum occurs.
WITH CTE AS (SELECT DISTINCT ID, Group, Value FROM TableName)
SELECT ID, Sum(Value)
FROM CTE
GROUP BY ID
or
SELECT ID, Sum(Value)
FROM (SELECT DISTINCT * FROM TableName) CTE
GROUP BY ID

SQL Group by Sales Rep - Select 2 counts

I would like to query a table that has leads assigned by sales rep to return the unique number of leads grouped by agent and also the number sold. There can be multiple leads from one buyer, I would like to select distinct so each buyer is counted only once. Here is the layout of the data:
AgentId
BuyerEmail
Product
Category
1
lisa#gmail.com
Jeans
1
1
lisa#gmail.com
Hat
1
1
ryan#gmail.com
Shoes
3
2
mark#gmail.com
Jeans
1
2
mark#gmail.com
Socks
1
2
mark#gmail.com
Hat
1
4
john#gmail.com
Shirt
3
5
lou#gmail.com
Hat
3
5
tim#gmail.com
Shirt
3
I would like to return a dataset like the following:
AgentId
UniqueLeads
QtySold
1
2
1
2
1
0
4
1
1
5
2
2
I can query this individually but I can't get it to return in one result set. Here are the 2 separate queries:
SELECT COUNT(DISTINCT BuyerEmail) FROM SalesLeads GROUP BY InitialAgent
SELECT COUNT(DISTINCT BuyerEmail) FROM SalesLeads WHERE Category = 3 GROUP BY InitialAgent
How can I query the table and have both data points return in one result set? Please note, a category = 3 means it is sold.

You can use conditional aggregation to calculate QtySold in the same statement:
select AgentId,
count(distinct BuyerEmail) as UniqueLeads,
count(case when Category = 3 then Category end) as QtySold
from SalesLeads
group by AgentId
When Category is anything other than 3 the case statement returns null so that record isn't counted in the QtySold calculation.
db<>fiddle

How to combine some rows into a single row, and delete other rows?

I have a table like this:
id | invoice_id | product_id | quantity | total
1 5 10 2 100
2 5 10 1 50
3 5 11 1 200
4 5 11 1 200
I want to combine the rows having same product_id in an invoice by adding their quantities and total values to one of the rows and then delete the other rows in the table. So the output should be like this
id | invoice_id | product_id | quantity | total
1 5 10 3 150
3 5 11 2 400
How can I do this? I was thinking of using an sql function that returns a list of id's having the same invoice and product and then using aggregate functions on quantity and price. Are there any simpler ways to do this?

First, you need an UPDATE statement that updates for each invoice_id, product_id combination the row with the min id with the totals of quantity and total:
UPDATE tablename t
SET quantity = s.quantity,
total = s.total
FROM (
SELECT MIN(id) id, SUM(quantity) quantity, SUM(total) total
FROM tablename
GROUP BY invoice_id, product_id
) s
WHERE s.id = t.id;
Then a DELETE statement to delete all the other ids:
DELETE FROM tablename t1
WHERE t1.id > (
SELECT MIN(t2.id)
FROM tablename t2
WHERE t2.invoice_id = t1.invoice_id AND t2.product_id = t1.product_id
);
See the demo.

This looks like an aggregation query:
select min(id) as id, invoice_id, product_id,
sum(quantity) as quantity, sum(total) as total
from t
group by invoice_id, product_id;

sum of 2 different columns in MSSQL

I have 2 tables A and B. The columns names are similar in both the tables. The columns are
1. fees
2. user_id
I want to get the sum of fees from both tables where user_id = 1
For eg:
Table A:
id user_id fees
1 1 10
2 2 11
3 1 5
Table B:
id user_id fees
1 1 15
2 2 10
3 1 20
I need the result as below:
user_id fees
1 50
2 21
Please help me with the query

Try this:
select user_id, sum(fees) from (
select user_id, fees from Table_A
union all
select user_id, fees from Table_B
) as A
group by user_id

SQL: SELECT value for all rows based on a value in one of the rows and a condition

I have a list of total store visits for a customer for a month. The customer has a home store but can visit other stores. Like the table below:
MemberId | HomeStoreId | VisitedStoreId | Month | Visits
1 5 5 1 5
1 5 3 1 2
1 5 2 1 1
1 5 4 1 7
I want my select statement to give the number of visits to the home store against each store for that member for that month. Like the below:
MemberId | HomeStoreId | VisitedStoreId | Month | Visits | HomeStoreVisits
1 5 5 1 5 5
1 5 3 1 2 5
1 5 2 1 1 5
1 5 4 1 7 5
I've looked at a SUM with CASE statements inside and OVER with PARTITION but I can't seem to work it out.
Thanks

I would use window functions:
select t.*,
sum(case when homestoreid = visitedstoreid then visits end) over
(partition by memberid, month) as homestorevisits
from t;

SELECT MemberID,HomestoreID,visitedstoreid,Month,visits, homestorevisits
FROM Table LEFT OUTER JOIN
(SELECT MemberID, Visits homestorevisits
FROM TABLE WHERE homestoreID =VisitedStoreId
)T ON T.MemberID = Table.MemberID

You can achieve this using a simple subquery.
SELECT MemberId, HomeStoreID, VisitedStoreID, Month, Visits,
(SELECT Visits FROM table t2
WHERE t2.MemberId = t1.MemberId
AND t2.HomeStoreId = t1.HomeStoreId
AND t2.Month = t1.Month
AND t2.VisitedStoreId = t2.HomeStoreId) AS HomeStoreVisits
FROM table t1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count over rows and avoid duplicates? - sql

As your expected result, Do you want following statement? SELECT StoreId, COUNT(*) FROM ( SELECT DISTINCT StoreId, CategoryId FROM table_name ) GROUP BY StoreId; Please, replace "table_name" in statement by your table's name. I'm not sure what is "dynamic way" meaning.

Related

SUM a column in SQL, based on DISTINCT values in another column, GROUP BY a third column

SQL Group by Sales Rep - Select 2 counts

How to combine some rows into a single row, and delete other rows?

sum of 2 different columns in MSSQL

SQL: SELECT value for all rows based on a value in one of the rows and a condition

Categories

Resources