Identifying the most purchased combination of items in SQL - sql

First of all, I hope everyone's staying safe out there.
So here's my question.
Currently I'm trying to figure out how I can identify the most purchased combination of items.
Most purchased combination of items must appear at the top (descending order is crucial).
Let's say I have a sales table that looks like this:
Cust_ID Item_ID
100 A
100 A
100 B
100 C
200 A
200 C
200 C
300 B
400 A
400 B
and the expected output looks something like this:
Comb_of_Item Count_of_Cust
A, B 10
A, C 7
B, C 4
A, B, C 2
Note that Customer 100 had purchased item "A" twice, which for the purpose of this exercise will be ignored (dups to be removed).
This means that Customer 100 would be counted as "A, B, C" NOT "A, A, B, C"
Any help/suggestion would be much appreciated.
Many thanks advance!

I believe this can be modify into a better query but right now this will get the job done.
select combo_of_item,count(combo_of_item) Count_of_Cust from (
select Cust_ID ,string_agg(Item_ID,',') combo_of_item from (
select distinct * from [table] ) a
group by Cust_ID) b
group by Combo_of_item
db<>fiddle
btw since OP didn't provide dbms, string_egg might have to alter depends on which db OP is currently using.

Related

How to use SQL to label each row based on certain criteria?

I need to correctly label each row based on certain criteria. For example the data I have is like this:
Table Product
Group product_id product_name category
1 123 Egg A
1 456 Egg A
1 456 Milk A
1 789 Milk A
2 135 Apple B
2. 137 Orange B
2. 137 Banana B
2. 139 Strawberry B
3. 235 Egg A
3. 237 Apple B
3. 237 Egg B
3. 239 Orange B
3. 239 Egg B
Since product egg can be found in more than 1 product IDs and milk can be found in more than 1 product IDs, 123,456 and 789 should be marked as A. Basically if a product name appears more than once in a group, then it is marked as A, otherwise B.
I was trying to use array functions and compare them, but it doesn't work for this scenario. For example,
select product_id,array_agg(product_name) as p1 from product
Then compare p1 with another array (p2) from the self inner join.
Any hints or help would be greatly appreciated!
Have you considered using a Case When statement?
Case
when product_name = 'egg' and category = 'a' then label = 'egg1'
when product_name = 'egg' and category = 'b' then label = 'egg2'
else 'no label'
End
I am referencing this post https://dba.stackexchange.com/questions/82487/case-with-multiple-conditions for clarity. - J
I am confused with your requirement. You state product name appears only once in a group, then it is marked as A, otherwise B. However, the data you show contains the exact opposite. The following produces what you said you wanted, not the values you posted. It will be correct or exactly the reverse. (See demo)
-- if a product name appears only once in a group, then it is marked as A, otherwise B.
with prod_group (group_id, product_name, cnt) as
( select group_id, product_name, count(*)
from products
group by group_id, product_name
) -- select * from prod_group ;
update products p
set category = case when grp.cnt = 1 then 'A' else 'B' end
from prod_group grp
where ( p.group_id, p.product_name) = ( grp.group_id, grp.product_name);
How it works: The prod_group CTE simply counts the number of times a product name appears in a group. The main "query" then uses this result to update category. Contrary to to your statement case isn't really going to help the CASE expression is exactly what you need.
Note: GROUP is an extremely poor choice for a column name as it is both a Postgres (conditional) and a SQL Standard reserved word.

Count blanks in multiple columns, grouped by another value

Ok so this gets me the count of how many Records of type A are blank in column B
SELECT A, Count(B)
FROM `table1`
where
B = ""
group by A
it gives me a table
A
B
First
564
Second
1985
And that is great. But I want this to summarize by counting blanks in multiple columns, not just blanks in column B, like this:
A
B
C
First
564
9001
Second
1985
223
I have an intuition that this is done by creating another table first that would look like this
A
Column
Value
First
"B"
B value
First
"C"
C value
Second
"B"
B value
Second
"C"
C value
for every document, so you can count blanks, but I'm not sure how to get there. Is this the right approach? or is there a much simpler version using pivot tables or similar?
You could try using a conditional sum,
select A,
Sum(case when b='' then 1 end) B,
Sum(case when c='' then 1 end) C
from t
group by A

Calculating values from two tables where one has key in header and one has it in column values

I have a simple problem that I dont know how to solve in sql.
I have two tables,
cost :
a | b | c
-------+-------+---------------
31.99 | 14.12 | 133.1
second table: income
Party | sum
------+--------
A | 90
B | 12
C | 70
Now i want to get a result that substract for each party A, B, C the income-cost and finds the net value. I cannot compare the column header to column value. I am quite new to this, so I am struggling quite a lot. There should be really easy way of doing this.
I created the 'cost' table by
SELECT sum(A) as A, sum(B) as B, sum(C) as C FROM mytable;
may be there is clever way of creating this table in the same formate as income table that would make it easier to compare? I will appreciate any suggeestion on any of the two fronts. Thanks a lot!
You can compare, using case:
select party,
cost - (case when party = 'a' then a
when party = 'b' then b
when party = 'c' then c
else 0 end)
from cost c cross join
income i

Trying to group items into different categories based on a specific field's value

i cant quite figure out how to put this into a simple question, so I'll explain what I have and what I need to do.
Table A
|..ItemNum..|..ItemUse..|..SubC..|..MainC..|
|..123..|..B..|..AAA..|..QQQ..|
|..456..|..J..|..BBB..|..QQQ..|
|..123..|..D..|..DDD..|..RRR..|
|..789..|..C..|..CCC..|..WWW..|
|..345..|..W..|..EEE..|..TTT..|
|..678..|..B..|..FFF..|..YYY..|
I need to make a list of ItemNum and MainC that are grouped into 3 categories:
B / C / D = 1
<anything else> = 2
B / C / D & <anything else> = 3
So my results would be:
|..MainC..|..Group..|
|..QQQ..|..3..|
|..RRR..|..1..|
|..WWW..|..1..|
|..TTT..|..2..|
|..YYY..|..1..|
I've got an iif setup that takes care of groups 1 and 2, but cant figure out how to get the values in MainC to come out with Group 3.
Any ideas?
I don't understand your explanation ( especially the mapping to 3 ) but here's a shot:
select MainC, case when ItemUse in ('B','C','D') then 1
else 2
end as group
from A

SQL Server: Subtraction between 2 queries

i have 2 queries, where in one table the amount is shown for cars such as
Amount_Table Cars
800 Car A
900 Car B
2100 Car C
Second Table shows discount respectively for Car A, B & C.
Discount_table
40
10
80
I wish to have a final query where in the Amount-Discount values are displayed
The amount table has one query made and discount table has another query. hence i wish to do
(amount-query) - (discount query)
I did
Select ( (amount-query) - (discount-query))
but that threw error of
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
Please help!
try something like this:
Select AmountTable.Amount-isnull(DiscountTable.Discount, 0)
from AmountTable left join
on AmountTable.Car = DiscountTable.Car
You cannot "subtract" queries. You have to do a join between tables (or subqueries), and make expressions using columns' names.
You need to join:
SELECT *
,cars_table.amount - discounts_table.discount
FROM cars_table
INNER JOIN discounts_table
ON cars.some_key = discounts_table.some_key