Assign a category to a product without repeating - sql

I have a function that produces a table like below. The sequence is important here.
I want each product to be assigned to a separate category, but if there is a change over time, eg Product D to Product C (row 11) , then another category should be created. The result I want to get is in the Result column.
Order
Number
Product
Result
1
106893
Product A
1
2
108468
Product B
2
3
108468
Product B
2
4
107011
Product C
3
5
107011
Product C
3
6
107011
Product C
3
7
107011
Product D
4
8
107011
Product D
4
9
107011
Product D
4
10
107011
Product D
4
11
107011
Product C
5
12
107011
Product E
6
13
107011
Product E
6
I tried to do it with rank() but in line 11 it again throws me a result of 3 instead of 5.
Theoretically, I did it with CTE, but it takes a long time to calculate on a small sample. There must be a simpler and faster way.

Use lag() to see where there is a switch. Then use a cumulative sum:
select t.*,
sum(case when prev_product = product then 0 else 1 end) over (order by order)
from (select t.*,
lag(product) over (order by order) as prev_product
from t
) t;
order is, of course, a SQL keyword. It is a bad name for a column and may need to be escaped if that is the actual name.

Related

Merge row values based on other column value

I'm trying to merge the values of two rows based on the value of another row in a different column. Below is my based table
Customer ID
Property ID
Bookings per customer
Cancellations per customer
A
1
0
1
B
2
10
1
C
3
100
1
C
4
100
1
D
5
20
1
Here is the SQL query I used
select customer_id, property_id, bookings_per_customer, cancellations_per_customer
from table
And this is what I want to see. Any ideas the query to get this would be? We use presto SQL
Thanks!
Customer ID
Property ID
Bookings per customer
Cancellations per customer
A
1
0
1
B
2
10
1
C
3 , 4
100
1
D
5
20
1
We can try:
SELECT
customer_id,
ARRAY_JOIN(ARRAY_AGG(property_id), ',') AS properties,
bookings_per_customer,
cancellations_per_customer
FROM yourTable
GROUP BY
customer_id,
bookings_per_customer,
cancellations_per_customer;

How to I stop duplication on SQL join where I have order_ids and when people order more than 1 item (so multiple product_ids) to calculate discounts?

So my problem is my discount number is blowing up because an order has a discount for the entire order, but I am making a dataset where there are multiple lines for each order to represent each product in the order. Instead of the discount only applying once to the order, it adds the discount for every line.
what is happening
order_id
product_id
quantity
amount
discount
1
a
1
5
0
2
a
1
5
7
2
b
1
10
7
3
a
1
5
5
3
b
1
10
5
3
c
1
15
5
what i want
order_id
product_id
quantity
amount
discount
1
a
1
5
0
2
a
1
5
7
2
b
1
10
0
3
a
1
5
5
3
b
1
10
0
3
c
1
15
0
I just want the discount to be applied once per order, and my join is using order_id so that is why the discount is applying multiple times. I would attach my code, but it's a decent sized CTE
Figured it out. I did need to use a row_number() Over Partition by Order id, but I was also losing records if the order had more than 1 item. The solution was to use a CASE WHEN statement.
CASE WHEN ORDER_ROW_COUNT = 1 THEN DISCOUNT ELSE 0 END
this allowed me to keep the records without duplicating the discounts
You’re joining on a field that isn’t unique so the join is returning all the records for that order Id and therefore the discount is being applied to all the records for that order Id. You need some sort of differentiator field. Something that is unique in each orders data set.
Example:
Select *, row_number () over(partition by order_id order by order_id) as rownumber into #temp from table
This should give you something like in the picture.
rownumber table image
Then join on order_Id = order_Id and rownumber =1 and this would only update the first record for each order.

Update new foreign key column of existing table with ids from another table in SQL Server

I have an existing table to which I have added a new column which is supposed to hold the Id of a record in another (new) table.
Simplified structure is sort of like this:
Customer table
[CustomerId] [GroupId] [LicenceId] <-- new column
Licence table <-- new table
[LicenceId] [GroupId]
The Licence table has a certain number of licences per group than can be assigned to customers in that same group. There are multiple groups, and each group has a variable number of customers and licences.
So say there are 100 licences available for group 1 and there are 50 customers in group 1, so each can get a license. There are never more customers than there are licences.
Sample
Customer
[CustomerId] [GroupId] [LicenceId]
1 1 NULL
2 1 NULL
3 1 NULL
4 1 NULL
5 2 NULL
6 2 NULL
7 2 NULL
8 3 NULL
9 3 NULL
Licence
[LicenceId] [GroupId]
1 1
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 2
10 2
11 2
12 3
13 3
14 3
15 3
16 3
17 3
Desired outcome
Customer
[CustomerId] [GroupId] [LicenceId]
1 1 1
2 1 2
3 1 3
4 1 4
5 2 7
6 2 8
7 2 9
8 3 12
9 3 13
So now I have to do this one time update to give every customer a licence and I have no idea how to go about it.
I'm not allowed to use a cursor. I can't seem to do a MERGE UPDATE, because joining the Customer to the Licence table by GroupId will result in multiple hits.
How do I assign each customer the next available LicenceId within their group in one query?
Is this even possible?
You can use window functions:
with c as (
select c.*, row_number() over (partition by groupid order by newid()) as seqnum
from customers c
),
l as (
select l.*, row_number() over (partition by groupid order by newid()) as seqnum
from licenses c
)
update c
set c.licenceid = l.licenseid
from c join
l
on c.seqnum = l.seqnum and c.groupid = l.groupid;
This assigns the licenses randomly. That is really just for fun. The most efficient method is to use:
row_number() over (partition by groupid order by (select null)) as seqnum
SQL Server often avoids an additional sort operation in this case.
But you might want to order them by something else -- for instance by the ordering of the customer ids, or by some date column, or something else.
Gordon has put it very well in his answer.
Let me break it down into simpler steps for you.
Step 1. Use the ROW_NUMBER() function to assign a SeqNum to the Customers. Use PARTITION BY GroupId so that the number starts from 1 in every group. I would ORDER BY CustomerId
Step 2. Use the ROW_NUMBER() function to assign a SeqNum to the Licences. Use PARTITION BY GroupId so that the number starts from 1 in every group. ORDER BY LicenseId because your ask is to "assign each customer the next available LicenceId within their group".
Now use these 2 queries to update LicenseId in Customer table.

Oracle SQL Count grouped rows in table

I was wonder if it is possible preferably using a select statement on PL/SQL V11 to get the following results from this table:
Area Store Product
10 1 A
10 1 B
11 1 E
11 1 D
10 2 C
10 2 B
10 2 A
10 3 B
10 3 A
13 1 B
13 1 A
and Return this result, so it groups by Area, and Store and looks for and area and store with the same products. So Area 10 Store 1 has products A and B so it will look at the list for other stores that only have A and B and count them. In this example it counts Area 10 store 1/Area 10 store 3/Area 13 Store 1.
Product Count of groups
AB 3
ABC 1
DE 1
Thanks in advance for the help.
Yes, you can use listagg() and then another group by:
select products, count(*)
from (select listagg(product) within group (order by product) as products
from t
group by area, store
) p
group by products;

How to update a column with incrementally sequenced values that change depending on other column value

I am trying to update a column in a table so that the Index column (which is currently arbitrary numbers) is renumbered sequentially starting at 1000 with increments of 10, and this sequence restarts every time the Group changes.
I have tried ROWNUMBER() with PARTITION and trying to define a SEQUENCE, but I can't seem to get the result I'm looking for.
Table 1
ID Group Index
1 A 1
2 A 2
3 B 3
4 B 4
5 B 5
6 C 6
7 D 7
What I want:
Table 1
ID Group Index
1 A 1000
2 A 1010
3 B 1000
4 B 1010
5 B 1020
6 C 1000
7 D 1000
You can use row_number() with some arithmetic:
select t.*,
990 + 10 * row_number() over (partition by group order by id) as index
from t;
Note that group and index are SQL reserved words, so they are really bad column names.