Select column's occurence order without group by - sql

I currently have two tables, users and coupons
id
first_name
1
Roberta
2
Oliver
3
Shayna
4
Fechin
id
discount
user_id
1
20%
1
2
40%
2
3
15%
3
4
30%
1
5
10%
1
6
70%
4
What I want to do is select from the coupons table until I've selected X users.
so If I chose X = 2 the resulting table would be
id
discount
user_id
1
20%
1
2
40%
2
4
30%
1
5
10%
1
I've tried using both dense_rank and row_number but they return the count of occurrences of each user_id not it's order.
SELECT id,
discount,
user_id,
dense_rank() OVER (PARTITION BY user_id)
FROM coupons
I'm guessing I need to do it in multiple subqueries (which is fine) where the first subquery would return something like
id
discount
user_id
order_of_occurence
1
20%
1
1
2
40%
2
2
3
15%
3
3
4
30%
1
1
5
10%
1
1
6
70%
4
4
which I can then use to filter by what I need.
PS: I'm using postgresql.

You've stated that you want to parameterize the query so that you can retrieve X users. I'm reading that as all coupons for the first X distinct user_ids in coupon id column order.
It appears your attempt was close. dense_rank() is the right idea. Since you want to look over the entire table you can't use partition by. And a sorting column is also required to determine the ranking.
with data as (
select *,
dense_rank() over (order by id) as dr
from coupons
)
select * from data where dr <= <X>;

Related

How to I stop duplication on SQL join where I have order_ids and when people order more than 1 item (so multiple product_ids) to calculate discounts?

So my problem is my discount number is blowing up because an order has a discount for the entire order, but I am making a dataset where there are multiple lines for each order to represent each product in the order. Instead of the discount only applying once to the order, it adds the discount for every line.
what is happening
order_id
product_id
quantity
amount
discount
1
a
1
5
0
2
a
1
5
7
2
b
1
10
7
3
a
1
5
5
3
b
1
10
5
3
c
1
15
5
what i want
order_id
product_id
quantity
amount
discount
1
a
1
5
0
2
a
1
5
7
2
b
1
10
0
3
a
1
5
5
3
b
1
10
0
3
c
1
15
0
I just want the discount to be applied once per order, and my join is using order_id so that is why the discount is applying multiple times. I would attach my code, but it's a decent sized CTE
Figured it out. I did need to use a row_number() Over Partition by Order id, but I was also losing records if the order had more than 1 item. The solution was to use a CASE WHEN statement.
CASE WHEN ORDER_ROW_COUNT = 1 THEN DISCOUNT ELSE 0 END
this allowed me to keep the records without duplicating the discounts
You’re joining on a field that isn’t unique so the join is returning all the records for that order Id and therefore the discount is being applied to all the records for that order Id. You need some sort of differentiator field. Something that is unique in each orders data set.
Example:
Select *, row_number () over(partition by order_id order by order_id) as rownumber into #temp from table
This should give you something like in the picture.
rownumber table image
Then join on order_Id = order_Id and rownumber =1 and this would only update the first record for each order.

How to get the values for every group of the top 3 types

I've got this table ratings:
id
user_id
type
value
0
0
Rest
4
1
0
Bar
3
2
0
Cine
2
3
0
Cafe
1
4
1
Rest
4
5
1
Bar
3
6
1
Cine
2
7
1
Cafe
5
8
2
Rest
4
9
2
Bar
3
10
3
Cine
2
11
3
Cafe
5
I want to have a table with a row for every pair (user_id, type) for the top 3 rated types through all users (ranked by sum(value) across the whole table).
Desired result:
user_id
type
value
0
Rest
4
0
Cafe
1
0
Bar
3
1
Rest
4
1
Cafe
5
1
Bar
3
2
Rest
4
3
Cafe
5
2
Bar
3
I was able to do this with two queries, one to get the top 3 and then another to get the rows where the type matches the top 3 types.
Does someone know how to fit this into a single query?
Get rows per user for the 3 highest ranking types, where types are ranked by the total sum of their value across the whole table.
So it's not exactly about the top 3 types per user, but about the top 3 types overall. Not all users will have rows for the top 3 types, even if there would be 3 or more types for the user.
Strategy:
Aggregate to get summed values per type (type_rnk).
Take only the top 3. (Break ties ...)
Join back to main table, eliminating any other types.
Order result by user_id, type_rnk DESC
SELECT r.user_id, r.type, r.value
FROM ratings r
JOIN (
SELECT type, sum(value) AS type_rnk
FROM ratings
GROUP BY 1
ORDER BY type_rnk DESC, type -- tiebreaker
LIMIT 3 -- strictly the top 3
) v USING (type)
ORDER BY user_id, type_rnk DESC;
db<>fiddle here
Since multiple types can have the same ranking, I added type to the sort order to break ties alphabetically by their name (as you did not specify otherwise).
Turns out, we don't need window functions - the ones with OVER and, optionally, PARTITION for this. (Since you asked in a comment).
I think you just want row_number(). Based on your results, you seem to want three rows per type, with the highest value:
select t.*
from (select t.*,
row_number() over (partition by type order by value desc) as seqnum
from t
) t
where seqnum <= 3;
Your description suggests that you might just want this per user, which is a slight tweak:
select t.*
from (select t.*,
row_number() over (partition by user order by value desc) as seqnum
from t
) t
where seqnum <= 3;

Find count of occurrence of a row

I have a table with duplicate rows and I need to extract these duplicate rows alone. Below is an example of the table I have:
my_table:
ID Offer
1 10
2 10
1 12
1 10
2 20
2 10
What I want next is to count the occurrence of the offer for each ID. i.e, my final result should be:
ID Offer Count
1 10 1
2 10 1
1 12 1
1 10 2
2 20 1
2 10 2
As you can see, the count should increase based on the number of times the offer shows up per ID.
I tried something like:
select id,offer,count(offer) over (partition by id);
But this just gives the total count of that particular offer for that ID and is not the result I am looking for.
Any help is much appreciated!
You could use ROW_NUMBER:
select id,offer,ROW_NUMBER() over (partition by id, offer order by rownum)
from tab

Derby DB last x row average

I have the following table structure.
ITEM TOTAL
----------- -----------------
ID | TITLE ID |ITEMID|VALUE
1 A 1 2 6
2 B 2 1 4
3 C 3 3 3
4 D 4 3 8
5 E 5 1 2
6 F 6 5 4
7 4 5
8 2 8
9 2 7
10 1 3
11 2 2
12 3 6
I am using Apache Derby DB. I need to perform the average calculation in SQL. I need to show the list of item IDs and their average total of the last 3 records.
That is, for ITEM.ID 1, I will go to TOTAL table and select the last 3 records of the rows which are associated with the ITEMID 1. And take average of them. In Derby database, I am able to do this for a given item ID but I cannot make it without giving a specific ID. Let me show you what I've done it.
SELECT ITEM.ID, AVG(VALUE) FROM ITEM, TOTAL WHERE TOTAL.ITEMID = ITEM.ID GROUP BY ITEM.ID
This SQL gives the average of all items in a list. But this calculates for all values of the total tables. I need last 3 records only. So I changed the SQL to this:
SELECT AVG(VALUE) FROM (SELECT ROW_NUMBER() OVER() AS ROWNUM, TOTAL.* FROM TOTAL WHERE ITEMID = 1) AS TR WHERE ROWNUM > (SELECT COUNT(ID) FROM TOTAL WHERE ITEMID = 1) - 3
This works if I supply the item ID 1 or 2 etc. But I cannot do this for all items without giving an item ID.
I tried to do the same thing in ORACLE using partition and it worked. But derby does not support partitioning. There is WINDOW but I could not make use of it.
Oracle one
SELECT ITEMID, AVG(VALUE) FROM(SELECT ITEMID, VALUE, COUNT(*) OVER (PARTITION BY ITEMID) QTY, ROW_NUMBER() OVER (PARTITION BY ITEMID ORDER BY ID) IDX FROM TOTAL ORDER BY ITEMID, ID) WHERE IDX > QTY -3 GROUP BY ITEMID ORDER BY ITEMID
I need to use derby DB for its portability.
The desired output is this
RESULT
-----------------
ITEMID | AVERAGE
1 (9/3)
2 (17/3)
3 (17/3)
4 (5/1)
5 (4/1)
6 NULL
As you have noticed, Derby's support for the SQL 2003 "OLAP Operations" support is incomplete.
There was some initial work (see https://wiki.apache.org/db-derby/OLAPOperations), but that work was only partially completed.
I don't believe anyone is currently working on adding more functionality to Derby in this area.
So yes, Derby has a row_number function, but no, Derby does not (currently) have partition by.

Retrieve Result from comparing multiple colums in a single table

FID RP Area Count
1 100 0.780 1
2 100 0.906 2
2 500 0.094 2
3 100 1.000 1
4 100 1.000 1
5 100 0.784 2
5 500 0.916 2
6 100 0.332 3
6 500 0.780 3
6 555 0.643 3
In the above table, i want to retrieve the columns where Area>0.4. This will retrieve 8 rows. But i want answer in other way.
Look at Case where FID =5. In this, the area of RP 100 and 500 satisfy the criteria, but the output should be given high weigtage for RP =100. For the case where FID =6, RP=100 did not satisfy the criteria, but RP=500 and RP=555 satisfies the criteria. I want the weigtage to be given to RP=500.
Required Result:
FID RP Area Count
1 100 0.78007 1
2 100 0.90626 2
3 100 1 1
4 100 1 1
5 100 0.7835 2
6 500 0.78 3
So, you want the first row for each id where the value of Area exceeds 0.4 and "first" is ordered by RP.
Window function provide the mechanism to do this. Most databases support row_number():
select FID, RP, Area, "Count"
from (select t.*,
row_number() over (partition by fid order by rp) as seqnum
from t
where Area > 0.4
) t
where seqnum = 1;
The subquery filters the rows so only rows with valid values of Area are included. The row_number() function assigns sequential values to the rows within an fid (because of the partition by clause). The values are assigned in order by rp (due to the order by clause).