SQL - Counting over several groups - sql

I have a list of transactions where the ID's are repeated and I have the quantity of items being bought. I need to count the number of times that a particular number of items were purchased at once.
Row
ItmNBR
TQTY
1
123
5
2
123
5
3
123
5
3
456
25
4
456
19
I need to produce an out put like this...
ItmNBR
QTY
Occurance
123
5
3
123
19
1
123
25
1
I can get the first two columns of my result but when I attempt to counting over a partition I end up counting getting repeating numbers since I'm only looking up 9 items I just count the number of rows in which the Cnt is the same.
TOT_IVO_ITM_QTY
Count(*) OVER (PARTITION BY QTY) AS CNT
FROM dataset
WHERE YEAR(bus_dt) = 2021
AND ITM_NBR IN (12639,12940,12949,12955,13485,13666,43950,631343,1103731)
AND QTY BETWEEN 5 AND 25
ORDER BY ITM_NBR
,QTY
GROUP BY ITM_NBR, TOT_IVO_ITM_QTY```

I think you just want group by:
select ItmNBR, QTY, count(*)
from t
group by ItmNBR, QTY
order by count(*) desc;
This assumes that you want the count by item and quantity, which seems to be the gist of the question.

Related

How to I stop duplication on SQL join where I have order_ids and when people order more than 1 item (so multiple product_ids) to calculate discounts?

So my problem is my discount number is blowing up because an order has a discount for the entire order, but I am making a dataset where there are multiple lines for each order to represent each product in the order. Instead of the discount only applying once to the order, it adds the discount for every line.
what is happening
order_id
product_id
quantity
amount
discount
1
a
1
5
0
2
a
1
5
7
2
b
1
10
7
3
a
1
5
5
3
b
1
10
5
3
c
1
15
5
what i want
order_id
product_id
quantity
amount
discount
1
a
1
5
0
2
a
1
5
7
2
b
1
10
0
3
a
1
5
5
3
b
1
10
0
3
c
1
15
0
I just want the discount to be applied once per order, and my join is using order_id so that is why the discount is applying multiple times. I would attach my code, but it's a decent sized CTE
Figured it out. I did need to use a row_number() Over Partition by Order id, but I was also losing records if the order had more than 1 item. The solution was to use a CASE WHEN statement.
CASE WHEN ORDER_ROW_COUNT = 1 THEN DISCOUNT ELSE 0 END
this allowed me to keep the records without duplicating the discounts
You’re joining on a field that isn’t unique so the join is returning all the records for that order Id and therefore the discount is being applied to all the records for that order Id. You need some sort of differentiator field. Something that is unique in each orders data set.
Example:
Select *, row_number () over(partition by order_id order by order_id) as rownumber into #temp from table
This should give you something like in the picture.
rownumber table image
Then join on order_Id = order_Id and rownumber =1 and this would only update the first record for each order.

Select column's occurence order without group by

I currently have two tables, users and coupons
id
first_name
1
Roberta
2
Oliver
3
Shayna
4
Fechin
id
discount
user_id
1
20%
1
2
40%
2
3
15%
3
4
30%
1
5
10%
1
6
70%
4
What I want to do is select from the coupons table until I've selected X users.
so If I chose X = 2 the resulting table would be
id
discount
user_id
1
20%
1
2
40%
2
4
30%
1
5
10%
1
I've tried using both dense_rank and row_number but they return the count of occurrences of each user_id not it's order.
SELECT id,
discount,
user_id,
dense_rank() OVER (PARTITION BY user_id)
FROM coupons
I'm guessing I need to do it in multiple subqueries (which is fine) where the first subquery would return something like
id
discount
user_id
order_of_occurence
1
20%
1
1
2
40%
2
2
3
15%
3
3
4
30%
1
1
5
10%
1
1
6
70%
4
4
which I can then use to filter by what I need.
PS: I'm using postgresql.
You've stated that you want to parameterize the query so that you can retrieve X users. I'm reading that as all coupons for the first X distinct user_ids in coupon id column order.
It appears your attempt was close. dense_rank() is the right idea. Since you want to look over the entire table you can't use partition by. And a sorting column is also required to determine the ranking.
with data as (
select *,
dense_rank() over (order by id) as dr
from coupons
)
select * from data where dr <= <X>;

How do I make a query that selects where the SUM equals a fixed value

I've spent that last couple of days searching for a way to make a SQL query that searches the database and returns records where the SUM of the same ID's equal or grater then the value provided.
For this I've been using the W3schools database to test it out in the products table.
More so what I've been trying to do:
SELECT * FROM products
WHERE supplierid=? and SUM(price) > 50
in the "where supplier id" would loop through same suppliers and sum of their price higher than 50 in this case return the records.
In this case it would read supplier ID 1 then add the price of all that supplier 18+19+10=47 now 47 < 50 so it will not print those records at the end. Next supplier ID 2 22+21.35=43.35 and again would not print those records until the sum of price is higher than 50 it will print
I'm working with a DB2 database.
SAMPLE data:
ProductID ProductName SupplierID CategoryID Price
1 Chais 1 1 18
2 Chang 1 1 19
3 Aniseed 1 2 10
4 Chef Anton 2 2 22
5 Chef Anton 2 2 21.35
6 Grandma's 3 2 25
7 Uncle Bob 3 7 30
8 Northwoods 3 2 40
9 Mishi 4 6 97
10 Ikura 4 8 31
11 Queso 5 4 21
12 Queso 5 4 38
13 Konbu 6 8 6
14 Tofu 6 7 23.25
How about:
select * from products where supplierid in (
select supplierid
from products
group by supplierid
having sum(price) > 50
);
The subquery finds out all the supplierid values that match your condition. The main (external) query retrieves all rows that match the list of supplierids.
not tested, but I would expect db2 to have analytic functions and CTEs, so perhaps:
with
basedata as (
select t.*
, sum(t.price) over(partition by t.supplierid) sum_price
from products t
)
select *
from basedata
where supplierid = ?
and sum_price > 50
The analytic function aggregates the price information but does not group the resultset, so you get the rows from your initial result, but restricted to those with an aggregated price value > 50.
The difference to a solution with a subquery is, that the use of the analytic function should be more efficient since it has to read the table only once to produce the result.

Derby DB last x row average

I have the following table structure.
ITEM TOTAL
----------- -----------------
ID | TITLE ID |ITEMID|VALUE
1 A 1 2 6
2 B 2 1 4
3 C 3 3 3
4 D 4 3 8
5 E 5 1 2
6 F 6 5 4
7 4 5
8 2 8
9 2 7
10 1 3
11 2 2
12 3 6
I am using Apache Derby DB. I need to perform the average calculation in SQL. I need to show the list of item IDs and their average total of the last 3 records.
That is, for ITEM.ID 1, I will go to TOTAL table and select the last 3 records of the rows which are associated with the ITEMID 1. And take average of them. In Derby database, I am able to do this for a given item ID but I cannot make it without giving a specific ID. Let me show you what I've done it.
SELECT ITEM.ID, AVG(VALUE) FROM ITEM, TOTAL WHERE TOTAL.ITEMID = ITEM.ID GROUP BY ITEM.ID
This SQL gives the average of all items in a list. But this calculates for all values of the total tables. I need last 3 records only. So I changed the SQL to this:
SELECT AVG(VALUE) FROM (SELECT ROW_NUMBER() OVER() AS ROWNUM, TOTAL.* FROM TOTAL WHERE ITEMID = 1) AS TR WHERE ROWNUM > (SELECT COUNT(ID) FROM TOTAL WHERE ITEMID = 1) - 3
This works if I supply the item ID 1 or 2 etc. But I cannot do this for all items without giving an item ID.
I tried to do the same thing in ORACLE using partition and it worked. But derby does not support partitioning. There is WINDOW but I could not make use of it.
Oracle one
SELECT ITEMID, AVG(VALUE) FROM(SELECT ITEMID, VALUE, COUNT(*) OVER (PARTITION BY ITEMID) QTY, ROW_NUMBER() OVER (PARTITION BY ITEMID ORDER BY ID) IDX FROM TOTAL ORDER BY ITEMID, ID) WHERE IDX > QTY -3 GROUP BY ITEMID ORDER BY ITEMID
I need to use derby DB for its portability.
The desired output is this
RESULT
-----------------
ITEMID | AVERAGE
1 (9/3)
2 (17/3)
3 (17/3)
4 (5/1)
5 (4/1)
6 NULL
As you have noticed, Derby's support for the SQL 2003 "OLAP Operations" support is incomplete.
There was some initial work (see https://wiki.apache.org/db-derby/OLAPOperations), but that work was only partially completed.
I don't believe anyone is currently working on adding more functionality to Derby in this area.
So yes, Derby has a row_number function, but no, Derby does not (currently) have partition by.

Oracle SQL find row crossing limit

I have a table which has four columns as below
ID.
SUB_ID. one ID will have multiple SUB_IDs
Revenue
PAY where values of Pay is always less than or equal to Revenue
select * from Table A order by ID , SUB_ID will have data as below
ID SUB_ID REVENUE PAY
100 1 10 8
100 2 12 9
100 3 9 7
100 4 11 11
101 1 6 5
101 2 4 4
101 3 3 2
101 4 8 7
101 5 4 3
101 6 3 3
I have constant LIMIT value 20 . Now I need to find the SUB_ID which Revenue crosses the LIMIT when doing consecutive SUM using SUB_ID(increasing order) for each ID and then find total Pay ##. In this example
for ID 100 Limit is crossed by SUB ID 2 (10+12) . So total Pay
is 17 (8+9)
for ID 101 Limit is crossed by SUB ID 4
(6+4+3+8) . So total Pay is 18 (5+4+2+7)
Basically I need to find the row which crosses the Limit.
Fiddle: http://sqlfiddle.com/#!4/4f12a/4/0
with sub as
(select x.*,
sum(revenue) over(partition by id order by sub_id) as run_rev,
sum(pay) over(partition by id order by sub_id) as run_pay
from tbl x)
select *
from sub s
where s.run_rev = (select min(x.run_rev)
from sub x
where x.id = s.id
and x.run_rev > 20);