Group certain rows by a value range - sql

I am trying to group a certain records by its price range. Lets say Customer A bought Product B multiple Times as shown below figure, I want to group them together. The Below customer bought products at different price points like 800,810,830,850 etc. I want to compare each price point against others price points in the tables and see if they can grouped together.
Lets say there are ten price points
800,800,850,820,830,1200,1220,1200,1250,1230.
I want to group numbers which are in 10% of its range. The first 5 numbers 800,800,850,820,830 are in one group and the other numbers are in a different group. How can I achieve this SQL Server?

If I understand correctly, you want one group of:
min + 0.1 * (max - min)
for each customer as a group. Then you want the rest in another group. You can use window functions and arithmetic for this:
select t.*,
(case when price <= 0.1 * max(price) over (partition by customer) + 0.9 * min(price) over (partition by customer)
then 1 else 2
end) as the_group
from t;

Related

How to split a list of number into ranges with a fixed interval with SQL?

Let's say I have a table like this
I want to calculate the frequency ( How many times that product exists in that price range ), in intervals of "50"
So eventually it will give me a table like
Interval for range will be lets pretend a fixed 50
We don't know highest and lowest price of these each products.
So I will run the query and it will give a table as shown above.
You can use arithmetic and aggregation:
select product, count(*) as frequency,
floor(price / 50)*50 as range_start, floor(price / 50)*50 + 50 as range_end
from t
group by product, floor(price / 50)
order by product, min(price)

Summing values based on values in other column (Variable)

I would like to sum all items within the query based on their ITEM, keep in mind this query is a daily report that will pick up different ITEM's depending on which items were purchased that day. Therefore, a basic CASE wont work.
For example:
ITEM_TABLE: expected result
Item Type Amount SUM
----------------------------------
SCARF 10 10
T-Shirt 20 45
T-Shirt 25 45
Current Query:
select SUM(AMOUNT)
from EDSREP.V_COGNOS_WSSTOR_SETTLE_RECON a
having CCY_CODE = a.CCY_CODE
Nothing is showing up, please help.
You can use window functions:
select
item_type,
amount,
sum(amount) over(partition by item_type) sum_amount
from item_table

Redshift - Find % as compared to total value

I have a table with count by product. I am trying to add a new column that would find % as compared to sum of all rows in that column.
prod_name,count
prod_a,100
prod_b,50
prod_c,150
For example, I want to find % of prod_a as compared to the total count and so on.
Expected output:
prod_name,count,%
prod_a,100,0.33
prod_b,50,0.167
prod_c,150,0.5
Edit on SQL:
select count(*),ratio_to_report(prod_name)
over (partition by count(*))
from sales
group by prod_name;
Using window functions.
select t.*,100.0*cnt_by_prod/sum(cnt_by_prod) over() as pct
from tbl t
Edit: Based on OP's question change, To compute the counts and then percentage, use
select prod_name,100.0*count(*)/sum(count(*)) over()
from tbl
group by prod_name

STDEVP for calculated fields

I have a table that looks like this:
ID CHANNEL VENDOR num_PERIOD SALES.A SALES.B
000001 Business Shop 1 40 30
000001 Business Shop 2 60 20
000001 Business Shop 3 NULL 30
With many combinations of ID, CHANNEL and VENDOR, and sales records for each of them over time (num_PERIOD).
I want to get the average Standard Deviation of a new field, which returns the sum of SALES.A + SALES.B sum(IS.NULL(SALES.A,0) + ISNULL(SALES.B,0)).
The problem I have is that STDEVP seem to fail with calculated fields, and the result that returns is invalid.
I have been trying with:
select ID, CHANNEL, VENDOR, stdevp(sum(isnull(SALES.A,0) + ISNULL(QSALES.B,0))) OVER (PARTITION BY ID, CHANNEL, VENDOR) as STDEV_SALES
FROM TABLE
GROUP BY ID, CHANNEL, VENDOR
However, the results I'm obtaning are always 0 or NULL.
What I want to obtain is the Average Standard Deviation of each ID, CHANNEL and VENDOR over time (num_PERIOD).
Can someone find an approximation for this please?
Your query doesn't match the sample data.
I can see the problem, though. The SUM() are calculating a single value for each group, and then you are taking the standard deviation of that value. Because you cannot nest aggregation functions, you have turned it into a window function.
Get rid of the sum(). The following should work in SQL Server:
SELECT ID, CHANNEL, VENDOR,
STDEVP(COALESCE(SALES.A, 0) + COALESCE(QSALES.B, 0)) as STDEV_SALES
FROM SALES . . .
QSALES
GROUP BY ID, CHANNEL, VENDOR;
I would also return the COUNT(*) . . . the standard deviation doesn't make sense if you have fewer than 3 rows. (Okay, it is defined for two values, but not very useful.)

understanding group by statements in rails

Given a invoices table like this:
invoice_date customer total
2012/01/01 A 780
2013/05/01 A 3800
2013/12/01 A 1500
2012/07/01 B 15
2013/03/01 B 21
Say that i want both:
the count of invoices of each customer of each year
the sum of the amounts of all the invoices of each customer of each year
the max amount among all the invoices of each customer of each year
That is, in SQL, very easily:
SELECT CUSTOMER, YEAR(invoice_date) as INVOICE_YEAR, MAX(total) AS MAX_TOTAL, SUM(total) AS SUM_AMOUNTS, count(*) AS INVOICES_NUM AS SUM_TOTAL FROM invoices GROUP BY YEAR(invoice_date), CUSTOMER;
(the function to extract the year of a date may be YEAR(date) or something else depending on the database server, on sqllite is strftime('%y', invoice_date))
Ok, i've tryed to translate this in rails/ActiveRecord:
Invoice.count(:group => 'customer')
This works, but how can i get both count and sum and max?
The idea i'm familiar with is that (in SQL) a group by generates the rows (well, to be correct, determines which rows should exist in the result table), and then you pass an arbitrary number of aggregation functions that are applyed on every disaggregate set of rows that are behind a single result row. E.G: group by customer means: one row for Customer A, one row for customer B; then I can pass how many aggregation function i want: count(*), max(total), max(date), min(total) just to list the most common.
Looking at the rails ActiveRecord API it seems that you're supposed to do just one function at a time, because the group is an argument of the count. And if i want a multiple aggregation functions, say max, sum etc?
Second attempt
irb> i = Invoice.select('customer, sum(total)').group('customer')
Invoice Load (0.3ms) SELECT customer, sum(total) AS TOTAL_GROUP FROM "invoices" GROUP BY customer
=> [#, #]
That is: it doesn't give back the field with the sum...
Well it does, it just doesn't get printed out.
Say you query is i = Invoice.select('customer, sum(total) as sum_total').group('customer')
So i is an array(technically it's not an array, but not important here) containing all the result. So i[0].sum_total will give you the sum of the first customer, but of course you should iterate it to get everything you want.