Redshift - Find % as compared to total value - sql

I have a table with count by product. I am trying to add a new column that would find % as compared to sum of all rows in that column.
prod_name,count
prod_a,100
prod_b,50
prod_c,150
For example, I want to find % of prod_a as compared to the total count and so on.
Expected output:
prod_name,count,%
prod_a,100,0.33
prod_b,50,0.167
prod_c,150,0.5
Edit on SQL:
select count(*),ratio_to_report(prod_name)
over (partition by count(*))
from sales
group by prod_name;

Using window functions.
select t.*,100.0*cnt_by_prod/sum(cnt_by_prod) over() as pct
from tbl t
Edit: Based on OP's question change, To compute the counts and then percentage, use
select prod_name,100.0*count(*)/sum(count(*)) over()
from tbl
group by prod_name

Related

Find sum based on another column and use aggregate function in SQL

I am working on SQL and I want to achieve this output in SQL using HIVE, here is the problem statement
I have columns Item, Cost and price.
The output I am expecting to have a following columns
25% cost- Which is the 25% value of cost.
total_price- Its a sum of price values based on Item.
cumulative_price-Based on item it's a sum of cumulative price.
I want to flag those records which has cumulative_price>25%cost value
this code will work on sql so try it.
with cte as
(select *,(cost*0.25) as costx,
sum(price) over (PARTITION BY item order by item) AS total_price,
sum(price) over (PARTITION BY item order by item,price) cumulative_price from dept)
select *,case when cumulative_price<costx then 0 else 1 end as flag from cte;
its work on me.

Get count of records of the top 3 rows and compare the counts

In SQL Server 2016, I have a query as such:
SELECT [Report_date], count(distinct indv_id)
FROM
[dbo].[STG_TABLE] group by report_date order by report_date desc
I get the results as below:
Report_date (No column name)
2020-08-21 47918
2020-08-12 968065
2020-07-31 977804
Now I want to compare the difference between the counts in each row. If the difference is more than 10%, then I need to send an email out in the SSIS package.
How can I go through each row and calculate the difference? I want to look at the first row and compare it with the second row.
You question seems to be about calculating the ratios between rows. For that, use lag(). To get the ratio:
SELECT [Report_date], COUNT(DISTINCT indv_id),
(COUNT(DISTINCT indv_id) * 1.0 / LAG(COUNT(DISTINCT indv_id)) OVER (ORDER BY report_date))
FROM [dbo].[STG_TABLE]
GROUP BY report_date
ORDER BYreport_date DESC;
I'm not sure what results you want, but this is the basic information.

Percentage calculation of specific number of row over total

I'm looking for SELECT statement to calculate percentage of specific number of row over the total number of rows.
For example; lets say i have a FRUIT table like this;
I want to calculate a percentage of rows that its name is not peach, over the total number of rows. I try this statement :
SELECT CAST((select count(name) from fruit WHERE name !='peach')
as FLOAT) /
(select count(name)from fruit)*100.0 as percentage ;
but it doesn't give me correct number. I also need a statement that i can calculate percentage of each fruit by grouping them with Group by function
I'm very new at SQL and i keep trying but cant find the right syntax. Please help me.
I think the easiest way to do this is using conditional aggregation with average:
select avg(case when fruit <> 'peach' then 100.0 else 0.0 end)
from fruits;
In Postgres, you can use the shorthand:
select 100*avg((fruit <> 'peach')::int)
from fruits;

T-SQL average calculation

I want to incorporate two average calculations for a bunch of value columns in my select statement.
see this link for my simplified table structure including the desired output calculation: Pastebin
1) moving average:
Month1 = value of the value1-column for that month, Month2 = if sum == 0 then write 0, else avg(Month1 and Month2) and so on.
So for each product, I want the moving average for each month within one year.
I have this set up in my Excel but I can't transfer the expression to sql.
2) overall average:
for each product, calculate the average over all years and duplicate the calculated value to all rows for that product.
I hope you can help me out with this. It looks like I need a procedure but maybe it is just a simple statement.
SQL-Server 2012 supports the analytic functions required to do this:
SELECT Product,
Month,
Year,
Value,
AVG_YTD = AVG(Value) OVER(PARTITION BY Year ORDER BY Month),
AVG_Year = AVG(Value) OVER(PARTITION BY Product, Year),
AVG_Overall = AVG(Value) OVER(PARTITION BY Product)
FROM T;
Simplified Example on SQL Fiddle

Query return rows whose sum of column value match given sum

I have tables with:
id desc total
1 baskets 25
2 baskets 15
3 baskets 75
4 noodles 10
I would like to ask the query with output which the sum of total is 40.
The output would be like:
id desc total
1 baskets 25
2 baskets 15
I believe this will get you a list of the results you're looking for, but not with your example dataset because nothing in your example dataset can provide a total sum of 40.
SELECT id, desc, total
FROM mytable
WHERE desc IN (
SELECT desc
FROM mytable
GROUP BY desc
HAVING SUM(total) = 40
)
Select Desc,SUM(Total) as SumTotal
from Table
group by desc
having SUM(Total) > = 40
Not quite sure what you want, but this may get you started
SELECT `desc`, SUM(Total) Total
FROM TableName
GROUP BY `desc`
HAVING SUM(Total) = 40
From reading your question, it sounds like you want a query that returns any subset of of sums that represent a certain target value and have the same description.
There is no simple way to do this. This migrates into algorithmic territory.
Assuming I am correct in what you are after, group bys and aggregate functions will not solve your problem. SQL cannot indicate that a query should be performed on subsets of data until it exhaust all possible permutations and finds the Sums that match your requirements.
You will have to intermix an algorithm into your sql ... i.e a stored procedure.
Or simply get all the data from the database that fits the desc then perform your algorithm on it in code.
I recall there was a CS algorithmic class I took where this was a known Problem:
I believe you could just adapt working versions of this algorithm to solve your problem
http://en.wikipedia.org/wiki/Subset_sum_problem
select desc
from (select desc, sum(total) as ct group by desc)