SQL Percentage Queries - sql

I'm a beginner in SQL and my question is about calculating a percentage of the overall disclosed total from a table called merged. I want to calculate the number of 'SUPPORT' from committee_position (a column in the table merged)
How to calculate the percentage in that case.
I start with:
SELECT Sum (amount) *100
from merged
where merged.committee_position == 'SUPPORT';
Help me continue it, Thank you

If I followed you correctly, you can do conditional aggregation:
select
100.0 * sum(case when committee_position = 'SUPPORT' then amount else 0 end) / sum(amount)
from merged
This gives you the percentage of amount that have committee_position = 'SUPPORT' over the total amount in the table.

Here you go.
SELECT a.Support_Amount/b.Total_Amount*100
FROM (SELECT Sum (amount) as Support_Amount
from merged
where merged.committee_position = 'SUPPORT') as a
CROSS JOIN
(SELECT Sum (amount) as Total_Amount
from merged) as b

Related

Avoid Cartesian Join for the spark SQL query

I am trying to calculate the processRate from the total count of two temp tables but I'm getting the error "Detected implicit cartesian product for INNER join between logical plans" where I am not even performing joins. I am sure this error can be resolved by restructuring the query in correct format and I need your help on it. Below is the query,
spark.sql("""
CREATE OR REPLACE TEMPORARY VIEW final_processRate AS
SELECT
((a.total - b.total)/a.total))* 100 AS processRate
FROM
(select count (*) as total from sales) a,
(select count (*) as total from sales where status = 'PENDING') b
""")
I'm getting this error while trying to view the data using,
spark.sql("select * from processRate limit 10").show(false)
Can you please help on formatting the above query to resolve this issue and view the data of final_processRate?
You don't need subquery for this. Just use a conditional aggregation:
spark.sql("""
CREATE OR REPLACE TEMPORARY VIEW final_processRate AS
SELECT
((count(*) - count(case when status='PENDING' then 1 end)) / count(*)) * 100 AS processRate
FROM sales
""")
Then you can query the temp view using:
spark.sql("select * from final_processRate")
which should give you a single number/percentage calculated above.
I would write this as:
select avg(case when status = 'PENDING' then 0.0 else 1 end)
from sales;
This returns the proportion of rows that are not pending.

query the percentage of occurrences in an SQL table

I have a table of names, where each row has the columns name, and occurrences.
I'd like to calculate the percentage of a certain name from the table.
How can I do that in one query?
You can get it by using SUM(occurrences):
select
name,
100.0 * sum(occurrences) / (select sum(occurrences) from users) as percentage
from
users
where name = 'Bob'
Try this:
SELECT name, cast(sum(occurance) as float) /
(select sum(occurance) from test) * 100 percentage FROM test
where name like '%dog%'
Demo here
It is not very elegant due to the subquery in the field list but this will do the job if you want it in one query:
SELECT
`name`,
(CAST(SUM(`occurance`) AS DOUBLE)/CAST((SELECT SUM(`occurance`) FROM `user`) AS DOUBLE)) as `percent`
FROM
`user`
WHERE
`name`='miroslav';
Example Fiddle
Hope this helps,
I think conditional aggregation is the best approach:
select sum(case when name = #name then occurrences else 0 end) / sum(occurrences) as ratio
from t;
If you want an actual percentage between 0 and 100 multiply by 100.

Percentage of items higher than a specific value ($500) for example?

I have a table with items which contains price, item number and personalisation number (which is not needed in this case I think).
How can I make a Query that shows how much percent of the data is >500$ (example.)
I've tried this
Select price, (Count(price)* 100 / (Select Count(*) From items)) as Score
From items
Group By price
but it did not work the way I intend it to.
Select sum(case when price > 500 then 1 else 0 end) * 100.0 / count(*)
From items

Combining Count and MIN functions

I have a part of my query as:
SUM(POReceiptQuantity) as Receieved,
MIN(ItemLocalStandardCost) as Low,
MAX(ItemLocalStandardCost) as High,
Received returns the total number of Items we sold this year. The LOW is the lowest price we paid, and High is the highest price we paid.
I'm trying to incorporate a new column showing how many if the item we sold at the Low price. I tried to use Count along with Min function but it returns a "cannot perform an aggregate function on an expression containing an aggregate or a subquery"
Does anyone have any ideas how i could go about this.
Thank you
You need create a subquery with your current GROUP BY query and join with your Original Table. Then you can use a conditional COUNT
SELECT T2.Received,
T2.Low,
COUNT( CASE WHEN T1.ItemLocalStandardCost = T2.Low THEN 1 END) as Total_Low,
T2.High,
COUNT( CASE WHEN T1.ItemLocalStandardCost = T2.High THEN 1 END) as Total_High
FROM YourTable T1
CROSS JOIN ( SELECT SUM(Y.POReceiptQuantity) as Receieved,
MIN(Y.ItemLocalStandardCost) as Low,
MAX(Y.ItemLocalStandardCost) as High
FROM YourTable Y
GROUP BY .... ) as T2

SQLite ROLLUP query

I am trying to get a summary of the balance per month within my database. The table has the following fields
tran_date
type (Income or Expense)
amount
I can get as far as retrieving the sum for each type for every month but want the sum for the whole month. This is my current query:
SELECT DISTINCT strftime('%m%Y', tran_date), type, SUM(amount) FROM tran WHERE exclude = 0 GROUP BY tran_date, type
This returns
032013 Income 100
032013 Expense 200
I would like the summary on one row, in this example 032013 -100.
Just use the right group by. This uses conditional aggregation, assuming that you want "income - expense":
SELECT strftime('%m%Y', tran_date), type,
SUM(case when type = 'Income' then amount when type = 'Expense' then - amount end)
FROM tran WHERE exclude = 0
GROUP BY tran_date;
If you want just the full sum, then this is easier:
SELECT strftime('%m%Y', tran_date), type,
SUM(amount)
FROM tran WHERE exclude = 0
GROUP BY tran_date;
Your original query returned type rows because "type" was in the group by clause.
Also, distinct is (almost) never needed with group by.