SQL calculate percentage from calculated column - sql

I have a table with multiple columns however I need to calculate a Total Percentage based off 2 columns.
Column 1 has unique identifier (number i.e. 15211, 36521, 45987 etc)
Column 2 has a "Y" or is blank (the criteria is built in to the DWH)
What i am wanting to do is get a Percentage of Column 2 of only the Y fields using Column 1 as the Denominator
Column 1
Column 2
25638
y
69857
n
78561
n
23149
y
based on the example above im expecting 2/4 = 0.50 or 50%

You can divide the result of a conditional aggregation on Column2 = 'Y', and the overall count.
SELECT COUNT(CASE WHEN Column2 = 'y' THEN 1 END) / COUNT(*) AS perc_y
FROM tab
Output:
perc_y
0.5000
If you want a percentage, multiply by 100, round up and concatenate with '%'.
Here's a demo in MySQL, although it should work on all the most common DBMS'.

Related

Return 0 in Sheets Query if there is no data

I need some advice in google query language.
I want to count rows depending on date and a condition. But if the condition is not met, it should return 0.
What I'm trying to achieve:
Date Starts
05.09.2018 0
06.09.2018 3
07.09.2018 0
What I get:
Date Starts
06.09.2018 3
The query looks like =Query(Test!$A2:P; "select P, count(B) where (B contains 'starts') group by P label count(B) 'Starts'")
P contains ascending datevalues and B an event (like start in this case).
How can I force output a 0 for the dates with no entry containing "start"?
The main point is to get all needed data in one table in ascending order. But this is only working, if every day has an entry. If there is no entry for a day, the results for "start" do not match the datevalue in column A. 3 in column D would be in the first row of the table then.
I need it like this:
A B C D
Date Logins Sessions Starts
05.09.2018 1 2 0
06.09.2018 3 4 3
07.09.2018 4 5 0
Maybe this is easy to fix, but I don't see it.
Thanks in advance!
You can do some pre-processing before the query. Ex: check if column B contains 'start' with regexmatch and use a double unary (--) to force the boolean values into 1's and 0's. The use query to sum.
=Query(Arrayformula({--regexmatch(Test!$B2:B; "start")\ Test!$A2:P}); "select Col17, sum(Col1) where Col17 is not null group by Col17 label sum(Col1) 'Starts'")
Change ranges to suit.

SQL grouping sets and roundup

I am trying to calculate the difference between a certain sum of values and the same sum using specific roundup rules (columns 5 and 6):
select
A,
B,
C,
sum(D),
sum(D)/300,
case when sum(D)/300 < 1.5 then 0 else round(sum(D/300), 0) end
from table
group by grouping sets ((A,B,C), ())
The SQL works, but the final row is wrong. The totals in the final row seem correct for column 4 and 5, but in column 6 is doesn't add up the rounded up values of the column, but the rounded up value of the total of column 5...
What am I doing wrong? (Goal: compare the totals of column 5 and 6)
Any help is welcome!
EDIT:
the result right now is something like this (only column 5 and 6):
1,2 0
1,5 2
3,1 3
5,8 6
The total of the second column should say 5 in this example, but it says 6, using the unrounded values...
You are missing the outer SUM:
select
A,
B,
C,
sum(D),
sum(D)/300,
SUM(case when sum(D)/300 < 1.5 then 0 else round(sum(D/300), 0) end) as result
from table
group by grouping sets ((A,B,C), ())

SQL: Computing sum of all values *and* a sum of only values matching condition

Suppose I fetch a set of rows from several tables. I want to know the total sum of values in column x in these rows, as well as sum of only those values in x where the row satisfies some additional condition.
For example, let's say I fetched:
X Y
1 0
10 0
20 100
35 100
I want to have a sum of all x (66) and x in those rows where x > y (11). So, I'd need something like:
SELECT sum(x) all_x, sum(x /* but only where x > y */) some_x FROM ...
Is there a way to formulate that in SQL? (Note that the condition is not a separate column in some table, I cannot group over it, or at least don't know how to do that.)
EDIT: I use Oracle Database, so depending on Oracle extensions is OK.
You could use a case expression inside the sum:
SELECT SUM (x) all_x,
SUM (CASE WHEN x > y THEN x ELSE 0 END) some_x
FROM my_table
You're looking for the CASE operator :
SELECT sum(X) all_x,
sum(CASE WHEN X>Y THEN X ELSE 0 END) some_x
FROM Table1
In this case (no pun) you would get 11 for some_x
You can use whatever condition you want instead of X>Y after the WHEN, and select whatever value instead of X.
SQL fiddle to test this query.
Below Query will give What you want
select SUM(x) as x,(select SUM(x) from test5 where x>y )as 'X>Y'
from test5

SQL query to return matrix

I have a set of rows with one column of actual data. The goal is display this data in Matrix format. The numbers of Column will remain same, the number of rows may vary.
For example:
I have 20 records. If I have 5 columns - then the number of rows would be 4
I have 24 records. I have 5 columns the number of rows would be 5, with the 5th col in 5th row would be empty.
I have 18 records. I have 5 columns the number of rows would be 4, with the 4th & 5th col in 4th row would be empty.
I was thinking of generating a column value against each row. This column value would b,e repeated after 5 rows. But I cannot the issue is "A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations"
Not sure how it can be achieved.
Any advice will be helpful.
Further Addition - I have managed to generate the name value association with column name and value. Example -
Name1 Col01
Name2 Col02
Name3 Col03
Name4 Col01
Name5 Col02
You can use ROW_NUMBER to assign a sequential integer from 0 up. Then group by the result of integer division whilst pivoting on the remainder.
WITH T AS
(
SELECT number,
ROW_NUMBER() OVER (ORDER BY number) -1 AS RN
FROM master..spt_values
)
SELECT MAX(CASE WHEN RN%5 = 0 THEN number END) AS Col1,
MAX(CASE WHEN RN%5 = 1 THEN number END) AS Col2,
MAX(CASE WHEN RN%5 = 2 THEN number END) AS Col3,
MAX(CASE WHEN RN%5 = 3 THEN number END) AS Col4,
MAX(CASE WHEN RN%5 = 4 THEN number END) AS Col5
FROM T
GROUP BY RN/5
ORDER BY RN/5
In general:
SQL is for retrieving data, that is all your X records in one column
Making a nice display of your data is usually the job of the software that queries SQL, e.g. your web/desktop application.
However if you really want to build the display output in SQL you could use a WHILE loop in connection with LIMIT and PIVOT. You would just select the first 5 records, than the next ones until finished.
Here is an example of how to use WHILE: http://msdn.microsoft.com/de-de/library/ms178642.aspx

Split a query result based on the result count

I have a query based on basic criteria that will return X number of records on any given day.
I'm trying to check the result of the basic query then apply a percentage split to it based on the total of X and split it in 2 buckets. Each bucket will be a percentage of the total query result returned in X.
For example:
Query A returns 3500 records.
If the number of records returned from Query A is <= 3000, then split the 3500 records into a 40% / 60% split (1,400 / 2,100).
If the number of records returned from Query A is >=3001 and <=50,000 then split the records into a 10% / 90% split.Etc. Etc.
I want the actual records returned, and not just the math acting on the records that returns one row with a number in it (in the column).
I'm not sure how you want to display different parts of the resulting set of rows, so I've just added additional column(part) in the resulting set of rows that contains values 1 indicating that row belongs to the first part and 2 - second part.
select z.*
, case
when cnt_all <= 3000 and cnt <= 40
then 1
when (cnt_all between 3001 and 50000) and (cnt <= 10)
then 1
else 2
end part
from (select t.*
, 100*(count(col1) over(order by col1) / count(col1) over() )cnt
, count(col1) over() cnt_all
from split_rowset t
order by col1
) z
Demo #1 number of rows 3000.
Demo #2 number of rows 3500.
For better usability you can create a view using the query above and then query that view filtering by part column.
Demo #3 using of a view.