How to get count(percentage) for columns after each groupby item? - sql

I have the following table. Using sqlite DB
Item
Result
A
Pass
B
Pass
A
Fail
B
Fail
I want to realize the above table as below using some query.
Item
Total
Accept
Reject
A
2
1(50%)
1(50%)
B
2
1(50%)
1(50%)
How should I construct this query?

You can try PIVOT() if your DBMS supports. Then use CONCAT or || operator depending on the DMBS.
Query:
SELECT
item,
total,
SUM(Pass)||'('|| CAST((SUM(Pass)*1.0/total*1.0)*100.0 AS DECIMAL)||'%)' AS Accept,
SUM(Fail)||'('|| CAST((SUM(Fail)*1.0/total*1.0)*100.0 AS DECIMAL)||'%)' AS Reject
FROM
(
SELECT
Item,
result,
COUNT(result) OVER(PARTITION BY item ORDER BY result ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS total,
CASE
WHEN Result = 'Pass' then 1
ELSE 0
END AS Pass,
CASE
WHEN Result = 'Fail' then 1
ELSE 0
END AS Fail
FROM t
) AS j
GROUP BY item, total
Query explanation:
Since SQLITE does not handle PIVOT, we are creating the flags Pass and Fail manually using CASE statement
To calculate total, COUNT is used as analytical function here. It is basically a shortcut to calculate count and place it in all rows
Then in the outer query, we are calculating %s and using || as the concatenate operator to concatenate the result with total sum and % of it
See demo in db<>fiddle

Related

SQL selecting with conditioning from a subquery

I am trying to perform two sum functions from a query. However, I want to only perform one of the sum functions if it meets a certain condition without affecting the other sum function.
What I was thinking is to use something similar to select x where condition = 1 from AC which is however not possible.
Here is the sample query where I want the second [sum(t.match)] selection to only calculate if the result in the subquery: match = 1 while still getting the total sum of all qqty.
select
sum(t.qqty), sum(t.qqty)
from
(select
car, cqty, qqty,
case when cqty = qqty then 1 else 0 end as match,
location, state) t
Use conditional aggregation -- that is case as the argument to the sum():
select sum(t.qqty), sum(case when condition = 1 then t.qqty else 0 end)
from t;

How to get 0 if no row found from sql query in sql server

I am getting blank value with this query from sql server
SELECT TOP 1 Amount from PaymentDetails WHERE Id = '5678'
it has no row,that is why its returning blank,So I want if no row then it should return 0
I already tried with COALESCE ,but its not working
how to solve this?
You are selecting an arbitrary amount, so one method is aggregation:
SELECT COALESCE(MAX(Amount), 0)
FROM PaymentDetails
WHERE Id = '5678';
Note that if id is a number, then don't use single quotes for the comparison.
To be honest, I would expect SUM() to be more useful than an arbitrary value:
SELECT COALESCE(SUM(Amount), 0)
FROM PaymentDetails
WHERE Id = '5678';
You can wrap the subquery in an ISNULL:
SELECT ISNULL((SELECT TOP 1 Amount from PaymentDetails WHERE Id = '5678' ORDER BY ????),0) AS Amount;
Don't forget to add a column (or columns) to your ORDER BY as otherwise you will get inconsistent results when more than one row has the same value for Id. If Id is unique, however, then remove both the TOP and ORDER BY as they aren't needed.
You should never, however, use TOP without an ORDER BY unless you are "happy" with inconsistent results.

Using the total of a column of the queried table in a case when (Hive)

Simplified example:
In hive, I have a table t with two columns:
Name, Value
Bob, 2
Betty, 4
Robb, 3
I want to do a case when that uses the total of the Value column:
Select
Name
, CASE
When value>0.5*sum(value) over () THEN ‘0’
When value>0.9*sum(value) over () THEN ‘1’
ELSE ‘2’
END as var
From table
I don’t like the fact that sum(value) over () is computed twice. Is there a way to compute this only once. Added twist, I want to do this in one query, so without declaring user variables.
I was thinking of scalar queries:
With total as
(Select sum(value) from table)
Select
Name
, CASE
When value>0.5*(select * from total) THEN ‘0’
When value>0.9*(select * from total)THEN ‘1’
ELSE ‘2’
END as var
From table;
But this doesn’t work.
TLDR: Is there a way to simplify the first query without user variables ?
Don't worry about that. Let the optimizer worry about it. But, you can use a subquery or CTE if you don't want to repeat the expression:
select Name,
(case when value > 0.5 * total then '0'
when value > 0.9 * total then '1'
else '2'
end) as var
From (select t.*, sum(value) over () as total
from table t
) t;
Cross join a subquery that fetches the sum to the table:
Select
t.Name
, CASE
When t.value>0.9*tt.value THEN '1'
When t.value>0.5*tt.value THEN '0'
ELSE '2'
END as var
From table t cross join (select sum(value) value from table) tt
and change the order of the WHEN clauses in the CASE expression because as they are, the 2nd case will never succeed.
Since I/O is the major factor the slows down Hive queries, we should strive to reduce the num of stages to get better performance.
So it's better not to use a sub-query or CTE here.
Try this SQL with a global window clause:
select
name,
case
when value > 0.5*sum(value) over w then '0'
when value > 0.9*sum(value) over w then '1'
else '2'
end as var
from my_table
window w as (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
In this case window clause is the recommended way to reduce repetition of code.
Both the windowing and the sum aggregation will be computed only once. You can run explain select..., confirming that only ONE meaningful MR stage will be launched.
Edit:
1. A simple select clause on a subquery is not sth to worry about. It can be pushed down to the last phase of the subquery, so as to avoid additional MR stage.
2. Two identical aggregations residing in the same query block will only be evaluated once. So don’t worry about potential repeated calculation.

Specifying a column value in an aggregate function vs using a WHERE clause

I have a database people that looks like this:
I wanted to count the occurrences of state='CA'.
My first attempt was:
SELECT COUNT(state='CA')
FROM people
;
this returned 1 row with a value of 1000. So I thought that there were 1000 people from CA in the database.
This turns out to be incorrect. I know that they are 127, which I can verify with the query
SELECT COUNT(*)
FROM people
WHERE state='CA'
;
which returns 1 row with a value of 127.
I understand how the second query works. However, I do not understand what is wrong with the first one. What is it returning?
If you want to see what's going on, run the query:
select state='CA' from people;
You will see that you will get one result for each row in people, with the value 0 or 1 (or True/False). What you've selected is whether state='CA' for each row, and there will be just as many of those results as there are rows.
You can't constrain a COUNT statement within the statement, you have to do that via the WHERE clause as in your second example.
count is not a sum .. your first query is improper because don't return the number of the rows true .. but the total numbers of not null rows true or false
if you want a filter count you must use a where condition (as your second query) otherwise you must use an if or a a select case inside the sum() function eg:
Select sum(case
when state='CA' then 1 else 0
end) as my_result from People;
or if you want count .. use null and not 0min count
Select count(case
when state='CA' then 1 else null
end) as my_result from People;
Try this-:
Select count(case when state='CA' then 1 else null end) as xyz from People;
1st query will work if you use case when in side count,
like below query will returned count of CA
SELECT sum( case when state='CA' then 1 else 0 end)
FROM people
In first query it is assigning the value 'CA' to the column state for all 1000 rows instead of filtering the values. That is what SELECT does. SELECT does not filter the number of returning rows, it modifies the data.
Whereas in WHERE clause the rows are being filtered first then the SELECT clause runs the COUNT function.
There is a sequence for running the query. It starts from FROM then WHERE, GROUP BY, ORDER BY at the end SELECT will run.
To answer the actual question - why do you get 1000? I'm guessing that there are 1000 rows in your database, or at least 1000 where state is not null. Count will return the number of rows where the thing inside the () is not null and as one of your comments says, the part inside your () will return either true or false, neither of which is null, so will count them all. Your second example is of course the right way to do it.

Changing position of a row in sql

In the above t-sql table I would like very much for the Total row to appear at the bottom. I have been wracking my head against this and in all of my other Queries simply using ORDER BY Status works as Total is alphabetically much farther down the list than most of our row values.
This is not the case here and I just can't figure out how to change it
I'm pretty new to sql and I'be been having a lot of difficulty even determining how to phrase a google search. So far I've just gotten results pertaining to Order By
The results of a select query, unless an order is explicitly specified via an 'order by' clause, can be returned in any order. Moreover, the order in which they are returned is not even deterministic. Running the exact same query 3 times in succession might return the exact same result set in 3 different orderings.
So if you want a particular order to your table, you need to order it. An order by clause like
select *
from myTable t
where ...
order by case Status when 'Total' then 1 else 0 end ,
Status
would do you. The 'Total' row will float to the bottom, the other rows will be ordered in collating sequence. You can also order things arbitrarily with this technique:
select *
from myTable t
where ...
order by case Status
when 'Deceased' then 1
when 'Total' then 2
when 'Active' then 3
when 'Withdrawn' then 4
else 5
end
will list the row(s) with a status of 'Deceased' first, followed by the row(s) with a status of 'Total', then 'Active' and 'Withdrawn', and finally anything that didn't match up to an item in the list.
ORDER BY CASE WHEN STATUS = 'Total' THEN 'zzz' ELSE STATUS END
In SQL Server (and most other databases), you can use case to sort certain statūs above others:
order by
case Status
when 'Total' then 2
else 1
end
, Status
In MS Access, you can use iif:
order by
iif(Status = 'Total', 2, 1)
, Status
You can use conditional expressions in order by:
order by (case when status = 'Total' then 1 else 0 end),
status