Count instances of value (say, '4') in several columns/ rows - sql

I have survey responses in a SQL database. Scores are 1-5.
Current format of the data table is this:
Survey_id, Question_1, Question_2, Question_3
383838, 1,1,1
392384, 1,5,4
393894, 4,3,5
I'm running a new query where I need % 4's, % 5's ... question doesn't matter, just overall.
At first glance I'm thinking
sum(iif(Question_1 =5,1,0)) + sum(iif(Question_2=5,1,0)) .... as total5s
sum(iif(Question_1=4,1,0)) + sum(iif(Question_2=4,1,0)) .... as total4s
But I am unsure if this is the quickest or most elegant way to achieve this.
EDIT: Hmm on first test this query already appears not to work correctly
EDIT2: I think I need sum instead of count in my example, will edit.

You have to unpivot the data and calculate the % responses thereafter. Because there are a limited number of questions, you can use union all to unpivot the data.
select 100.0*count(case when question=4 then 1 end)/count(*) as pct_4s
from (select survey_id,question_1 as question from tablename
union all
select survey_id,question_2 from tablename
union all
select survey_id,question_3 from tablename
) responses
Another way to do this could be
select 100.0*(count(case when question_1=4 then 1 end)
+count(case when question_2=4 then 1 end)
+count(case when question_3=4 then 1 end))
/(3*count(*))
from tablename
With unpivot as #Dudu suggested,
with unpivoted as (select *
from tablename
unpivot (response for question in (question_1,question_2,question_3)) u
)
select 100.0*count(case when response=4 then 1 end)/count(*)
from unpivoted

Related

Transposing Column Data into Row Data SQL

I have some data that looks like this in an SQL table.
[ID],[SettleDate],[Curr1],[Curr2][Quantity1],[Quantity2],[CashAmount1],[CashAmount2]
The issue i have, i need to create 2 records from this data (all information from 1 and all information of 2). Example below.
[ID],[SettleDate],[Curr1],[Quantity1],[CashAmount1]
[ID],[SettleDate],[Curr2],[Quantity2],[CashAmount2]
Does anyone have an ideas how to do so?
Thanks
A standard (ie cross-RDBMS) solution for this is to use union:
select ID, SettleDate, Curr1, Quantity1, CashAmount1 from mytable
union all select ID, SettleDate, Curr2, Quantity2, CashAmount2 from mytable
Depending on your RBDMS, neater solutions might be available.
Just another option. The ItemNbr 1/2 is just to maintain which element.
Select A.[ID]
,A.[SettleDate]
,B.*
From YourTable A
Cross Apply ( values (1,[Curr1],[Quantity1],[CashAmount1])
,(2,[Curr2],[Quantity2],[CashAmount2])
) B{ItemNbr,Curr,Quantity,CashAmount)

Stored procedure to count then divide a column

I don't exactly know how to title this question. But I am looking to create a stored procedure or procedures to create a new table with averages. I have 19 sites that I have collected survey data from. I want to count each column two but with two different conditions.
E.g.
SELECT COUNT(ColumnName)
FROM TableName
WHERE ColumnName = 3
SELECT COUNT(ColumnName)
FROM TableName
WHERE ColumnName = 4
From there I would like to add those two numbers together then divide by another count for another column in the table.
Basically I want to know how many surveys have the answer 3 and 4 then divide them by how many surveys were answered. Also keep in mind I want numbers based on each site.
Use group by:
select columnname
from tablename
where columnname in (3, 4)
group by columname;
You seems want :
select sum(case when col in (3,4) then 1 else 0 end) / count(*)
from table t
So I have gotten a bit closer to what I want trying to achieve but it is still not doing what I want it do. This is what I have come up with but I don't know how to get to divide by the sum. SELECT (SELECT COUNT() FROM Resident_Survey WHERE CanbealonewhenIwish = 3 and Village = 'WP' and Setting = 'LTC')+ (SELECT COUNT() FROM Resident_Survey WHERE CanbealonewhenIwish = 4 and Village = 'WP' and Setting = 'LTC')/ (SELECT COUNT(*) FROM Resident_Survey WHERE Village = 'WP' and Setting = 'LTC') AS ICanbealonewhenIwish
I figured it out. I was looking to create this query.
SELECT 100.0 *
COUNT(CASE WHEN Privacyisrespected IN (3,4)
THEN 1
ELSE NULL END) /
COUNT(*) AS Myprivacyisrespected
FROM Resident_Survey
WHERE Village = 'WP'
and Setting = 'LTC'

Fetch data from table using SQL

I have a table named "Orders" with 1-1000 rows and 3 columns (S.no, Order and Status). I need to fetch Order from 50-1000 which has its Status as "Cancelled". How can i do this in SQL Server?
Logic operator:
SELECT Order
FROM Orders
WHERE Status = 'Cancelled'
AND (S.no > 50 AND S.no < 1000)
BETWEEN:
SELECT Order
FROM Orders
WHERE Status = 'Cancelled'
AND (S.no BETWEEN 50 and 1000)
select *
from orders
where no between 50 and 1000
and status = 'Cancelled'
Assuming you meant to say that the column was named "no". S.no would not be a valid column name.
You can try something like this:
SELECT *
FROM Orders
WHERE (S.no BETWEEN 50 AND 1000) AND (Status = 'Cancelled')
Hope this helps
If you're using SQL Server, you don't have access to Limit and Offset (unless that's changed in the last year or so, in which case please someone correct me).
There's a really nice generalizable solution discussed here: Equivalent of LIMIT and OFFSET for SQL Server?
I'd definitely take a look at that. If indeed your s_no values range from 1-1000, then the solution above by Notulysses should work just fine. But if you don't have so_no between 1-1000 (or in some other easy to filter way) then check out the solution linked to above. If you can what Notulysses recommended, go for it. If you need a generalizable solution, the one above is very good. I've also copied it below, for reference
;WITH Results_CTE AS
(
SELECT
Col1, Col2, ...,
ROW_NUMBER() OVER (ORDER BY SortCol1, SortCol2, ...) AS RowNum
FROM Table
WHERE <whatever>
)
SELECT *
FROM Results_CTE
WHERE RowNum >= #Offset
AND RowNum < #Offset + #Limit

filtering rows by checking a condition for group in one statement only

I have the following statement:
SELECT
(CONVERT(VARCHAR(10), f1, 120)) AS ff1,
CONVERT(VARCHAR(10), f2, 103) AS ff2,
...,
Bonus,
Malus,
ClientID,
FROM
my_table
WHERE
<my_conditions>
ORDER BY
f1 ASC
This select returns several rows for each ClientID. I have to filter out all the rows with the Clients that don't have any row with non-empty Bonus or Malus.
How can I do it by changing this select by one statement only and without duplicating all this select?
I could store the result in a #temp_table, then group the data and use the result of the grouping to filter the temp table. - BUT I should do it by one statement only.
I could perform this select twice - one time grouping it and then I can filter the rows based on grouping result. BUT I don't want to select it twice.
May be CTE (Common Table Expressions) could be useful here to perform the select one time only and to be able to use the result for grouping and then for selecting the desired result based on the grouping result.
Any more elegant solution for this problem?
Thank you in advance!
Just to clarify what the SQL should do I add an example:
ClientID Bonus Malus
1 1
1
1 1
2
2
3 4
3 5
3 1
So in this case I don't want the ClientID=2 rows to appear (they are not interesting). The result should be:
ClientID Bonus Malus
1 1
1
1 1
3 4
3 5
3 1
SELECT Bonus,
Malus,
ClientID
FROM my_table
WHERE ClientID not in
(
select ClientID
from my_table
group by ClientID
having count(Bonus) = 0 and count(Malus) = 0
)
A CTE will work fine, but in effect its contents will be executed twice because they are being cloned into all the places where the CTE is being used. This can be a net performance win or loss compared to using a temp table. If the query is very expensive it might come out as a loss. If it is cheap or if many rows are being returned the temp table will lose the comparison.
Which solution is better? Look at the execution plans and measure the performance.
The CTE is the easier, more maintainable are less redundant alternative.
You haven't specified what are data types of Bonus and Malus columns. So if they're integer (or can be converted to integer), then the query below should be helpful. It calculates sum of both columns for each ClientID. These sums are the same for each detail line of the same client so we can use them in WHERE condition. Statement SUM() OVER() is called "windowed function" and can't be used in WHERE clause so I had to wrap your select-list with a parent one just because of syntax.
SELECT *
FROM (
SELECT
CONVERT(VARCHAR(10), f1, 120) AS ff1,
CONVERT(VARCHAR(10), f2, 103) AS ff2,
...,
Bonus,
Malus,
ClientID,
SUM(Bonus) OVER (PARTITION BY ClientID) AS ClientBonusTotal,
SUM(Malus) OVER (PARTITION BY ClientID) AS ClientMalusTotal
FROM
my_table
WHERE
<my_conditions>
) a
WHERE ISNULL(a.ClientBonusTotal, 0) <> 0 OR ISNULL(a.ClientMalusTotal, 0) <> 0
ORDER BY f1 ASC

cleaner way to write this sql

sometimes when I write sql I encounter the following situation:
select A = (
select sum(A)
--... big query using abc
),
select B = (
select sum(B)
--... same big query using abc
)
from abc
Maybe it doesn't look very well, but it's the only way I can think of in some situations. So the question is: big query is repeated, perhaps there is a cleaner way to write same thing?
Clarifications: abc is a bunch of joins. using abc means using current abc row's data. big query is not the same as abc.
Outer apply will help here:
select *
from abc
outer apply (
select sum(a) as sumA, sum(b) as sumB
-- big query using abc
) sums
if the 'big query' is the same in all the subselects, can't you just do:
select sum(a), sum(b)
from abc
where ...big query
Can't be more helpful without a decent set of example data and corresponsing query..
your query could be simplified to
SELECT sum(a) as A, sum(b) as B
FROM abc
although i suspect you've oversimplified your situation
It's hard to say what to do without seeing actual query and what you are trying to achieve. There are some approaches that might be useful.
1. Use CTE or derived table for your big query
2. In some cases it can be replaced with a number of SUM(CASE WHEN [condition] THEN field END)
If A and B are fields, you can just put both sums in the query:
select sum(a), sum(b) from abc
If what you want to do is to aggregate the same rows depending on different conditions, you can often use case. Imagine you have a table TASKS with fields STATUS and EFFORT, and you want to count both ACTIVE and PASSIVE tasks, and get the total effort of each aggregate. You could do:
select
count(case when status = 'ACTIVE' then 1 end) active_nr,
sum(case when status = 'ACTIVE' then effort else 0 end) active_effort,
count(case when status = 'PASSIVE' then 1 end) passive_nr,
sum(case when status = 'PASSIVE' then effort else 0 end) passive_effort
from tasks;
This is a simple example, the predicates tested by case can be as complex as you need, involving multiple fields, etc. As a bonus, this approach will usually be nicer to the database.
select sum(A),sum(B)
--... big query using abc
from abc
No need to split it up.