How to count and group by - sql

As you can see in this image below, I need to count how many number '1' is on every column, the number '1' means that the person interviewed feels secure at Home(AP_4_01),Workplace(AP4_4_02) and so on..
Number 2 = Insecure
Number 3 = Doesn't Apply
Number 9 = Didn't Answer
+----------+----------------------+
| Columns | Numbers of persons |
+----------+----------------------+
| AP4_4_01 | 312 |
| AP4_4_02 | 232 |
| AP4_4_03 | 345 |
| AP4_4_0X | XXX |
+----------+----------------------+

You just need to use the SUM function on some case statements
SELECT
SUM(CASE WHEN AP_4_01 = 1 THEN 1 ELSE 0 END)
,SUM(CASE WHEN AP_4_02 = 1 THEN 1 ELSE 0 END)
...etc
FROM Table
To get a result set like the one in your question, you will need to use the UNPIVOT function, or you can transpose it in excel.

Related

Return the mean of a grouped value along with the mean of the top n% of that value in the same query?

I need to write one query that returns both the average value of fields in a group as well as the average of the top 33% of the values of those fields in a group.
UserId | Sequence | Value | Value2
-------|----------|-------|-------
1 | 1 | 5 | 0
1 | 2 | 10 | 15
1 | 3 | 15 | 20
1 | 4 | NULL | 25
1 | 5 | NULL | 30
1 | 6 | NULL | 60
The return needs to also contain the denominators used to calculate the means, I want to group by user and return something like this:
UserId | ValueMean | ValueDenom | ValueTopNMean | ValueTopNDenom | Value2Mean | Value2Denom | Value2TopNMean | Value2TopNDenom
-------|-----------|------------|---------------|----------------|------------|-------------|----------------|----------------
1 | 10 | 3 | 15 | 1 | 25 | 6 | 45 | 2
I've tried various window functions (NTILE, PERCENT_RANK, etc.), but what is tricky is I have multiple fields of values that will need to undergo this same operation, and the denominators for each Value field will vary (n% will stay the same, however). Please let me know if I've been unclear or you need more information.
The overall average and top value, as well as the count of non-null values, can easily be computed with aggregate functions.
As for the average and count of top N values: you can use ntile() in a subquery to identify the relevant rows first, then use that information in conditional expressions within aggregate functions in the outer query.
select
userid,
avg(value) avg_value,
count(value) cnt_value,
max(value) top_value,
avg(case when ntile_value = 1 then value end) avg_topn_value,
sum(case when ntile_value = 1 then 1 else 0 end) cnt_topn_value
from (select t.*, ntile(3) over(order by value) ntile_value from mytable t) t
group by userid

Transposing lines containing Text to columns

I have a table just like this one:
+----+---------+-------------+------------+
| ID | Period | Total Units | Source |
+----+---------+-------------+------------+
| 1 | Past | 400 | Competitor |
| 1 | Present | 250 | PAWS |
| 2 | Past | 3 | BP |
| 2 | Present | 15 | BP |
+----+---------+-------------+------------+
And I'm trying to transpose the lines into columns, so that for each ID, I have one unique line that compares past and present numbers and attributes. Like following :
+----+------------------+---------------------+-------------+----------------+
| ID | Total Units Past | Total Units Present | Source Past | Source Present |
+----+------------------+---------------------+-------------+----------------+
| 1 | 400 | 250 | Competitor | PAWS
|
| 2 | 3 | 15 | BP | BP |
+----+------------------+---------------------+-------------+----------------+
Transposing the total units is not a problem, as I use a SUM(CASE WHEN Period = Past THEN Total_Units ELSE 0 END) AS Total_Units.
However I don't know how to proceed with text columns. I've seen some pivot and unpivot clause used but they all use an aggregate function at some point.
You can do conditional aggregation :
select id,
sum(case when period = 'past' then units else 0 end) as unitspast,
sum(case when period = 'present' then units else 0 end) as unitpresent,
max(case when period = 'past' then source end) as sourcepast,
max(case when period = 'present' then source end) as sourcepresent
from table t
group by id;
Assuming you only have two rows per ID, you could also join:
Select a.ID, a.units as UnitsPast, a.source as SourcePast
, b.units as UnitsPresent, b.source as SourcePresent
from MyTable a
left join MyTable b
on a.ID = b.ID
and b.period = 'Present'
where a.period = 'Past'

SQL Grouping entries with a different value

Let's assume I have a report that displays an ID and VALUE from different tables
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 1 | 1 |
4 | 2 | 0 |
5 | 2 | 0 |
My goal is to display this table with grouped IDs and VALUEs. My rule to grouping VALUEs would be "If VALUE contains atleast one '1' then display '1' otherwise display '0'".
My current SQL is (simplified)
SELECT
TABLE_A.ID,
CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END AS VALUE
FROM TABLE_A, TABLE_B, TABLE_C
GROUP BY
TABLE_A.ID
(CASE
WHEN TABLE_B.VALUE = 1 OR TABLE_C.VALUE NOT IN (0,1,2,3)
THEN 1
ELSE 0
END)
The output is following
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 1 | 0 |
3 | 2 | 0 |
Which is half way to the output I want
| ID | VALUE |
|----|-------|
1 | 1 | 1 |
2 | 2 | 0 |
So my Question is: How do I extend my current SQL (or change it completely) to get my desired output?
If you are having only 0 and 1 as distinct values in FOREIGN_VALUE column then using max() function as mentioned by HoneyBadger in the comment will fulfill your requirement.
SELECT
ID,
MAX(FOREIGN_VALUE) AS VALUE
FROM (SELECT
ID,
CASE WHEN FOREIGN_VALUE = 1
THEN 1
ELSE 0
END AS FOREIGN_VALUE
FROM TABLE,
FOREIGN_TABLE)
GROUP BY
ID;
Assuming value is always 0 or 1, you can do:
select id, max(value) as value
from t
group by id;
If value can take on other values:
select id,
max(case when value = 1 then 1 else 0 end) as value
from t
group by id;

SQL Query: Binary Fields on Dates for Time Series

I have a table that looks like the following:
ID | Date
1 | 2010-01-01
2 | 2010-02-01
3 | 2010-02-15
2 | 2010-02-15
4 | 2010-03-01
I am having trouble creating a table with the IDs as rows, and Time periods as columns. For a given time period, I would like to place a 1 if that ID has an associated date in that period, and a 0 if not.
So the output might look like:
ID | Jan-10 | Feb-10 | Mar-10
1 | 1 | 0 | 0
2 | 0 | 1 | 0
3 | 0 | 1 | 0
4 | 0 | 0 | 1
What is the best way to accomplish this?
I have created a new table containing all distinct IDs from the table above in anticipation of a join---but I'm not sure how to handle the 1/0s.
You can do this with a simple group by and conditional aggregation:
select id,
max(case when month(t.date) = 1 and year(t.date) = 2010 then 1 else 0 end) as "Jan-10",
max(case when month(t.date) = 2 and year(t.date) = 2010 then 1 else 0 end) as "Feb-10",
max(case when month(t.date) = 3 and year(t.date) = 2010 then 1 else 0 end) as "Mar-10"
from "table" t
group by id;
Because you have so many time periods, I would generate the code for the column using formulas in Excel (or your favorite spreadsheet).

Conditionally Summing the same Column multiple times in a single select statement?

I have a single table that shows employee deployments, for various types of deployment, in a given location for each month:
ID | Location_ID | Date | NumEmployees | DeploymentType_ID
As an example, a few records might be:
1 | L1 | 12/2010 | 7 | 1 (=Permanent)
2 | L1 | 12/2010 | 2 | 2 (=Temp)
3 | L1 | 12/2010 | 1 | 3 (=Support)
4 | L1 | 01/2011 | 4 | 1
5 | L1 | 01/2011 | 2 | 2
6 | L1 | 01/2011 | 1 | 3
7 | L2 | 12/2010 | 6 | 1
8 | L2 | 01/2011 | 6 | 1
9 | L2 | 12/2010 | 3 | 2
What I need to do is sum the various types of people by date, such that the results look something like this:
Date | Total Perm | Total Temp | Total Supp
12/2010 | 13 | 5 | 1
01/2011 | 10 | 2 | 1
Currently, I've created a separate query for each deployment type that looks like this:
SELECT Date, SUM(NumEmployees) AS "Total Permanent"
FROM tblDeployment
WHERE DeploymentType_ID=1
GROUP BY Date;
We'll call that query qSumPermDeployments. Then, I'm using a couple of joins to combine the queries:
SELECT qSumPermDeployments.Date, qSumPermDeployments.["Total Permanent"] AS "Permanent"
qSumTempDeployments.["Total Temp"] AS "Temp"
qSumSupportDeployments.["Total Support"] AS Support
FROM (qSumPermDeployments LEFT JOIN qSumTempDeployments
ON qSumPermDeployments.Date = qSumTempDeployments.Date)
LEFT JOIN qSumSupportDeployments
ON qSumPermDeployments.Date = qSumSupportDeployments.Date;
Note that I'm currently constructing that final query under the assumption that a location will only have temp or support employees if they also have permanent employees. Thus, I can create the joins using the permanent employee results as the base table. Given all of the data I currently have, that assumption holds up, but ideally I'd like to move away from that assumption.
So finally, my question. Is there a way to simplify this down to a single query or is it best to separate it out into multiple queries - if for no other reason that readability.
SELECT Date,
SUM(case when DeploymentType_ID = 1 then NumEmployees else null end) AS "Total Permanent",
SUM(case when DeploymentType_ID = 2 then NumEmployees else null end) AS "Total Temp",
SUM(case when DeploymentType_ID = 3 then NumEmployees else null end) AS "Total Supp"
FROM tblDeployment
GROUP BY Date
Try this:
SELECT
Date,
SUM(CASE WHEN DeploymentType_ID=1 THEN NumEmployees ELSE 0 END) AS "Total Permanent",
SUM(CASE WHEN DeploymentType_ID=2 THEN NumEmployees ELSE 0 END) AS "Total Temporary",
SUM(CASE WHEN DeploymentType_ID=3 THEN NumEmployees ELSE 0 END) AS "Total Support"
FROM tblDeployment
GROUP BY Date;