Transposing lines containing Text to columns - sql

I have a table just like this one:
+----+---------+-------------+------------+
| ID | Period | Total Units | Source |
+----+---------+-------------+------------+
| 1 | Past | 400 | Competitor |
| 1 | Present | 250 | PAWS |
| 2 | Past | 3 | BP |
| 2 | Present | 15 | BP |
+----+---------+-------------+------------+
And I'm trying to transpose the lines into columns, so that for each ID, I have one unique line that compares past and present numbers and attributes. Like following :
+----+------------------+---------------------+-------------+----------------+
| ID | Total Units Past | Total Units Present | Source Past | Source Present |
+----+------------------+---------------------+-------------+----------------+
| 1 | 400 | 250 | Competitor | PAWS
|
| 2 | 3 | 15 | BP | BP |
+----+------------------+---------------------+-------------+----------------+
Transposing the total units is not a problem, as I use a SUM(CASE WHEN Period = Past THEN Total_Units ELSE 0 END) AS Total_Units.
However I don't know how to proceed with text columns. I've seen some pivot and unpivot clause used but they all use an aggregate function at some point.

You can do conditional aggregation :
select id,
sum(case when period = 'past' then units else 0 end) as unitspast,
sum(case when period = 'present' then units else 0 end) as unitpresent,
max(case when period = 'past' then source end) as sourcepast,
max(case when period = 'present' then source end) as sourcepresent
from table t
group by id;

Assuming you only have two rows per ID, you could also join:
Select a.ID, a.units as UnitsPast, a.source as SourcePast
, b.units as UnitsPresent, b.source as SourcePresent
from MyTable a
left join MyTable b
on a.ID = b.ID
and b.period = 'Present'
where a.period = 'Past'

Related

Calculating percentage by group using Teradata

I'm trying to create a table that displays the percentage of counts per state dependent on the indicator.
Here's an example of the dataset I'm using to create my new table.
+-------------+-------+-------+
| Indicator | State | Count |
+-------------+-------+-------+
| Registered | CA | 25 |
| Registered | FL | 12 |
| Total | CA | 50 |
| Total | FL | 36 |
+-------------+-------+-------+
I'm trying to create a new table that would have a Percentage for each corresponding row like this:
+-------------+-------+-------+------------+
| Indicator | State | Count | Percentage |
+-------------+-------+-------+------------+
| Registered | CA | 25 | 50 |
| Registered | FL | 12 | 33.3 |
| Total | CA | 50 | . |
| Total | FL | 36 | . |
+-------------+-------+-------+------------+
So far, i've tried doing the below query:
select indicator, state, count
, case when (select count from table where indicator='Registered') * 100 / (select count from table where indicator='Total')
when indicator = 'Total' then . end as Percentage
from table;
This doesn't work because I get an error: "Subquery evaluated more than one row." I'm guessing its because I'm not taking into account the state in the case when statement, but i'm not sure as to how I would go about that.
What would be the best way to do this?
Just join the table back with itself.
select a.indicator, a.state, a.count
, case when (indicator='Total') then null
else 100 * a.count/b.count
end as Percentage
from table a
inner join (select state,count from table where indicator='Total') b
on a.state = b.state
;
You can use window functions:
select t.*,
(case when indicator <> 'Total'
then count * 100.0 / sum(case when indicator = 'Total' then indicator end) over (partition by state)
end) as percentage
from t;

One SQL query with multiple conditions

I am running an Oracle database and have two tables below.
#account
+----------------------------------+
| acc_id | date | acc_type |
+--------+------------+------------+
| 1 | 11-07-2018 | customer |
| 2 | 01-11-2018 | customer |
| 3 | 02-09-2018 | employee |
| 4 | 01-09-2018 | customer |
+--------+------------+------------+
#credit_request
+-----------------------------------------------------------------+
| credit_id | date | credit_type | acc_id | credit_amount |
+------------+-------------+---------- +--------+
| 1112 | 01-08-2018 | failed | 1 | 2200 |
| 1214 | 02-12-2018 | success | 2 | 1500 |
| 1312 | 03-11-2018 | success | 4 | 8750 |
| 1468 | 01-12-2018 | failed | 2 | 3500 |
+------------+-------------+-------------+--------+---------------+
Want to have followings for each customer:
the last successful credit_request
sum of credit_amount of all failed credit_requests
Here is one method:
select a.acct_id, acr.num_fails,
acr.num_successes / nullif(acr.num_fails) as ratio, -- seems weird. Why not just the failure rate?
last_cr.credit_id, last_cr.date, last_cr.credit_amount
from account a left join
(select acc_id,
sum(case when credit_type = 'failed' then 1 else 0 end) as num_fails,
sum(case when credit_type = 'failed' then credit_amount else 0 end) as num_fails,
sum(case when credit_type = 'success' then 1 else 0 end) as num_successes
max(case when credit_type = 'success' then date else 0 end) as max_success_date
from credit_request
group by acct_id
) acr left join
credit_request last_cr
on last_cr.acct_id = acr.acct_id and last_cr.date = acr.date;
The following query should do the trick.
SELECT
acc_id,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN credit_id END) as last_successfull_credit_id,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN cdate END) as last_successfull_credit_date,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN credit_amount END) as last_successfull_credit_amount,
SUM(CASE WHEN credit_type = 'failed' THEN credit_amount ELSE 0 END) total_amount_of_failed_credit,
SUM(CASE WHEN credit_type = 'failed' THEN 1 ELSE 0 END) / COUNT(*) ratio_success_request
FROM (
SELECT
a.acc_id,
a.cdate adate,
a.acc_type,
c.credit_id,
c.cdate,
c.credit_type,
c.credit_amount,
ROW_NUMBER() OVER(PARTITION BY c.acc_id, c.credit_type ORDER BY c.cdate DESC) rn
FROM
account a
LEFT JOIN credit_request c ON c.acc_id = a.acc_id
) x
GROUP BY acc_id
ORDER BY acc_id
The subquery assigns a sequence to each record, within groups of accounts and credit types, using ROW_NUMBR(). The outer query does conditional aggrgation to compute the different computation you asked for.
This Db Fiddle demo with your test data returns :
ACC_ID | LAST_SUCCESSFULL_CREDIT_ID | LAST_SUCCESSFULL_CREDIT_DATE | LAST_SUCCESSFULL_CREDIT_AMOUNT | TOTAL_AMOUNT_OF_FAILED_CREDIT | RATIO_SUCCESS_REQUEST
-----: | -------------------------: | :--------------------------- | -----------------------------: | ----------------------------: | --------------------:
1 | null | null | null | 2200 | 1
2 | 1214 | 02-DEC-18 | 1500 | 3500 | .5
3 | null | null | null | 0 | 0
4 | 1312 | 03-NOV-18 | 8750 | 0 | 0
This might be what you are looking for... Since you did not show expected results, this might not be 100% accurate, feel free to adapt this.
I guess the below query is easy to understand and implement. Also, to avoid more and more terms in the CASE statements you can just make use of WITH clause and use it in the CASE statements to reduce the query size.
SELECT a.acc_id,
c.credit_type,
(distinct c.credit_id),
CASE WHEN
c.credit_type='success'
THEN max(date)
END CASE,
CASE WHEN
c.credit_type='failure'
THEN sum(credit_amount)
END CASE,
(CASE WHEN
c.credit_type='success'
THEN count(*)
END CASE )/
( CASE WHEN
c.credit_type='failure'
THEN count(*)
END CASE)
from accounts a LEFT JOIN
credit_request c on
a.acc_id=c.acc_id
where a.acc_type= 'customer'
group by c.credit_type

SQL Group rows for every ID using left outer join

I have a table with almost a million records of claims for 6 different conditions like Diabetes, Hypertension, Heart Failure etc. Every member has a number of claims. He might have claims with the condition as Diabetes or Hypertension or anything else. My goal is to group the conditions they have(number of claims) per every member row.
Existing table
+--------------+---------------+------+------------+
| Conditions | ConditionCode | ID | Member_Key |
+--------------+---------------+------+------------+
| DM | 3001 | 1212 | A1528 |
| HTN | 5001 | 1213 | A1528 |
| COPD | 6001 | 1214 | A1528 |
| DM | 3001 | 1215 | A1528 |
| CAD | 8001 | 1823 | B4354 |
| HTN | 5001 | 3458 | B4354 |
+--------------+---------------+------+------------+
Desired Result
+------------+------+-----+----+----+-----+-----+
| Member_Key | COPD | CAD | DM | HF | CHF | HTN |
+------------+------+-----+----+----+-----+-----+
| A1528 | 1 | | 2 | | | 1 |
| B4354 | | 1 | | | | 1 |
+------------+------+-----+----+----+-----+-----+
Query
select distinct tr.Member_Key,C.COPD,D.CAD,DM.DM,HF.HF,CHF.CHF,HTN.HTN
FROM myTable tr
--COPD
left outer join (select Member_Key,'X' as COPD
FROM myTable
where Condition=6001) C
on C.Member_Key=tr.Member_Key
--CAD
left outer join ( ....
For now I'm just using 'X'. But i'm trying to get the number of claims in place of X based on condition. I don't think using a left outer join is efficient when you are searching 1 million rows and doing a distinct. Do you have any other approach in solving this
You don't want so many sub-queries, this is easy with group by and case statements:
SELECT Member_Key
SUM(CASE WHEN Condition=6001 THEN 1 ELSE 0 END) AS COPD,
SUM(CASE WHEN Condition=3001 THEN 1 ELSE 0 END) AS DM,
SUM(CASE WHEN Condition=5001 THEN 1 ELSE 0 END) AS HTN,
SUM(CASE WHEN Condition=8001 THEN 1 ELSE 0 END) AS CAD
FROM myTable
GROUP BY Member_Key
This is an ideal situation for CASE statments:
SELECT tr.Member_Key,
SUM(CASE WHEN Condition=6001 THEN 1 ELSE 0 END) as COPD,
SUM(CASE WHEN Condition=6002 THEN 1 ELSE 0 END) as OtherIssue,
SUM(CASE etc.)
FROM myTable tr
GROUP BY tr.Member_Key
This should be done with a PIVOT, like:
SELECT *
FROM
(SELECT conditions, member_key
FROM t) src
PIVOT
(COUNT (conditions)
for conditions in ([COPD], [CAD], [DM], [HF], [CHF], [HTN])) pvt

Combine two rows with just 2 different column values into one

I have a table that keeps records of Workplace Accidents. Each IncidentNo is supposed to be unique. The organization that provided me the data in such a way that each IncidentNo has two entries, one for each gender. Format of the table is something like this. (Table has around 110 columns, I just showed the ones related to the question. All of the columns are varchar)
+------------+--------+----------------+
| IncidentNo | Gender | PersonnelCount |
+------------+--------+----------------+
| 123456 | M | 150 |
| 123456 | F | 100 |
| 789012 | M | 31 |
| 789012 | F | 42 |
+------------+--------+----------------+
What I need is to combine these columns in such a way that table (Doesn't matter if it is on the same table or inserted into a new one) is something like this:
+------------+----------------------+--------------------+
| IncidentNo | FemalePersonnelCount | MalePersonnelCount |
+------------+----------------------+--------------------+
| 123456 | 100 | 150 |
| 789012 | 42 | 31 |
+------------+----------------------+--------------------+
I thought to use Left Outer Join to insert data to a new table but couldn't figure out how.
Just use conditional aggregation:
select incidentno, sum(PersonnelCount) as total_PersonnelCount,
sum(case when gender = 'M' then PersonnelCount else 0 end) as nummales,
sum(case when gender = 'F' then PersonnelCount else 0 end) as numfemales,
from t
group by incidentno;
I would calculate the total as well, just to be sure that the total matches the sum of 'M' and 'F'.
In its rawest form, you can self join:
select x.IncidentNo,
sum(a.personelcount) as Female,
sum(b.personelcount) as Male
from Accidents x
left join Accidents a
on a.incidentno = x.incidentno
and a.Gender = 'F'
left join Accidents b
on b.incidentno = x.incidentno
and b.Gender = 'M'
group by x.IncidentNo -- I left this out originally because I'm an idiot

Conditionally Summing the same Column multiple times in a single select statement?

I have a single table that shows employee deployments, for various types of deployment, in a given location for each month:
ID | Location_ID | Date | NumEmployees | DeploymentType_ID
As an example, a few records might be:
1 | L1 | 12/2010 | 7 | 1 (=Permanent)
2 | L1 | 12/2010 | 2 | 2 (=Temp)
3 | L1 | 12/2010 | 1 | 3 (=Support)
4 | L1 | 01/2011 | 4 | 1
5 | L1 | 01/2011 | 2 | 2
6 | L1 | 01/2011 | 1 | 3
7 | L2 | 12/2010 | 6 | 1
8 | L2 | 01/2011 | 6 | 1
9 | L2 | 12/2010 | 3 | 2
What I need to do is sum the various types of people by date, such that the results look something like this:
Date | Total Perm | Total Temp | Total Supp
12/2010 | 13 | 5 | 1
01/2011 | 10 | 2 | 1
Currently, I've created a separate query for each deployment type that looks like this:
SELECT Date, SUM(NumEmployees) AS "Total Permanent"
FROM tblDeployment
WHERE DeploymentType_ID=1
GROUP BY Date;
We'll call that query qSumPermDeployments. Then, I'm using a couple of joins to combine the queries:
SELECT qSumPermDeployments.Date, qSumPermDeployments.["Total Permanent"] AS "Permanent"
qSumTempDeployments.["Total Temp"] AS "Temp"
qSumSupportDeployments.["Total Support"] AS Support
FROM (qSumPermDeployments LEFT JOIN qSumTempDeployments
ON qSumPermDeployments.Date = qSumTempDeployments.Date)
LEFT JOIN qSumSupportDeployments
ON qSumPermDeployments.Date = qSumSupportDeployments.Date;
Note that I'm currently constructing that final query under the assumption that a location will only have temp or support employees if they also have permanent employees. Thus, I can create the joins using the permanent employee results as the base table. Given all of the data I currently have, that assumption holds up, but ideally I'd like to move away from that assumption.
So finally, my question. Is there a way to simplify this down to a single query or is it best to separate it out into multiple queries - if for no other reason that readability.
SELECT Date,
SUM(case when DeploymentType_ID = 1 then NumEmployees else null end) AS "Total Permanent",
SUM(case when DeploymentType_ID = 2 then NumEmployees else null end) AS "Total Temp",
SUM(case when DeploymentType_ID = 3 then NumEmployees else null end) AS "Total Supp"
FROM tblDeployment
GROUP BY Date
Try this:
SELECT
Date,
SUM(CASE WHEN DeploymentType_ID=1 THEN NumEmployees ELSE 0 END) AS "Total Permanent",
SUM(CASE WHEN DeploymentType_ID=2 THEN NumEmployees ELSE 0 END) AS "Total Temporary",
SUM(CASE WHEN DeploymentType_ID=3 THEN NumEmployees ELSE 0 END) AS "Total Support"
FROM tblDeployment
GROUP BY Date;