I need help building a pivot query from a data set that looks like this:
1 indicates the employee spoke with someone or the location and 0 indicates they haven't spoke with someone
I want to return a calculation the % of contacts spoken to and the % of locations spoken to by employee and then manager so the table would look like this:
Any ideas on how to pivot this so it calculates the percentage for each employee's assigned contacts and then the percentage for each manager's employee's assigned contacts?
You don't need a pivot if you only have two categories. Try case statements:
Select employee as [Name]
,count(case when ContactType = 'Individual' and SpokeTo = 1 then LocationName end) * 100.0/ NULLIF(count(case when ContactType = 'Individual' then LocationName end), 0) as IndividualContacts
,count(case when ContactType = 'Location' and SpokeTo = 1 then LocationName end) * 100.0/ NULLIF(count(case when ContactType = 'Location' then LocationName end), 0) as LocationContacts
from MyTable
group by Employee
Note the *100.0 is important to avoid integer division (or you can cast explicitly to decimal). The NULLIF is optional unless you have some employees that were not assigned any contacts of one type or another - then you must include it to avoid division by 0 errors.
Related
I want to produce a report that lists all the observation dates and, for each date, lists the total number of people fully vaccinated in each one of the 4 countries used in this assignment.
people_fully_vaccinated needs to be inner join with location
the first image is the format required.
the second image contains the data:
you can use:
select date,
sum(case when country = 'country 1' then vacinated_count else 0 end) as country1count,
sum(case when country = 'country 2' then vacinated_count else 0 end) as country2count,
...
...
from table
group by date
considering you have vaciniated people count ready for each country in a table.
I have a table in PostgreSQL that contains demographic data for each province of my country.
Columns are: Province_name, professions, Number_of_people.
As you can see, Province_names are repeated for each profession.
How then can I get the province names not repeated and instead get the professions in separate columns?
It sounds like you want to pivot your table (Really: It is better to show data and expected output in your question!)
demo:db<>fiddle
This is the PostgreSQL way (since 9.4) to do that using the FILTER clause
SELECT
province,
SUM(people) FILTER (WHERE profession = 'teacher') AS teacher,
SUM(people) FILTER (WHERE profession = 'banker') AS banker,
SUM(people) FILTER (WHERE profession = 'supervillian') AS supervillian
FROM mytable
GROUP BY province
If you want to go a more common way, you can use the CASE clause
SELECT
province,
SUM(CASE WHEN profession = 'teacher' THEN people ELSE 0 END) AS teacher,
SUM(CASE WHEN profession = 'banker' THEN people ELSE 0 END) AS banker,
SUM(CASE WHEN profession = 'supervillian' THEN people ELSE 0 END) AS supervillian
FROM mytable
GROUP BY province
What you want to do is a pivot which is a little more complicated in Postgresql then in other rdbms. You can use the crosstab function. Find a introduction here: https://www.vertabelo.com/blog/technical-articles/creating-pivot-tables-in-postgresql-using-the-crosstab-function
for you it would look something like this:
SELECT *
FROM crosstab( 'select Province_name, professions, Number_of_people from table1 order by 1,2')
AS final_result(Province_name TEXT, data_scientist NUMERIC,data_engineer NUMERIC,data_architect NUMERIC,student NUMERIC);
I'm fairly new to Oracle SQL and have been tasked with creating a crosstab report using SQL. I have a single source table with a simple structure similar to this:
I'm trying to crosstab the results to show the total staff per state as follows:
Ideally I'd like to dynamically cater for offices opening in new states.
I've been looking into PIVOTS but can't seem to get my head around it. Any guidance would be gratefully received.
Thank you.
In a PIVOT you can start from a source sub-query.
Then you define what field to aggregate for which titles in another field.
SELECT *
FROM
(
SELECT Company, State, Staff
FROM YourCompanyStaffTable
WHERE State IN ('Illinois', 'Texas', 'Tennessee', 'Missouri', 'Kansas', 'Indiana')
) src
PIVOT (
SUM(Staff)
FOR State IN (
'Illinois' as Illinois,
'Texas' as Texas,
'Tennessee' as Tennessee,
'Missouri' as Missouri,
'Kansas' as Kansas,
'Indiana' as Indiana
)
) pvt
ORDER BY Company
In this query, the new column names are generated from the "State" column.
Note that in the source query there's also a limit on those names.
That's just for efficiency reasons. (less data to pull from the table)
And it'll group the results by the source fields that aren't used in the PIVOT declaration.
In this case it automatically groups on the "Company" column.
So it sums the total "Staff" for each "State" per "Company".
use case when
select company, sum(case when state='Illinois' then staff else 0 end) as Illinois,
sum(case when state='Texas' then staff else 0 end) as Texas,
sum(case when state='Tennessee' then staff else 0 end) as Tennessee,
sum(case when state='Missouri' then staff else 0 end) as Missouri,
sum(case when state='Kansas' then staff else 0 end) as kansas,
sum(case when state='Indiana' then staff else 0 end) as Indiana from t
group by company
Running Oracle 12.1. I have a Line Items table. Its structure is fixed, and I cannot change it. I need to build a dashboard style page of information of the Line items table for a person to look at their sales territory. This person might be a GVP, who owns a large territory, or a Manager, or an individual rep. The Line Items table is pretty de-normalized, as this copy is part of a DW. This ‘copy’ of the table is only updated every 2 weeks, and it looks like this.
Line_Item_ID // PK
Account_ID //
Company_Name // The legal name of the Headquarters
LOB_Name // Line of business, aka Division within the Company_Name
Account_Type // One of 2 values, ‘NAMED’ or “GENERAL’
ADG_STATUS // 3 possible values, ‘A’, ‘D’ or ‘G’
Industry // One of 15 values, for this example assume it is ONLY ‘MFG’, ‘GOV’, ‘HEALTHCARE’
// Now have the sales hierarchy of the rep who sold this
GVP // Group Vice President
SVP // Sales Vice President
RVP // Regional Vice President
RM // Regional Manager
REP // Sales Rep
// Now have information about the product sold
ProductName
ProductPrice
VariousOtherFields….
I need to make an aggregated table which will be used for quick access of the dashboard. It will have counts of various combinations, and there will be one row per PERSON, not account. A person is every UNIQUE person listed in any of the GVP, SVP, RVP, RM or REP fields. Here is what the end result table will look like. Other than PERSON, every column is based on a DISTINCT count, and it is an integer value.
PERSON
TOTAL_COMPANIES // For this person, count of DISTINCT COMPANY_NAME
TOTAL_LOBS // For this person, count of DISTINCT LOBS
TOTAL_COMPANIES_NAMED // count of DISTINCT COMPANY_NAME with ACCOUNT_TYPE=’NAMED’
TOTAL_COMPANIES_GENERAL // count of DISTINCT COMPANY_NAME with ACCOUNT_TYPE=’GENERAL’
TOTAL_LOBS_NAMED // count of DISTINCT LOB_NAME with ACCOUNT_TYPE=’NAMED’
TOTAL_LOBS_GENERAL // count of DISTINCT LOB_NAME with ACCOUNT_TYPE=’GENERAL’
TOTAL_COMPANIES_STATUS_A // count of DISTINCT COMPANY_NAME with ADG_STATUS=’A’
TOTAL_COMPANIES_STATUS_D // count of DISTINCT COMPANY_NAME with ADG_STATUS=’D’
TOTAL_COMPANIES_STATUS_G // count of DISTINCT COMPANY_NAME with ADG_STATUS=’G’
TOTAL_LOB_STATUS_A // count of DISTINCT LOB_NAME with ADG_STATUS=’A’
TOTAL_LOB_STATUS_D // count of DISTINCT LOB_NAME with ADG_STATUS=’D’
TOTAL_LOB_STATUS_G // count of DISTINCT LOB_NAME with ADG_STATUS=’G’
//Now Various Industry Permutations. I have 15 different industries, but only showing 2. This will only be at the COMPANY_NAME level, not the LOB_NAME level
MFG_COMPANIES_STATUS_A // count of DISTINCT COMPANY_NAME with ADG_STATUS=’A’ and Industry = ‘MFG’
MFG_COMPANIES_STATUS_D // count of DISTINCT COMPANY_NAME with ADG_STATUS=’D’ and Industry = ‘MFG’
MFG_COMPANIES_STATUS_G // count of DISTINCT COMPANY_NAME with ADG_STATUS=’G’ and Industry = ‘MFG’
GOV_COMPANIES_STATUS_A // count of DISTINCT COMPANY_NAME with ADG_STATUS=’A’ and Industry = ‘GOV’
GOV_COMPANIES_STATUS_D // count of DISTINCT COMPANY_NAME with ADG_STATUS=’D’ and Industry = ‘GOV’
GOV_COMPANIES_STATUS_G // count of DISTINCT COMPANY_NAME with ADG_STATUS=’G’ and Industry = ‘GOV’
There are approx. 400 people, 35000 unique accounts, and 200,000 entries in the line items table.
So what is my strategy? I have thought about making another table of unique PERSON values, and using it as a driving table. Let’s call this table PERSON_LIST.
Pseudo-code…
For each entry in PERSON_LIST
For all LINE_ITEMS where person_list in ANY(GVP, SVP, RVP, RM, REP) do
Calculations…
This would be an incredibly long running process…
How can I do this more effectively (set based as opposed to row by row)? I believe I would have to use the PIVOT operator for the INDUSTRY list, but can I use PIVOT with additional criteria? Aka count of distinct COMPANY with a specific industry and a specific ADG_STATUS?
Any ideas or SQL code most appreciated.
You could unpivot the original data to get the data from the original GVP etc. columns into one 'person' column:
select * from line_items
unpivot (person for role in (gvp as 'GVP', svp as 'SVP', rvp as 'RVP',
rm as 'RM', rep as 'REP'))
And then use that as a CTE or inline view, with pretty much what you showed; conditional aggregation using case expressions, something like:
select person,
count(distinct company_name) as total_companies,
count(distinct lob_name) as total_lobs,
count(distinct case when account_type='NAMED' then company_name end)
as total_companies_named,
count(distinct case when account_type='GENERAL' then company_name end)
as total_companies_general,
count(distinct case when account_type='NAMED' then lob_name end)
as total_lobs_named,
count(distinct case when account_type='GENERAL' then lob_name end)
as total_lobs_general,
count(distinct case when adg_status='A' then company_name end)
as total_companies_status_a,
count(distinct case when adg_status='D' then company_name end)
as total_companies_status_d,
count(distinct case when adg_status='G' then company_name end)
as total_companies_status_g,
count(distinct case when adg_status='A' then lob_name end)
as total_lob_status_a,
count(distinct case when adg_status='D' then lob_name end)
as total_lob_status_d,
count(distinct case when adg_status='G' then lob_name end)
as total_lob_status_g,
count(distinct case when adg_status='A' and industry = 'MFG' then company_name end)
as mfg_companies_status_a,
count(distinct case when adg_status='D' and industry = 'MFG' then company_name end)
as mfg_companies_status_d,
count(distinct case when adg_status='G' and industry = 'MFG' then company_name end)
as mfg_companies_status_g,
count(distinct case when adg_status='A' and industry = 'GOV' then company_name end)
as gov_companies_status_a,
count(distinct case when adg_status='D' and industry = 'GOV' then company_name end)
as gov_companies_status_d,
count(distinct case when adg_status='G' and industry = 'GOV' then company_name end)
as gov_companies_status_g
from (
select * from line_items
unpivot (person for role in (gvp as 'GVP', svp as 'SVP', rvp as 'RVP',
rm as 'RM', rep as 'REP'))
)
group by person;
I have searched and think I have found part of my answer, but I still can't quite figure it out. I have a database with 4 tables and I'm trying to return for each employee their name, the number of total vacation days they have which is based on their job title and the number of vacation days they have taken wich is found by adding up all of instances where the ReasonID column of the Leave table equals 2 for that employee.
This is what I have, and if I take out the line where I'm trying to get VacationDaysTaken, I can return the correct EmployeeName and TotalVactionDays. If I just try to return VacationDaysTaken, then I get the number of vacation days used by all employees. If I try to run it as I have it listed below, I get "Column 'Employee.Last' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
SELECT
Employee.Last + ', ' + Employee.First AS EmployeeName,
Title.Vacation AS TotalVacationDays,
SUM(CASE WHEN Leave.ReasonID=2 THEN 1 ELSE 0 END) AS VacationDaysTaken
FROM Employee, Title, Leave, LeaveType
WHERE Employee.EmpID = Leave.EmpID
AND Leave.ReasonID = LeaveType.ReasonID
AND Employee.TitleID = Title.TitleID
ORDER BY EmployeeName
Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
You need a GROUP BY:
SELECT e.Last + ', ' + e.First AS EmployeeName,
t.Vacation AS TotalVacationDays,
SUM(CASE WHEN l.ReasonID = 2 THEN 1 ELSE 0 END) AS VacationDaysTaken
FROM Employee e JOIN
Title t
ON e.TitleID = t.TitleID JOIN
Leave l
ON e.EmpID = l.EmpID
GROUP BY e.Last, e.First, t.Vacation
ORDER BY EmployeeName;
Note: Because you are using the ReasonId for the comparison, there is no need to join to the leave types table.