PIVOT with calculated aggregate function - sql

I am trying to generate a few reports that display a calculated value for each month where the month values are the columns.
The base query works well to report the months as rows:
SELECT ROUND(SUM(REVENUE)/SUM(HEADCOUNT), 2), MONTH FROM TABLE
GROUP BY MONTH
But if I try to pivot the table, I consistently get the ORA-56902 error:' expect aggregate function inside pivot operation':
SELECT * FROM (
SELECT REVENUE, HEADCOUNT, MONTH FROM TABLE
)
PIVOT (ROUND(SUM(REVENUE)/SUM(HEADCOUNT), 2) FOR MONTH IN ('APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP', 'OCT', 'NOV', 'DEC', 'JAN', 'FEB', 'MAR'))
Is there any way to get PIVOT to see ROUND(SUM(REVENUE)/SUM(HEADCOUNT), 2) as an aggregate function, or is there some other function I should be using.

I would advise just using conditional aggregation:
SELECT ROUND(SUM(CASE WHEN MONTH = 'APR' THEN REVENUE END)/
SUM(CASE WHEN MONTH = 'APR' THEN HEADCOUNT END
), 2) as APR,
ROUND(SUM(CASE WHEN MONTH = 'MAY' THEN REVENUE END)/
SUM(CASE WHEN MONTH = 'MAY' THEN HEADCOUNT END
), 2) as MAY,
. . .
FROM TABLE;
I should point out that you can use pivot. Just calculate the summaries in the subquery and then pivot:
SELECT *
FROM (SELECT MONTH, ROUND(SUM(REVENUE) / SUM(HEADCOUNT), 2) as val
FROM TABLE
GROUP BY MONTH
) m
PIVOT (MAX(val)) FOR MONTH IN ('APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP', 'OCT', 'NOV', 'DEC', 'JAN', 'FEB', 'MAR'))

PIVOT requires an aggregate function, unconditionally.
One way is to calculate your value in the subquery and then do the pivoting.
select *
from (
select
round(sum(revenue)/sum(headcount), 2) val,
month
from table
group by month
)
pivot (
max(val)
for month in ('APR', 'MAY', 'JUN',
'JUL', 'AUG', 'SEP', 'OCT',
'NOV', 'DEC', 'JAN',
'FEB', 'MAR')
)

Related

Making a pivot table group by users

I want to see user statics, so I made query:
SELECT l.partner AS Partner ,
bu.meno||' '||decode(substr(bu.priezvisko, 1, 2), 'Sz',
substr(bu.priezvisko, 1, 2), 'Gy',
substr(bu.priezvisko, 1, 2), 'Ny',
substr(bu.priezvisko, 1, 2), 'Zs',
substr(bu.priezvisko, 1, 2), 'Cs',
substr(bu.priezvisko, 1, 2),
substr(bu.priezvisko, 1, 1))
||'.' AS prod_man --hungarian names have 2letter (surname)
, SUM(CASE
WHEN o.pocet!=0 THEN 1
ELSE 0
END) AS obj_pocet -- counting items
, SUM(CASE
WHEN o.pocet=0 OR o.p_del+o.p_del_dod>=o.pocet THEN 1
ELSE 0
END) AS nedod_pocet -- counting items2
, ROUND(SUM(CASE
WHEN o.pocet=0 OR o.p_del+o.p_del_dod>=o.pocet THEN 1
ELSE 0
END)/count(*), 3) * 100 AS "%" --percentage
FROM obj_odb_o o
JOIN obj_odb_l l ON o.rid_o=l.rid
JOIN sklad_karta sk ON sk.id=o.kod_id
JOIN bartex_users bu ON bu.id=sk.id.prod_man
WHERE l.partner in (325,
326)
GROUP BY l.partner
, bu.meno||' '||decode(substr(bu.priezvisko, 1, 2), 'Sz',
substr(bu.priezvisko, 1, 2), 'Gy',
substr(bu.priezvisko, 1, 2), 'Ny',
substr(bu.priezvisko, 1, 2), 'Zs',
substr(bu.priezvisko, 1, 2), 'Cs',
substr(bu.priezvisko, 1, 2),
substr(bu.priezvisko, 1, 1))
||'.'
It's working. Here is the result:
But I want to make a pivot by Months (last 6 months)...
WITH MONTHS AS
(
SELECT ADD_MONTHS(TRUNC(SYSDATE,'MONTH'),-LEVEL+1) AS MONTH,
DECODE(LEVEL,1,'Akt_mesiac','minuly_mesiac'||(LEVEL-1)) AS MONTH_NAME FROM DUAL CONNECT BY LEVEL <=7)
SELECT
partner,
prod_man,
'%',
NVL(Akt_mesiac,0) AS Akt_mesiac,
NVL(minuly_mesiac1,0) AS minuly_mesiac1,
NVL(minuly_mesiac2,0) AS minuly_mesiac2,
NVL(minuly_mesiac3,0) AS minuly_mesiac3,
NVL(minuly_mesiac4,0) AS minuly_mesiac4,
NVL(minuly_mesiac5,0) AS minuly_mesiac5,
NVL(minuly_mesiac6,0) AS minuly_mesiac6
FROM (
SELECT
-- my query - HERE I HAVE PROBLEM HERE
FROM MONTHS M
JOIN obj_odb_l l ON M.MONTH=TRUNC(l.datum_p,'MONTH')
) PIVOT
( SUM(CNT)
FOR MONTH_NAME IN
('Akt_mesiac' AS Akt_mesiac,
'minuly_mesiac1' AS minuly_mesiac1,
'minuly_mesiac2' AS minuly_mesiac2,
'minuly_mesiac3' AS minuly_mesiac3,
'minuly_mesiac4' AS minuly_mesiac4,
'minuly_mesiac5' AS minuly_mesiac5,
'minuly_mesiac6' AS minuly_mesiac6)
);
Table: obj_odb_l l ->date column -> l.datum_p -> trunc(l.datum_p,'MONTH')
How can I make a pivot table ?
Consider adding the month expression, TRUNC(l.datum_p,'MONTH'), into above aggregate query. Then run the query as another CTE in pivot query for JOIN in pivot's data source.
WITH MONTHS AS (
SELECT ADD_MONTHS(TRUNC(SYSDATE,'MONTH'),-LEVEL+1) AS MONTH
, DECODE(LEVEL,1,'Akt_mesiac','minuly_mesiac'||(LEVEL-1)) AS MONTH_NAME
FROM DUAL CONNECT BY LEVEL <=7
)
, AGG AS (
-- SAME AGGREGATE QUERY WITH TRUNC(l.datum_p,'MONTH') ADDED TO SELECT AND GROUP BY
-- POSSIBLY ADD WHERE CONDITION FOR LAST SIX MONTHS (IF DATA GOES BACK YEARS)
)
SELECT *
FROM (
SELECT AGG.partner
, AGG.prod_man
, AGG.obj_pocet
, AGG.nedod_pocet
, AGG.'%' AS PCT -- AVOID SPECIAL CHARS AS NAME
, M.MONTH_NAME
FROM MONTHS M
INNER JOIN AGG
ON M.MONTH = AGG.MONTH -- NEW FIELD USED FOR JOIN
)
PIVOT
( SUM(PCT) -- ONLY PIVOTS ONE NUM AT A TIME
FOR MONTH_NAME IN
('Akt_mesiac' AS Akt_mesiac,
'minuly_mesiac1' AS minuly_mesiac1,
'minuly_mesiac2' AS minuly_mesiac2,
'minuly_mesiac3' AS minuly_mesiac3,
'minuly_mesiac4' AS minuly_mesiac4,
'minuly_mesiac5' AS minuly_mesiac5,
'minuly_mesiac6' AS minuly_mesiac6)
);

I would like to display current month, year to date and previous year to date on ssrs

Data Presentation
Would like to calculate year to date and previous year to date current month on sql view so that to simply data presentation on ssrs to make it run faster. Is there a way to write a view which can perform this ?
Ignoring the fact that I think you have some errors in your Previous YTD summary numbers..
I recreated the data as per your example
CREATE TABLE #t (TransDate date, Customer varchar(30), Amount float)
INSERT INTO #t VALUES
('2020-09-21', 'Customer 1', 200),
('2020-09-22', 'Customer 2', 300),
('2020-08-03', 'Customer 2', 450),
('2020-08-04', 'Customer 1', 1200),
('2019-09-14', 'Customer 1', 859),
('2019-02-05', 'Customer 2', 230),
('2019-07-26', 'Customer 2', 910),
('2019-11-17', 'Customer 1', 820)
Then the following statement will produce what you need. It is NOT the most elegant way of doing this but it will convert to a view easily and was all I could come up with in the time I had.
SELECT
m.Customer
, m.MTD as [Current Month]
, y.YTD as [Current YTD]
, p.YTD as [Previous YTD]
FROM (
SELECT Customer, Yr = Year(TransDate), Mn = MONTH(TransDate), MTD = SUM(Amount) FROM #t t WHERE MONTH(TransDate) = MONTH(GetDate()) and YEAR(TransDate) = YEAR(GetDate())
GROUP by Customer, YEAR(TransDate), MONTH(TransDate)
) m
JOIN (SELECT Customer, Yr = YEAR(TransDate), YTD = SUM(Amount) FROM #t t GROUP by Customer, YEAR(TransDate)) y on m.Customer =y.Customer and m.Yr = y.Yr
JOIN (SELECT Customer, Yr = YEAR(TransDate), YTD = SUM(Amount) FROM #t t GROUP by Customer, YEAR(TransDate)) p on y.Customer =p.Customer and y.Yr = p.Yr + 1
This gives the following results (which don;t match your example but I think your sample is incorrect)

LEAST(STRING) and GREATEST(STRING) for long STRINGS in Legacy BigQuery SQL

I'd like to run the following SQL query in a BigQuery table:
SELECT
LEAST(origin, destination) AS point_1,
GREATEST(origin, destination) AS point_2,
COUNT(*) AS journey_count,
FROM route
GROUP BY point_1, point_2
ORDER BY point_1, point_2;
on a table like:
INSERT INTO route
( route_id, origin, destination, dur)
VALUES
( 1, 'AA', 'BB', 2),
( 2, 'CC', 'DD', 4),
( 3, 'BB', 'AA', 6),
( 4, 'CC', 'AA', 2),
( 5, 'DD', 'CC', 12);
But BigQuery tells me that, although the query is syntactically correct, string is not a valid argument type for the LEAST function, for string length > 1. I tried to cast it to numeric, like LEAST(cast(origin as numeric), cast(destination as numeric)) AS point_1 but it tells me LEAST cannot handle bytes.
How do I make LEAST and GREATEST compare long strings in BigQuery?
#legacydSQL
SELECT
IF(origin < destination, CONCAT(origin, ' - ', destination), CONCAT(destination, ' - ', origin)) route,
COUNT(1) journey_count
FROM [project:dataset.table]
GROUP BY route
ORDER BY route
if to apply to sample data from your example - result is
Row route journey_count
1 AA - BB 2
2 AA - CC 1
3 CC - DD 2
see this
with t as (
(select 1 as route_id, 'AA' as origin, 'BB' as destination, 2 as dur)
union all
(select 2, 'CC', 'DD', 4)
union all
(select 3, 'BB', 'AA', 6)
union all
(select 4, 'CC', 'AA', 2)
union all
(select 5, 'DD', 'CC', 12))
select
if(origin<destination,origin,destination) as point_1,
if(origin<destination,destination,origin) as point_2,
count(1) as journey_count
from t
GROUP BY point_1, point_2
ORDER BY point_1, point_2;

Hive windowing query

I have a base Hive table with following schema:
And I want the below output:
So basically, grouping on all columns, and calculating the count distinct Encounters in that month and last 3 months (including that month).
For example, for DischargeMonthYear Jan-2018, num_discharges_last_30_days would be patients discharged in Jan-2018 (3) and num_discharges_last_90_days would be patients discharged in Nov-17, Dec-17 and Jan-18. Since there is no data before Jan-18 in this case, both counts would be the same.
Similarly for Mar-18, num_discharges_last_90_days should include counts for Jan, Feb and Mar-18 months (3+2+2 = 7).
For Jun-18, since we have no data for Apr and May-18, it should include counts only for Jun-18 and NOT got to the previous group/partition.
I have the below query that gives me the correct total for num_discharges_last_90_days till Jun-18 but does not follow the grouping of earlier columns and for Jul-18 it also includes Jun-18 totals which should not be the case since the region is different.
If I add a PARTITION BY region (and others) clause for it, num_discharges_last_90_days is correct for Jul-18 now, but incorrect for Jun-18 since it includes the Feb and Mar-18 totals.
`
DROP TABLE IF EXISTS Encounter;
CREATE TEMPORARY TABLE Encounter
(
Encounter_no int,
Admit_date date,
discharge_date date,
region varchar(50),
Facilityname varchar(50),
Payertype varchar(10),
Payernamme varchar(20),
patient_type varchar(10)
);
INSERT INTO Encounter
select 12345, '2018-01-01', '2018-01-05', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12346, '2018-01-02', '2018-01-06', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12347, '2018-01-03', '2018-01-07', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12348, '2018-02-04', '2018-02-08', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12349, '2018-02-05', '2018-02-09', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12350, '2018-03-06', '2018-03-10', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12351, '2018-03-07', '2018-03-11', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12352, '2018-06-08', '2018-06-12', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12353, '2018-06-09', '2018-06-13', 'Midwest', 'ABC', 'MCR', 'MCR123', 'IP' union all
select 12354, '2018-07-10', '2018-07-14', 'NorthEast', 'ABC', 'MCR', 'MCR123', 'IP'
;
--SELECT from_unixtime(unix_timestamp(e.discharge_date, 'yyyy-MM-dd'),'MM') AS `Discharge_Month` FROM Encounter e
--Below CTE is used to get all month numbers
WITH R AS
(
SELECT '01' AS MonthNum
UNION ALL SELECT '02'
UNION ALL SELECT '03'
UNION ALL SELECT '04'
UNION ALL SELECT '05'
UNION ALL SELECT '06'
UNION ALL SELECT '07'
UNION ALL SELECT '08'
UNION ALL SELECT '09'
UNION ALL SELECT '10'
UNION ALL SELECT '11'
UNION ALL SELECT '12'
)
SELECT * FROM
(
--Perform a left join on CTE with your query to get all months
SELECT
R.MonthNum,
e.region,
e.facilityname,
from_unixtime(unix_timestamp(e.discharge_date, 'yyyy-MM-dd'),'MMM-yyyy') AS Discharge_Month,
e.Payertype,
e.Payernamme,
e.patient_type,
CASE WHEN COALESCE(e.region, '') <> ''
THEN COUNT(1)
ELSE 0
END
as num_discharges_last_30_days,
SUM(
CASE WHEN COALESCE(e.region, '') <> ''
THEN COUNT(1)
ELSE 0
END
)
OVER (ORDER BY R.MonthNum
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) as num_discharges_last_90_days
FROM R
LEFT JOIN Encounter e
ON R.MonthNum = from_unixtime(unix_timestamp(e.discharge_date, 'yyyy-MM-dd'),'MM')
GROUP BY
R.MonthNum,
e.region,
e.facilityname,
from_unixtime(unix_timestamp(e.discharge_date, 'yyyy-MM-dd'),'MMM-yyyy'),
e.Payertype,
e.Payernamme,
e.patient_type
) A
WHERE A.region IS NOT NULL
;
`
My colleague cracked the question using the below query. It needed a self-join and CASE & WHERE clauses to only count the last 3 months calculation.
WITH CTE AS (
SELECT a.region,a.facilityname,a.payertype,a.payernamme,a.patient_type, LAST_DAY(a.discharge_date) AS month_year, COUNT(encounter_no) AS measure_1
FROM Encounter AS a
GROUP BY a.region,a.facilityname,a.payertype,a.payernamme,a.patient_type, LAST_DAY(a.discharge_date)
)
-- SELECT * FROM CTE AS a;
SELECT a.region,a.facilityname,a.payertype,a.payernamme,a.patient_type, a.month_year, MAX(a.measure_1) AS measure_1,
SUM(IF(b.month_year IS NULL, a.measure_1, b.measure_1)) AS measure_2
FROM CTE AS a
LEFT JOIN CTE AS b
ON a.region = b.region
AND a.facilityname = b.facilityname
AND a.payertype = b.payertype
AND a.payernamme = b.payernamme
AND a.patient_type = b.patient_type
WHERE ( b.month_year BETWEEN add_months(a.month_year, -2) AND a.month_year
OR b.month_year IS NULL)
GROUP BY a.region,a.facilityname,a.payertype,a.payernamme,a.patient_type, a.month_year;

How to choose table column dynamically?

I have a data table in which I have month columns(fields).
How do I write a query that would select current month's column dynamically?
I tried doing this:
SELECT MonthName(month(date())) FROM my_table
That didn't work, so I tried several different ways that would return the month name for a query to use as field name, but so far nothing worked.
Can anybody point me to a solution?
You have a data source with a column for each month. You could use Switch() to retrieve the values from the column whose name matches the current month.
SELECT
Switch
(
Month(Date())= 1, [Jan],
Month(Date())= 2, [Feb],
Month(Date())= 3, [Mar],
Month(Date())= 4, [Apr],
Month(Date())= 5, [May],
Month(Date())= 6, [Jun],
Month(Date())= 7, [Jul],
Month(Date())= 8, [Aug],
Month(Date())= 9, [Sep],
Month(Date())=10, [Oct],
Month(Date())=11, [Nov],
Month(Date())=12, [Dec]
) AS current_month_column
FROM my_table;
However, I would try to transform the data source instead, converting columns to rows.
SELECT 'Jan' AS month_name, [Jan] As month_value FROM my_table
UNION ALL
SELECT 'Feb' AS month_name, [Feb] As month_value FROM my_table
UNION ALL
...
You could store the union result set in another table and query that, or query the union query.
SELECT month_name, month_value
FROM YourTable
WHERE month_name = MonthName(Month(Date()), True);
Dim db As Database
Dim strSQL As String
Set db = CurrentDb
strSQL = "SELECT " & left(MonthName(Month(Date)), 3) & " INTO new_table FROM my_table"
db.Execute strSQL