What's the best way to split my results by day? - sql

not sure where to go with this one. I know I need to split the date and time up from 'createdon' but then I'm stumped.
I can bring back values with the query but I have to manually enter each day.
`SELECT
sum(CASE WHEN title LIKE '%Environmental%' THEN 1 ELSE 0 END)As Environmental
,sum(CASE WHEN title LIKE '%Let%' THEN 1 ELSE 0 END)As Let
,sum(CASE WHEN title LIKE '%Lease%' THEN 1 ELSE 0 END)As Lease
,sum(CASE WHEN title LIKE '%Pay%' THEN 1 ELSE 0 END)As Paym
,sum(CASE WHEN title LIKE '%Manage%' THEN 1 ELSE 0 END)As Manage
,sum(CASE WHEN title LIKE '%Rent%' THEN 1 ELSE 0 END)As Rent
,sum(CASE WHEN title LIKE '%Works%' THEN 1 ELSE 0 END)As Works
FROM
incident
WHERE
(createdon > ('01/09/2021 00:01')
AND
createdon < ('01/09/2021 23:59'))`
Ideally, this is what I'm trying to bring back

SELECT
sum(CASE WHEN title LIKE '%Environmental%' THEN 1 ELSE 0 END)As Environmental
,sum(CASE WHEN title LIKE '%Let%' THEN 1 ELSE 0 END)As Let
,sum(CASE WHEN title LIKE '%Lease%' THEN 1 ELSE 0 END)As Lease
,sum(CASE WHEN title LIKE '%Pay%' THEN 1 ELSE 0 END)As Paym
,sum(CASE WHEN title LIKE '%Manage%' THEN 1 ELSE 0 END)As Manage
,sum(CASE WHEN title LIKE '%Rent%' THEN 1 ELSE 0 END)As Rent
,sum(CASE WHEN title LIKE '%Works%' THEN 1 ELSE 0 END)As Works
,createdon
FROM incident WHERE createdon between <your_start_date> and <your_end_date>
GROUP BY createdon

You need to aggregate by the date. One method looks like this:
SELECT CAST(createdon as DATE),
SUM(CASE WHEN title LIKE '%Environmental%' THEN 1 ELSE 0 END)As Environmental,
SUM(CASE WHEN title LIKE '%Let%' THEN 1 ELSE 0 END) As Let,
SUM(CASE WHEN title LIKE '%Lease%' THEN 1 ELSE 0 END) As Lease,
SUM(CASE WHEN title LIKE '%Pay%' THEN 1 ELSE 0 END) As Paym,
SUM(CASE WHEN title LIKE '%Manage%' THEN 1 ELSE 0 END) As Manage,
SUM(CASE WHEN title LIKE '%Rent%' THEN 1 ELSE 0 END) As Rent,
SUM(CASE WHEN title LIKE '%Works%' THEN 1 ELSE 0 END)A s Works,
FROM incident
GROUP BY CAST(createdon as DATE)
ORDER BY CAST(createdon as DATE);
The problem is that date/time functions are notoriously database dependent. Here are some syntaxes:
CAST(createdon as DATE) should work in SQL Server, MySQL, and Postgres.
TRUNC(createdon) works in Oracle.
DATE() in SQLite.

Related

SQL group by within sum()

I have the following sql statement which produces the following output (filtered result for 7/8 DueDate)
SELECT
JobType.BillingCategory,
Jobs.DueDate,
Sum(Impressions.PRINTtot) AS SumOfPRINTtot,
Sum(Impressions.PRINTrem) AS SumOfPRINTrem,
Sum(Impressions.CARDtot) AS SumOfCARDtot,
Sum(Impressions.CARDrem) AS SumOfCARDrem,
Sum(Impressions.BOOKtot) AS SumOfBOOKtot,
Sum(Impressions.BOOKrem) AS SumOfBOOKrem
FROM
(
Impressions
INNER JOIN Jobs ON Impressions.JobNo = Jobs.JobNo
)
INNER JOIN JobType ON (Jobs.AccountName = JobType.AccountName)
AND (Jobs.Product = JobType.Product)
GROUP BY
Jobs.DueDate,
JobType.BillingCategory;
I am trying to get all of these results on one line: the identifier would be the DueDate and the sums of the values in the Impressions table would be summed for each BillingCategory. Example below (omitting CARD & BOOK sums just for visual purposes w/ too many columns)
You could use a CASE expression to summarize your data as such. You could modify your query to sum for only that billing category, I have used CARD in the example below to summarize the metrics for Impressions.PRINTtot and SumOfPRINTrem
SELECT
Jobs.DueDate,
Sum(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.PRINTtot ELSE 0 END) AS SumOfPRINTtotCard,
Sum(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.PRINTrem ELSE 0 END) AS SumOfPRINTremCard,
....<repeat>
FROM
(
Impressions
INNER JOIN Jobs ON Impressions.JobNo = Jobs.JobNo
)
INNER JOIN JobType ON (Jobs.AccountName = JobType.AccountName)
AND (Jobs.Product = JobType.Product)
GROUP BY
Jobs.DueDate
Edit 1:
Based on the Billing Categories listed in your question
A complete example may look like:
SELECT
Jobs.DueDate,
SUM(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.PRINTtot ELSE 0 END) AS SumOfPRINTtotCARD,
SUM(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.PRINTrem ELSE 0 END) AS SumOfPRINTremCARD,
SUM(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.CARDtot ELSE 0 END) AS SumOfCARDtotCARD,
SUM(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.CARDrem ELSE 0 END) AS SumOfCARDremCARD,
SUM(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.BOOKtot ELSE 0 END) AS SumOfBOOKtotCARD,
SUM(CASE WHEN JobType.BillingCategory='CARD' THEN Impressions.BOOKrem ELSE 0 END) AS SumOfBOOKremCARD,
SUM(CASE WHEN JobType.BillingCategory='CARDTIPON' THEN Impressions.PRINTtot ELSE 0 END) AS SumOfPRINTtotCARDTIPON,
SUM(CASE WHEN JobType.BillingCategory='CARDTIPON' THEN Impressions.PRINTrem ELSE 0 END) AS SumOfPRINTremCARDTIPON,
SUM(CASE WHEN JobType.BillingCategory='CARDTIPON' THEN Impressions.CARDtot ELSE 0 END) AS SumOfCARDtotCARDTIPON,
SUM(CASE WHEN JobType.BillingCategory='CARDTIPON' THEN Impressions.CARDrem ELSE 0 END) AS SumOfCARDremCARDTIPON,
SUM(CASE WHEN JobType.BillingCategory='CARDTIPON' THEN Impressions.BOOKtot ELSE 0 END) AS SumOfBOOKtotCARDTIPON,
SUM(CASE WHEN JobType.BillingCategory='CARDTIPON' THEN Impressions.BOOKrem ELSE 0 END) AS SumOfBOOKremCARDTIPON,
SUM(CASE WHEN JobType.BillingCategory='EOB' THEN Impressions.PRINTtot ELSE 0 END) AS SumOfPRINTtotEOB,
SUM(CASE WHEN JobType.BillingCategory='EOB' THEN Impressions.PRINTrem ELSE 0 END) AS SumOfPRINTremEOB,
SUM(CASE WHEN JobType.BillingCategory='EOB' THEN Impressions.CARDtot ELSE 0 END) AS SumOfCARDtotEOB,
SUM(CASE WHEN JobType.BillingCategory='EOB' THEN Impressions.CARDrem ELSE 0 END) AS SumOfCARDremEOB,
SUM(CASE WHEN JobType.BillingCategory='EOB' THEN Impressions.BOOKtot ELSE 0 END) AS SumOfBOOKtotEOB,
SUM(CASE WHEN JobType.BillingCategory='EOB' THEN Impressions.BOOKrem ELSE 0 END) AS SumOfBOOKremEOB,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDE' THEN Impressions.PRINTtot ELSE 0 END) AS SumOfPRINTtotMEMBERGUIDE,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDE' THEN Impressions.PRINTrem ELSE 0 END) AS SumOfPRINTremMEMBERGUIDE,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDE' THEN Impressions.CARDtot ELSE 0 END) AS SumOfCARDtotMEMBERGUIDE,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDE' THEN Impressions.CARDrem ELSE 0 END) AS SumOfCARDremMEMBERGUIDE,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDE' THEN Impressions.BOOKtot ELSE 0 END) AS SumOfBOOKtotMEMBERGUIDE,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDE' THEN Impressions.BOOKrem ELSE 0 END) AS SumOfBOOKremMEMBERGUIDE,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDEHD' THEN Impressions.PRINTtot ELSE 0 END) AS SumOfPRINTtotMEMBERGUIDEHD,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDEHD' THEN Impressions.PRINTrem ELSE 0 END) AS SumOfPRINTremMEMBERGUIDEHD,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDEHD' THEN Impressions.CARDtot ELSE 0 END) AS SumOfCARDtotMEMBERGUIDEHD,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDEHD' THEN Impressions.CARDrem ELSE 0 END) AS SumOfCARDremMEMBERGUIDEHD,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDEHD' THEN Impressions.BOOKtot ELSE 0 END) AS SumOfBOOKtotMEMBERGUIDEHD,
SUM(CASE WHEN JobType.BillingCategory='MEMBERGUIDEHD' THEN Impressions.BOOKrem ELSE 0 END) AS SumOfBOOKremMEMBERGUIDEHD
FROM
(
Impressions
INNER JOIN Jobs ON Impressions.JobNo = Jobs.JobNo
)
INNER JOIN JobType ON (Jobs.AccountName = JobType.AccountName)
AND (Jobs.Product = JobType.Product)
GROUP BY
Jobs.DueDate
With a specific database/tool there may be various functions that may prove useful. However, I find in these cases especially since your billing categories may change over time, a script where you can run anywhere is sometimes helpful. I've included the script I used to generate the code below
var types='CARD,CARDTIPON,EOB,MEMBERGUIDE,MEMBERGUIDEHD'.split(',');
var metrics = metrics='PRINTtot,PRINTrem,CARDtot,CARDrem,BOOKtot,BOOKrem'.split(',');
var metricTemplate="SUM(CASE WHEN JobType.BillingCategory='[TYPE]' THEN Impressions.[METRICNAME] ELSE 0 END) AS SumOf[METRICNAME][TYPE]";
var summary_lines = []
for(var i=0;i < types.length;i++){
for(var j=0;j<metrics.length;j++){
summary_lines.push(metricTemplate.replaceAll('[TYPE]',types[i]).replaceAll('[METRICNAME]',metrics[j]))
}
}
complete_metrics = summary_lines.join(",\n");
console.log(complete_metrics)
The simplest option is to use your query as a CTE (Common Table Expression) and then you can use it as a base for another query.
For example:
with
q as (
-- your query here
)
select
max(DueDate) as DueDate,
sum(case when BillingCategory = 'CARD' then SumOfPRINTtot else 0 end) as SumOfPRINTtotC,
sum(case when BillingCategory = 'CARD' then SumOfPRINTrem else 0 end) as SumOfPRINTremC,
sum(case when BillingCategory = 'CARDTIPON' then SumOfPRINTtot else 0 end) as SumOfPRINTtotCT,
sum(case when BillingCategory = 'CARDTIPON' then SumOfPRINTrem else 0 end) as SumOfPRINTremCT,
sum(case when BillingCategory = 'EOB' then SumOfPRINTtot else 0 end) as SumOfPRINTtotE,
sum(case when BillingCategory = 'EOB' then SumOfPRINTrem else 0 end) as SumOfPRINTremE,
sum(case when BillingCategory = 'MEMBERGUIDE' then SumOfPRINTtot else 0 end) as SumOfPRINTtotMG,
sum(case when BillingCategory = 'MEMBERGUIDE' then SumOfPRINTrem else 0 end) as SumOfPRINTremMG,
sum(case when BillingCategory = 'MEMBERGUIDEHD' then SumOfPRINTtot else 0 end) as SumOfPRINTtotMGH,
sum(case when BillingCategory = 'MEMBERGUIDEHD' then SumOfPRINTrem else 0 end) as SumOfPRINTremMGH
from q
In some databases you can use the FILTER clause as well. You don't mention which specific database, so this solution will work on virtually all databases.

How to make column and row as header in HiveQL

I've a question here about setting column and row as header in HiveQL. So, my expected output is like this (Table 1):
But what I could do so far is just like this (Table 2):
with this query:
SELECT
sum(case when grade ='A' and class = 'I' then 1 else 0 end) as 'Grade_A_I',
sum(case when grade ='B' and class = 'I' then 1 else 0 end) as 'Grade_B_I',
sum(case when grade ='C' and class = 'I' then 1 else 0 end) as 'Grade_C_I',
sum(case when grade ='A' and class = 'II' then 1 else 0 end) as 'Grade_A_II', etc...
FROM
grade_table
where
.......
I didn't find any references on the Internet to do this, so I'm just wondering is there any ways to achieve the Table 1 instead of Table 2?
Really appreciate any inputs from you all, thanks in advance!
something like this would work?
select
case when class = 'I' then 'Class I'
when class = 'II' then 'Class II' end as class,
sum(case when grade ='A' then 1 else 0 end) as Grade_A,
sum(case when grade ='B' then 1 else 0 end) as Grade_B,
sum(case when grade ='C' then 1 else 0 end) as Grade_C
from
grade_table
group by 1
union all
select 'Total' as class,
sum(case when grade ='A' then 1 else 0 end) as Grade_A,
sum(case when grade ='B' then 1 else 0 end) as Grade_B,
sum(case when grade ='C' then 1 else 0 end) as Grade_C
from
grade_table

Combinations of Products as single count

I need to count combinations of products within transactions differently to other products and I'm struggling with how to do this within a single select statement from SQL 2008. This would then become a data set to manipulate in Reporting Services
raw data looks like this
txn, prod, units
1, a, 2
1, c, 1
2, a, 1
2, b, 1
2, c, 1
3, a, 2
3, b, 1
4, a, 3
4, c, 2
So a+b should = one if in same trans number, however a or b should equal one if not paired. So a=1 and b=1 but a+b=1, a+b+a=2, a+b+a+b=2 given the example data here is my desired result with an explanation of why
txn 1 is 3 units -- 2a + c
txn 2 is 2 units -- (a+b) + c
txn 3 is 2 units -- (a+b) + a
txn 4 is 5 units -- 3a + 2c
My query is more complex than this and includes other aggregates so I would like to group by transaction which I can't do as I need to manipulate at a lower grain
Update Progress :
Possible solution, I've generated columns based on the products I'm measuring. This allows me to group on Txn as I am now aggregating that field. Unsure if there's a better way to do it as it does take a little while
CASE WHEN SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end)=0
THEN SUM(CASE WHEN Prod='a' then 1 else 0 end)
ELSE 0 END AS MixProd
, CASE WHEN SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end)!=0
THEN ABS(SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end))
ELSE 0 END AS NotMixProd
I will then need to sort out the current unit aggregate to remove the extras but this certainly gives me a start
Update Progress 2 :
This failed to handle 0 correctly where a or b was 0 it would still give a value for mix because a-b was not zero. I reverted to an earlier draft that I lost and expanded as per below
, CASE WHEN SUM(CASE WHEN Prod='a' then 1 else 0 end) = 0 THEN 0
WHEN SUM(CASE WHEN Prod='b' then 1 else 0 end) = 0 THEN 0
WHEN SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end)=0
THEN SUM(CASE WHEN Prod='a' then 1 else 0 end)
ELSE ABS(SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end))
END AS MixProd
, CASE WHEN SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end)!=0
THEN ABS(SUM(CASE WHEN Prod='a' then 1 else 0 end)-
SUM(CASE WHEN Prod='b' then 1 else 0 end))
ELSE 0 END AS NotMixProd
UPDATE: This should work in SQL Server 2008 (based on LAG solution from here).
Here is the demo: http://rextester.com/GNI23706
WITH CTE AS
(
select txn, prod, units,
row_number() over (partition by txn order by prod) rn,
(row_number() over (partition by txn order by prod))/2 rndiv2,
(row_number() over (partition by txn order by prod)+1)/2 rnplus1div2,
count(*) over (partition by txn) partitioncount
from test_data
)
select
txn,
sum(case when prev_prod = 'a' and prod = 'b' and prev_units >= units then 0
when prev_prod = 'a' and prod = 'b' and prev_units < units then units - prev_units
else units
end) units
from
(
select
txn,
prod,
units,
CASE WHEN rn%2=1
THEN MAX(CASE WHEN rn%2=0 THEN prod END) OVER (PARTITION BY txn,rndiv2)
ELSE MAX(CASE WHEN rn%2=1 THEN prod END) OVER (PARTITION BY txn,rnplus1div2)
END AS prev_prod,
CASE WHEN rn%2=1
THEN MAX(CASE WHEN rn%2=0 THEN units END) OVER (PARTITION BY txn,rndiv2)
ELSE MAX(CASE WHEN rn%2=1 THEN units END) OVER (PARTITION BY txn,rnplus1div2)
END AS prev_units
from cte
) temp
group by txn
For SQL Server 2012+, use LAG:
select
txn,
sum(
case when prev_prod = 'a' and prod = 'b' and prev_units >= units then 0
when prev_prod = 'a' and prod = 'b' and prev_units < units then units - prev_units
else units
end) units
from
(
select
txn,
prod,
units,
lag(prod) over (partition by txn order by prod) prev_prod,
lag(units) over (partition by txn order by prod) prev_units
from test_data
) temp
group by txn
I decided in the end that a temp table was the best way to go, because I couldn't group on a collation. So I eventually tweaked the code above as it was failing to pick up the spare items correctly
SUM(Units) AS OldUnits
SUM(Units) -
(CASE WHEN
SUM(CASE WHEN Prod='a' THEN 1 ELSE 0 END) = 0 THEN 0 WHEN
SUM(CASE WHEN Prod='b' THEN 1 ELSE 0 END) = 0 THEN 0 WHEN
SUM(CASE WHEN Prod='a' THEN 1 ELSE 0 END) -
SUM(CASE WHEN Prod='b' THEN 1 ELSE 0 END) = 0 THEN
SUM(CASE WHEN Prod='a' THEN 1 ELSE 0 END) WHEN
(SUM(CASE WHEN Prod='a' THEN 1 ELSE 0 END) -
SUM(CASE WHEN Prod='b' THEN 1 ELSE 0 END)) < 0 THEN
SUM(CASE WHEN Prod='a' THEN 1 ELSE 0 END) ELSE
SUM(CASE WHEN Prod='b' THEN 1 ELSE 0 END) END) AS NewUnits
This was stored in a temptable that I could then collate on Trans as the next step. Works fine for my purposes and helped me overcome a mild irrational fear I have of temptables

How do you create a conditional count across multiple fields?

I have the huge table of over 100 million rows of data which is joined to another reference table that I want to create a conditional count for.
The first table is the large one which is an audit log and contains data which lists data on countries and contains a date of audit.
The second table is a smaller table which contains relational data to the audit log.
The first part is the easy bit which is to identify which audit data I want to see. I have the following code to identify this:
select aud.*
from audit_log aud
join database db on db.id=aud.release_id
where aud.event_description like '% opted in'
and r.creation_source = 'system_a'
This gives me the data in the following format:
Country Event Description Audit Date
Czech Republic Czech Republic has been automatically opted in 11-AUG-14 07.01.52.606000000
Denmark Denmark has been automatically opted in 12-AUG-15 07.01.53.239000000
Denmark Denmark has been automatically opted in 11-SEP-15 07.01.53.902000000
Dominican Republic Dominican Republic has been automatically opted in 11-SEP-15 07.01.54.187000000
Ecuador Ecuador has been automatically opted in 11-DEC-14 07.01.54.427000000
Ecuador Ecuador has been automatically opted in 11-NOV-14 07.01.54.679000000
The number of results from this query still returns over 5 million rows so I cannot export the data to Excel to create a count.
My two main issues are the number of rows and the date format of the 'Audit Date' field.
Ideally I want to create a count which shows the data as:
Country |Aug-14|Nov-14|Dec-14|Aug-15|Sep-15
Czech Republic | 1 | | | |
Denmark | | | | 1 | 1
Dominican Republic | | | | | 1
Ecuador | | 1 | 1 | |
Any idea's on how I extract the month and year and drop the figures into column by country?
Thanks
Edit - Thank you xQbert for you solution, it worked perfectly!
The problem now is that I have run into a new problem.
I need to constrain the count by another query, but there is no unique identifier between the tables involved.
For example, I amended your query to fit my db:
select cty.country_name,
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2014' then 1 else 0 end) as "AUG-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2014' then 1 else 0 end) as "SEP-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='OCT-2014' then 1 else 0 end) as "OCT-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='NOV-2014' then 1 else 0 end) as "NOV-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='DEC-2014' then 1 else 0 end) as "DEC-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JAN-2015' then 1 else 0 end) as "JAN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='FEB-2015' then 1 else 0 end) as "FEB-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAR-2015' then 1 else 0 end) as "MAR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='APR-2015' then 1 else 0 end) as "APR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAY-2015' then 1 else 0 end) as "MAY-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUN-2015' then 1 else 0 end) as "JUN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUL-2015' then 1 else 0 end) as "JUL-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2015' then 1 else 0 end) as "AUG-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2015' then 1 else 0 end) as "SEP-15"
from dschd.audit_trail aud
join dschd.release r on r.id=aud.release_id
join dschd.country cty on aud.EVENT_COUNTRY_ID=cty.id
where aud.event_description like '% opted in'
and r.creation_source = 'DSCHED'
GROUP BY cty.COUNTRY_name
My second query is:
select *
from DSCHD.RELEASE_COUNTRY_RIGHT rcr
join dschd.release r on rcr.RELEASE_ID=r.ID
join dschd.country cty on rcr.COUNTRY_ID=cty.id
where r.release_status in ('DRAFT', 'SCHEDULED', 'FINAL', 'DELIVERED')
and r.is_active = 'Y'
and rcr.MARKETING_RIGHT = 'Y'
and rcr.OPT_OUT = 'N'
and r.creation_source = 'DSCHED'
The problem is that I have many countries which can relate to one ID (Release_ID) but there is no unique identifier between the tables on a country level. Each country has an ID though.
So for query 1, to identify each unique row I would need the 'aud.Release_ID' and the 'aud.Event_country_id' and for query 2 to achieve the same I would need to use the 'rcr.Release_ID' and 'rcr.country_id'.
select cty.country_name,
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2014' then 1 else 0 end) as "AUG-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2014' then 1 else 0 end) as "SEP-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='OCT-2014' then 1 else 0 end) as "OCT-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='NOV-2014' then 1 else 0 end) as "NOV-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='DEC-2014' then 1 else 0 end) as "DEC-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JAN-2015' then 1 else 0 end) as "JAN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='FEB-2015' then 1 else 0 end) as "FEB-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAR-2015' then 1 else 0 end) as "MAR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='APR-2015' then 1 else 0 end) as "APR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAY-2015' then 1 else 0 end) as "MAY-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUN-2015' then 1 else 0 end) as "JUN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUL-2015' then 1 else 0 end) as "JUL-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2015' then 1 else 0 end) as "AUG-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2015' then 1 else 0 end) as "SEP-15"
from dschd.audit_trail aud
join dschd.release r on r.id=aud.release_id
join dschd.country cty on aud.EVENT_COUNTRY_ID=cty.id
where aud.event_description like '% opted in'
and ***** in (select ******
from DSCHD.RELEASE_COUNTRY_RIGHT rcr
join dschd.release r on rcr.RELEASE_ID=r.ID
join dschd.country cty on rcr.COUNTRY_ID=cty.id
where r.release_status in ('DRAFT', 'SCHEDULED', 'FINAL', 'DELIVERED')
and r.is_active = 'Y'
and rcr.MARKETING_RIGHT = 'Y'
and rcr.OPT_OUT = 'N'
and r.creation_source = 'DSCHED')
GROUP BY cty.COUNTRY_name
The bit I am stuck at are the two parts which are indicated by '*****' as the join criteria is two fields.
Any ideas?
Quick and dirty, not dynamic floating based on a 12 month cylce or anything...
select country,
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2014' then 1 else 0 end) as "AUG-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2014' then 1 else 0 end) as "SEP-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='OCT-2014' then 1 else 0 end) as "OCT-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='NOV-2014' then 1 else 0 end) as "NOV-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='DEC-2014' then 1 else 0 end) as "DEC-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JAN-2015' then 1 else 0 end) as "JAN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='FEB-2015' then 1 else 0 end) as "FEB-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAR-2015' then 1 else 0 end) as "MAR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='APR-2015' then 1 else 0 end) as "APR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAY-2015' then 1 else 0 end) as "MAY-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUN-2015' then 1 else 0 end) as "JUN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUL-2015' then 1 else 0 end) as "JUL-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2015' then 1 else 0 end) as "AUG-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2015' then 1 else 0 end) as "SEP-15"
from audit_log aud
join database db on db.id=aud.release_id
where aud.event_description like '% opted in'
and r.creation_source = 'system_a'
GROUP BY COUNTRY
Ideally we'd simply use a Pivot statement or base it on earliest date in range and go on... Such as found in this prior stack article Dynamic pivot in oracle sql
update based on changing requirements you do know you can join on multiple criteria right? :P
Note we created an inline view with your second query alias it as z table name and then add the two columns desired to match on as part of the results. Then we join it as if it were a table!
select cty.country_name,
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2014' then 1 else 0 end) as "AUG-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2014' then 1 else 0 end) as "SEP-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='OCT-2014' then 1 else 0 end) as "OCT-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='NOV-2014' then 1 else 0 end) as "NOV-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='DEC-2014' then 1 else 0 end) as "DEC-14",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JAN-2015' then 1 else 0 end) as "JAN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='FEB-2015' then 1 else 0 end) as "FEB-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAR-2015' then 1 else 0 end) as "MAR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='APR-2015' then 1 else 0 end) as "APR-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='MAY-2015' then 1 else 0 end) as "MAY-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUN-2015' then 1 else 0 end) as "JUN-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='JUL-2015' then 1 else 0 end) as "JUL-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='AUG-2015' then 1 else 0 end) as "AUG-15",
SUM(CASE WHEN to_char(Audit_Date,'MON-YYYY') ='SEP-2015' then 1 else 0 end) as "SEP-15"
from dschd.audit_trail aud
join dschd.release r on r.id=aud.release_id
join dschd.country cty on aud.EVENT_COUNTRY_ID=cty.id
join (select Release_ID, country_id
from DSCHD.RELEASE_COUNTRY_RIGHT rcr
join dschd.release r on rcr.RELEASE_ID=r.ID
join dschd.country cty on rcr.COUNTRY_ID=cty.id
where r.release_status in ('DRAFT', 'SCHEDULED', 'FINAL', 'DELIVERED')
and r.is_active = 'Y'
and rcr.MARKETING_RIGHT = 'Y'
and rcr.OPT_OUT = 'N'
and r.creation_source = 'DSCHED') Z
ON aud.Release_ID = z.Realease_ID and
aud.Event_country_id = z.country_id
where aud.event_description like '% opted in'
GROUP BY cty.COUNTRY_name

Proper way to create a pivot table with crosstab

How do I convert the following query into a pivot table using crosstab?
select (SUM(CASE WHEN added_customer=false
THEN 1
ELSE 0
END)) AS CUSTOMERS_NOT_ADDED, (SUM(CASE WHEN added_customer=true
THEN 1
ELSE 0
END)) AS CUSTOMERS_ADDED,
(select (SUM(CASE WHEN added_sales_order=false
THEN 1
ELSE 0
END))
FROM shipments_data
) AS SALES_ORDER_NOT_ADDED,
(select (SUM(CASE WHEN added_sales_order=true
THEN 1
ELSE 0
END))
FROM shipments_data
) AS SALES_ORDER_ADDED,
(select (SUM(CASE WHEN added_fulfillment=false
THEN 1
ELSE 0
END))
FROM shipments_data
) AS ITEM_FULFILLMENT_NOT_ADDED,
(select (SUM(CASE WHEN added_fulfillment=true
THEN 1
ELSE 0
END))
FROM shipments_data
) AS ITEM_FULFILLMENT_ADDED,
(select (SUM(CASE WHEN added_invoice=false
THEN 1
ELSE 0
END))
FROM shipments_data
) AS INVOICE_NOT_ADDED,
(select (SUM(CASE WHEN added_invoice=true
THEN 1
ELSE 0
END))
FROM shipments_data
) AS INVOICE_ADDED,
(select (SUM(CASE WHEN added_ra=false
THEN 1
ELSE 0
END))
FROM shipments_data
) AS RA_NOT_ADDED,
(select (SUM(CASE WHEN added_ra=true
THEN 1
ELSE 0
END))
FROM shipments_data
) AS RA_ADDED,
(select (SUM(CASE WHEN added_credit_memo=false
THEN 1
ELSE 0
END))
FROM shipments_data
) AS CREDIT_MEMO_NOT_ADDED,
(select (SUM(CASE WHEN added_credit_memo=true
THEN 1
ELSE 0
END))
FROM shipments_data
) AS CREDIT_MEMO_ADDED
FROM shipments_data;
This query gives me data in a standard row format however I would like to show this as a pivot table in the following format:
Added Not_Added
Customers 100 0
Sales Orders 50 50
Item Fulfillemnts 0 100
Invoices 0 100
...
I am using Heroku PostgreSQL, which is running v9.1.6
Also, I'm not sure if my above query can be optimized or if this is poor form. If it can be optimized/improved I would love to learn how.
The tablefunc module that supplies crosstab() is available for 9.1 (like for any other version this side of the millennium). Doesn't Heroku let you install additional modules? Have you tried:
CREATE EXTENSION tablefunc;
For examples how to use it, refer to the manual or this related question:
PostgreSQL Crosstab Query
OR try this search - there are a couple of good answers with examples on SO.
To get you started (like most of the way ..) use this largely simplified and re-organized query as base for the crosstab() call:
SELECT 'added'::text AS col
,SUM(CASE WHEN added_customer THEN 1 ELSE 0 END) AS customers
,SUM(CASE WHEN added_sales_order THEN 1 ELSE 0 END) AS sales_order
,SUM(CASE WHEN added_fulfillment THEN 1 ELSE 0 END) AS item_fulfillment
,SUM(CASE WHEN added_invoice THEN 1 ELSE 0 END) AS invoice
,SUM(CASE WHEN added_ra THEN 1 ELSE 0 END) AS ra
,SUM(CASE WHEN added_credit_memo THEN 1 ELSE 0 END) AS credit_memo
FROM shipments_data
UNION ALL
SELECT 'not_added' AS col
,SUM(CASE WHEN NOT added_customer THEN 1 ELSE 0 END) AS customers
,SUM(CASE WHEN NOT added_sales_order THEN 1 ELSE 0 END) AS sales_order
,SUM(CASE WHEN NOT added_fulfillment THEN 1 ELSE 0 END) AS item_fulfillment
,SUM(CASE WHEN NOT added_invoice THEN 1 ELSE 0 END) AS invoice
,SUM(CASE WHEN NOT added_ra THEN 1 ELSE 0 END) AS ra
,SUM(CASE WHEN NOT added_credit_memo THEN 1 ELSE 0 END) AS credit_memo
FROM shipments_data;
If your columns are defined NOT NULL, you can further simplify the CASE expressions.
If performance is crucial, you can get all aggregates in a single scan in a CTE and split values into two rows in the next step.
WITH x AS (
SELECT count(NULLIF(added_customer, FALSE)) AS customers
,sum(added_sales_order::int) AS sales_order
...
,count(NULLIF(added_customer, TRUE)) AS not_customers
,sum((NOT added_sales_order)::int) AS not_sales_order
...
FROM shipments_data
)
SELECT 'added'::text AS col, customers, sales_order, ... FROM x
UNION ALL
SELECT 'not_added', not_customers, not_sales_order, ... FROM x;
I also demonstrate two alternative ways to build your aggregates - both built on the assumption that all columns are boolean NOT NULL. Both alternatives are syntactically shorter, but not faster. In previous testes all three methods performed about the same.