Postgres Crosstab query Dynamic pivot - sql

Does any one know how to create the following crosstab in Postgres?
For example I have the following table:
Store Month Sales
A Mar-2020 100
A Feb-2020 200
B Mar-2020 400
B Feb-2020 500
A Jan-2020 400
C Apr-2020 600
I would like the query to return the following crosstab, the column headings should not be hardcoded values but reflect the values in "month" column from the first table:
Store Jan-2020 Feb-2020 Mar-2020 Apr-2020
A 400 200 100 -
B - 500 400 -
C - - - 600
Is this possible?

Postgres does have a crosstab function, but I think using the built in filtering functionality is simple in this case:
select store,
sum(sales) filter (where month = 'Jan-2020') as Jan_2020,
sum(sales) filter (where month = 'Feb-2020') as Feb_2020,
sum(sales) filter (where month = 'Mar-2020') as Mar_2020,
sum(sales) filter (where month = 'Apr-2020') as Apr_2020
from t
group by store
order by store;
Note: This puts NULL values in the columns with no corresponding value, rather than -. If you really wanted a hyphen, you would need to convert the value to a string -- and that seems needlessly complicated.

Try this with CASE expression inside SUM(), here is the db-fiddle.
select
store,
sum(case when month = 'Jan-2020' then sales end) as "Jan-2020",
sum(case when month = 'Feb-2020' then sales end) as "Feb-2020",
sum(case when month = 'Mar-2020' then sales end) as "Mar-2020",
sum(case when month = 'Apr-2020' then sales end) as "Apr-2020"
from myTable
group by
store
order by
store
Output:
+---------------------------------------------------+
|store Jan-2020 Feb-2020 Mar-2020 Apr-2020|
+---------------------------------------------------+
| A 400 200 100 null |
| B null 500 400 null |
| C null null null 600 |
+---------------------------------------------------+
If you want to replace null values with 0 in the output then use coalesce()
e.g.
coalesce(sum(case when month = 'Jan-2020' then sales end), 0)

Related

SQL Create Column Headers by Month ID

I am trying to extract itemised sales data for the past 12 months and build a dynamic table with column headers for each month ID. Extracting the data as below works, however when I get to the point of creating a SUM column for each month ID, I get stuck. I have tried to find similar questions but I'm not sure of the best approach.
Select Item, Qty, format(Transaction Date,'MMM-yy')
from Transactions
Data Extract:
Item
Qty
Month ID
A123
50
Apr-22
A123
30
May-22
A123
50
Jun-22
A321
50
Apr-22
A999
25
May-22
A321
10
Jun-22
Desired Output:
Item
Apr-22
May-22
Jun-22
A123
50
30
50
A321
50
Null
10
A999
Null
25
Null
Any advice would be greatly appreciated.
This is a typical case of pivot operation, where you
first filter every value according to your "Month_ID" value
then aggregate on common "Item"
WITH cte AS (
SELECT Item, Qty, FORMAT(Transaction Date,'MMM-yy') AS Month_ID
FROM Transactions
)
SELECT Item,
MAX(CASE WHEN Month_ID = 'Apr-22' THEN Qty END) AS [Apr-22],
MAX(CASE WHEN Month_ID = 'May-22' THEN Qty END) AS [May-22],
MAX(CASE WHEN Month_ID = 'Jun-22' THEN Qty END) AS [Jun-22]
FROM cte
GROUP BY Item
Note: you don't need the SUM as long as there's only one value for each couple <"Item", "Month-Year">.

dynamic grouping by month in SQL

I am using CASE statements to group data into Month columns, like this:
SUM(CASE WHEN MONTH(date) = 1 THEN ROUND(value) END) as jan,
SUM(CASE WHEN MONTH(date) = 2 THEN ROUND(value) END) as feb,
SUM(CASE WHEN MONTH(date) = 3 THEN ROUND(value) END) as mar
Is it possible to NOT have to define the different CASE groupings?
I want to define the data range in the WHERE statement, and then have the report group by month, for whatever range I define. For example, maybe my report starts with July'20, and not Jan.
Is this possible in an SQL query?
Thanks
edit - example output:
+-------+-------+------+-------+-------+
| | July | Aug | Sep | etc |
+-------+-------+------+-------+-------+
| value | 435 € | 24 € | 234 € | 453 € |
+-------+-------+------+-------+-------+
edit - possible solution/workaround:
if I do the following, it can be considered "semi-dynamic". I still need to define the month "buckets", but they can be trigged by the starting date (the month({ d '2021-01-01' }) part can also later be replaced with a variable, so that is also fixed in the code.
SUM (CASE WHEN MONTH(date) = month({ d '2021-01-01' }) THEN value END) as month_1,
SUM (CASE WHEN MONTH(date) = month({ d '2021-01-01' })+1 THEN value END) as month_2,
SUM (CASE WHEN MONTH(date) = month({ d '2021-01-01' })+2 THEN value END) as month_3,
etc
the main downside is that I have to hard-code the number of month groupings. So i'd be happy to hear of a better solution!
You can create a dynamic sql statement to generate the result into a temporary table and then return the result from the temporary table. Here is an example:
declare sqlStr string;
sqlStr = 'SELECT 1 AS column1, 2 AS column2 INTO #temp FROM system.iota';
execute immediate sqlStr;
select * from #temp;
The logic to generate the required grouping is up to you. You will need to use the while loop to construct the statement text.

SQL query: get total values for each month

I have a table that stores, number of fruits sold on each day. Stores number of items sold on particular date.
CREATE TABLE data
(
code VARCHAR2(50) NOT NULL,
amount NUMBER(5) NOT NULL,
DATE VARCHAR2(50) NOT NULL,
);
Sample data
code |amount| date
------+------+------------
aple | 1 | 01/01/2010
aple | 2 | 02/02/2010
orange| 3 | 03/03/2010
orange| 4 | 04/04/2010
I need to write a query, to list out, how many apple and orange sold for jan and february?
--total apple for jan
select sum(amount) from mg.drum d where date >='01/01/2010' and cdate < '01/02/2020' and code = 'aple';
--total apple for feb
select sum(amount) from mg.drum d where date >='01/02/2010' and cdate < '01/03/2020' and code = 'aple';
--total orange for jan
select sum(amount) from mg.drum d where date >='01/01/2010' and cdate < '01/02/2020' and code = 'orange';
--total orange for feb
select sum(amount) from mg.drum d where date >='01/02/2010' and cdate < '01/03/2020' and code = 'orange';
If I need to calculate for more months, more fruits, its tedious.is there a short query to write?
Can I combine at least for the months into 1 query? So 1 query to get total for each month for 1 fruit?
You can use conditional aggregation such as
SELECT TO_CHAR("date",'MM/YYYY') AS "Month/Year",
SUM( CASE WHEN code = 'apple' THEN amount END ) AS apple_sold,
SUM( CASE WHEN code = 'orange' THEN amount END ) AS orange_sold
FROM data
WHERE "date" BETWEEN date'2020-01-01' AND date'2020-02-29'
GROUP BY TO_CHAR("date",'MM/YYYY')
where date is a reserved keyword, cannot be a column name unless quoted.
Demo
select sum(amount), //date.month
from mg.drum
group by //date.month
//data.month Here you can give experssion which will return month number or name.
If you are dealing with months, then you should include the year as well. I would recommend:
SELECT TRUNC(date, 'MON') as yyyymm, code,
SUM(amount)
FROM t
GROUP BY TRUNC(date, 'MON'), code;
You can add a WHERE clause if you want only some dates or codes.
This will return a separate row for each row that has data. That is pretty close to the results from your four queries -- but this does not return 0 values.
select to_char(date_col,'MONTH') as month, code, sum(amount)
from mg.drum
group by to_char(date_col,'MONTH'), code

SQL Query to get sums among multiple payments which are greater than or less than 10k

I am trying to write a query to get sums of payments from accounts for a month. I have been able to get it for the most part but I have hit a road block. My challenge is that I need a count of the amount of payments that are either < 10000 or => 10000. The business rules are that a single payment may not exceed 10000 but there can be multiple payments made that can total more than 10000. As a simple mock database it might look like
ID | AccountNo | Payment
1 | 1 | 5000
2 | 1 | 6000
3 | 2 | 5000
4 | 3 | 9000
5 | 3 | 5000
So the results I would expect would be something like
NumberOfPaymentsBelow10K | NumberOfPayments10K+
1 | 2
I would like to avoid doing a function or stored procedure and would prefer a sub query.
Any help with this query would be greatly appreciated!
I suggest avoiding sub-queries as much as possible because it hits the performance, specially if you have a huge amount of data, so, you can use something like Common Table Expression instead. You can do the same by using:
;WITH CTE
AS
(
SELECT AccountNo, SUM(Payment) AS TotalPayment
FROM Payments
GROUP BY AccountNo
)
SELECT
SUM(CASE WHEN TotalPayment < 10000 THEN 1 ELSE 0 END) AS 'NumberOfPaymentsBelow10K',
SUM(CASE WHEN TotalPayment >= 10000 THEN 1 ELSE 0 END) AS 'NumberOfPayments10K+'
FROM CTE
You can get the totals per account using SUM and GROUP BY...
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
You can use that result to count the number over 10000
SELECT COUNT(*)
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)
WHERE TotPay>10000
You can get the the number over and the number under in a single query if you want but that's a but more complicated:
SELECT
COUNT(CASE WHEN TotPay<=10000 THEN 1 END) AS Below10K,
COUNT(CASE WHEN TotPay> 10000 THEN 1 END) AS Above10K
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)

Select info from table where row has max date

My table looks something like this:
group date cash checks
1 1/1/2013 0 0
2 1/1/2013 0 800
1 1/3/2013 0 700
3 1/1/2013 0 600
1 1/2/2013 0 400
3 1/5/2013 0 200
-- Do not need cash just demonstrating that table has more information in it
I want to get the each unique group where date is max and checks is greater than 0. So the return would look something like:
group date checks
2 1/1/2013 800
1 1/3/2013 700
3 1/5/2013 200
attempted code:
SELECT group,MAX(date),checks
FROM table
WHERE checks>0
GROUP BY group
ORDER BY group DESC
problem with that though is it gives me all the dates and checks rather than just the max date row.
using ms sql server 2005
SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group
That works to get the max date..join it back to your data to get the other columns:
Select group,max_date,checks
from table t
inner join
(SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group)a
on a.group = t.group and a.max_date = date
Inner join functions as the filter to get the max record only.
FYI, your column names are horrid, don't use reserved words for columns (group, date, table).
You can use a window MAX() like this:
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
to get max dates per group alongside other data:
group date cash checks max_date
----- -------- ---- ------ --------
1 1/1/2013 0 0 1/3/2013
2 1/1/2013 0 800 1/1/2013
1 1/3/2013 0 700 1/3/2013
3 1/1/2013 0 600 1/5/2013
1 1/2/2013 0 400 1/3/2013
3 1/5/2013 0 200 1/5/2013
Using the above output as a derived table, you can then get only rows where date matches max_date:
SELECT
group,
date,
checks
FROM (
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
) AS s
WHERE date = max_date
;
to get the desired result.
Basically, this is similar to #Twelfth's suggestion but avoids a join and may thus be more efficient.
You can try the method at SQL Fiddle.
Using an in can have a performance impact. Joining two subqueries will not have the same performance impact and can be accomplished like this:
SELECT *
FROM (SELECT msisdn
,callid
,Change_color
,play_file_name
,date_played
FROM insert_log
WHERE play_file_name NOT IN('Prompt1','Conclusion_Prompt_1','silent')
ORDER BY callid ASC) t1
JOIN (SELECT MAX(date_played) AS date_played
FROM insert_log GROUP BY callid) t2
ON t1.date_played = t2.date_played
SELECT distinct
group,
max_date = MAX(date) OVER (PARTITION BY group), checks
FROM table
Should work.