How to add summary rows to income statements in postgresql - sql

Income statement table has structure:
sgroup char(30),
account char(10),
jan numeric(12,2),
feb numeric(12,2)
and has values:
SGroup Account Jan Feb
Sales 311 100 200
Sales 312 20 30
..
Other 410 3333 44
Other 411 333 344
...
How convert this table to have header and subtotals for each group:
Caption Jan Feb
Sales
311 100 200
312 20 30
Sales Total 120 230
Other
410 3333 44
411 333 344
Other total 3666 388
... ... ...
Grand Total ... ...
Caption column should contain group header, account numbers and group total for each group.
After total there should be empty row.
After that that there should be next group etc.
In the end there should be a "Grand Total" row containing the sum of all rows.
Using Postgres 9.1.2 in Debian.
Mono C# ASP.NET MVC application running in Debian. If it's more reasonable, this conversion can done in MVC controller also.

I would calculate sums per group in a CTE to use it three times in the main query:
WITH total AS (
SELECT sgroup, 'Sales Total'::text AS c, sum(jan) AS j, sum(feb) AS f
FROM income_statement
GROUP BY 1
)
( -- parens required
SELECT caption, jan, feb
FROM (
SELECT 1 AS rnk, sgroup, account::text AS caption, jan, feb
FROM income_statement
UNION ALL
SELECT 0 AS rnk, sgroup, sgroup::text, NULL, NULL FROM total
UNION ALL
SELECT 2 AS rnk, * FROM total
) sub
ORDER BY sgroup, rnk
)
UNION ALL
SELECT 'Grand Total', sum(j), sum(f) FROM total;
The extra set of parentheses is required to include ORDER BY.
You probably don't want to use the data type char(30):
Any downsides of using data type "text" for storing strings?

Related

How to drop a field from a running count

I am tasked with trying to come up with a total count for the number of clients we have had in any given year. I am able to run a total count of the clients we have had, but I want to drop them from the running total when they offboard from us (i.e. #EndDate)
DECLARE #EndDate Date
SET #EndDate = (SELECT DISTINCT LOAEndDate FROM tblCompany)
SELECT DISTINCT Year(DateBecameClient) AS [Year],
Count(CompanyId) OVER (ORDER BY Year(DateBecameClient)) AS NumberofClients
FROM [tblCompany] AS Company
ORDER BY [Year]
Here is the output that I get without including #EndDate.
--------------------
Year NumberofClients
2001 3
2002 6
2003 9
2004 10
2005 13
2006 15
2007 16
2008 26
2009 36
2010 78
2011 135
2012 204
2013 314
2014 385
2015 456
2016 471
2017 496
2018 507
2019 513
2020 514
2021 516
I presume that you have a separate date that indicates when the client left. You'll want to counterbalance with a -1 via a union. If a client was added and lost within the same year it'll never be counted:
with data as (
select year(DateBecameClient) as yr, 1 as num
from tblCompany
union all
select year(DateLostClient), -1
from tblCompany
)
select yr as "Year", sum(sum(num)) over (order by yr) as NumberOfClients
from data
group by yr
order by "Year";
I'm using grouping with a sum of sums to get around needing distinct. This is basically the same as your query except for the addition of the negative counters.

SQL Querying of Data by grouping with only one main variable(Store) and finding the percentage of customers in other variable

Tables - Store
Stores
Date
Customer_ID
A
01/01/2020
1111
C
01/01/2020
1111
F
02/01/2020
1234
A
02/01/2020
1111
A
02/01/2020
2222
Tables - Customer
Customer_ID
Age_Group
Income_Level
1111
26-30
Low
1234
25 and below
Mid
2222
31-60
High
I want to know how I can get this output.
Stores
Age_Group
Percentage_by_Age
Income_Level
Percentage_By_Income
A
25 and below
10
Low
80
A
25 and below
10
Mid
10
A
25 and below
10
High
10
A
26 - 30
42
Low
15
A
26 - 30
42
Mid
65
A
26 - 30
42
High
20
A
31 - 60
48
Low
30
A
31 - 60
48
Mid
50
A
31 - 60
48
High
20
I am using SQL to query from different tables.
First I need to aggregate the number of customers by stores, then in each store, I want to find out how many customers visited Store A in a particular age group(25 and below), and how many of them are in which income level.
May I know how I can go about solving this query?
Thanks.
My current solution/thought process
SELECT
stores AS Stores,
Age_Group AS Age,
Income_Level AS Income
COUNT(DISTINCT(Customer_ID)) AS Number_of_Customers
FROM tables JOIN tables....
GROUP BY Stores, Ages, Income;
And then manually calculating the percentages.
But it doesn't seem right.
Is there a way to produce an example output table using just SQL?
As per your requirement, Common Table Expressions can be used . You can use below code to get the expected output.
WITH
data_for_percent_by_income AS (
SELECT
COUNT(customer_id) AS cus_count_in_per_income_level_and_agegrp,
Age_group AS age_g,income_level AS inc_lvl
FROM
`project.dataset.Customer2`
WHERE
customer_id IN (
SELECT customer_id
FROM
`project.dataset.Store5`
WHERE stores='A')
GROUP BY
Age_group,income_level),tot_cus_in_defined_income_level AS (
SELECT
COUNT(customer_id) AS cus_count_in_per_income_level,Age_group AS ag
FROM
`project.dataset.Customer2`
WHERE
customer_id IN (
SELECT
customer_id
FROM
`project.dataset.Store5`
WHERE stores='A')
GROUP BY
Age_group),
tot_cus_storeA AS(
SELECT
COUNT(*) AS tot_cus_in_A
FROM
`project.dataset.Customer2`
WHERE customer_id IN (
SELECT customer_id
FROM
`project.dataset.Store5`
WHERE stores='A') ),
final_view AS(
SELECT
ROUND(cus_count_in_per_income_level_and_agegrp*100/cus_count_in_per_income_level) AS p_by_inc,
age_g,inc_lvl
FROM
data_for_percent_by_income
INNER JOIN
tot_cus_in_defined_income_level
ON
data_for_percent_by_income.age_g=tot_cus_in_defined_income_level.ag )
SELECT
stores,tot_cus_in_defined_income_level.ag AS age_group,income_level,
ROUND(cus_count_in_per_income_level*100/tot_cus_in_A) AS percentage_by_age,
p_by_inc AS percentage_by_income
FROM
tot_cus_in_defined_income_level,tot_cus_storeA,`project.dataset.Customer2`,`project.dataset.Store5`
INNER JOIN
final_view
ON
age_group=final_view.age_g AND income_level=final_view.inc_lvl
WHERE
tot_cus_in_defined_income_level.ag = Age_group AND stores='A'
GROUP BY
stores,percentage_by_age,age_group,income_level,percentage_by_income
ORDER BY Age_group
I have attached the screenshots of the input table and output table.
Customer Table
Store Table
Output Table
SELECT
s.Stores AS Stores,
c.age_group AS Age,
a.income_level AS Affluence,
CAST(COUNT(DISTINCT c.Customer_ID) AS numeric)*100/SUM(CAST(COUNT(DISTINCT c.Customer_ID) AS numeric)) OVER(PARTITION BY s.Stores ) AS Perc_of_Members
This is what I did in the end.

T-SQL Force Select results to have a Primary Key

I have a large set of imperfect data, from this data I reverse engineering a table for the coding used.
For this particular task, it is know that all records with a specific division code should all have the same group ID and plan ID (which are not included in the data) from another source I been able to add a close but imperfect (and incomplete) mapping of the group ID and plan ID. Now I want to work backwards and build a division mapping table. I have gotten data down to a format like this:
Division Year Group Plan Cnt
52 2019 30 101 9031
52 2020 30 101 9562
54 2019 60 602 3510
54 2020 60 602 3385
56 2019 76 904 1113
56 2020 76 905 1125
56 2020 76 001 6
The Division and Year columns should from a primary key. As you can see 56, 2020 is not unique, but by looking at the cnt column it is easy to see that the record with a count of 6 is a bad record and should be dropped.
What I need is a method to return each division and year pair once with the group and plan IDs that have the largest count.
Thank You
I found the answer using the Rank() function and WHERE clause:
SELECT *
FROM (
SELECT Division, Year, Group, Plan_Cd
, RANK() OVER (PARTITION BY Division, Year ORDER BY Cnt DESC ) AS 'rk'
FROM DivisionMap ) R
WHERE rk = 1

How to use aggregate functions in my criteria in SQL Server?

I have table called VoucherEntry
These are my records,
ID VoucherOnlineID TransactionNumber Store Amount
-------------------------------------------------------------
120 137 26 1001 100
126 137 22 2000 -56
128 137 30 3000 -20
133 137 11 2000 -5
Now I want to add 2 columns which is having carry amount and Balance amount. If the VoucherEntry.Amount = 100 Then carry column should be 0, other wise it should display like below
Expecting output
ID VoucherOnlineID TransactionNumber Store Carry Amount Balance
---------------------------------------------------------------------------------
120 137 26 1001 0 100 100
126 137 22 2000 100 -56 44
128 137 30 3000 44 -20 24
133 137 11 2000 24 -5 19
Update
we can sort the record By ID column or Date column, after you sort the records will display in above order
You need two variations of a Cumulative Sum:
SELECT
VoucherOnlineID
,TransactionNumber
,Store
,Coalesce(Sum(Amount) -- Cumulative Sum of previous rows
Over (PARTITION BY VoucherOnlineID
ORDER BY DATE -- or whatever determines correct order
ROWS BETWEEN Unbounded Preceding AND 1 Preceding), 0) AS Carry
,Amount
,Sum(Amount) -- Cumulative Sum including current row
Over (PARTITION BY VoucherOnlineID
ORDER BY DATE -- or whatever determines correct order
ROWS Unbounded Preceding) AS Balance
FROM VoucherEntry
sql Server 2008 and below
declare #t table(ID int,VoucherOnlineID int,TransactionNumber int,Store int,Amount int)
insert into #t VALUES
(120,137,26,1001,100)
,(126,137,22,2000,-56)
,(128,137,30,3000,-20)
,(133,137,11,2000,-5 )
select *
,isnull((Select sum(Amount) from #t t1
where t1.VoucherOnlineID=t.VoucherOnlineID
and t1.id<t.id ) ,0)Carry
,isnull((Select sum(Amount) from #t t1
where t1.VoucherOnlineID=t.VoucherOnlineID
and t1.id<=t.id ) ,0)Balance
from #t t

SQL script to partition data on a column and return the max value [duplicate]

This question already has answers here:
How to group by on consecutive values in SQL
(2 answers)
Closed 6 years ago.
I have a requirement to compute bonus payout based on spread goal and date achieved as follows:
Spread Goal | Date Achieved | Bonus Payout
----------------------------------------------
$3,500 | < 27 wks | $2,000
$3,500 | 27 wks to 34 wks | $1,000
$3,500 | > 34 wks | $0
I have a table in SQL Server 2014 where the subset of the data is as follows:
EMP_ID WK_NUM NET_SPRD_LCL
123 10 0
123 11 1500
123 15 3600
123 18 3800
123 19 4000
Based on the requirement, I need to look for records where NET_SPRD_LCL is greater than or equal to 3500 during 2 continuous wk_num.
So, in my example, WK_NUM 15 and 18 (which in my case are continuous because I have a calendar table that I join to to exclude the holiday weeks) are less than 27 wks and have NET_SPRD_LCL > 3500.
For this case, I want to output the MAX(WK_NUM), it's associated NET_SPRD_LCL and BONUSPAYOUT = 2000. So, the output should be as follows:
EMP_ID WK_NUM NET_SPRD_LCL BONUSPAYOUT
123 18 3800 2000
If this meets the first requirement, the script should output and quit. If not, then I will look for the second requirement where Date Achieved is between 27 wks to 34 wks.
I hope I was able to explain my requirement clearly :-)
Thanks for the help.
Nice question! I broke my mind on situations like 4 rows in a turn are with 3500 and more. And came up with this.
You can use CTE, recursive CTE and ROW_NUMBER():
;WITH cte AS(
SELECT EMP_ID,
WK_NUM,
NET_SPRD_LCL,
ROW_NUMBER() OVER (PARTITION BY EMP_ID ORDER BY WK_NUM) rn
FROM YourTable
)
, recur AS (
SELECT EMP_ID,
WK_NUM,
NET_SPRD_LCL,
rn,
1 as lev
FROM cte
WHERE rn = 1
UNION ALL
SELECT c.EMP_ID,
c.WK_NUM,
c.NET_SPRD_LCL,
c.rn,
CASE WHEN c.NET_SPRD_LCL < 3500 THEN Lev+1 ELSE Lev END
FROM cte c
INNER JOIN recur r
ON r.rn+1 = c.rn
)
SELECT TOP 1 WITH TIES
EMP_ID,
WK_NUM,
NET_SPRD_LCL,
CASE WHEN WK_NUM < 27 THEN $2000
WHEN WK_NUM between 27 and 34 THEN $1000
ELSE $0 END as Bonus
FROM recur
WHERE NET_SPRD_LCL >= 3500
ORDER BY ROW_NUMBER() OVER(PARTITION BY EMP_ID,lev ORDER BY WK_NUM)%2
Output for data you provided:
EMP_ID WK_NUM NET_SPRD_LCL Bonus
123 18 3800 2000,00