I am wondering if what I am trying to do is possible. I believe it is using the PIVOT function in TSQL but don't have enough experience with the PIVOT function to know where to start.
Basically I'm trying to take the following # table called #tmpbudgetdata (truncated for simplicity):
Account Description BudgetAmount Period
-------------------- ---------------------------------------------------------------------------------------------------- --------------------- --------------------
4001 Mood Embedded Account 0.00 1
4001 Mood Embedded Account 0.00 2
4001 Mood Embedded Account 0.00 3
4001 Mood Embedded Account 0.00 4
4001 Mood Embedded Account 0.00 5
4001 Mood Embedded Account 0.00 6
4001 Mood Embedded Account 0.00 7
4001 Mood Embedded Account 0.00 8
4001 Mood Embedded Account 0.00 9
4001 Mood Embedded Account 0.00 10
4001 Mood Embedded Account 0.00 11
4001 Mood Embedded Account 0.00 12
4003 DBS Music 0.00 1
4003 DBS Music 0.00 2
4003 DBS Music 0.00 3
4003 DBS Music 0.00 4
4003 DBS Music 0.00 5
4003 DBS Music 0.00 6
4003 DBS Music 0.00 7
4003 DBS Music 0.00 8
4003 DBS Music 0.00 9
4003 DBS Music 0.00 10
4003 DBS Music 0.00 11
4003 DBS Music 0.00 12
4010 Sales - Software 5040.00 1
4010 Sales - Software 0.00 2
4010 Sales - Software 6280.56 3
4010 Sales - Software 6947.93 4
4010 Sales - Software 4800.00 5
4010 Sales - Software 0.00 6
4010 Sales - Software 2400.00 7
4010 Sales - Software 2550.00 8
4010 Sales - Software 4800.00 9
4010 Sales - Software 2400.00 10
4010 Sales - Software 0.00 11
4010 Sales - Software 2400.00 12
4015 New Install Revenue 0.00 1
4015 New Install Revenue 0.00 2
4015 New Install Revenue 0.00 3
4015 New Install Revenue 3844.79 4
4015 New Install Revenue 0.00 5
4015 New Install Revenue 0.00 6
4015 New Install Revenue 0.00 7
4015 New Install Revenue 0.00 8
4015 New Install Revenue 0.00 9
4015 New Install Revenue 0.00 10
4015 New Install Revenue 0.00 11
4015 New Install Revenue 0.00 12
and turning it into something like this:
Account Description Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 Period9 Period10 Period11 Period12
------- --------------- -------- ------- -------- ------ ------- ------- -------- ------ ------- -------- -------- --------
4001 Mood Enabled... 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
4003 Dbs Music 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
4010 Sales - Software 5040.00 0.00 6280.56 6947.93 4800.00 0.00 2400.00 2550.00 4800.00 2400.00 0.00 2400.00
...etc...
Basically just grouping via the Account column (the description is the same per account) and then taking the period values and pivoting them horizontally.
I know I could do it with a cursor and loop through but wondering if this is possible with a pivot or by other means.
Thanks in advance
I simple PIVOT should do the trick
Example
Select *
From (
Select [Account]
,[Description]
,Period = concat('Period',Period)
,[BudgetAmount]
From YourTable
) src
Pivot (sum([BudgetAmount]) for Period in ( [Period1],[Period2],[Period3],[Period4],[Period5],[Period6],[Period7],[Period8],[Period9],[Period10],[Period11],[Period12] ) ) pvt
Returns
Related
I have the following output file. Please note that this data is dynamic, so there could be more or less years and many more categories A,B,C,D...:
2015 2016 2017
EX
FE
B 0.00 -2.00 -1.00
D 0.00 -1.00 0.00
sumFE 0.00 -3.00 -1.00
VE
B 0.00 -3.00 0.00
C -4.00 0.00 0.00
D 0.00 -5.00 0.00
sumVE -4.00 -8.00 0.00
sumE -4.00 -11.00 -1.00
IN
FI
A 8.00 0.00 0.00
C 0.00 0.00 8.00
sumFI 8.00 0.00 8.00
VI
A 0.00 0.00 5.00
B 4.00 0.00 0.00
sumVI 4.00 0.00 5.00
sumI 12.00 0.00 13.00
net 8.00 -11.00 12.00
I am trying to format it like this.
2015 2016 2017
IN
VI
A 0.00 0.00 5.00
B 4.00 0.00 0.00
sumVI 4.00 0.00 5.00
FI
A 8.00 0.00 0.00
C 0.00 0.00 8.00
sumFI 8.00 0.00 8.00
sumI 12.00 0.00 13.00
EX
VE
B 0.00 -3.00 0.00
C -4.00 0.00 0.00
D 0.00 -5.00 0.00
sumVE -4.00 -8.00 0.00
FE
B 0.00 -2.00 -1.00
D 0.00 -1.00 0.00
sumFE 0.00 -3.00 -1.00
sumE -4.00 -11.00 -1.00
net 8.00 -11.00 12.00
I have tried the following script as a start:
#!/usr/bin/env bash
awk '
BEGIN{FS="\t"}
3>NR {print "D" $0}
$1 ~ /^I$/,$1 ~ /^sumI$/ {
print
}
$1 ~ /^E$/,$1 ~ /^sumE$/{
print
}
$1 ~ /net/ {print ORS $0}
' "${#:--}"
The script would go a long way in replacing all I data for E data however the execution order is not preserved and the I block is printed out last. Can someone please help with this.
This will probably be easier to modify the originating code to use GNU awk's predefined array scanning orders. The key objective is to switch the scanning order (PROCINFO["sorted_in"]) just prior to the associated for (index in array) loop.
Adding four lines of code (see # comments) to what I'm guessing is the originating code:
...
END {
for (year = minYear; year <= maxYear; year++) {
printf "%s%s", OFS, year
}
print ORS
PROCINFO["sorted_in"]="#ind_str_desc" # sort cat == { I | E } in descending order
for (cat in ctiys2amounts) {
printf "%s\n\n",(cat=="I") ? "IN" : "EX" # print { IN | EX }
delete catSum
PROCINFO["sorted_in"]="#ind_str_desc" # sort type == { VI | FI } || { VE | FE } in descending order
for (type in ctiys2amounts[cat]) {
print type
delete typeSum
PROCINFO["sorted_in"]="#ind_str_asc" # sort item == { A | B | C | D } in ascending order
for (item in ctiys2amounts[cat][type]) {
printf "%s", item
for (year = minYear; year <= maxYear; year++) {
amount = ctiys2amounts[cat][type][item][year]
printf "%s%0.2f", OFS, amount
typeSum[year] += amount
}
print ""
}
....
This generates:
2015 2016 2017
IN
VI
A 0.00 0.00 5.00
B 4.00 0.00 0.00
sumVI 4.00 0.00 5.00
FI
A 8.00 0.00 0.00
C 0.00 0.00 8.00
sumFI 8.00 0.00 8.00
sumI 12.00 0.00 13.00
EX
VE
B 0.00 -3.00 0.00
C -4.00 0.00 0.00
D 0.00 -5.00 0.00
sumVE -4.00 -8.00 0.00
FE
B 0.00 -2.00 -1.00
D 0.00 -1.00 0.00
sumFE 0.00 -3.00 -1.00
sumE -4.00 -11.00 -1.00
net 8.00 -11.00 12.00
I have multi index dataframe and I want to convert two columns' value into percentage values.
Capacity\nMWh Day-Ahead\nMWh Intraday\nMWh UEVM\nMWh ... Cost Per. MW\n(with Imp.)\n$/MWh Cost Per. MW\n(w/o Imp.)\n$/MWh Intraday\nMape Day-Ahead\nMape
Power Plants Date ...
powerplant1 2020 January 3.6 446.40 492.70 482.50 ... 0.05 0.32 0.04 0.10
2020 February 0.0 0.00 0.00 0.00 ... 0.00 0.00 0.00 0.00
2020 March 0.0 0.00 0.00 0.00 ... 0.00 0.00 0.00 0.00
2020 April 0.0 0.00 0.00 0.00 ... 0.00 0.00 0.00 0.00
I used apply('{:0%}'.format):
nested_df[['Intraday\nMape', 'Day-Ahead\nMape']] = \
nested_df[['Intraday\nMape', 'Day-Ahead\nMape']].apply('{:.0%}'.format)
But I got this error:
TypeError: ('unsupported format string passed to Series.__format__', 'occurred at index Intraday\nMape')
How can I solve that?
Use DataFrame.applymap:
nested_df[['Intraday\nMape', 'Day-Ahead\nMape']] = \
nested_df[['Intraday\nMape', 'Day-Ahead\nMape']].applymap('{:.0%}'.format)
I'm trying to generate a summary row using ROLLUP grouping,
Here is my query
SELECT nic as NIC,branch_id,SUM(as_share),SUM(as_deposit) as as_deposit,SUM(as_credits) as as_credits,SUM(as_fixed) as as_fixed,SUM(as_ira) as as_ira,SUM(as_saviya) as as_saviya
FROM As_Member_Account_Details
GROUP BY nic,branch_id
WITH ROLLUP
But it give me this output,
112233 1 30.00 0.00 0.00 50.00 0.00 0.00
112233 2 20.00 0.00 0.00 0.00 0.00 0.00
112233 3 0.00 0.00 0.00 0.00 0.00 0.00
112233 NULL 50.00 0.00 0.00 50.00 0.00 0.00
NULL NULL 50.00 0.00 0.00 50.00 0.00 0.00
The row before the last row is unnecessary. Because there should be only 3 data rows + a summary row. How can I eliminate that row
Grouping sets allows more granular control when cubeing data.
SELECT nic as NIC
, branch_id,SUM(as_share)
, SUM(as_deposit) as as_deposit
, SUM(as_credits) as as_credits
, SUM(as_fixed) as as_fixed
, SUM(as_ira) as as_ira
, SUM(as_saviya) as as_saviya
FROM As_Member_Account_Details
GROUP BY GROUPING SETS ((nic,branch_id),())
WITH CTE_YourQuery AS
(
SELECT nic as NIC,branch_id,SUM(as_share),SUM(as_deposit) as as_deposit,SUM(as_credits) as as_credits,SUM(as_fixed) as as_fixed,SUM(as_ira) as as_ira,SUM(as_saviya) as as_saviya
FROM As_Member_Account_Details
GROUP BY nic,branch_id
WITH ROLLUP
)
SELECT *
FROM CTE_YourQuery
WHERE NOT (nic IS NOT NULL AND branch_id IS NULL)
I have a text file in the form below. Could someone help me as to how I could delete columns 2, 3, 4, 5, 6 and 7? I want to keep only 1,8 and 9.
37.55 6.00 24.98 0.00 -2.80 -3.90 26.675 './gold_soln_CB_FragLib_Controls_m1_9.mol2' 'ethyl'
38.45 1.39 27.36 0.00 -0.56 -2.48 22.724 './gold_soln_CB_FragLib_Controls_m2_6.mol2' 'pyridin-2-yl(pyridin-3-yl)methanone'
38.47 0.00 28.44 0.00 -0.64 -2.42 20.387 './gold_soln_CB_FragLib_Controls_m3_3.mol2' 'pyridin-2-yl(pyridin-4-yl)methanone'
42.49 0.07 30.87 0.00 -0.03 -3.24 22.903 './gold_soln_CB_FragLib_Controls_m4_5.mol2' '(3-chlorophenyl)(pyridin-3-yl)methanone'
38.20 1.47 27.53 0.00 -1.13 -3.28 22.858 './gold_soln_CB_FragLib_Controls_m5_2.mol2' 'dipyridin-4-ylmethanone'
41.87 0.57 30.53 0.00 -0.67 -3.16 22.829 './gold_soln_CB_FragLib_Controls_m6_9.mol2' '(3-chlorophenyl)(pyridin-4-yl)methanone'
38.18 1.49 27.09 0.00 -0.56 -1.63 7.782 './gold_soln_CB_FragLib_Controls_m7_1.mol2' '3-hydrazino-6-phenylpyridazine'
39.45 1.50 27.71 0.00 -0.15 -4.17 17.130 './gold_soln_CB_FragLib_Controls_m8_6.mol2' '3-hydrazino-6-phenylpyridazine'
41.54 4.10 27.71 0.00 -0.65 -4.44 9.702 './gold_soln_CB_FragLib_Controls_m9_4.mol2' '3-hydrazino-6-phenylpyridazine'
41.05 1.08 29.30 0.00 -0.31 -2.44 28.590 './gold_soln_CB_FragLib_Controls_m10_3.mol2' '3-hydrazino-6-(4-methylphenyl)pyridazine'
Try:
awk '{print $1"\t"$8"\t"$9}' yourfile.tsv > only189.tsv
I have a Table (parts) where I store when an item was requested and when it was issued. With this, I can easily compute each items turn-around-time ("TAT"). What I'd like to do is have another column ("Computed") where any overlapping request-to-issue dates are properly computed.
RecID Requested Issued TAT Computed
MD0001 11/28/2012 12/04/2012 6.00 0.00
MD0002 11/28/2012 11/28/2012 0.00 0.00
MD0003 11/28/2012 12/04/2012 6.00 0.00
MD0004 11/28/2012 11/28/2012 0.00 0.00
MD0005 11/28/2012 12/10/2012 12.00 0.00
MD0006 11/28/2012 01/21/2013 54.00 54.00
MD0007 11/28/2012 11/28/2012 0.00 0.00
MD0008 11/28/2012 12/04/2012 6.00 0.00
MD0009 01/29/2013 01/30/2013 1.00 1.00
MD0010 01/29/2013 01/30/2013 1.00 0.00
MD0011 02/05/2013 02/06/2013 1.00 1.00
MD0012 02/07/2013 03/04/2013 25.00 25.00
MD0013 03/07/2013 03/14/2013 7.00 7.00
MD0014 03/07/2013 03/08/2013 1.00 0.00
MD0015 03/13/2013 03/25/2013 12.00 11.00
MD0016 03/20/2013 03/21/2013 1.00 0.00
Totals 133.00 99.00 <- waiting for parts TAT summary
In the above, I manually filled in the ("Computed") column so that there is an example of what I'm trying to accomplish.
NOTE: Notice how MD0013 affects the computed time for MD0015 as MD0013 was "computed" first. This could have been where MD0015 was computed first, then MD0013 would be affected accordingly - the net result is there is -1 day.