SQL Server 2012 Days in Drawdown - sql-server-2012

I have a table of stock prices:
DECLARE #table TABLE (ClosingDate DATE, Ticker VarChar(6), Price Decimal (6,2))
INSERT INTO #Table
VALUES ('1/1/13' , 'ABC' , '100.00')
,('1/2/13' , 'ABC' , '101.50')
,('1/3/13' , 'ABC' , '99.80')
,('1/4/13' , 'ABC' , '95.50')
,('1/5/13' , 'ABC' , '78.00')
,('1/1/13' , 'JKL' , '34.57')
,('1/2/13' , 'JKL' , '33.99')
,('1/3/13' , 'JKL' , '31.85')
,('1/4/13' , 'JKL' , '30.11')
,('1/5/13' , 'JKL' , '45.00')
,('1/1/13' , 'XYZ' , '11.50')
,('1/2/13' , 'XYZ' , '12.10')
,('1/3/13' , 'XYZ' , '17.15')
,('1/4/13' , 'XYZ' , '14.10')
,('1/5/13' , 'XYZ' , '15.55')
I calculate drawdowns (% from max peak or max price) for each ticker:
SELECT Ticker,
t.ClosingDate,
t.Price,
MAX(t.[Price]) OVER (PARTITION BY Ticker ORDER BY ClosingDate) AS max_price,
(t.[Price] / MAX(t.[Price]) OVER (PARTITION BY Ticker ORDER BY ClosingDate)) - 1 AS Drawdown
FROM
#Table t;
Output:
Ticker ClosingDate Price max_price Drawdown
-----------------------------------------------------
ABC 2013-01-01 100.00 100.00 0.000000000
ABC 2013-01-02 101.50 101.50 0.000000000
ABC 2013-01-03 99.80 101.50 -0.016748769
ABC 2013-01-04 95.50 101.50 -0.059113301
ABC 2013-01-05 78.00 101.50 -0.231527094
JKL 2013-01-01 34.57 34.57 0.000000000
JKL 2013-01-02 33.99 34.57 -0.016777553
JKL 2013-01-03 31.85 34.57 -0.078680938
JKL 2013-01-04 30.11 34.57 -0.129013596
JKL 2013-01-05 45.00 45.00 0.000000000
XYZ 2013-01-01 11.50 11.50 0.000000000
XYZ 2013-01-02 12.10 12.10 0.000000000
XYZ 2013-01-03 17.15 17.15 0.000000000
XYZ 2013-01-04 14.10 17.15 -0.177842566
XYZ 2013-01-05 15.55 17.15 -0.093294461
A new high price is designated as a drawdown or 0.
How can I add days in drawdown?
Any date where drawdown = 0 resets the days counter to 0 and builds as each day remains in drawdown (price < max price)
Here is my expected output:
Ticker ClosingDate Price max_price Drawdown Days in DD
--------------------------------------------------------------------
ABC 1/1/2013 100.00 100.00 0.0000 0
ABC 1/2/2013 101.50 101.50 0.0000 0
ABC 1/3/2013 99.80 101.50 -0.0167 1
ABC 1/4/2013 95.50 101.50 -0.0591 2
ABC 1/5/2013 78.00 101.50 -0.2315 3
JKL 1/1/2013 34.57 34.57 0.0000 0
JKL 1/2/2013 33.99 34.57 -0.0168 1
JKL 1/3/2013 31.85 34.57 -0.0787 2
JKL 1/4/2013 30.11 34.57 -0.1290 3
JKL 1/5/2013 45.00 45.00 0.0000 0
XYZ 1/1/2013 11.50 11.50 0.0000 0
XYZ 1/2/2013 12.10 12.10 0.0000 0
XYZ 1/3/2013 17.15 17.15 0.0000 0
XYZ 1/4/2013 14.10 17.15 -0.1778 1
XYZ 1/5/2013 15.55 17.15 -0.0933 2

Related

Divide individual grouped items in pandas

Following df:
appid tag totalvalue
0 1234 B 50.00
1 1234 BA 10.00
2 2345 B 100.00
3 2345 BA 25.00
4 2345 BCS 15.00
What we want is to group the df with appid and have some analysis based on tag column, is such that if each tag is divided by tag='B' with totalvalue. Just like follows:
appid tag total %tage(B)
0 1234 B 50.00 1
1 1234 BA 10.00 0.2
2 2345 B 100.00 1
3 2345 BA 25.00 0.4
4 2345 BCS 15.00 0.15
You can use groupby:
gmax = df['totalvalue'].where(df['tag'] == 'B').groupby(df['appid']).transform('max')
df['%tage(B)'] = 1 / (gmax / df['totalvalue'])
print(df)
# Output
appid tag totalvalue %tage(B)
0 1234 B 50.0 1.00
1 1234 BA 10.0 0.20
2 2345 B 100.0 1.00
3 2345 BA 25.0 0.25
4 2345 BCS 15.0 0.15

Calculation over pandas groupby object with condition within groups

I have a df as follows:
appid month tag totalvalue
0 1234 02-'22 B 50.00
1 1234 02-'22 BA 10.00
2 1234 01-'22 B 100.00
3 2345 03-'22 BA 25.00
4 2345 03-'22 B 100.00
5 2345 04-'22 BB 100.00
Output what I want is follows:
appid month tag totalvalue %tage
0 1234 02-'22 B 50.00 1.0
1 1234 02-'22 BA 10.00 0.2
2 1234 01-'22 B 100.00 1.0
3 2345 03-'22 BA 25.00 0.25
4 2345 03-'22 B 100.00 1.0
5 2345 04-'22 BB 100.00 inf
I want to have group variables based on appid & month. Moreover want to check if there are tag=B is available in that group just divide other tag's totalvalue with it. If not shows the inf
I have tried with df.groupby(['appid', 'month'])['totalvalue'] but unable to replicate them with condition of tag=B as denominator over groupby object.
IIUC, you can use a groupby.transform('first') on the masked totalvalue, then use it a divider:
m = df['tag'].eq('B')
df['%tage'] = (df['totalvalue']
.div(df['totalvalue'].where(m)
.groupby([df['appid'], df['month']])
.transform('first').fillna(0))
)
output:
appid month tag totalvalue %tage
0 1234 02-'22 B 50.0 1.00
1 1234 02-'22 BA 10.0 0.20
2 1234 01-'22 B 100.0 1.00
3 2345 03-'22 BA 25.0 0.25
4 2345 03-'22 B 100.0 1.00
5 2345 04-'22 BB 100.0 inf

GroupBy, Transpose, and flatten rows in Pandas

as_of_date
industry
sector
deal
year
quarter
stage
amount
yield
0
2022-01-01
Mortgage
RMBS
XYZ
2022
NaN
A
111
0.1
1
2022-01-01
Mortgage
RMBS
XYZ
2022
1
A
222
0.2
2
2022-01-01
Mortgage
RMBS
XYZ
2022
2
A
333
0.3
3
2022-01-01
Mortgage
RMBS
XYZ
2022
3
A
444
0.4
4
2022-01-01
Mortgage
RMBS
XYZ
2022
4
A
555
0.5
5
2022-01-01
Mortgage
RMBS
XYZ
2022
Nan
B
123
0.6
6
2022-01-01
Mortgage
RMBS
XYZ
2022
1
B
234
0.7
7
2022-01-01
Mortgage
RMBS
XYZ
2022
2
B
345
0.8
8
2022-01-01
Mortgage
RMBS
XYZ
2022
3
B
456
0.9
9
2022-01-01
Mortgage
RMBS
XYZ
2022
4
B
567
1.0
For each group (as_of_date, industry, sector, deal, year, stage), I need to display all the amounts and yields in one line
I have tried this -
df.groupby(['as_of_date', 'industry', 'sector', 'deal', 'year', 'stage'])['amount', 'yield' ].apply(lambda df: df.reset_index(drop=True)).unstack().reset_index()
but this is not working correctly.
Basically, I need this as output rows -
2022-01-01 Mortgage RMBS XYZ 2022 A 111 222 333 444 555 0.1 0.2 0.3 0.4 0.5
2022-01-01 Mortgage RMBS XYZ 2022 B 123 234 345 456 567 0.6 0.7 0.8 0.9 1.0
What would be the correct way to achieve this with Pandas? Thank you
This can be calculated by creating a list for each column first, then combined this (using +), and turning this into a string, removing the [, ], ,:
df1 = df.groupby(['as_of_date', 'industry', 'sector', 'deal', 'year', 'stage']).apply(
lambda x: str(list(x['amount']) + list(x['yield']))[1:-1].replace(",", ""))
df1
#Out:
#as_of_date industry sector deal year stage
#2022-01-01 Mortgage RMBS XYZ 2022 A 111 222 333 444 555 0.1 0.2 0.3 0.4 0.5
# B 123 234 345 456 567 0.6 0.7 0.8 0.9 1.0
Maybe this?
df.groupby(['as_of_date', 'industry', 'sector', 'deal', 'year', 'stage']).agg(' '.join).reset_index()
does this answer your question?
df2 = df.pivot(index=['as_of_date','industry','sector','deal','year', 'stage'], columns=['quarter']).reset_index()
to flatten the columns names
df2.columns = df2.columns.to_series().str.join('_')
df2
as_of_date_ industry_ sector_ deal_ year_ stage_ amount_1 amount_2 amount_3 amount_4 amount_NaN amount_Nan yield_1 yield_2 yield_3 yield_4 yield_NaN yield_Nan
0 2022-01-01 Mortgage RMBS XYZ 2022 A 222.0 333.0 444.0 555.0 111.0 NaN 0.2 0.3 0.4 0.5 0.1 NaN
1 2022-01-01 Mortgage RMBS XYZ 2022 B 234.0 345.0 456.0 567.0 NaN 123.0 0.7 0.8 0.9 1.0 NaN 0.6

Calculate day's difference between successive pandas dataframe rows with condition

I have a dataframe as following:
Company Date relTweet GaplastRel
XYZ 3/2/2020 1
XYZ 3/3/2020 1
XYZ 3/4/2020 1
XYZ 3/5/2020 1
XYZ 3/5/2020 0
XYZ 3/6/2020 1
XYZ 3/8/2020 1
ABC 3/9/2020 0
ABC 3/10/2020 1
ABC 3/11/2020 0
ABC 3/12/2020 1
The relTweet displays whether the tweet is relevant (1) or not (0).
\nI need to find the days difference (GaplastRel) between each successive rows for each company, with a condition that the previous day's tweet should be relevant tweet (i.e. relTweet =1 ). e.g. For the first record relTweet should be 0. For the 2nd record, relTweet should be 1 as the last relevant tweet was made one day ago.
Below is the example of needed output:
Company Date relTweet GaplastRel
XYZ 3/2/2020 1 0
XYZ 3/3/2020 1 1
XYZ 3/4/2020 1 1
XYZ 3/5/2020 1 1
XYZ 3/5/2020 0 1
XYZ 3/6/2020 1 1
XYZ 3/8/2020 1 2
ABC 3/9/2020 0 0
ABC 3/10/2020 1 0
ABC 3/11/2020 0 1
ABC 3/12/2020 1 2
Following is my code:
dataDf['Date'] = pd.to_datetime(dataDf['Date'], format='%m/%d/%Y')
dataDf['relTweet'] = (dataDf.groupby('Company', group_keys=False)
.apply(lambda g: g['Date'].diff().replace(0, np.nan).ffill()))
This code gives the days difference between successive rows for each company without conisidering the relTweet =1 condition. I am not sure how to apply the condition.
Following is the output of the above code:
Company Date relTweet GaplastRel
XYZ 3/2/2020 1 NaT
XYZ 3/3/2020 1 1 days
XYZ 3/4/2020 1 1 days
XYZ 3/5/2020 1 1 days
XYZ 3/5/2020 0 0 days
XYZ 3/6/2020 1 1 days
XYZ 3/8/2020 1 2 days
ABC 3/9/2020 0 NaT
ABC 3/10/2020 1 1 days
ABC 3/11/2020 0 1 days
ABC 3/12/2020 1 1 days
Change your mind sometime we need merge_asof rather than groupby
df1=df.loc[df['relTweet']==1,['Company','Date']]
df=pd.merge_asof(df,df1.assign(Date1=df1.Date),by='Company',on='Date', allow_exact_matches=False)
df['GaplastRel']=(df.Date-df.Date1).dt.days.fillna(0)
df
Out[31]:
Company Date relTweet Date1 GaplastRel
0 XYZ 2020-03-02 1 NaT 0.0
1 XYZ 2020-03-03 1 2020-03-02 1.0
2 XYZ 2020-03-04 1 2020-03-03 1.0
3 XYZ 2020-03-05 1 2020-03-04 1.0
4 XYZ 2020-03-05 0 2020-03-04 1.0
5 XYZ 2020-03-06 1 2020-03-05 1.0
6 XYZ 2020-03-08 1 2020-03-06 2.0
7 ABC 2020-03-09 0 NaT 0.0
8 ABC 2020-03-10 1 NaT 0.0
9 ABC 2020-03-11 0 2020-03-10 1.0
10 ABC 2020-03-12 1 2020-03-10 2.0

Average True Range in SQL Server 2012

I have a table t with columns date, ticker, open, high, low, close.
declare #t table
(
[Datecol] date,
Ticker varchar(10),
[open] decimal (10,2),
[high] decimal (10,2),
[low] decimal (10,2),
[close] decimal(10,2)
)
insert into #t values
('20180215', 'ABC', '122.01', '125.76', '118.79' , '123.29')
,('20180216', 'ABC', '123.02', '130.62', '119.94' , '128.85')
,('20180217', 'ABC', '131.03', '139.80', '129.42' , '136.75')
,('20180218', 'ABC', '136.40', '137.95', '124.32' , '127.38')
,('20180219', 'ABC', '127.24', '138.52', '126.70' , '137.47')
,('20180220', 'ABC', '137.95', '142.01', '127.86' , '128.36')
,('20180215', 'JKL', '9.94', '10.30', '9.77' , '10.17')
,('20180216', 'JKL', '10.15', '10.24', '9.70' , '10.02')
,('20180217', 'JKL', '10.01', '10.18', '9.93' , '10.15')
,('20180218', 'JKL', '10.16', '10.20', '9.23' , '9.38')
,('20180219', 'JKL', '9.37', '9.79', '9.36' , '9.68')
,('20180220', 'JKL', '9.69', '10.01', '9.26' , '9.28')
I'm interested in calculating the daily Average True Range (ATR) for each ticker.
ATR = Max (Today's High, Yesterday's Close) - Min (Today's Low, Yesterday's Close)
Using LAG function, I can get yesterday's close:
SELECT
*,
((LAG([close], 1) OVER (PARTITION BY Ticker ORDER BY [Datecol])) - 0) * 1 AS 'yest_close'
FROM
#t t
Datecol Ticker open high low close yest_close
--------------------------------------------------------------
2018-02-15 ABC 122.01 125.76 118.79 123.29 NULL
2018-02-16 ABC 123.02 130.62 119.94 128.85 123.29
2018-02-17 ABC 131.03 139.80 129.42 136.75 128.85
2018-02-18 ABC 136.40 137.95 124.32 127.38 136.75
2018-02-19 ABC 127.24 138.52 126.70 137.47 127.38
2018-02-20 ABC 137.95 142.01 127.86 128.36 137.47
2018-02-15 JKL 9.94 10.30 9.77 10.17 NULL
2018-02-16 JKL 10.15 10.24 9.70 10.02 10.17
2018-02-17 JKL 10.01 10.18 9.93 10.15 10.02
2018-02-18 JKL 10.16 10.20 9.23 9.38 10.15
2018-02-19 JKL 9.37 9.79 9.36 9.68 9.38
2018-02-20 JKL 9.69 10.01 9.26 9.28 9.68
How do I get max (Today's High, Yesterday's close)?
You can use case (iif in SQL 2012) to find max or min of two values.
Here's a sample
select
*, ATR = iif([high] > yest_close, [high], yest_close) - iif([low] > yest_close, yest_close, [low])
from (
select
*, yest_close = lag([close]) over (partition by Ticker order by [Datecol])
from #t
) t
Output:
Datecol Ticker open high low close yest_close ATR
------------------------------------------------------------------------
2018-02-15 ABC 122.01 125.76 118.79 123.29 NULL NULL
2018-02-16 ABC 123.02 130.62 119.94 128.85 123.29 10.68
2018-02-17 ABC 131.03 139.80 129.42 136.75 128.85 10.95
2018-02-18 ABC 136.40 137.95 124.32 127.38 136.75 13.63
2018-02-19 ABC 127.24 138.52 126.70 137.47 127.38 11.82
2018-02-20 ABC 137.95 142.01 127.86 128.36 137.47 14.15
2018-02-15 JKL 9.94 10.30 9.77 10.17 NULL NULL
2018-02-16 JKL 10.15 10.24 9.70 10.02 10.17 0.54
2018-02-17 JKL 10.01 10.18 9.93 10.15 10.02 0.25
2018-02-18 JKL 10.16 10.20 9.23 9.38 10.15 0.97
2018-02-19 JKL 9.37 9.79 9.36 9.68 9.38 0.43
2018-02-20 JKL 9.69 10.01 9.26 9.28 9.68 0.75