Why my calculation does not match metatrader 4 backtest loss? - metatrader4

I am writing and expert advisor and ran a test. And I cannot understand, and find info why loss amount is not matching my calculations. To do backtesting I should get correct results.
So in the image we see there was 0.79 lot position opened at 109.87 closed at 109.939.
Here is the position size calculator showing swap.
But when backtesting I am getting even lower swap. Maybe calculator indicator is showing wrong. With lower swap the gap between expected loss and actual loss is even bigger.
So daily swap eats 6.15 usd. Account is in usd. From jan 22 to feb 6 it has been 17 swaps. Commision as you will see below in log is 3.16.
So my math is 6.15*17+52.5+3.16=160.21
but position loss is 175.48.
I also have printed swap each day:
2020.11.30 20:55:10.728 2020.02.06 00:00:00 adx USDJPY,H1: swap: -98.7114401
2020.11.30 20:55:10.728 2020.02.05 00:00:00 adx USDJPY,H1: swap: -81.29177420000001
2020.11.30 20:55:10.727 2020.02.04 00:00:00 adx USDJPY,H1: swap: -75.48521890000001
2020.11.30 20:55:10.727 2020.02.03 00:00:00 adx USDJPY,H1: swap: -69.67866360000001
2020.11.30 20:55:10.727 2020.01.31 00:00:00 adx USDJPY,H1: swap: -63.87210830000001
2020.11.30 20:55:10.721 2020.01.30 00:00:00 adx USDJPY,H1: swap: -58.06555300000001
2020.11.30 20:55:10.721 2020.01.29 00:00:00 adx USDJPY,H1: swap: -40.6458871
2020.11.30 20:55:10.721 2020.01.28 00:00:00 adx USDJPY,H1: swap: -34.8393318
2020.11.30 20:55:10.720 2020.01.27 00:00:00 adx USDJPY,H1: swap: -29.0327765
2020.11.30 20:55:10.720 2020.01.24 00:00:00 adx USDJPY,H1: swap: -23.2262212
2020.11.30 20:55:10.720 2020.01.23 00:00:00 adx USDJPY,H1: swap: -17.41966590000001
2020.11.30 20:55:10.719 2020.01.22 00:00:00 adx USDJPY,H1: commission -3.16
and it shows about 5.8 for one day. Which would make even bigger gap if I would use it for my calculation.

Related

Calculate weekly retention in google big query

I have a big table in google big query and there are two columns on which I want to perform retention:-
Date user
2021-02-03 08:35:07 UTC foo#abc.com
2021-02-03 08:35:07 UTC foo1#abc.com
2021-02-04 08:35:07 UTC foo2#abc.com
2021-02-05 08:35:07 UTC foo#abc.com
2021-02-03 08:35:07 UTC foo1#abc.com
2021-02-10 08:35:07 UTC foo#abc.com
2021-02-13 08:35:07 UTC foo1#abc.com
2021-02-18 08:35:07 UTC foo3#abc.com
2021-02-21 08:35:07 UTC foo2#abc.com
2021-02-23 08:35:07 UTC foo2#abc.com
2021-02-24 08:35:07 UTC foo5#abc.com
2021-02-24 08:35:07 UTC foo2#abc.com
I want to calculate retention on the below condition:-
percentage of unique users for week1 present in week2
percentage of unique users from week2 present in week3 and so on.
The desired out format will be:-
week2 week3 week4
23% 56% 33%
I want to perform this on a time frame like one month or 6 months and whatever timeframe I choose the output should be in the above format.
I want a solution for Big Query but even a MySQL solution will help me.
Here is a possible solution:
WITH leads AS (
SELECT
user,
EXTRACT(ISOWEEK
FROM
`Date`) AS visit_week,
EXTRACT(ISOWEEK
FROM
LEAD(`Date`) OVER (PARTITION BY user ORDER BY `Date`)) AS next_visit_week
-- here you look the user's next visit and take the week. If the user is there the following week, next_visit_week = visit_week + 1
FROM
`your_project`.`your_dataset`.`your_table`)
SELECT
visit_week+1 AS `week`,
SUM(CASE
WHEN visit_week= next_visit_week-1
THEN 1
ELSE 0
END
)/COUNT(DISTINCT user)*100 AS retention_pct
FROM
leads
GROUP BY
`visit_week`
For each week, you count the number of times the next visit of a user occurs the week following the current week (NB: it can only occur once for each user). You divide the total by the number of distinct users.
You therefore obtain the retention rate for the following week (hence the '+1' in the "visit_week+1 AS week").

Multiindex Value

I've have the following Multiindex and I am trying to get the last time entry for a slice of the MultiIndex.
df.loc['AUDCAD'][-1]
would return 2019-04-30 00:00:00
and
df.loc['USDCHF'][-1]
would return 2021-03-05 23:55:00
open high low close
AUDCAD 2018-12-31 00:00:00 0.95708 0.96276 0.95649 0.95979
2019-01-31 00:00:00 0.96039 0.96309 0.92200 0.94895
2019-02-28 00:00:00 0.94849 0.95800 0.93185 0.93655
2019-03-31 00:00:00 0.93718 0.95632 0.93160 0.94745
2019-04-30 00:00:00 0.94998 0.96147 0.94150 0.94750
USDCHF 2021-03-05 23:35:00 0.93109 0.93119 0.93108 0.93116
2021-03-05 23:40:00 0.93116 0.93150 0.93116 0.93143
2021-03-05 23:45:00 0.93143 0.93147 0.93127 0.93128
2021-03-05 23:50:00 0.93129 0.93134 0.93117 0.93126
2021-03-05 23:55:00 0.93126 0.93141 0.93114 0.93118```
I guess you're looking for :
df.loc[block_name].index[-1]

Find highest and lowest bar number from resample

My dataframe contains 30 minute OHLC data. I need to find out which bar had the highest value, and which one had the lowest value for each day. So for example:
28/05/2018 = the highest value was 1.16329 and it occurred on bar 6 for that day.
29/05/2018 = highest value was 1.159 occuring on bar 2
I have used the following formula which resamples into daily data but then I lose the information on what bar of the day the high and low was acheived.
d3 = df.resample('D').agg({'Open':'first', 'High':'max', 'Low':'min', 'Close':'last'})
Date Time Open High Low Last
28/05/2018 14:30:00 1.16167 1.16252 1.1613 1.16166
28/05/2018 15:00:00 1.16166 1.16287 1.16159 1.16276
28/05/2018 15:30:00 1.16277 1.16293 1.16177 1.16212
28/05/2018 16:00:00 1.16213 1.16318 1.16198 1.16262
28/05/2018 16:30:00 1.16262 1.16298 1.16258 1.16284
28/05/2018 17:00:00 1.16285 1.16329 1.16264 1.16265
28/05/2018 17:30:00 1.16266 1.163 1.16243 1.16289
28/05/2018 18:00:00 1.16288 1.1629 1.16228 1.16269
28/05/2018 18:30:00 1.16269 1.16278 1.16264 1.16274
28/05/2018 19:00:00 1.16275 1.16277 1.1627 1.16275
28/05/2018 19:30:00 1.16276 1.16284 1.1627 1.1628
28/05/2018 20:00:00 1.16279 1.16288 1.16264 1.16278
28/05/2018 20:30:00 1.16278 1.16289 1.1626 1.16265
28/05/2018 21:00:00 1.16267 1.1627 1.16251 1.16262
29/05/2018 14:30:00 1.15793 1.15827 1.15714 1.15786
29/05/2018 15:00:00 1.15785 1.159 1.15741 1.15814
29/05/2018 15:30:00 1.15813 1.15813 1.15601 1.15647
29/05/2018 16:00:00 1.15647 1.15658 1.15451 1.15539
29/05/2018 16:30:00 1.15539 1.15601 1.15418 1.1551
29/05/2018 17:00:00 1.15508 1.15599 1.15463 1.15527
29/05/2018 17:30:00 1.15528 1.15587 1.15442 1.15465
29/05/2018 18:00:00 1.15465 1.15469 1.15196 1.15261
29/05/2018 18:30:00 1.15261 1.15441 1.15261 1.15349
29/05/2018 19:00:00 1.15348 1.15399 1.15262 1.15399
29/05/2018 19:30:00 1.154 1.15412 1.15239 1.15322
29/05/2018 20:00:00 1.15322 1.15373 1.15262 1.15367
29/05/2018 20:30:00 1.15367 1.15419 1.15351 1.15367
29/05/2018 21:00:00 1.15366 1.15438 1.15352 1.15354
29/05/2018 21:30:00 1.15355 1.15355 1.15354 1.15354
30/05/2018 14:30:00 1.16235 1.16323 1.16133 1.16161
30/05/2018 15:00:00 1.16162 1.16193 1.1602 1.16059
Any ideas on how to acheive this?
You could groupby and apply some sorting logic to retain the Time columns, such as:
highs = df.groupby(df.index).apply(lambda x: x.sort_values(by='High').iloc[-1])
lows = df.groupby(df.index).apply(lambda x: x.sort_values(by='Low').iloc[0])
Output:
# Highs
Time Open High Low Last
Date
2018-05-28 17:00:00 1.16285 1.16329 1.16264 1.16265
2018-05-29 15:00:00 1.15785 1.15900 1.15741 1.15814
2018-05-30 14:30:00 1.16235 1.16323 1.16133 1.16161
# Lows
Time Open High Low Last
Date
2018-05-28 14:30:00 1.16167 1.16252 1.16130 1.16166
2018-05-29 18:00:00 1.15465 1.15469 1.15196 1.15261
2018-05-30 15:00:00 1.16162 1.16193 1.16020 1.16059
EDIT
To join then, something like that should do it:
new_df = pd.concat([highs.Time.rename('time_of_high'), lows.Time.rename('time_of_low')], 1)
Output:
time_of_high time_of_low
Date
28/05/2018 17:00:00 14:30:00
29/05/2018 15:00:00 18:00:00
30/05/2018 14:30:00 15:00:00

Pandas resample only when makes sense

I have a time series that is very irregular. The difference in time, between two records can be 1s or 10 days.
I want to resample the data every 1h, but only when the sequential records are less than 1h.
How to approach this, without making too many loops?
In the example above, I would like to resample only rows 5-6 (delta difference is 10s) and rows 6-7 (delta difference is 50min).
The others should remain as they are.
tmp=vals[['datumtijd','filter data']]
datumtijd filter data
0 1970-11-01 00:00:00 129.0
1 1970-12-01 00:00:00 143.0
2 1971-01-05 00:00:00 151.0
3 1971-02-01 00:00:00 151.0
4 1971-03-01 00:00:00 163.0
5 1971-03-01 00:00:10 163.0
6 1971-03-01 00:00:20 163.0
7 1971-03-01 00:01:10 163.0
8 1971-03-01 00:04:10 163.0
.. ... ...
244 1981-08-19 00:00:00 102.0
245 1981-09-02 00:00:00 98.0
246 1981-09-17 00:00:00 92.0
247 1981-10-01 00:00:00 89.0
248 1981-10-19 00:00:00 92.0
You can be a little explicit about this by using groupby on the hour-floor of the time stamps:
grouped = df.groupby(df['datumtijd'].dt.floor('1H')).mean()
This is explicitly looking for the hour of each existing data point and grouping the matching ones.
But you can also just do the resample and then filter out the empty data, as pandas can still do this pretty quickly:
resampled = df.resample('1H', on='datumtijd').mean().dropna()
In either case, you get the following (note that I changed the last time stamp just so that the console would show the hours):
filter data
datumtijd
1970-11-01 00:00:00 129.0
1970-12-01 00:00:00 143.0
1971-01-05 00:00:00 151.0
1971-02-01 00:00:00 151.0
1971-03-01 00:00:00 163.0
1981-08-19 00:00:00 102.0
1981-09-02 00:00:00 98.0
1981-09-17 00:00:00 92.0
1981-10-01 00:00:00 89.0
1981-10-19 03:00:00 92.0
One quick clarification also. In your example, rows 5-8 all occur within the same hour, so they all get grouped together (hour:minute:second)!.
Also, see this related post.

How To Reset Time In DateTime Column Pandas Dataframe

I'm trying to convert date time field to date only, but when I do that the dtype changed to object, so I'm thinking to reset all time in all rows to be zeros
from
Price Volume
2011-08-14 14:14:40 10.4 0.779
2011-08-14 15:15:17 10.4 0.101
2011-08-14 15:15:17 10.4 0.316
2011-08-14 16:45:09 10.5 0.150
2011-08-14 16:45:09 10.5 1.800
to
Price Volume
2011-08-14 00:00:00 10.4 0.779
2011-08-14 00:00:00 10.4 0.101
2011-08-14 00:00:00 10.4 0.316
2011-08-14 00:00:00 10.5 0.150
2011-08-14 00:00:00 10.5 1.800
how can I do that ?
many thanks