perigee and apogee calculations off by a few minutes - skyfield

I'm trying to calculate perigee and apogee (or apsis in general given a second body such as the Sun, and planet, etc)
from skyfield import api, almanac
from scipy.signal import argrelextrema
import numpy as np
e = api.load('de430t.bsp')
def apsis(year = 2019, body='moon'):
apogees = dict()
perigees = dict()
planets = e
earth, moon = planets['earth'], planets[body]
t = ts.utc(year, 1, range(1,367))
dt = t.utc_datetime()
astrometric = earth.at(t).observe(moon)
_, _, distance = astrometric.radec()
#find perigees, at day precision
localmaxes = argrelextrema(distance.km, np.less)[0]
for i in localmaxes:
# get minute precision
t2 = ts.utc(dt[i].year, dt[i].month, dt[i].day-1, 0, range(2881))
dt2 = t2.utc_datetime() # _and_leap_second()
astrometric2 = earth.at(t2).observe(moon)
_, _, distance2 = astrometric2.radec()
m = min(distance2.km)
daindex = list(distance2.km).index(m)
perigees[dt2[daindex]] = m
#find apogees, at day precision
localmaxes = argrelextrema(distance.km, np.greater)[0]
for i in localmaxes:
# get minute precision
t2 = ts.utc(dt[i].year, dt[i].month, dt[i].day-1, 0, range(2881))
dt2 = t2.utc_datetime()
astrometric2 = earth.at(t2).observe(moon)
_, _, distance2 = astrometric2.radec()
m = max(distance2.km)
daindex = list(distance2.km).index(m)
apogees[dt2[daindex]] = m
return apogees, perigee
When I run this for 2019, the next apogee calculates out at 2019-09-13 13:16. This differs by a few minutes from tables such as John Walker's (13:33), Fred Espenak's (13:32), Time and Date dot com (13:32).
I'd expect difference of a minute as seen above in the other sources for reasons such as rounding vs truncation of seconds, but more than 15 minutes difference seems unusual. I've tried this with de431t and de421 ephemeris with similar results.
Whats the difference here? I'm calculating distance of the center of each body, right? What am I screwing up?

After a bit more research and comparing skyfield output to the output of JPL's Horizons, it appears that Skyfield is correct in its calculations, at least against the JPL ephemeris (not surprise there)
I switched the above code snippet to use the same (massive) de432t SPICE kernel used by HORIZONS. This lines up with HORIZONS output (see below, apogees reported by various sources marked), the Moon begins moving away (deldot or range-rate between the observer (geocentric Earth) and the target body (geocentric Moon) goes negative
Ephemeris / WWW_USER Fri Sep 13 17:05:39 2019 Pasadena, USA / Horizons
*******************************************************************************
Target body name: Moon (301) {source: DE431mx}
Center body name: Earth (399) {source: DE431mx}
Center-site name: GEOCENTRIC
*******************************************************************************
Start time : A.D. 2019-Sep-13 13:10:00.0000 UT
Stop time : A.D. 2019-Sep-13 13:35:00.0000 UT
Step-size : 1 minutes
*******************************************************************************
Target pole/equ : IAU_MOON {East-longitude positive}
Target radii : 1737.4 x 1737.4 x 1737.4 km {Equator, meridian, pole}
Center geodetic : 0.00000000,0.00000000,0.0000000 {E-lon(deg),Lat(deg),Alt(km)}
Center cylindric: 0.00000000,0.00000000,0.0000000 {E-lon(deg),Dxy(km),Dz(km)}
Center pole/equ : High-precision EOP model {East-longitude positive}
Center radii : 6378.1 x 6378.1 x 6356.8 km {Equator, meridian, pole}
Target primary : Earth
Vis. interferer : MOON (R_eq= 1737.400) km {source: DE431mx}
Rel. light bend : Sun, EARTH {source: DE431mx}
Rel. lght bnd GM: 1.3271E+11, 3.9860E+05 km^3/s^2
Atmos refraction: NO (AIRLESS)
RA format : HMS
Time format : CAL
EOP file : eop.190912.p191204
EOP coverage : DATA-BASED 1962-JAN-20 TO 2019-SEP-12. PREDICTS-> 2019-DEC-03
Units conversion: 1 au= 149597870.700 km, c= 299792.458 km/s, 1 day= 86400.0 s
Table cut-offs 1: Elevation (-90.0deg=NO ),Airmass (>38.000=NO), Daylight (NO )
Table cut-offs 2: Solar elongation ( 0.0,180.0=NO ),Local Hour Angle( 0.0=NO )
Table cut-offs 3: RA/DEC angular rate ( 0.0=NO )
*******************************************************************************
Date__(UT)__HR:MN delta deldot
***************************************************
$$SOE
2019-Sep-13 13:10 0.00271650099697 0.0000340
2019-Sep-13 13:11 0.00271650100952 0.0000286
2019-Sep-13 13:12 0.00271650101990 0.0000232
2019-Sep-13 13:13 0.00271650102812 0.0000178
2019-Sep-13 13:14 0.00271650103417 0.0000124
2019-Sep-13 13:15 0.00271650103805 0.0000070
2019-Sep-13 13:16 0.00271650103977 0.0000016 <----- Skyfield, HORIZONS
2019-Sep-13 13:17 0.00271650103932 -0.0000038
2019-Sep-13 13:18 0.00271650103670 -0.0000092
2019-Sep-13 13:19 0.00271650103191 -0.0000146
2019-Sep-13 13:20 0.00271650102496 -0.0000200
2019-Sep-13 13:21 0.00271650101585 -0.0000254
2019-Sep-13 13:22 0.00271650100456 -0.0000308
2019-Sep-13 13:23 0.00271650099112 -0.0000362
2019-Sep-13 13:24 0.00271650097550 -0.0000416
2019-Sep-13 13:25 0.00271650095772 -0.0000470
2019-Sep-13 13:26 0.00271650093778 -0.0000524
2019-Sep-13 13:27 0.00271650091566 -0.0000578
2019-Sep-13 13:28 0.00271650089139 -0.0000632
2019-Sep-13 13:29 0.00271650086494 -0.0000686
2019-Sep-13 13:30 0.00271650083633 -0.0000740
2019-Sep-13 13:31 0.00271650080556 -0.0000794
2019-Sep-13 13:32 0.00271650077262 -0.0000848 <------ Espenak, T&D.com
2019-Sep-13 13:33 0.00271650073751 -0.0000902
2019-Sep-13 13:34 0.00271650070024 -0.0000956
2019-Sep-13 13:35 0.00271650066081 -0.0001010
$$EOE
Looking at Espenak's page a bit more, his calculations are based on Jean Meeus' Astronomical Algorithms book (a must have for anyone who plays with this stuff). Lunar ephemeris in that book comes from Jean Chapront's ELP2000/82. While this has been fitted into DE430 (among others),
Sure enough, when using that ELP2000 model to find the maximum lunar distance today Sept 13 2019. You get 2019-09-13 13:34. See code below.
Meeus based his formulae on the 1982 version of Ephemeride Lunaire Parisienne and the source code below leverages the 2002 update by Chapront, but is pretty much what those other sources are coming up with.
So I think my answer is, they are different answers because they are using different models. Skyfield is leveraging the models represented as numerical integrations by the JPL Development ephemeris while ELP is a more analytical approach.
In the end I realize it's a nit-pick, I just wanted to better understand the tools I'm using. But it begs the question, which approach is more accurate?
From what I've read, DE430 and its isotopes, have been fit to observational data, namely Lunar Laser Ranging (LLR) measurement. If just for that LLR consideration, I think I'll stick with Skyfield for calculating lunar distance.
from elp_mpp02 import mpp02 as mpp
import julian
import pytz
import datetime
def main():
mpp.dataDir = 'ELPmpp02'
mode = 1 # Historical mode
jd = 2451545
data = dict()
maxdist = 0
apogee = None
for x in range(10,41):
dt = datetime.datetime(2019, 9, 13, 13, x, tzinfo=pytz.timezone("UTC"))
jd = julian.to_jd(dt, fmt='jd')
lon, lat, dist = mpp.compute_lbr(jd, mode)
if dist > maxdist:
maxdist = dist
apogee = dt
print(f"{maxdist:.2} {apogee}")

Related

Suitable Clustering Approach

I've got a total of 9 sensors in the ground, which measure the water content of the soil. 1-3 are in a depth of 1m, 4-6 are in a depth of 2m and sensors 7-9 are in a depth of 3m.
My dataset also contains the precipiation of the location. It is hourly data:
Time
Sensor-ID
Precipitation
Soil Water Content
2022-01-01 11:00
1
74
120
2022-01-01 11:00
2
74
100
2022-01-01 11:00
3
74
110
...
...
...
...
2022-01-01 11:00
9
74
30
The goal now is to find out if the different ground / soil depths behave differently regarding the water content after raining (over time).
I thought about a clustering method to find out if the sensors can be clustered based on the data and confirm this. Since I'm not very experienced in data science, would that be the right approach and is it even possible to analyse it with clustering?
For clustering, you can add a new column with three new classes to your data - for 1-3 sensors : Class 1, for 4-6 sensors : Class 2, for 7-9 sensors : Class 3 and perform your analysis using the new classes. Either can be done using Python, Power BI or Excel.
You should start by analyzing different variables w.r.t to the sensors at different ground depths: Use univariate, Bi-Variate and Multi-Variate plots to derive your goal.

Loop over pandas dataframe to create multiple networks

I have data of countries trade with one another. I have split the main file according to months and got 12 csv files for the year 2019. A sample of the data of January csv is provided below:
reporter partner year month trade
0 Albania Argentina 2019 01 515256
1 Albania Australia 2019 01 398336
2 Albania Austria 2019 01 7664503
3 Albania Bahrain 2019 01 400
4 Albania Bangladesh 2019 01 653907
5 Zimbabwe Zambia 2019 01 79569855
I want to make complex network for every month and print the number of nodes of every network. Now I can do it the hard (stupid) way like so.
df01 = pd.read_csv('012019.csv')
df02 = pd.read_csv('022019.csv')
df03 = pd.read_csv('032019.csv')
df1= df01[['reporter','partner', 'trade']]
df2= df02[['reporter','partner', 'trade']]
df3= df03[['reporter','partner', 'trade']]
G1 = nx.Graph()
G1 = nx.from_pandas_edgelist(df1, 'reporter', 'partner', edge_attr='trade')
G1.number_of_nodes()
and so on for the next networks.
My question is how can I use a "for loop" to read the files, convert them to networks from dataframe and report the number of nodes of each node.
I tried this but nothing is reported.
for f in glob.glob('.csv'):
df = pd.read_csv(f)
df1 = df[['reporter','partner', 'trade']]
G = nx.from_pandas_edgelist(df1, 'reporter', 'partner', edge_attr='trade')
G.number_of_nodes()
Thanks.
Edit:
Ok. So I managed to do the above using similar codes like below:
for files in glob.glob('/home/user/VMShared/network/2nd/*.csv'):
df = pd.read_csv(files)
df1=df[['reporter','partner', 'import']]
G = nx.Graph()
G = nx.from_pandas_edgelist(df1, 'reporter', 'partner', edge_attr='import')
nx.write_graphml_lxml(G, "/home/user/VMShared/network/2nd/*.graphml")
The problem that I now face is how to write separate files. All I get from this is one file titled *.graphml. How can I get graphml files for every input file? Also if I can get the same graphml output name as the input file would be a plus.

Is there an equivalent to numpy.digitize that works on an pandas.IntervalIndex?

I need to match each hour of the month to a monthly total for the month that the hour falls in.
I am passed a DataFrame (monthly_totals) with a time-based pandas.IntervalIndex, and a second DataFrame (hours) with a pandas.DatetimeIndex. More generally, I need to match the index of one DataFrame, to the interval of another DataFrame that each entry falls into.
I have a working solution, using pandas.Series.apply, but it is quite slow. I see that numpy.digitize exists, and it taunts me, because the bins parameter must be an array, not an IntervalIndex.
My first attempt, which works, but takes about 1 second to process a DataFrame of length 8760, is as follows:
def get_mock_montly_totals(self):
start = '2018-07-01'
end = '2019-07-01'
hourly_rng = pd.date_range(start, end, freq='H')
monthly_rng = pd.date_range(start, end, freq='MS')
mock_series = pd.Series(1, index=hourly_rng)
bins = (monthly_rng + pd.offsets.Day(pd.Timestamp(start).day - 1))
cuts = pd.cut(mock_series.index, bins, right=False)
groups = mock_series.groupby(cuts)
monthly_totals = groups.sum()
return monthly_totals
def get_interval_value(self, frame, key):
try:
return frame.iloc[frame.index.get_loc(key)]
except KeyError:
return np.nan
result = api.get_secret_data().resample('H').asfreq()
hours = result.index.to_series()
monthly_totals = self.get_mock_montly_totals()
# This line takes over a second to run, which is too slow.
result['monthly_totals'] = hours.apply(
lambda h: self.get_interval_value(monthly_totals, h))
Where monthly_totals looks like:
[2018-07-01, 2018-08-01) 744
[2018-08-01, 2018-09-01) 744
[2018-09-01, 2018-10-01) 720
[2018-10-01, 2018-11-01) 744
[2018-11-01, 2018-12-01) 720
[2018-12-01, 2019-01-01) 744
[2019-01-01, 2019-02-01) 744
[2019-02-01, 2019-03-01) 672
[2019-03-01, 2019-04-01) 744
[2019-04-01, 2019-05-01) 720
[2019-05-01, 2019-06-01) 744
[2019-06-01, 2019-07-01) 720
dtype: int64
hours looks like:
time
2018-06-27 00:00:00-10:00 2018-06-27 10:00:00
...
2019-06-24 21:00:00-10:00 2019-06-25 07:00:00
And the output, result['monthly_totals'] should look like:
time
2018-06-27 00:00:00-10:00 NaN
...
2019-06-24 20:00:00-10:00 720
2019-06-24 21:00:00-10:00 720
Again, my solution works, but the call to apply seems to make it hella slow. So I really want some help getting towards a cleaner solution that ditches that. Thank you!

Solar energy conversion w/m^2 to mj/m^2

i am new here, I am using MERRA monthly solar radiation data. I want to convert w/M^2 to MJ/m^2
I am bit confused, how to convert solar radiation monthly average data W/m^2 to MJ/m^2
so far i understood by reading different sources,
Firstly i have to convert w/m^2 to kw/m^2
after kw/m^2 to mj/m^2 .......
Am i doing correctly
Just i am taking one instance:
For may month i have value 294 w/m^2
So 294 * 0.001 = 0.294 kw/m^2
0.294 * 24 (kw to kwh (m^/day)) = 7.056 kwh/m^2/day
7.056 * 3.6 (kwh to mj) = 25.40 mj/day
i am confused i am doing right or wrong .
Not sure why you would take the kWh step in between.
Your panels do 294 Watt per m², i.e. 294 Joule per sec per m². So that's 24*60*60 * 294 = 25401600 Joule per m² per day, or 25.4016 MJ per m² per day.
So if:
1 W/m2 = 1 J/m2 s
Then:
294 W/m2 = 294 J/m2 s
if you want it in days then:
1 day = 60s * 60min *24h = 86400s
294 J/m2 s x 86000s/1day = 25284000 J/m2 day
25284000 J/m2 day x 1MJ/1000000J = 25.284 MJ/m2 day
all together:
294 W/m2 = 294/(1000000/86400) = 25.4016 MJ/m2 day
A watt is the unit of power and Joules are the units of energy, they are related by time. 1 watt is 1 Joule per second 1W = 1 J/s. So the extension of that equation is that 1J = 1w x 1second. 1J = 1Ws. A loose analogy is if you say Litre is a unit of volume and L/S is a unit of flow. So your calculation needs to consider how long you are gathering the solar energy. So the number of Joules, if the sunlight shines at 90degrees to the solar panel for 1 hour is 294W/m2 x 3600s and would give ~1 x 10^7 joules per square metre. Of course as the inclination [the angle of light] varies away from 90 degrees, this will cause the effective power and hence the energy absorbed to drop, as a function of the sine of the angle to the sun. 90 degrees gives a sine of 1 and is full power.

How to pull EOD stock data from yahoo finance for excatly last 20 WORKING Days using Pandas in Python 2.7

Right now what I am doing is to pull data for the last 30 days, store this in a dataframe and then pick the data for the last 20 days to use. However If one of the days in the last 20 days is a holiday, then Yahoo shows the Volume across that day as 0 and fills the OHLC(Open, High, Low, Close, Adj Close) with the Adj Close of the previous day. In the example shown below, the data for 2016-01-26 is invalid and I dont want to retreive this data.
So how do I pull data from Yahoo for excatly the last 20 working days ?
My present code is below:
from datetime import date, datetime, timedelta
import pandas_datareader.data as web
todays_date = date.today()
n = 30
date_n_days_ago = date.today() - timedelta(days=n)
yahoo_data = web.DataReader('ACC.NS', 'yahoo', date_n_days_ago, todays_date)
yahoo_data_20_day = yahoo_data.tail(20)
IIUC you can add filter, where column Volume is not 0:
from datetime import date, datetime, timedelta
import pandas_datareader.data as web
todays_date = date.today()
n = 30
date_n_days_ago = date.today() - timedelta(days=n)
yahoo_data = web.DataReader('ACC.NS', 'yahoo', date_n_days_ago, todays_date)
#add filter - get data, where column Volume is not 0
yahoo_data = yahoo_data[yahoo_data.Volume != 0]
yahoo_data_20_day = yahoo_data.tail(20)
print yahoo_data_20_day
Open High Low Close Volume Adj Close
Date
2016-01-20 1218.90 1229.00 1205.00 1212.25 156300 1206.32
2016-01-21 1225.00 1236.95 1211.25 1228.45 209200 1222.44
2016-01-22 1239.95 1256.65 1230.05 1241.00 123200 1234.93
2016-01-25 1250.00 1263.50 1241.05 1245.00 124500 1238.91
2016-01-27 1249.00 1250.00 1228.00 1230.35 112800 1224.33
2016-01-28 1232.40 1234.90 1208.00 1214.95 134500 1209.00
2016-01-29 1220.10 1253.50 1216.05 1240.05 254400 1233.98
2016-02-01 1245.00 1278.90 1240.30 1271.85 210900 1265.63
2016-02-02 1266.80 1283.00 1253.05 1261.35 204600 1255.18
2016-02-03 1244.00 1279.00 1241.45 1248.95 191000 1242.84
2016-02-04 1255.25 1277.40 1253.20 1270.40 205900 1264.18
2016-02-05 1267.05 1286.00 1259.05 1271.40 231300 1265.18
2016-02-08 1271.00 1309.75 1270.15 1280.60 218500 1274.33
2016-02-09 1271.00 1292.85 1270.00 1279.10 148600 1272.84
2016-02-10 1270.00 1278.25 1250.05 1265.85 256800 1259.66
2016-02-11 1250.00 1264.70 1225.50 1234.00 231500 1227.96
2016-02-12 1234.20 1242.65 1199.10 1221.05 212000 1215.07
2016-02-15 1230.00 1268.70 1228.35 1256.55 130800 1250.40
2016-02-16 1265.00 1273.10 1225.00 1227.80 144700 1221.79
2016-02-17 1222.80 1233.50 1204.00 1226.05 165000 1220.05