How to add +/- 1% to a condition? - pandas

ID Open Close SMA EMA
30 UNITY 11.50 11.53 12.576 12.715570
31 UNITY 11.44 11.34 12.399 12.626823
32 UNITY 11.26 11.74 12.273 12.569609
33 UNITY 11.72 11.61 12.150 12.507699
34 UNITY 11.51 11.43 11.994 12.438170
35 UNITY 11.85 11.17 11.844 12.356352
How to make a condition which reflects if the close is +/- 1% or precisely equal to the EMA?
THANKS!

numpy.isclose(a, b, rtol=1e-02)
rtol is the relative tolerance, which is 0.01 for +/- 1%.

Related

Plotting Charts for monthly counts per company

I want to create a program that prints out bar charts or CSV files for monthly counts per company. So I should have a graph for January which has all the companies on the x axis and the counts on the y axis
So I am able to split my date in to month and year and I want that to be the heading. So I am able to program my df table to be this:
Date Modified Company
2019-01 Apple 113 0.0
Blackberry 66 0.0
LG 73 0.0
Linux 115 0.0
Microsoft 187 0.0
Panasonic 336 0.0
Samsung 151 0.0
2019-02 Apple 151 0.0
Blackberry 163 0.0
LG 301 0.0
Linux 108 0.0
Microsoft 199 0.0
Panasonic 142 0.0
Samsung 304 0.0
2019-03 Apple 358 0.0
Blackberry 230 0.0
LG 288 0.0
Linux 464 0.0
Microsoft 53 0.0
Panasonic 113 0.0
Samsung 177 0.0
df['Date Modified']=pd.to_datetime(df['Date']).dt.to_period('M')
df=df.groupby(["Date Modified","Company"]).sum()
print(df)
df = pd.read_csv("Sample_Data.csv")
df['Date Modified']=pd.to_datetime(df['Date']).dt.to_period('M')
df=df.groupby(["Date Modified","Company"]).sum()
So there's currently nothing faulty with this program. I want to create monthly graphs with every company listed on the x axis and the count on the y axis with a title containg the month and year so for e.g 2019-03 or 2019-02
months = df.index.levels[0]
for month in months:
data = df.loc[month]
data.plot(kind='bar', align='center', title =str(month), legend=True)

Normalize time variable for recurrent LSTM Neural Network using Keras

I am using Keras to create an LSTM neural-network that can predict the concentration in the blood of a certain drug. I have a dataset with time stamps on which a drug dosage was administered and when the concentration in the blood was measured. These dosage and measurement time stamps are disjoint. Furthermore several other variables are measured at all time steps (both dosage and measurements). These variables are the input for my model along with the dosages (0 when no dosage was given at time t). The observed concentration in the blood is the response variable.
I have normalized all input features using the MinMaxScaler().
Q1:
Now I am wondering, do I need to normalize the time variable that corresponds with all rows as well and give it as input to the model? Or can I leave this variable out since the time steps are equally spaced?
The data looks like:
PatientID Time Dosage DosageRate DrugConcentration
1 0 100 12 NA
1 309 100 12 NA
1 650 100 12 NA
1 1030 100 12 NA
1 1320 NA NA 12
1 1405 100 12 NA
1 1812 90 8 NA
1 2078 90 8 NA
1 2400 NA NA 8
2 0 120 13.5 NA
2 800 120 13.5 NA
2 920 NA NA 16
2 1515 120 13.5 NA
2 1832 120 13.5 NA
2 2378 120 13.5 NA
2 2600 120 13.5 NA
2 3000 120 13.5 NA
2 4400 NA NA 2
As you can see, the time between two consecutive dosages and measurements differs for a patient and between patients, which makes the problem difficult.
Q2:
One approach I can think of is aggregating on measurements intervals and taking the average dosage and SD between two measurements. Then we only predict on time stamps of which we know the observed drug concentration. Would this work, or would we lose to much information?
Q3
A second approach I could think of is create new data points, so that all intervals between dosages are the same and set the dosage and dosage rate at those time points to zero. The disadvantage is then, that we can only calculate the error on the time stamps on which we know the observed drug concentration. How should we tackle this?

Fuel Consumption and mileage from OBD2 port parameters.

I am computing my fuel consumption from OBD2 parameter. MAF to be specific and I am receiving data on per second basis. Here is an section of my data.
TS RS EngS MAF R MAP EL TD Travel
14:41:22 31 932 1056 98 23978 12130
14:41:23 29 2084 2639 107 23210 12130
14:41:24 32 2154 3867 149 38826 12130
14:41:25 36 2426 4683 184 36266 12130
14:41:26 39 2391 3031 133 682 12130
14:41:27 40 1784 2794 132 30634 12130
14:41:28 42 1864 2853 140 30378 12130
14:41:29 43 1953 2900 132 29098 12130
14:41:30 46 2031 3017 135 29098 12130
14:41:31 45 2027 2969 126 20138 12130
14:41:32 47 2122 4253 174 42154 12130
14:41:33 51 2220 4722 183 20906 12130
Where
TS : Time Stamp,
RS : Road Speed,
EngS : Engine Speed,
MAF R : Mass Air Flow Rate,
MAP Mass Air Pressure,
EL : Engine Load,
TD Travel : Total Distance Traveled
So basically from this data I am trying to compute my Instantaneous Fuel Consumption and The Mileage in KMPL.
For that, Since The Data is per second i am taking MAF of each row and using this formula,
Fuel Consumption = MAF/(14.7*710),
where 14.7 = ideal air/fuel ratio,
and 710 is density of gasoline in grams/L
So, this should give my consumption. and I am calculating the distance(in KM) from RS /3600. And further dividing distance by fuel consumption to get mileage. However the calculation is coming horribly wrong. The mileage of my car is around 14KMPL. Here are my results.
TS Distance (inKM) Fuel Consum(L) Mileage(KMPL)
14:41:22 0.0086111111 0.1008355216 0.0853975957
14:41:23 0.0080555556 0.2519933158 0.0319673382
14:41:24 0.0088888889 0.369252805 0.0240726374
14:41:25 0.01 0.4471711626 0.0223628016
14:41:26 0.0108333333 0.2894246837 0.0374305785
14:41:27 0.0111111111 0.2667939842 0.0416467828
14:41:28 0.0116666667 0.2724277871 0.0428248043
14:41:29 0.0119444444 0.2769157317 0.0431338602
14:41:30 0.0127777778 0.2880878491 0.0443537546
14:41:31 0.0125 0.2835044163 0.0440910239
14:41:32 0.0130555556 0.4061112437 0.0321477323
14:41:33 0.0141666667 0.4508952017 0.0314189785
Can someone tell what am I doing so wrong that the computation is so wrong. As the formulas are simple there isn't much scope to do error.Thank You.
MAF is in g/s
MAF(g/s) * 1/14.7 * 1L/710g = Fuel Consumption in L/s Units
Speed (V) is in KPH (Km/hr) so V(Km/hr) * (1hr/3600s) = v KPS(Km/s)
so FC(L/s) / v (Km/s) = L/Km
you want Km/L so v/Fc so your final formula is
KmPL = V * 1/ 3600 * 1/MAF * 14.7 * 710
Divide the MAF by 14.7 to get Grams of fuel per Sec
next divide by 454 to get lbs fuel/sec
next divide 6.701 to get fuel/sec
multiply by 3600 to get gallons/ hr
other case GPH=MAF*0.0805 next MPG=MPH?GPH

Generate Seaborn Countplot using column value as count

For the following table
count_value
CPUCore Offline_RetentionAge
i7 183 4184
7 1981
30 471
i5 183 2327
7 831
30 250
Pentium 183 333
7 125
30 43
2 183 575
7 236
31 96
Is it possible to generate a seaborn countplot (or normal countplot) like the following (generated using sns.countplot(x='CPUCore', hue="Offline_BackupSchemaIncrementType", data=dfCombined_df))
Problem here is that I need to use the count_value as count, rather then really go and count the Offline_RetentionAge
I think you need seaborn.barplot:
sns.barplot(x="count_value", y="index", hue='Offline_RetentionAge', data=df.reset_index())

SQL Convert Int to Time

i'm trying to import data from a text file to an SQL database. The file is using TAB not , to separate fields. My issue on import has been that when I write the Time which is given as an Int it completely messes up on import.
Part of the text file:
North Felix 2011-07-01 0422 0.47 0012 0.69 2109 0.55 1311 1.44
North Felix 2011-07-02 0459 0.43 0048 0.72 2140 0.55 1342 1.47
North Felix 2011-07-03 0533 0.41 0123 0.75 2213 0.57 1412 1.46
North Felix 2011-07-04 0605 0.41 0158 0.79 2244 0.59 1441 1.41
My query result:
INSERT INTO `dbc`.`history_long` (`Region`,`Location`,`Date`,`LT1-time`,`LT1-height`,`HT1-time`,`HT1-height`,`LT2-time`,`LT2-height`,`HT2-time`,`HT2-height`)
values ('North','Felix','2011:00:00','422:00:00','0.47','12:00:00','0.69','2109:00:00','0.55','1311:00:00','1.44'),
('North','Felix','2011:00:00','459:00:00','0.43','48:00:00','0.72','2140:00:00','0.55','1342:00:00','1.47'),
('North','Felix','2011:00:00','533:00:00','0.41','123:00:00','0.75','2213:00:00','0.57','1412:00:00','1.46'),
('North','Felix','2011:00:00','605:00:00','0.41','158:00:00','0.79','2244:00:00','0.59','1441:00:00','1.41'),
The issue is for example L2-time becomes 2109:00:00 in the time column. Is there a way to convert this from Int to Time?
Here's how you could convert an int like 0422 to a time value:
SELECT CAST('00:00' AS time)
+ INTERVAL (#IntValue DIV 60) HOUR
+ INTERVAL (#IntValue % 60) MINUTE
Without seeing your import query it's impossible to say how you could actually apply it.