Format time in R - posixct

I'm new to R, but need help formatting just one variable in my data frame:
> str(g.2015.1990$START.TIME)
int [1:60464] 712 606 1238 708 709 707 407 707 612 1237 ...
Although R interprets this as an integer, it is a time variable formatted in 3-4 digits (HMM or HHMM depending on the time, where "Hour" ranges from 1:12, not 1:24) with no date attached (although there is a separate variable for date that corresponds with each time). I've read all related posts, googled everything I can think of, watched every YouTube video I could find, and even went out and bought "R for Dummies;" but, nothing has worked. Here is a fraction of the codes I've tried:
timex <- mapply(function(x, y) paste0(rep(x, y), collapse = ""), 0, 4 - nchar(g.2015.1990$START.TIME))
timexx <- paste0(timex, g.2015.1990$START.TIME)
timexxx = format(strptime(timexx, format="%H%M"), format = "%H:%M")
timex = format(as.POSIXct(sprintf("%004.0f",g.2015.1990$START.TIME),format="%H%M"),"%H:%M")
timex = format(strptime(g.2015.1990$START.TIME,"%H%M"),"%H:%M")
timex <- gsub("([0-9]{3})","0\\1",g.2015.1990$START.TIME)
timexx = format(strptime("g.2015.1990$START.TIME","%H%M%S"),"%H:%M:%S")
timex = as.numeric(g.2015.1990$START.TIME)
timex = if(g.2015.1990$START.TIME) < 4 nchar(g.2015.1990$START.TIME) + 1
timex = as.Date(origin = g.2015.1990$START.TIME, format = "%H%M")
timex = as.POSIXct(as.character(g.2015.1990$START.TIME))
timex = strptime("g.2015.1990$DATE, g.2015.1990$START.TIME", "%y%m%d, %H%M")
timex = strftime("g.2015.1990$START.TIME", "%H%M")
timex = as.POSIXct(g.2015.1990$START.TIME, format="%H:%M")
timex = times(paste0("00", as.character(g.2015.1990$START.TIME)))
timex = strptime(g.2015.1990$START.TIME, "%I%M")
timex = as.POSIXlt(g.2015.1990$START.TIME, format = "%I%M", origin = "g.2015.1990$START.TIME")
timex = as.POSIXlt(as.character(g.2015.1990$START.TIME), format = "%I%M")
(Where "timex" is a stand in for the variable in question, so as not to mess up my data)
Some of these are sequences and some are single-lines of code. I can't even keep it straight anymore. Your help is VERY much appreciated.

Related

Can't get dimensions of arrays equal to plot with MatPlotLib

I am trying to create a plot of arrays where one is calculated based on my x-axis calculated in a for loop. I've gone through my code multiple times and tested in between what exactly the lengths are for my arrays, but I can't seem to think of a solution that makes them equal length.
This is the code I have started with:
import numpy as np
import matplotlib.pyplot as plt
a = 1 ;b = 2 ;c = 3; d = 1; e = 2
t0 = 0
t_end = 10
dt = 0.05
t = np.arange(t0, t_end, dt)
n = len(t)
fout = 1
M = 1
Ca = np.zeros(n)
Ca[0] = a; Cb[0] = b
Cc[0] = 0;
k1 = 1
def rA(Ca, Cb, Cc, t):
-k1 * Ca**a * Cb**b * dt
return -k1 * Ca**a * Cb**b * dt
while e > 1e-3:
t = np.arange(t0, t_end, dt)
n = len(t)
for i in range(1,n-1):
Ca[i+1] = Ca[i] + rA(Ca[i], Cb[i], Cc[i], t[i])
e = abs((M-Ca[n-1])/M)
M = Ca[n-1]
dt = dt/2
plt.plot(t, Ca)
plt.grid()
plt.show()
Afterwards, I try to calculate a second function for different y-values. Within the for loop I added:
Cb[i+1] = Cb[i] + rB(Ca[i], Cb[i], Cc[i], t[i])
While also defining rB in a similar manner as rA. The error code I received at this point is:
IndexError: index 200 is out of bounds for axis 0 with size 200
I feel like it has to do with the way I'm initializing the arrays for my Ca. To put it in MatLab code, something I'm more familiar with, looks like this in MatLab:
Ca = zeros(1,n)
I have recreated the code I have written here in MatLab and I do receive a plot. So I'm wondering where I am going wrong here?
So I thought my best course of action was to change n to an int by just changing it in the while loop.
but after changing n = len(t) to n = 100 I received the following error message:
ValueError: x and y must have same first dimension, but have shapes (200,) and (400,)
As my previous question was something trivial I just kept on missing out on, I feel like this is the same. But I have spent over an hour looking and trying fixes without succes.

Code or Logic to find number of char appearances in a string composed of consecutive numbers

I am struggling with this exercise where I have to find a number (y) so that when counting the times (nr) the value "1" appears in a string (x) composed of all the consecutive numbers starting from 1 to y, the following conditions are met: nr=y and nr is divisible by 10.
example:
x (string with consecutive from 1 to 12)= 123456789101112
y (the number) = 12
nr (times of "1" appearances) = 5
so i need to find the situation where nr=y and y mod 10 = 0
I've tried creating a vba sub to do this, but it takes forever and cannot seem to find a suitable result:
Sub abc2()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim i As Double
Dim y As Double
Dim nr As Double
Dim x As String
x = 1
y = 1
For i = 1 To 500001
x = x & (y + 1)
y = y + 1
nr = Len(x) - Len(Replace(x, "1", ""))
If nr = y And nr Mod 10 = 0 Then
Range("E1") = y
GoTo out
End If
Next i
out:
Range("A1") = x
Range("B1") = y
Range("C1") = nr
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
I'd really appreciate some suggestions. Maybe it can be solved in some other ingenious way.
Thank you!
Python:
x = ''
y = 0
####################################
# BRUTE FORCE, FINDS ANSWER 199990 #
####################################
#for iteration in range(100000):
# for index in range(10):
# y+=1
# x+=str(y)
# if (y == x.count('1')):
# print 'Found: ' + str(y) + ': ' + x
####################################
# More elegantly and efficiently, just track how many '1's we've added in each step
ones = 0
for iteration in range(100000):
x = ''
for index in range(10):
y += 1
x += str(y)
ones += x.count('1')
if (y == ones):
print 'Found: ' + str(y)
The commented-out solution takes about 2 minutes to execute. The second solution finishes in .46 seconds.

How to add a column of simple moving average of another column to a Julia data frame

I have a Julia data frame where one column is called 'close' and I want to add another column to the data frame called 'sma' which is a simple moving average of 'close'. Thanks to anyone who can help!
I noticed a problem in the code amrod. It doesn't account for the first length of SMA that doesn't have enough previous data points for a good SMA and also gives double the SMA that is asked for. I changed it to input zeros up to that point, I also changed the variable names when I was figuring out how it works.
function makeSMA(data, SMA)
len = length(data)
y = Vector{Float64}(len)
for i in 1:SMA-1
y[i] = NaN
end
for i in SMA:len
y[i] = mean(data[i-(SMA-1):i])
end
return y
end
check this:
function ma{T <: Real}(x::Vector{T}, wind::Int)
len = length(x)
y = Vector{Float64}(len)
for i in 1:len
lo = max(1, i - wind)
hi = min(len, i + wind)
y[i] = mean(x[lo:hi])
end
return y
end
x = collect(1:100)
y = ma(x, 4)
then you can hcat(x, y).
EDIT:
If you want a backwards-looking MA you can use something like
function ma{T <: Real}(x::Vector{T}, wind::Int)
len = length(x)
y = Vector{Float64}(len)
for i in 1:len
if i < wind
y[i] = NaN
else
y[i] = mean(x[i - wind + 1:i])
end
end
return y
end

how to get the sum of alphabetical characters Shannon entropy

I am trying to add up all the Shannon entropy of all the alphabetical characters in a word document.
Instead of it adding the characters it gives me what I put for character(27) as an answer.
Dim characters(1 To 27) As Double
Dim x As Integer 'for looping
Dim tot As Double 'The final value
characters(1) = 0.1859 'Space
characters(2) = 0.0856 'A
characters(3) = 0.0139 'B
characters(4) = 0.0279 'C
characters(5) = 0.0378 'D
characters(6) = 0.1304 'E
characters(7) = 0.0289 'F
characters(8) = 0.0199 'G
characters(9) = 0.0528 'H
characters(10) = 0.0627 'I
characters(11) = 0.0013 'J
characters(12) = 0.0042 'K
characters(13) = 0.0339 'L
characters(14) = 0.0249 'M
characters(15) = 0.0707 'N
characters(16) = 0.0797 'O
characters(17) = 0.0199 'P
characters(18) = 0.0012 'Q
characters(19) = 0.0677 'R
characters(20) = 0.0607 'S
characters(21) = 0.1045 'T
characters(22) = 0.0249 'U
characters(23) = 0.0092 'V
characters(24) = 0.0149 'W
characters(25) = 0.0017 'X
characters(26) = 0.0199 'Y
characters(27) = 0.0008 'Z
For x = 1 To 27
tot = tol + characters(x)
Next
MsgBox "The Shannon entropy of the alphabetic characters in this document is " & tot
What I am getting
Today was a good day
The Shannon entropy of the characters in this document is 0.0008
What I am trying to get
Today was a good day
The Shannon entropy of the characters in this document is 1.2798542258337
I don't know if you have noticed that you wrote this:
For x = 1 To 27
tot = tol + characters(x)
Next
...while you might have wanted to write this:
For x = 1 To 27
tot = tot + characters(x)
Next
In fact, what you want is that tot is iteratively summing to itself a new value. But if you write
tot = tol + characters(x)
...what happens is that tol is always = 0 (because it doesn't preserve it's value and gets its 0 value by default) and so tot will be obviously equal to 0 + the last element because it does not keep its changes either. It's a typo, just change tot = tol + characters(x) with tot = tot + characters(x) and the code will work.

Which points are co-linear & in sequence (i.e. which two points I am between)

I have longitude and latitude of my position, and I have a list of positions that is ordered as a list of points(long/lat)) along a road. How do I find out which two points I am between?
If I just search for the two nearest points I will end up with P2 and P3 in my case in the picture.
I want to know how to find out that I'm between point p1 and p2.
The list of points I will search will be database rows containing latitude and longitude so pointers to how to build the sql-query, linq-query or pseudo code, everything that points me to the best solution is welcome. I'm new to geolocation and the math around it so treat me as an newbie. ;)
(The list of points will be ordered so P1 will have an id of 1, p2 will have an id of 2 and so on. )
Bear in mind that what you propose might become really complex (many points under equivalent conditions) and thus delivering an accurate algorithm would require (much) more work. Taking care of simpler situations (like the one in your picture) is not so difficult; you have to include the following:
Convert latitude/longitude values into cartesian coordinates (for ease of calculations; although you might even skip this step). In this link you can get some inspiration on this front; it is in C#, but the ideas are clear anyway.
Iterate through all the available points "by couples" and check whether the point to be analysed (Mypos), falls in the line formed by them, in an intermediate position. As shown in the code below, this calculation is pretty simple and thus you don't need to do any pre-filtering (looking for closer points before).
.
Dim point1() As Double = New Double() {0, 0} 'x,y
Dim point2() As Double = New Double() {0, 3}
Dim pointToCheck() As Double = New Double() {0.05, 2}
Dim similarityRatio As Double = 0.9
Dim minValSimilarDistance As Double = 0.001
Dim similarityDistance As Double = 0.5
Dim eq1 As Double = (point2(0) - point1(0)) * (pointToCheck(1) - point1(1))
Dim eq2 As Double = (point2(1) - point1(1)) * (pointToCheck(0) - point1(0))
Dim maxVal As Double = eq1
If (eq2 > eq1) Then maxVal = eq2
Dim inLine = False
Dim isInBetween As Boolean = False
If (eq1 = eq2 OrElse (maxVal > 0 AndAlso Math.Abs(eq1 - eq2) / maxVal <= (1 - similarityRatio))) Then
inLine = True
ElseIf (eq1 <= minValSimilarDistance AndAlso eq2 <= similarityDistance) Then
inLine = True
ElseIf (eq2 <= minValSimilarDistance AndAlso eq1 <= similarityDistance) Then
inLine = True
End If
If (inLine) Then
'pointToCheck part of the line formed by point1 and point2, but not necessarily between them
Dim insideX As Boolean = False
If (pointToCheck(0) >= point1(0) AndAlso pointToCheck(0) <= point2(0)) Then
insideX = True
Else If (pointToCheck(0) >= point2(0) AndAlso pointToCheck(0) <= point1(0)) Then
insideX = True
End If
if(insideX) Then
If (pointToCheck(1) >= point1(1) AndAlso pointToCheck(1) <= point2(1)) Then
isInBetween = True
ElseIf (pointToCheck(1) >= point2(1) AndAlso pointToCheck(1) <= point1(1)) Then
isInBetween = True
End If
End If
End If
If (isInBetween) Then
'pointToCheck is between point1 and point2
End If
As you can see, I have included various ratios allowing you to tweak the exact conditions (the points will, most likely, not be falling exactly in the line). similarityRatio accounts for "equations" being more or less similar (that is, X and Y values not exactly fitting within the line but close enough). similarityRatio cannot deal properly with cases involving zeroes (e.g., same X or Y), this is what minValSimilarDistance and similarityDistance are for. You can tune these values or just re-define the ratios (with respect to X/Y variations between points, instead of with respect to the "equations").
An equivalent solution in Scala for clarity:
def colinearAndInOrder(a: Point, b: Point, c: Point) = {
lazy val colinear: Boolean =
math.abs((a.lng - b.lng) * (a.lat - c.lat) -
(a.lng - c.lng) * (a.lat - b.lat)) <= 1e-9
lazy val bounded: Boolean =
((a.lat < b.lat && b.lat < c.lat) || (a.lat > b.lat && b.lat > c.lat)) &&
((a.lng < b.lng && b.lng < c.lng) || (a.lng > b.lng && b.lng > c.lng))
close(a,b) || close(b,c) || (colinear && bounded)
}
def close(a: Point, b: Point): Boolean = {
math.abs(a.lat - b.lng) <= 1e-4 && math.abs(a.lat - b.lng) <= 1e-4
}