Given an array, verify from the first element how many steps are needed to reach the end.
Example: arr = [1, 3, 5, 8, 4, 2, 6, 7, 0, 7, 9]
1 -> 3 -> 8 (this is the shortest path)
3 steps.
So far, i have this code from geeks for geeks:
def jumpCount(x, n):
jumps = [0 for i in range(n)]
if (n == 0) or (x[0] == 0):
return float('inf')
jumps[0] = 0
for i in range(1, n):
jumps[i] = float('inf')
for j in range(i):
if (i <= j + x[j]) and (jumps[j] != float('inf')):
jumps[i] = min(jumps[i], jumps[j] + 1)
break
return jumps[n-1]
def jumps(x):
n = len(x)
return jumpCount(x,n)
x = [1, 3, 5, 8, 4, 2, 6, 7, 0, 7, 9]
print(jumps(x))
But I want to print out what numbers made the shortest path (1-3-8). How can I adapt the code to do it?
I tried to create a list of j's but since 5 is tested in the loop, it's appended too.
Link to the problem:
https://www.geeksforgeeks.org/minimum-number-of-jumps-to-reach-end-of-a-given-array/
The essential idea is that you need an auxiliary structure to help you keep track of the minimum path. Those type of structures are usually called "backpointers" (you could call them in our case "forwardpointers" since we are going forward, duh). My code solves the problem recursively, but the same could be done iteratively. The strategy is as follows:
jumps_vector = [ 1, 3, 5, 8, 4, 2, 6, 7, 0, 7, 9 ]
"""
fwdpointers holds the relative jump size to reach the minimum number of jumps
for every component of the original vector
"""
fwdpointers = {}
def jumps( start ):
if start == len( jumps_vector ) - 1:
# Reached the end
return 0
if start > len( jumps_vector ) - 1:
# Cannot go through that path
return math.inf
if jumps_vector[ start ] == 0:
# Cannot go through that path (infinite loop with itself)
return math.inf
# Get the minimum in a traditional way
current_min = jumps( start + 1 )
fwdpointers[ start ] = start + 1
for i in range( 2, jumps_vector[ start ] + 1 ):
aux_min = jumps( start + i )
if current_min > aux_min:
# Better path. Update minimum and fwdpointers
current_min = aux_min
# Store the (relative!) index of where I jump to
fwdpointers[ start ] = i
return 1 + current_min
In this case, the variable fwdpointers stores the relative indexes of where I jump to. For instance, fwdpointers[ 0 ] = 1, since I will jump to the adjacent number, but fwdpointers[ 1 ] = 2 since I will jump two numbers the next jump.
Having done that, then it's only a matter of postprocessing things a bit on the main() function:
if __name__ == "__main__":
min_jumps = jumps( 0 )
print( min_jumps )
# Holds the index of the jump given such that
# the sequence of jumps are the minimum
i = 0
# Remember that the contents of fwdpointers[ i ] are the relative indexes
# of the jump, not the absolute ones
print( fwdpointers[ 0 ] )
while i in fwdpointers and i + fwdpointers[ i ] < len( jumps_vector ):
print( jumps_vector[ i + fwdpointers[ i ] ] )
# Get the index of where I jump to
i += fwdpointers[ i ]
jumped_to = jumps_vector[ i ]
I hope this answered your question.
EDIT: I think the iterative version is more readable:
results = {}
backpointers = {}
def jumps_iter():
results[ 0 ] = 0
backpointers[ 0 ] = -1
for i in range( len( jumps_vector ) ):
for j in range( 1, jumps_vector[ i ] + 1 ):
if ( i + j ) in results:
results[ i + j ] = min( results[ i ] + 1, results[ i + j ] )
if results[ i + j ] == results[ i ] + 1:
# Update where I come from
backpointers[ i + j ] = i
elif i + j < len( jumps_vector ):
results[ i + j ] = results[ i ] + 1
# Set where I come from
backpointers[ i + j ] = i
return results[ len( jumps_vector ) - 1 ]
And the postprocessing:
i = len( jumps_vector ) - 1
print( jumps_vector[ len( jumps_vector ) - 1 ], end = " " )
while backpointers[ i ] >= 0:
print( jumps_vector[ backpointers[ i ] ], end = " " )
i = backpointers[ i ]
print()
Related
Consider the following lists short_list and long_list
short_list = list('aaabaaacaaadaaac')
np.random.seed([3,1415])
long_list = pd.DataFrame(
np.random.choice(list(ascii_letters),
(10000, 2))
).sum(1).tolist()
How do I calculate the cumulative count by unique value?
I want to use numpy and do it in linear time. I want this to compare timings with my other methods. It may be easiest to illustrate with my first proposed solution
def pir1(l):
s = pd.Series(l)
return s.groupby(s).cumcount().tolist()
print(np.array(short_list))
print(pir1(short_list))
['a' 'a' 'a' 'b' 'a' 'a' 'a' 'c' 'a' 'a' 'a' 'd' 'a' 'a' 'a' 'c']
[0, 1, 2, 0, 3, 4, 5, 0, 6, 7, 8, 0, 9, 10, 11, 1]
I've tortured myself trying to use np.unique because it returns a counts array, an inverse array, and an index array. I was sure I could these to get at a solution. The best I got is in pir4 below which scales in quadratic time. Also note that I don't care if counts start at 1 or zero as we can simply add or subtract 1.
Below are some of my attempts (none of which answer my question)
%%cython
from collections import defaultdict
def get_generator(l):
counter = defaultdict(lambda: -1)
for i in l:
counter[i] += 1
yield counter[i]
def pir2(l):
return [i for i in get_generator(l)]
def pir3(l):
return [i for i in get_generator(l)]
def pir4(l):
unq, inv = np.unique(l, 0, 1, 0)
a = np.arange(len(unq))
matches = a[:, None] == inv
return (matches * matches.cumsum(1)).sum(0).tolist()
setup
short_list = np.array(list('aaabaaacaaadaaac'))
functions
dfill takes an array and returns the positions where the array changes and repeats that index position until the next change.
# dfill
#
# Example with short_list
#
# 0 0 0 3 4 4 4 7 8 8 8 11 12 12 12 15
# [ a a a b a a a c a a a d a a a c]
#
# Example with short_list after sorting
#
# 0 0 0 0 0 0 0 0 0 0 0 0 12 13 13 15
# [ a a a a a a a a a a a a b c c d]
argunsort returns the permutation necessary to undo a sort given the argsort array. The existence of this method became know to me via this post.. With this, I can get the argsort array and sort my array with it. Then I can undo the sort without the overhead of sorting again.
cumcount will take an array sort it, find the dfill array. An np.arange less dfill will give me cumulative count. Then I un-sort
# cumcount
#
# Example with short_list
#
# short_list:
# [ a a a b a a a c a a a d a a a c]
#
# short_list.argsort():
# [ 0 1 2 4 5 6 8 9 10 12 13 14 3 7 15 11]
#
# Example with short_list after sorting
#
# short_list[short_list.argsort()]:
# [ a a a a a a a a a a a a b c c d]
#
# dfill(short_list[short_list.argsort()]):
# [ 0 0 0 0 0 0 0 0 0 0 0 0 12 13 13 15]
#
# np.range(short_list.size):
# [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
#
# np.range(short_list.size) -
# dfill(short_list[short_list.argsort()]):
# [ 0 1 2 3 4 5 6 7 8 9 10 11 0 0 1 0]
#
# unsorted:
# [ 0 1 2 0 3 4 5 0 6 7 8 0 9 10 11 1]
foo function recommended by #hpaulj using defaultdict
div function recommended by #Divakar (old, I'm sure he'd update it)
code
def dfill(a):
n = a.size
b = np.concatenate([[0], np.where(a[:-1] != a[1:])[0] + 1, [n]])
return np.arange(n)[b[:-1]].repeat(np.diff(b))
def argunsort(s):
n = s.size
u = np.empty(n, dtype=np.int64)
u[s] = np.arange(n)
return u
def cumcount(a):
n = a.size
s = a.argsort(kind='mergesort')
i = argunsort(s)
b = a[s]
return (np.arange(n) - dfill(b))[i]
def foo(l):
n = len(l)
r = np.empty(n, dtype=np.int64)
counter = defaultdict(int)
for i in range(n):
counter[l[i]] += 1
r[i] = counter[l[i]]
return r - 1
def div(l):
a = np.unique(l, return_counts=1)[1]
idx = a.cumsum()
id_arr = np.ones(idx[-1],dtype=int)
id_arr[0] = 0
id_arr[idx[:-1]] = -a[:-1]+1
rng = id_arr.cumsum()
return rng[argunsort(np.argsort(l))]
demonstration
cumcount(short_list)
array([ 0, 1, 2, 0, 3, 4, 5, 0, 6, 7, 8, 0, 9, 10, 11, 1])
time testing
code
functions = pd.Index(['cumcount', 'foo', 'foo2', 'div'], name='function')
lengths = pd.RangeIndex(100, 1100, 100, 'array length')
results = pd.DataFrame(index=lengths, columns=functions)
from string import ascii_letters
for i in lengths:
a = np.random.choice(list(ascii_letters), i)
for j in functions:
results.set_value(
i, j,
timeit(
'{}(a)'.format(j),
'from __main__ import a, {}'.format(j),
number=1000
)
)
results.plot()
Here's a vectorized approach using custom grouped range creating function and np.unique for getting the counts -
def grp_range(a):
idx = a.cumsum()
id_arr = np.ones(idx[-1],dtype=int)
id_arr[0] = 0
id_arr[idx[:-1]] = -a[:-1]+1
return id_arr.cumsum()
count = np.unique(A,return_counts=1)[1]
out = grp_range(count)[np.argsort(A).argsort()]
Sample run -
In [117]: A = list('aaabaaacaaadaaac')
In [118]: count = np.unique(A,return_counts=1)[1]
...: out = grp_range(count)[np.argsort(A).argsort()]
...:
In [119]: out
Out[119]: array([ 0, 1, 2, 0, 3, 4, 5, 0, 6, 7, 8, 0, 9, 10, 11, 1])
For getting the count, few other alternatives could be proposed with focus on performance -
np.bincount(np.unique(A,return_inverse=1)[1])
np.bincount(np.fromstring('aaabaaacaaadaaac',dtype=np.uint8)-97)
Additionally, with A containing single-letter characters, we could get the count simply with -
np.bincount(np.array(A).view('uint8')-97)
Besides defaultdict there are a couple of other counters. Testing a slightly simpler case:
In [298]: from collections import defaultdict
In [299]: from collections import defaultdict, Counter
In [300]: def foo(l):
...: counter = defaultdict(int)
...: for i in l:
...: counter[i] += 1
...: return counter
...:
In [301]: short_list = list('aaabaaacaaadaaac')
In [302]: foo(short_list)
Out[302]: defaultdict(int, {'a': 12, 'b': 1, 'c': 2, 'd': 1})
In [303]: Counter(short_list)
Out[303]: Counter({'a': 12, 'b': 1, 'c': 2, 'd': 1})
In [304]: arr=[ord(i)-ord('a') for i in short_list]
In [305]: np.bincount(arr)
Out[305]: array([12, 1, 2, 1], dtype=int32)
I constructed arr because bincount only works with ints.
In [306]: timeit np.bincount(arr)
The slowest run took 82.46 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.63 µs per loop
In [307]: timeit Counter(arr)
100000 loops, best of 3: 13.6 µs per loop
In [308]: timeit foo(arr)
100000 loops, best of 3: 6.49 µs per loop
I'm guessing it would hard to improve on pir2 based on default_dict.
Searching and counting like this are not a strong area for numpy.
As the title says,
I want to solve a problem similar to the summation of multiple schemes into a fixed constant, However, when I suggest the constrained optimization model, I can't get all the basic schemes well. Part of the opinion is to add a constraint when I get a solution. However, the added constraint leads to incomplete solution and no addition leads to a dead cycle.
Here is my problem description
I have a list of benchmark data detail_list ,My goal is to select several numbers from the benchmark data list(detail_list), but not all of them, so that the sum of these data can reach the sum of the number(plan_amount) I want.
For Examle
detail_list = [50, 100, 80, 40, 120, 25],
plan_amount = 20,
The feasible schemes are:
detail_list[2]=20 can be satisfied, detail_list[1](noly 10) + detail_list[3](only 10) = plan_amount(20) , detail_list[1](only 5) + detail_list[3](only 15) = plan_amount(20) also can be satisfied, and detail_list1 + detail_list2 + detail_list3 = plan_amount(20). But you can't take four elements in the detail_list are combined, because number = 3, indicating that a maximum of three elements are allowed to be combined.
from pulp import *
num = 6 # the list max length
number_max = 3 # How many combinations can there be at most
plan_amount = 20
detail_list = [50, 100, 80, 40, 120, 25] # Basic data
plan_model = LpProblem("plan_model")
alpha = [LpVariable("alpha_{0}".format(i+1), cat="Binary") for i in range(num)]
upBound_num = [int(detail_list_money) for detail_list_money in detail_list]
num_channel = [
LpVariable("fin_money_{0}".format(i+1), lowBound=0, upBound=upBound_num[i], cat="Integer") for i
in range(num)]
plan_model += lpSum(num_channel) == plan_amount
plan_model += lpSum(alpha) <= number_max
for i in range(num):
plan_model += num_channel[i] >= alpha[i] * 5
plan_model += num_channel[i] <= alpha[i] * detail_list[i]
plan_model.writeLP("2222.lp")
test_dd = open("2222.txt", "w", encoding="utf-8")
i = 0
while True:
plan_model.solve()
if LpStatus[plan_model.status] == "Optimal":
test_dd.write(str(i + 1) + "times result\n")
for v in plan_model.variables():
test_dd.write(v.name + "=" + str(v.varValue))
test_dd.write("\n")
test_dd.write("============================\n\n")
alpha_0_num = 0
alpha_1_num = 0
for alpha_value in alpha:
if value(alpha_value) == 0:
alpha_0_num += 1
if value(alpha_value) == 1:
alpha_1_num += 1
plan_model += (lpSum(
alpha[k] for k in range(num) if value(alpha[k]) == 1)) <= alpha_1_num - 1
plan_model.writeLP("2222.lp")
i += 1
else:
break
test_dd.close()
I don't know how to change my constraints to achieve this goal. Can you help me
I am trying to implement Earley's algorithm for parsing a grammar, however I must have done something wrong because after the first entry in the chart it doesn't go through the rest of the input string. My test grammar is the following:
S -> aXbX | bXaX
X -> aXbX | bXaX | epsilon
S and X are non-terminals; a and b are terminals.
The string I want to check if it is accepted or not by the grammar is: 'abba'.
Here is my code:
rules = {
"S": [
['aXbX'],
['bXaX'],
],
"X" : [
['aXbX'],
['bXaX'],
['']
]
}
def predictor(rule, state):
if rule["right"][rule["dot"]].isupper(): # NON-TERMINAL
return [{
"left": rule["right"][rule["dot"]],
"right": right,
"dot": 0,
"op": "PREDICTOR",
"completor": []
} for right in rules[rule["right"][rule["dot"]]]]
else:
return []
def scanner(rule, next_input):
# TERMINAL
if rule["right"][rule["dot"]].islower() and next_input in rules[rule["right"][rule["dot"]]]:
print('scanner')
return [{
"left": rule["right"][rule["dot"]],
"right": [next_input],
"dot": 1,
"op": "SCANNER",
"completor": []
}]
else:
return []
def completor(rule, charts):
if rule["dot"] == len(rule["right"]):
print('completor')
return list(map(
lambda filter_rule: {
"left": filter_rule["left"],
"right": filter_rule["right"],
"dot": filter_rule["dot"] + 1,
"op": "COMPLETOR",
"completor": [rule] + filter_rule["completor"]
},
filter(
lambda p_rule: p_rule["dot"] < len(p_rule["right"]) and rule["left"] == p_rule["right"][p_rule["dot"]],
charts[rule["state"]]
)
))
else:
return []
input_string = 'abba'
input_arr = [char for char in input_string] + ['']
charts = [[{
"left": "S'",
"right": ["S"],
"dot": 0,
"op": "-",
"completor": []
}]]
for curr_state in range(len(input_arr)):
curr_chart = charts[curr_state]
next_chart = []
for curr_rule in curr_chart:
if curr_rule["dot"] < len(curr_rule["right"]): # not finished
curr_chart += [i for i in predictor(curr_rule, curr_state) if i not in curr_chart]
next_chart += [i for i in scanner(curr_rule, input_arr[curr_state]) if i not in next_chart]
else:
print('else')
curr_chart += [i for i in completor(curr_rule, charts) if i not in curr_chart]
charts.append(next_chart)
def print_charts(charts, inp):
for chart_no, chart in zip(range(len(charts)), charts):
print("\t{}".format("S" + str(chart_no)))
print("\t\n".join(map(
lambda x: "\t{} --> {}, {} {}".format(
x["left"],
"".join(x["right"][:x["dot"]] + ["."] + x["right"][x["dot"]:]),
str(chart_no) + ',',
x["op"]
),
chart
)))
print()
print_charts(charts[:-1], input_arr)
And this is the output I get (for states 1 to 4 I should get 5 to 9 entries):
S0
S' --> .S, 0, -
S --> .aXbX, 0, PREDICTOR
S --> .bXaX, 0, PREDICTOR
S1
S2
S3
S4
In my naive implementation of edit-distance finder, I have to check whether the last characters of two strings match:
ulong editDistance(const string a, const string b) {
if (a.length == 0)
return b.length;
if (b.length == 0)
return a.length;
const auto delt = a[$ - 1] == b[$ - 1] ? 0 : 1;
import std.algorithm : min;
return min(
editDistance(a[0 .. $ - 1], b[0 .. $ - 1]) + delt,
editDistance(a, b[0 .. $ - 1]) + 1,
editDistance(a[0 .. $ - 1], b) + 1
);
}
This yields the expected results but if I replace delt with its definition it always returns 1 on non-empty strings:
ulong editDistance(const string a, const string b) {
if (a.length == 0)
return b.length;
if (b.length == 0)
return a.length;
//const auto delt = a[$ - 1] == b[$ - 1] ? 0 : 1;
import std.algorithm : min;
return min(
editDistance(a[0 .. $ - 1], b[0 .. $ - 1]) + a[$ - 1] == b[$ - 1] ? 0 : 1, //delt,
editDistance(a, b[0 .. $ - 1]) + 1,
editDistance(a[0 .. $ - 1], b) + 1
);
}
Why does this result change?
The operators have different precedence from what you expect. In const auto delt = a[$ - 1] == b[$ - 1] ? 0 : 1; there is no ambiguity, but in editDistance(a[0 .. $ - 1], b[0 .. $ - 1]) + a[$ - 1] == b[$ - 1] ? 0 : 1, there is (seemingly).
Simplifying:
auto tmp = editDistance2(a[0..$-1], b[0..$-1]);
return min(tmp + a[$-1] == b[$-1] ? 0 : 1),
//...
);
The interesting part here is parsed as (tmp + a[$-1]) == b[$-1] ? 0 : 1, and tmp + a[$-1] is not equal to b[$-1]. The solution is to wrap things in parentheses:
editDistance(a[0 .. $ - 1], b[0 .. $ - 1]) + (a[$ - 1] == b[$ - 1] ? 0 : 1)
This question already has answers here:
Generate all possible n-character passwords
(4 answers)
Closed 1 year ago.
I have a list of integers, a = [0, ..., n]. I want to generate all possible combinations of k elements from a; i.e., the cartesian product of the a with itself k times. Note that n and k are both changeable at runtime, so this needs to be at least a somewhat adjustable function.
So if n was 3, and k was 2:
a = [0, 1, 2, 3]
k = 2
desired = [(0,0), (0, 1), (0, 2), ..., (2,3), (3,0), ..., (3,3)]
In python I would use the itertools.product() function:
for p in itertools.product(a, repeat=2):
print p
What's an idiomatic way to do this in Go?
Initial guess is a closure that returns a slice of integers, but it doesn't feel very clean.
For example,
package main
import "fmt"
func nextProduct(a []int, r int) func() []int {
p := make([]int, r)
x := make([]int, len(p))
return func() []int {
p := p[:len(x)]
for i, xi := range x {
p[i] = a[xi]
}
for i := len(x) - 1; i >= 0; i-- {
x[i]++
if x[i] < len(a) {
break
}
x[i] = 0
if i <= 0 {
x = x[0:0]
break
}
}
return p
}
}
func main() {
a := []int{0, 1, 2, 3}
k := 2
np := nextProduct(a, k)
for {
product := np()
if len(product) == 0 {
break
}
fmt.Println(product)
}
}
Output:
[0 0]
[0 1]
[0 2]
[0 3]
[1 0]
[1 1]
[1 2]
[1 3]
[2 0]
[2 1]
[2 2]
[2 3]
[3 0]
[3 1]
[3 2]
[3 3]
The code to find the next product in lexicographic order is simple: starting from the right, find the first value that won't roll over when you increment it, increment that and zero the values to the right.
package main
import "fmt"
func main() {
n, k := 5, 2
ix := make([]int, k)
for {
fmt.Println(ix)
j := k - 1
for ; j >= 0 && ix[j] == n-1; j-- {
ix[j] = 0
}
if j < 0 {
return
}
ix[j]++
}
}
I've changed "n" to mean the set is [0, 1, ..., n-1] rather than [0, 1, ..., n] as given in the question, since the latter is confusing since it has n+1 elements.
Just follow the answer Implement Ruby style Cartesian product in Go, play it on http://play.golang.org/p/NR1_3Fsq8F
package main
import "fmt"
// NextIndex sets ix to the lexicographically next value,
// such that for each i>0, 0 <= ix[i] < lens.
func NextIndex(ix []int, lens int) {
for j := len(ix) - 1; j >= 0; j-- {
ix[j]++
if j == 0 || ix[j] < lens {
return
}
ix[j] = 0
}
}
func main() {
a := []int{0, 1, 2, 3}
k := 2
lens := len(a)
r := make([]int, k)
for ix := make([]int, k); ix[0] < lens; NextIndex(ix, lens) {
for i, j := range ix {
r[i] = a[j]
}
fmt.Println(r)
}
}