How to .each_char.map? - iteration

In Crystal, if I try this:
numbers = [1, 2, 3, 4, 5]
a = numbers.map { 0 }
p a
The output will be nice like this: [0, 0, 0, 0, 0]
However if I have a string and try to manipulate each char of that string individually with each_char it gets messier
word = "NUMBER"
b = word.each_char.map { 'x' }
p b
The output will be like this:
Iterator::Map(String::CharIterator, Char, Char)(#iterator=#<String::CharIterator:0x7f0040951f50 #reader=Char::Reader(#string="NUMBER", #current_char='N', #current_char_width=1, #pos=0, #error=nil, #end=false), #end=false>, #func=#<Proc(Char, Char):0x453190>)
In contrast, Ruby with the same code outputs:
["x", "x", "x", "x", "x", "x"]
Is there a way to do this to get the same or similar output as Ruby gives in Crystal?

You can collect the iterator's elements into an array using Iterator#to_a, which it inherits from Enumerable:
p "NUMBER".each_char.map { 'x' }.to_a # => ['x', 'x', 'x', 'x', 'x', 'x']
Alternatively you can start out with an array by using String#chars and then calling Array#map on it:
p "NUMBER".chars.map { 'x' } # => ['x', 'x', 'x', 'x', 'x', 'x']
This pattern of each_foo returning an Iterator and foos returning an Array can be found throughout most of the standard library.

Related

Pandas Dataframe GroupBy Agg - LAMBDA - single values go to preexisting or new lists and preexisting lists fusion

I have this DataFrame to groupby key:
df = pd.DataFrame({
'key': ['1', '1', '1', '2', '2', '3', '3', '4', '4', '5'],
'data1': [['A', 'B', 'C'], 'D', 'P', 'E', ['F', 'G', 'H'], ['I', 'J'], ['K', 'L'], 'M', 'N', 'O']
'data2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
})
df
I want to make the groupby key and sum data2, it's ok for this part.
But concerning data1, I want to :
If a list doesn't exist yet:
Single values don't change when key was not duplicated
Single values assigned to a key are combined into a new list
If a list already exist:
Other single values are append to it
Other lists values are append to it
The resulting DataFrame should then be :
dfgood = pd.DataFrame({
'key': ['1', '2', '3', '4', '5'],
'data1': [['A', 'B', 'C', 'D', 'P'], ['F', 'G', 'H', 'E'], ['I', 'J', 'K', 'L'], ['M', 'N'], 'O']
'data2': [6, 9, 13, 17, 10]
})
dfgood
In fact, I don't really care about the order of data1 values into the lists, it could also be any structure that keep them together, even a string with separators or a set, if it's easier to make it go the way you think best to do this.
I thought about two solutions :
Going that way :
dfgood = df.groupby('key', as_index=False).agg({
'data1' : lambda x: x.iloc[0].append(x.iloc[1]) if type(x.iloc[0])==list else list(x),
'data2' : sum,
})
dfgood
It doesn't work because of index out of range in x.iloc[1].
I also tried, because data1 was organized like this in another groupby from the question on this link:
dfgood = df.groupby('key', as_index=False).agg({
'data1' : lambda g: g.iloc[0] if len(g) == 1 else list(g)),
'data2' : sum,
})
dfgood
But it's creating new lists from preexisting lists or values and not appending data to already existing lists.
Another way to do it, but I think it's more complicated and there should be a better or faster solution :
Turning data1 lists and single values into individual series with apply,
use wide_to_long to keep single values for each key,
Then groupby applying :
dfgood = df.groupby('key', as_index=False).agg({
'data1' : lambda g: g.iloc[0] if len(g) == 1 else list(g)),
'data2' : sum,
})
dfgood
I think my problem is that I don't know how to use lambdas correctly and I try stupid things like x.iloc[1] in the previous example. I've looked at a lot of tutorial about lambdas, but it's still fuzzy in my mind.
There is problem combinations lists with scalars, possible solution is create first lists form scalars and then flatten them in groupby.agg:
dfgood = (df.assign(data1 = df['data1'].apply(lambda y: y if isinstance(y, list) else [y]))
.groupby('key', as_index=False).agg({
'data1' : lambda x: [z for y in x for z in y],
'data2' : sum,
})
)
print (dfgood)
key data1 data2
0 1 [A, B, C, D, P] 6
1 2 [E, F, G, H] 9
2 3 [I, J, K, L] 13
3 4 [M, N] 17
4 5 [O] 10
Another idea is use flatten function for flatten only lists, not strings:
#https://stackoverflow.com/a/5286571/2901002
def flatten(foo):
for x in foo:
if hasattr(x, '__iter__') and not isinstance(x, str):
for y in flatten(x):
yield y
else:
yield x
dfgood = (df.groupby('key', as_index=False).agg({
'data1' : lambda x: list(flatten(x)),
'data2' : sum}))
You could explode to get individual rows, then aggregate again with groupby+agg after taking care of masking the duplicated values in data2 (to avoid summing duplicates):
(df.explode('data1')
.assign(data2=lambda d: d['data2'].mask(d.duplicated(['key', 'data2']), 0))
.groupby('key')
.agg({'data1': list, 'data2': 'sum'})
)
output:
data1 data2
key
1 [A, B, C, D, P] 6
2 [E, F, G, H] 9
3 [I, J, K, L] 13
4 [M, N] 17
5 [O] 10

Dictionary Unique Keys Rename and Replace

I have a dictionary format structure like this
df = pd.DataFrame({'ID' : ['A', 'B', 'C'],
'CODES' : [{"1407273790":5,"1801032636":20,"1174813554":1,"1215470448":2,"1053754655":4,"1891751228":1},
{"1497066526":19,"1801032636":16,"1215470448":11,"1891751228":18},
{"1215470448":8,"1407273790":4},]})
Now I want to create a unique list of keys and create names for them like this -
np_code np_rename
1407273790 np_1
1801032636 np_2
1174813554 np_3
1215470448 np_4
1053754655 np_5
1891751228 np_6
1497066526 np_7
And finally replace the new names in main dataframe df -
df = pd.DataFrame({'ID' : ['A', 'B', 'C'],
'CODES' : [{"np_1":5,"np_2":20,"np_3":1,"np_4":2,"np_5":4,"np_6":1},
{"np_7":19,"1801032636":16,"np_4":11,"np_6":18},
{"np_4":8,"np_1":4},]})
You can use apply here:
Assuming the unique list dataframe is unique_list_df:
u = df['CODES'].map(lambda x: [*x.keys()]).explode().unique()
d = dict(zip(u,'np_'+pd.Index((pd.factorize(u)[0]+1).astype(str))))
f = lambda x: {d.get(k,k): v for k,v in x.items()}
df['CODES'] = df['CODES'].apply(f)
print(df)
ID CODES
0 A {'np_1': 5, 'np_2': 20, 'np_3': 1, 'np_4': 2, ...
1 B {'np_7': 19, 'np_2': 16, 'np_4': 11, 'np_6': 18}
2 C {'np_4': 8, 'np_1': 4}

How to find the index in a list of numbers where there are repeating numbers [duplicate]

Does anyone know how I can get the index position of duplicate items in a python list?
I have tried doing this and it keeps giving me only the index of the 1st occurrence of the of the item in the list.
List = ['A', 'B', 'A', 'C', 'E']
I want it to give me:
index 0: A
index 2: A
You want to pass in the optional second parameter to index, the location where you want index to start looking. After you find each match, reset this parameter to the location just after the match that was found.
def list_duplicates_of(seq,item):
start_at = -1
locs = []
while True:
try:
loc = seq.index(item,start_at+1)
except ValueError:
break
else:
locs.append(loc)
start_at = loc
return locs
source = "ABABDBAAEDSBQEWBAFLSAFB"
print(list_duplicates_of(source, 'B'))
Prints:
[1, 3, 5, 11, 15, 22]
You can find all the duplicates at once in a single pass through source, by using a defaultdict to keep a list of all seen locations for any item, and returning those items that were seen more than once.
from collections import defaultdict
def list_duplicates(seq):
tally = defaultdict(list)
for i,item in enumerate(seq):
tally[item].append(i)
return ((key,locs) for key,locs in tally.items()
if len(locs)>1)
for dup in sorted(list_duplicates(source)):
print(dup)
Prints:
('A', [0, 2, 6, 7, 16, 20])
('B', [1, 3, 5, 11, 15, 22])
('D', [4, 9])
('E', [8, 13])
('F', [17, 21])
('S', [10, 19])
If you want to do repeated testing for various keys against the same source, you can use functools.partial to create a new function variable, using a "partially complete" argument list, that is, specifying the seq, but omitting the item to search for:
from functools import partial
dups_in_source = partial(list_duplicates_of, source)
for c in "ABDEFS":
print(c, dups_in_source(c))
Prints:
A [0, 2, 6, 7, 16, 20]
B [1, 3, 5, 11, 15, 22]
D [4, 9]
E [8, 13]
F [17, 21]
S [10, 19]
>>> def indices(lst, item):
... return [i for i, x in enumerate(lst) if x == item]
...
>>> indices(List, "A")
[0, 2]
To get all duplicates, you can use the below method, but it is not very efficient. If efficiency is important you should consider Ignacio's solution instead.
>>> dict((x, indices(List, x)) for x in set(List) if List.count(x) > 1)
{'A': [0, 2]}
As for solving it using the index method of list instead, that method takes a second optional argument indicating where to start, so you could just repeatedly call it with the previous index plus 1.
>>> List.index("A")
0
>>> List.index("A", 1)
2
I made a benchmark of all solutions suggested here and also added another solution to this problem (described in the end of the answer).
Benchmarks
First, the benchmarks. I initialize a list of n random ints within a range [1, n/2] and then call timeit over all algorithms
The solutions of #Paul McGuire and #Ignacio Vazquez-Abrams works about twice as fast as the rest on the list of 100 ints:
Testing algorithm on the list of 100 items using 10000 loops
Algorithm: dupl_eat
Timing: 1.46247477189
####################
Algorithm: dupl_utdemir
Timing: 2.93324529055
####################
Algorithm: dupl_lthaulow
Timing: 3.89198786645
####################
Algorithm: dupl_pmcguire
Timing: 0.583058259784
####################
Algorithm: dupl_ivazques_abrams
Timing: 0.645062989076
####################
Algorithm: dupl_rbespal
Timing: 1.06523873786
####################
If you change the number of items to 1000, the difference becomes much bigger (BTW, I'll be happy if someone could explain why) :
Testing algorithm on the list of 1000 items using 1000 loops
Algorithm: dupl_eat
Timing: 5.46171654555
####################
Algorithm: dupl_utdemir
Timing: 25.5582547323
####################
Algorithm: dupl_lthaulow
Timing: 39.284285326
####################
Algorithm: dupl_pmcguire
Timing: 0.56558489513
####################
Algorithm: dupl_ivazques_abrams
Timing: 0.615980005148
####################
Algorithm: dupl_rbespal
Timing: 1.21610942322
####################
On the bigger lists, the solution of #Paul McGuire continues to be the most efficient and my algorithm begins having problems.
Testing algorithm on the list of 1000000 items using 1 loops
Algorithm: dupl_pmcguire
Timing: 1.5019953958
####################
Algorithm: dupl_ivazques_abrams
Timing: 1.70856155898
####################
Algorithm: dupl_rbespal
Timing: 3.95820421595
####################
The full code of the benchmark is here
Another algorithm
Here is my solution to the same problem:
def dupl_rbespal(c):
alreadyAdded = False
dupl_c = dict()
sorted_ind_c = sorted(range(len(c)), key=lambda x: c[x]) # sort incoming list but save the indexes of sorted items
for i in xrange(len(c) - 1): # loop over indexes of sorted items
if c[sorted_ind_c[i]] == c[sorted_ind_c[i+1]]: # if two consecutive indexes point to the same value, add it to the duplicates
if not alreadyAdded:
dupl_c[c[sorted_ind_c[i]]] = [sorted_ind_c[i], sorted_ind_c[i+1]]
alreadyAdded = True
else:
dupl_c[c[sorted_ind_c[i]]].append( sorted_ind_c[i+1] )
else:
alreadyAdded = False
return dupl_c
Although it's not the best it allowed me to generate a little bit different structure needed for my problem (i needed something like a linked list of indexes of the same value)
dups = collections.defaultdict(list)
for i, e in enumerate(L):
dups[e].append(i)
for k, v in sorted(dups.iteritems()):
if len(v) >= 2:
print '%s: %r' % (k, v)
And extrapolate from there.
I think I found a simple solution after a lot of irritation :
if elem in string_list:
counter = 0
elem_pos = []
for i in string_list:
if i == elem:
elem_pos.append(counter)
counter = counter + 1
print(elem_pos)
This prints a list giving you the indexes of a specific element ("elem")
Using new "Counter" class in collections module, based on lazyr's answer:
>>> import collections
>>> def duplicates(n): #n="123123123"
... counter=collections.Counter(n) #{'1': 3, '3': 3, '2': 3}
... dups=[i for i in counter if counter[i]!=1] #['1','3','2']
... result={}
... for item in dups:
... result[item]=[i for i,j in enumerate(n) if j==item]
... return result
...
>>> duplicates("123123123")
{'1': [0, 3, 6], '3': [2, 5, 8], '2': [1, 4, 7]}
from collections import Counter, defaultdict
def duplicates(lst):
cnt= Counter(lst)
return [key for key in cnt.keys() if cnt[key]> 1]
def duplicates_indices(lst):
dup, ind= duplicates(lst), defaultdict(list)
for i, v in enumerate(lst):
if v in dup: ind[v].append(i)
return ind
lst= ['a', 'b', 'a', 'c', 'b', 'a', 'e']
print duplicates(lst) # ['a', 'b']
print duplicates_indices(lst) # ..., {'a': [0, 2, 5], 'b': [1, 4]})
A slightly more orthogonal (and thus more useful) implementation would be:
from collections import Counter, defaultdict
def duplicates(lst):
cnt= Counter(lst)
return [key for key in cnt.keys() if cnt[key]> 1]
def indices(lst, items= None):
items, ind= set(lst) if items is None else items, defaultdict(list)
for i, v in enumerate(lst):
if v in items: ind[v].append(i)
return ind
lst= ['a', 'b', 'a', 'c', 'b', 'a', 'e']
print indices(lst, duplicates(lst)) # ..., {'a': [0, 2, 5], 'b': [1, 4]})
Wow, everyone's answer is so long. I simply used a pandas dataframe, masking, and the duplicated function (keep=False markes all duplicates as True, not just first or last):
import pandas as pd
import numpy as np
np.random.seed(42) # make results reproducible
int_df = pd.DataFrame({'int_list': np.random.randint(1, 20, size=10)})
dupes = int_df['int_list'].duplicated(keep=False)
print(int_df['int_list'][dupes].index)
This should return Int64Index([0, 2, 3, 4, 6, 7, 9], dtype='int64').
def index(arr, num):
for i, x in enumerate(arr):
if x == num:
print(x, i)
#index(List, 'A')
In a single line with pandas 1.2.2 and numpy:
import numpy as np
import pandas as pd
idx = np.where(pd.DataFrame(List).duplicated(keep=False))
The argument keep=False will mark every duplicate as True and np.where() will return an array with the indices where the element in the array was True.
string_list = ['A', 'B', 'C', 'B', 'D', 'B']
pos_list = []
for i in range(len(string_list)):
if string_list[i] = ='B':
pos_list.append(i)
print pos_list
def find_duplicate(list_):
duplicate_list=[""]
for k in range(len(list_)):
if duplicate_list.__contains__(list_[k]):
continue
for j in range(len(list_)):
if k == j:
continue
if list_[k] == list_[j]:
duplicate_list.append(list_[j])
print("duplicate "+str(list_.index(list_[j]))+str(list_.index(list_[k])))
Here is one that works for multiple duplicates and you don't need to specify any values:
List = ['A', 'B', 'A', 'C', 'E', 'B'] # duplicate two 'A's two 'B's
ix_list = []
for i in range(len(List)):
try:
dup_ix = List[(i+1):].index(List[i]) + (i + 1) # dup onwards + (i + 1)
ix_list.extend([i, dup_ix]) # if found no error, add i also
except:
pass
ix_list.sort()
print(ix_list)
[0, 1, 2, 5]
def dup_list(my_list, value):
'''
dup_list(list,value)
This function finds the indices of values in a list including duplicated values.
list: the list you are working on
value: the item of the list you want to find the index of
NB: if a value is duplcated, its indices are stored in a list
If only one occurence of the value, the index is stored as an integer.
Therefore use isinstance method to know how to handle the returned value
'''
value_list = []
index_list = []
index_of_duped = []
if my_list.count(value) == 1:
return my_list.index(value)
elif my_list.count(value) < 1:
return 'Your argument is not in the list'
else:
for item in my_list:
value_list.append(item)
length = len(value_list)
index = length - 1
index_list.append(index)
if item == value:
index_of_duped.append(max(index_list))
return index_of_duped
# function call eg dup_list(my_list, 'john')
If you want to get index of all duplicate elements of different types you can try this solution:
# note: below list has more than one kind of duplicates
List = ['A', 'B', 'A', 'C', 'E', 'E', 'A', 'B', 'A', 'A', 'C']
d1 = {item:List.count(item) for item in List} # item and their counts
elems = list(filter(lambda x: d1[x] > 1, d1)) # get duplicate elements
d2 = dict(zip(range(0, len(List)), List)) # each item and their indices
# item and their list of duplicate indices
res = {item: list(filter(lambda x: d2[x] == item, d2)) for item in elems}
Now, if you print(res) you'll get to see this:
{'A': [0, 2, 6, 8, 9], 'B': [1, 7], 'C': [3, 10], 'E': [4, 5]}
def duplicates(list,dup):
a=[list.index(dup)]
for i in list:
try:
a.append(list.index(dup,a[-1]+1))
except:
for i in a:
print(f'index {i}: '+dup)
break
duplicates(['A', 'B', 'A', 'C', 'E'],'A')
Output:
index 0: A
index 2: A
This is a good question and there is a lot of ways to it.
The code below is one of the ways to do it
letters = ["a", "b", "c", "d", "e", "a", "a", "b"]
lettersIndexes = [i for i in range(len(letters))] # i created a list that contains the indexes of my previous list
counter = 0
for item in letters:
if item == "a":
print(item, lettersIndexes[counter])
counter += 1 # for each item it increases the counter which means the index
An other way to get the indexes but this time stored in a list
letters = ["a", "b", "c", "d", "e", "a", "a", "b"]
lettersIndexes = [i for i in range(len(letters)) if letters[i] == "a" ]
print(lettersIndexes) # as you can see we get a list of the indexes that we want.
Good day
Using a dictionary approach based on setdefault instance method.
List = ['A', 'B', 'A', 'C', 'B', 'E', 'B']
# keep track of all indices of every term
duplicates = {}
for i, key in enumerate(List):
duplicates.setdefault(key, []).append(i)
# print only those terms with more than one index
template = 'index {}: {}'
for k, v in duplicates.items():
if len(v) > 1:
print(template.format(k, str(v).strip('][')))
Remark: Counter, defaultdict and other container class from collections are subclasses of dict hence share the setdefault method as well
I'll mention the more obvious way of dealing with duplicates in lists. In terms of complexity, dictionaries are the way to go because each lookup is O(1). You can be more clever if you're only interested in duplicates...
my_list = [1,1,2,3,4,5,5]
my_dict = {}
for (ind,elem) in enumerate(my_list):
if elem in my_dict:
my_dict[elem].append(ind)
else:
my_dict.update({elem:[ind]})
for key,value in my_dict.iteritems():
if len(value) > 1:
print "key(%s) has indices (%s)" %(key,value)
which prints the following:
key(1) has indices ([0, 1])
key(5) has indices ([5, 6])
a= [2,3,4,5,6,2,3,2,4,2]
search=2
pos=0
positions=[]
while (search in a):
pos+=a.index(search)
positions.append(pos)
a=a[a.index(search)+1:]
pos+=1
print "search found at:",positions
I just make it simple:
i = [1,2,1,3]
k = 0
for ii in i:
if ii == 1 :
print ("index of 1 = ", k)
k = k+1
output:
index of 1 = 0
index of 1 = 2

What is the equivalent of Python list, set, and map comprehensions in Kotlin?

In Python, there are list comprehensions and similar constructs for maps and sets. In Kotlin there is nothing at all in any of the documentation with a similar name.
What are the equivalents of these comprehensions? For example, those found in Python 3 Patterns, Recipes and Idioms. Which includes comprehensions for:
list
set
dictionary
Note: this question is intentionally written and answered by the author (Self-Answered Questions), so that the idiomatic answers to commonly asked Kotlin topics are present in SO.
Taking examples from Python 3 Patterns, Recipes and Idioms we can convert each one to Kotlin using a simple pattern. The Python version of a list comprehension has 3 parts:
output expression
input list/sequence and variable
optional predicate
These directly correlate to Kotlin functional extensions to collection classes. The input sequence, followed by the optional predicate in a filter lambda, followed by the output expression in a map lambda. So for this Python example:
# === PYTHON
a_list = [1, 2, 3, 4, 5, 6]
# output | var | input | filter/predicate
even_ints_squared = [ e*e for e in a_list if e % 2 == 0 ]
print(even_ints_squared)
# output: [ 4, 16, 36 ]
Becomes
// === KOTLIN
var aList = listOf(1, 2, 3, 4, 5, 6)
// input | filter | output
val evenIntsSquared = aList.filter { it % 2 == 0 }.map { it * it }
println(evenIntsSquared)
// output: [ 4, 16, 36 ]
Notice that the variable is not needed in the Kotlin version since the implied it variable is used within each lambda. In Python you can turn these into a lazy generator by using the () instead of square brackets:
# === PYTHON
even_ints_squared = ( e**2 for e in a_list if e % 2 == 0 )
And in Kotlin it is more obviously converted to a lazy sequence by changing the input via a function call asSequence():
// === KOTLIN
val evenIntsSquared = aList.asSequence().filter { it % 2 == 0 }.map { it * it }
Nested comprehensions in Kotlin are created by just nesting one within the other's map lambda. For example, take this sample from PythonCourse.eu in Python changed slightly to use both a set and a list comprehension:
# === PYTHON
noprimes = {j for i in range(2, 8) for j in range(i*2, 100, i)}
primes = [x for x in range(2, 100) if x not in noprimes]
print(primes)
# output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Becomes:
// === KOTLIN
val nonprimes = (2..7).flatMap { (it*2..99).step(it).toList() }.toSet()
val primes = (2..99).filterNot { it in nonprimes }
print(primes)
// output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Notice that the nested comprehension produces a list of lists which is converted to a flat list using flatMap() and then converted to a set using toSet(). Also, Kotlin ranges are inclusive, whereas a Python range is exclusive so you will see the numbers are slightly different in the ranges.
You can also use a sequence generator with co-routines in Kotlin to yield the values without needing the call to flatMap() or flatten():
// === KOTLIN
val nonprimes = sequence {
(2..7).forEach { (it*2..99).step(it).forEach { value -> yield(value) } }
}.toSet()
val primes = (2..99).filterNot { it in nonprimes }
Another example from the referenced Python page is generating a matrix:
# === PYTHON
matrix = [ [ 1 if item_idx == row_idx else 0 for item_idx in range(0, 3) ] for row_idx in range(0, 3) ]
print(matrix)
# [[1, 0, 0],
# [0, 1, 0],
# [0, 0, 1]]
And in Kotlin:
// === KOTLIN
val matrix = (0..2).map { row -> (0..2).map { col -> if (col == row) 1 else 0 }}
println(matrix)
// [[1, 0, 0],
// [0, 1, 0],
// [0, 0, 1]]
Or in Kotlin instead of lists, you could also generate arrays:
// === KOTLIN
val matrix2 = Array(3) { row ->
IntArray(3) { col -> if (col == row) 1 else 0 }
}
Another of the examples for set comprehensions is to generate a unique set of properly cased names:
# === PYTHON
names = [ 'Bob', 'JOHN', 'alice', 'bob', 'ALICE', 'J', 'Bob' ]
fixedNames = { name[0].upper() + name[1:].lower() for name in names if len(name) > 1 }
print(fixedNames)
# output: {'Bob', 'Alice', 'John'}
Is translated to Kotlin:
// === KOTLIN
val names = listOf( "Bob", "JOHN", "alice", "bob", "ALICE", "J", "Bob" )
val fixedNames = names.filter { it.length > 1 }
.map { it.take(1).toUpperCase() + it.drop(1).toLowerCase() }
.toSet()
println(fixedNames)
// output: [Bob, John, Alice]
And the example for map comprehension is a bit odd, but can also be implemented in Kotlin. The original:
# === PYTHON
mcase = {'a':10, 'b': 34, 'A': 7, 'Z':3}
mcase_frequency = { k.lower() : mcase.get(k.lower(), 0) + mcase.get(k.upper(), 0) for k in mcase.keys() }
print(mcase_frequency)
# output: {'a': 17, 'z': 3, 'b': 34}
And the converted, which is written to be a bit more "wordy" here to make it clearer what is happening:
// === KOTLIN
val mcase = mapOf("a" to 10, "b" to 34, "A" to 7, "Z" to 3)
val mcaseFrequency = mcase.map { (key, _) ->
val newKey = key.toLowerCase()
val newValue = mcase.getOrDefault(key.toLowerCase(), 0) +
mcase.getOrDefault(key.toUpperCase(), 0)
newKey to newValue
}.toMap()
print(mcaseFrequency)
// output: {a=17, b=34, z=3}
Further reading:
Kotlin adds more power than list/set/map comprehensions because of its extensive functional transforms that you can make to these collection types. See What Java 8 Stream.collect equivalents are available in the standard Kotlin library?
for more examples.
See Get Factors of Numbers in Kotlin
which shows another example of a Python comprehension versus Kotlin.
See Kotlin Extensions Functions for Collections in the API reference guide.
Just for exercise the closest to python will be:
infix fun <I, O> ((I) -> O).`in`(range: Iterable<I>): List<O> = range.map(this).toList()
infix fun <I> Iterable<I>.`if`(cond: (I) -> Boolean): List<I> = this.filter(cond)
fun main() {
{ it: Int -> it + 1 } `in` 1..2 `if` {it > 0}
}
val newls = (1..100).filter({it % 7 == 0})
in Kotlin is equivalent to the following Python code
newls = [i for i in 0..100 if i % 7 ==0]
Map comprehension
import kotlin.math.sqrt
val numbers = "1,2,3,4".split(",")
val roots = numbers.associate { n -> n.toInt() to sqrt(n.toFloat()) }
println(roots) // prints {1=1.0, 2=1.4142135, 3=1.7320508, 4=2.0}
If keys are untransformed elements of source list, even simpler:
val roots = numbers.associateWith { n -> sqrt(n.toFloat()) }

tensorflow string_split on batch data

From tensorflow offical doc, it says
For example: N = 2, source[0] is 'hello world' and source[1] is 'a b c', then the output will be st.indices = [0, 0; 0, 1; 1, 0; 1, 1; 1, 2] st.shape = [2, 3] st.values = ['hello', 'world', 'a', 'b', 'c']
What if I want something like [['hello', 'world'], ['a','b','c']], how can I get this?
Thanks.
Use tf.map_fn to map your batch onto the function tf.string_split.
https://www.tensorflow.org/api_docs/python/tf/map_fn
The map function will split your batch along the first dimension (your batch size, N as referenced by the documentation in your question), then it will pass each of the samples to tf.string_split individually, each of which will return ['hello', 'world'] and ['a', 'b', 'c'] respectively. Then the map function will recombine the individual results into an array which will result in [['hello', 'world'], ['a', 'b', 'c']] as desired.