What does [:] mean in groovy? - variables

While reading some groovy code of another developer I encountered the following definition:
def foo=[:]
What does it mean?

[:] is shorthand notation for creating a Map.
You can also add keys and values to it:
def foo = [bar: 'baz']

[:] creates an empty Map. The colon is there to distinguish it from [], which creates an empty List.
This groovy code:
def foo = [:]
is roughly equivalent to this java code:
Object foo = new java.util.LinkedHashMap();

Quoting the doc:
Notice that [:] is the empty map expression.
... which is the only Map with size() returning 0. ) By itself, it's rarely useful, but you can add values into this Map, of course:
def emptyMap = [:]
assert emptyMap.size() == 0
emptyMap.foo = 5
assert emptyMap.size() == 1
assert emptyMap.foo == 5

Related

Swapping characters in Strings Python

I have been trying to make something like an encoder:
here is my idea
dict = {
1: "!",
2: "#"
}
in = 21 # Input number in
out = ?
print(out) # Returns "#!"
Is there any way I could perform this?
What you want is exactly the translate function of str:
x="12"
y="!#"
in=12
txt=str(in)
mapping = txt.maketrans(x, y)
out=txt.translate(mapping)
You can check the complete reference here.

How can I get the value of a variable named after another one in groovy?

I have a variable that contains the name of another variable which I want to retrieve the value, e.g.:
def variable = "finalVariableValue"
def variableName = "variable"
How can I get variable.value as I only know variableName?
I've seen the a Binding could be used but I have a lot of variable that I need to put on this Binding object in order to make it works. Is the only way?
NB: this behaviour is really similar to the ant property extension mechanism.
Thanks,
Michele.
By prefixing it with def you are not registering it in an object you can inspect, like a map; one could argue it is registered in the AST, but that is a rocky road.
My 0.03 are working with a map, with a binding, or with dynamic properties. Drop the def part and choose one of the solutions:
Map
Simply declare the variable as a key in a map:
def map = [:]
map.variable = "finalVariableValue"
def variableName = "variable"
assert map[variableName] == "finalVariableValue"
Binding (with script)
Use the script built-in binding. Note this only works with scripts:
variable = "finalVariableValue"
variableName = "variable"
assert binding[variableName] == "finalVariableValue"
Dynamic properties
Use some dynamic properties mechanism, like an Expando (also, you could use getProperty with setProperty and others):
class Container extends Expando {
def declare() {
variable = "finalVariableValue"
variableName = "variable"
}
}
c = new Container()
c.declare()
assert c[c.variableName] == "finalVariableValue"
You can use the script's scope, simply dropping the Type definition:
variable = 'value'
name = 'variable'
assert 'variable' == this.name
assert 'value' == this[this.name]
or using #Field annotation:
import groovy.transform.Field
#Field def variable = 'value'
#Field def name = 'variable'
assert 'variable' == this.name
assert 'value' == this[this.name]

generating DataFrames in for loop in Scala Spark cause out of memory

I'm generating small dataFrames in for loop. At each round of for loop, I pass the generated dataFrame to a function which returns double. This simple process (which I thought could be easily taken care of by garbage collector) blow up my memory. When I look at Spark UI at each round of for loop it adds a new "SQL{1-500}" (my loop runs 500 times). My question is how to drop this sql object before generating a new one?
my code is something like this:
Seq.fill(500){
val data = (1 to 1000).map(_=>Random.nextInt(1000))
val dataframe = createDataFrame(data)
myFunction(dataframe)
dataframe.unpersist()
}
def myFunction(df: DataFrame)={
df.count()
}
I tried to solve this problem by dataframe.unpersist() and sqlContext.clearCache() but neither of them worked.
You have two places where I suspect something fishy is happening:
in the definition of myFunction : you really need to put the = before the body of the definition. I had typos like that compile, but produce really weird errors (note I changed your myFunction for debugging purposes)
it is better to fill your Seq with something you know and then apply foreach or some such
(You also need to replace random.nexInt with Random.nextInt, and also, you can only create a DataFrame from a Seq of a type that is a subtype of Product, such as tuple, and need to use sqlContext to use createDataFrame)
This code works with no memory issues:
Seq.fill(500)(0).foreach{ i =>
val data = {1 to 1000}.map(_.toDouble).toList.zipWithIndex
val dataframe = sqlContext.createDataFrame(data)
myFunction(dataframe)
}
def myFunction(df: DataFrame) = {
println(df.count())
}
Edit: parallelizing the computation (across 10 cores) and returning the RDD of counts:
sc.parallelize(Seq.fill(500)(0), 10).map{ i =>
val data = {1 to 1000}.map(_.toDouble).toList.zipWithIndex
val dataframe = sqlContext.createDataFrame(data)
myFunction(dataframe)
}
def myFunction(df: DataFrame) = {
df.count()
}
Edit 2: the difference between declaring function myFunction with = and without = is that the first is (a usual) function definition, while the other is procedure definition and is only used for methods that return Unit. See explanation. Here is this point illustrated in Spark-shell:
scala> def myf(df:DataFrame) = df.count()
myf: (df: org.apache.spark.sql.DataFrame)Long
scala> def myf2(df:DataFrame) { df.count() }
myf2: (df: org.apache.spark.sql.DataFrame)Unit

why both transform and map methods in scala?

I'm having trouble understanding the difference between / reason for, for example, immutable.Map.transform and immutable.Map.map. It looks like transform won't change the key, but that just seems like a trivial variation of the map method. Am I missing something?
I was expecting to find a method that applied a function to the (key,value) of the map when/if that element was accessed (rather than having to iterate through the map eagerly with the map function). Does such a method exist?
You can do exactly that with mapValues. Here is the explanation from the docs:
def mapValues[C](f: (B) ⇒ C): Map[A, C]
Transforms this map by applying a function to every retrieved value.
f - the function used to transform values of this map.
returns - a map view which maps every key of this map to f(this(key)). The resulting map wraps the original map without copying any elements.
edit:
Although extending classes of the collection API is not often a good idea, it could work like this:
class LazilyModifiedMap[A,B,C](underlying: Map[A,B])(f: (A,B) => C) extends Map[A,C] {
def get(key: A) = underlying.get(key).map( x => f(key, x))
def iterator = underlying.iterator.map { case (k,v) => (k, f(k,v)) }
def -(key: A) = iterator.toMap - key
def +[C1 >: C](kv: (A,C1)) = iterator.toMap + kv
}
If you only need the interface of PartialFunction, you can exploit the fact that Map inherits from PartialFunction:
val m = Map(1 -> "foo", 2 -> "bar")
val n = m.andThen(_.reverse)
n(1) // --> oof

Serializing groovy map to string with quotes

I'm trying to persist a groovy map to a file. My current attempt is to write the string representation out and then read it back in and call evaluate on it to recreate the map when I'm ready to use it again.
The problem I'm having is that the toString() method of the map removes vital quotes from the values of the elements. When my code calls evaluate, it complains about an unknown identifier.
This code demonstrates the problem:
m = [a: 123, b: 'test']
print "orig: $m\n"
s = m.toString()
print " str: $s\n"
m2 = evaluate(s)
print " new: ${m2}\n"
The first two print statements almost work -- but the quotes around the value for the key b are gone. Instead of showing [a: 123, b: 'test'], it shows [a: 123, b: test].
At this point the damage is done. The evaluate call chokes when it tries to evaluate test as an identifier and not a string.
So, my specific questions:
Is there a better way to serialize/de-serialize maps in Groovy?
Is there a way to produce a string representation of a map with proper quotes?
Groovy provides the inspect() method returns an object as a parseable string:
// serialize
def m = [a: 123, b: 'test']
def str = m.inspect()
// deserialize
m = Eval.me(str)
Another way to serialize a groovy map as a readable string is with JSON:
// serialize
import groovy.json.JsonBuilder
def m = [a: 123, b: 'test']
def builder = new JsonBuilder()
builder(m)
println builder.toString()
// deserialize
import groovy.json.JsonSlurper
def slurper = new JsonSlurper()
m = slurper.parseText('{"a": 123, "b": "test"}')
You can use myMap.toMapString()