How to quickly serialize Clojure data structures to string with indentations? - serialization

clojure.pprint seems to do what I want:
user=> (def a {"q" {:q 1 :w 2 :e 3 :r 4 :t 4}, "w" {:q 1 :w 2 :e 3 :r 4 :t 4} "e" {:q 1 :w 2 :e 3 :r 4 :t 4}, "r" {:q 1 :w 2 :e 3 :r 4 :t 4}})
user=> (pprint a)
{"e" {:r 4, :e 3, :w 2, :t 4, :q 1},
"q" {:r 4, :e 3, :w 2, :t 4, :q 1},
"r" {:r 4, :e 3, :w 2, :t 4, :q 1},
"w" {:r 4, :e 3, :w 2, :t 4, :q 1}}
nil
but
user=> (time (with-out-str (pprint a)))
"Elapsed time: 174.621945 msecs"
...
user=> (time (do (with-out-str (pprint (repeat 1000 {:r 4 :t 6 :q 3 :u 5 :d 3}))) nil))
"Elapsed time: 32902.028436 msecs"
it is too slow.
Are there any printers in Clojure which does indentation (maybe not as accurately as pprint), but is also fast and suitable for big structures?

You may want to look into Brandon Bloom's Fipp (GitHub repo summary: "Fast Idiomatic Pretty Printer for Clojure"). Note that the README states that it's edn-only for now.

Related

How can I group elements of a list in Raku?

Is there some method in Raku which, when you pass it a "getter", groups together items from the original list for which the getter is returning the same value?
I am looking for something like groupBy in Scala:
# (1 until 10).groupBy(_ % 3)
res0: Map[Int, IndexedSeq[Int]] = HashMap(0 -> Vector(3, 6, 9), 1 -> Vector(1, 4, 7), 2 -> Vector(2, 5, 8))
Or groupBy from Lodash (JavaScript):
> groupBy(range(1, 10), x => x % 3)
{"0": [3,6,9], "1": [1,4,7], "2": [2,5,8]}
It's called classify in Raku:
$ raku -e 'say (1..10).classify(* % 3)'
{0 => [3 6 9], 1 => [1 4 7 10], 2 => [2 5 8]}

pandas str in list ,change it's type

my data
df = pd.DataFrame({"id":['1,2,3,4','1,2,3,6'], "sum": [6,7]})
mycode:
df['id']=df['id'].str.split(',')
df['nf']=df.apply(lambda x: set(range(1,x['sum']+1))-set(x['id']) , axis=1)
print(df)
i want output
id sum nf
0 [1, 2, 3, 4] 6 {5, 6}
1 [1, 2, 3, 6] 7 {4, 5, 7}
but it output
id sum nf
0 [1, 2, 3, 4] 6 {1, 2, 3, 4, 5, 6}
1 [1, 2, 3, 6] 7 {1, 2, 3, 4, 5, 6, 7}
i think the 'num' in the list is actually str
but i don't known how to easily modify it by pandas
Use map for convert values to integers:
df['nf']=df.apply(lambda x: set(range(1,x['sum']+1))-set(map(int, x['id'])) , axis=1)
print(df)
id sum nf
0 [1, 2, 3, 4] 6 {5, 6}
1 [1, 2, 3, 6] 7 {4, 5, 7}

series.where on a series containing lists

I have this series called hours_by_analysis_date, where the index is datetimes, and the values are a list of ints. For example:
Index |
01-01-2000 | [1, 2, 3, 4, 5]
01-02-2000 | [2, 3, 4, 5, 6]
01-03-2000 | [1, 2, 3, 4, 5]
I want to return all the indices where the value is [1, 2, 3, 4, 5], so it should return 01-01-2000 and 01-03-2000
I tried hours_by_analysis_date.where(fh_by_analysis_date==[1, 2, 3, 4, 5]), but it gives me the error:
{ValueError} lengths must match to compare
It's confused between comparing two array-like objects and equality test for each element.
You can use apply:
hours_by_analysis_date.apply(lambda elem: elem == [1,2,3,4,5])

tensorflow tf.gather_nd to handle unknow dimention

any simple way to implement below code,
especially handle unknow dimention, i want to add this code to loss function.Thanks.
result =[]
for i in range(0,x.shape[0]):
tmp2 = tf.gather_nd(x[i], y[i])
result.append(tmp2)
finalResult = tf.stack(result)
example
x shape=(?,3,2)
y shape= (?,1)
x :
[[[ 0 1]
[ 2 3]
[ 4 5]]
[[ 6 7]
[ 8 9]
[10 11]]
[[12 13]
[14 15]
[16 17]]...]
y :
[[1]
[0]
[2]...]
finalResult :
[[ 2 3]
[ 6 7]
[16 17]...]
jdehesa's reply is helpful. Thanks so much.
have to add the indices of the first dimension to query.
(By the way, i made a mistake in loss function. it has to be differentiable.
but it's another issue.) anyway, thanks again.

Mathematica: dynamic number of menus

I am trying to make a dynamic number of drop-down menus in a plot, to plot a various number of curves.
I have previously requested help to plot this data, and it worked well.
First thing
Needs["PlotLegends`"]
Here is a example of data (not actual numbers, as they are waaay too long).
data={{year, H, He, Li, C, O, Si, S},
{0, .5, .1, .01, 0.01, 0.01, 0.001, 0.001},
{100, .45, .1, .01, 0.01, 0.01, 0.001, 0.001},
{200, .40, .1, .01, 0.01, 0.01, 0.001, 0.001},
{300, .35, .1, .01, 0.01, 0.01, 0.001, 0.001}}
The compounds variable is the number of compounds+1
compounds=8
For now, my code is this one
Manipulate[
ListLogLogPlot[
{data[[All, {1, i}]],
data[[All, {1, j}]],
data[[All, {1, k}]]},
PlotLegend -> {data[[1, i]],
data[[1, j]],
data[[1, k]]}
],
{{i, 2, "Compound 1"},Thread[Range[2, compounds] -> Drop[data[[1]], 1]]},
{{j, 3, "Compound 2"},Thread[Range[2, compounds] -> Drop[data[[1]], 1]]},
{{k, 4, "Compound 2"},Thread[Range[2, compounds] -> Drop[data[[1]], 1]]},
ContinuousAction -> False
]
As you can see, I can easily add a compound by duplicating each of the 3 lines (data, legend and menu descriptor), but it's lame and inefficient. Plotting a set takes about 20 seconds, so it's about 1 minute here (and I use a pretty efficient cluster).
Is there a solution to add a little menu or field where I can add the number of compounds to plot, so the right number of menus will display? I don't need more than 7 plots, but efficiency...
The numbers 2, 4, 16 are the default values to plot. I can make a list with the default values (2, 14, 16, and some others I may pick), or they could all be set to 2.
Thanks
You could do something like this
Manipulate[
ListLogLogPlot[data[[All, {1, #}]] & /# i],
{{n, 3, "# compounds"}, Range[7],
Dynamic[If[Length[i] != n, i = PadRight[{2, 4, 16}, n, 2]];
PopupMenu[#, Range[7]]] &},
{{i, {2, 4, 16}}, ControlType -> None},
Dynamic[Column[
Labeled[PopupMenu[Dynamic[i[[#]]],
Thread[Range[2, compounds] -> Drop[data[[1]], 1]]],
Row[{"Compound ", #}], Left] & /# Range[n]]
]
]
Without PlotLegend, this runs quite fast for a random data set of about 1000x1000 elements. If I include the PlotLegend option in ListLogLogPlot, it slows down quite a lot so that might be the reason why your code was so slow.
I thought I'd add a DM version. If you're like me you may find that easier than using manipulate. It is essentially a DM version of Heike's answer.
DynamicModule[{data,compounds,n=1,c={2},labels},
data=yourData;
compounds=Length[data[[1]]];
labels=Rule###Transpose[{Range[7],data[[1,2;;]]}];
Column[{
Dynamic[
Grid[
Join[
{{"no. of compounds",PopupMenu[Dynamic[n],Range[7]]}},
Table[
With[{i=i},
c=PadRight[c,n,2];
{"compound"<>ToString[i], PopupMenu[Dynamic[c[[i]]],labels]}
],
{i,n}
]
],
Alignment->{{Right,Left},Center}
],
TrackedSymbols:>{n}
],
Dynamic#ListLogLogPlot[data[[All,{1,#}]]&/#c]
}]
]
I've used Grid because it allows you to easily keep all the controllers and their labels aligned. PadRight[c,n,2] allows you to keep current settings if you change the value of n. I'd avoid plot legends and always make your own.
How about something like:
Manipulate[
Manipulate[ ListLogLogPlot[Table[Subscript[x, n], {n, 1, numCompounds}]],
Evaluate#Apply[Sequence,Table[{{Subscript[x, n], n + 1, "Compound " <> ToString#n},
Thread[Range[2, compounds] -> Drop[data[[1]], 1]]}, {n, 1,
numCompounds}]], ContinuousAction -> False],
{{numCompounds, 3}, 1, compounds - 1, 1}]