I am new to Python and Adafruit IO.
I have created a feed in Adafruit IO and am able to periodically send a value (e.g temperature) and I can see a chart of historical values.
aio.send_data('my-feed', 123) # this works
But I am unable to send a complex value (e.g. temperature and humidity).
data = dict()
data['temperature'] = 20
data['humidity'] = 50
aio.send_data('my-feed', data) # this fails :-(
Is it possible to send a complex value to a single Adafruit IO feed?


What does executorRunTime consist of in Spark?

Currently working on Spark, I collected some performance metrics through the custom Spark listener API for analysis purposes. I tried to make a stacked bar plot that shows the percentage of the time the executor passes executing the task, shuffling or in garbage collection pauses for three different machine learning algorithms.
Here is a screenshot of what I found:
What caught my attention right after the plot appeared is that the rates are false. You can see that it goes beyond the value 1 for the kmeans algorithm, and less than 0.8 for the perceptron.
Here is how I computed the rates:
execution['cpuRate'] = execution['executorCpuTime'] / execution['executorRunTime']
execution['serRate'] = execution['resultSerializationTime'] / execution['executorRunTime']
execution['gcRate'] = execution['jvmGCTime'] / execution['executorRunTime']
execution['shuffleFetchRate'] = execution['shuffleFetchWaitTime'] / execution['executorRunTime']
execution['shuffleWriteRate'] = execution['shuffleWriteTime'] / execution['executorRunTime']
execution = execution[['cpuRate', 'serRate', 'gcRate', 'shuffleFetchRate', 'shuffleWriteRate']]
I use Pandas library and execution is the dataframe containing the averaged metrics.
Of course, my assumption is that the executorRunTime is a summation of the underlying other metrics, but it turns out to be false.
What are the meaning of those times, and how are they correlated? I mean: what does the executorRunTime consist of if not all the other metrics specified above?
According to TaskMetrics.scala:
* Time the executor spends actually running the task (including fetching shuffle data).
def executorRunTime: Long = _executorRunTime.sum
Measured in miliseconds.

Java Tensorflow Serving Prediction Client Too Slow

I have created a tensorflow object detection model and served it using tensorflow serving. I have created a python client to test the serving model and that takes around 40ms time to receive all the prediction.
t1 =
result = stub.Predict(request, 60.0) # 60 secs timeout
t2 =
print ((t2 - t1).microseconds / 1000)
Now, my problem is when I do the same on java, it takes way too much time (about 10 times) of 450 to 500ms.
ManagedChannel channel = ManagedChannelBuilder.forAddress("localhost", 9000)
PredictionServiceGrpc.PredictionServiceBlockingStub stub = PredictionServiceGrpc.newBlockingStub(channel);
Instant pre =;
Predict.PredictResponse response = stub.predict(request);
Instant curr =;
System.out.println("time " + ChronoUnit.MILLIS.between(pre,curr));
The actual issue was that,
I was sending all the image pixels over the network (which was a bad idea). Now, changing the input to an encoded image made it faster.

feed data into a like a queue

About the (from TensorFlow 1.2, see here and here) usage:
The way how to get data doesn't really fit any way how I get the data usually. In my case, I have a thread and I receive data there and I don't know in advance when it will end but I see when it ends. Then I wait until I processed all the buffers and then I have finished one epoch. How can I get this logic with the Dataset?
Note that I prefer the Dataset interface over the QueueBase interface because it gives me the iterator interface which I can reinitialize and even reset to a different Dataset. This is more powerful compared to queues which cannot be reopened currently after they are closed (see here and here).
Maybe a similar question, or the same question: How can I wrap around a Dataset over a queue? I have some thread with reads some data from somewhere and which can feed it and queue it somehow. How do I get the data into the Dataset? I could repeat some dummy tensor infinite times and then use map to just return my queue.dequeue() but that really only gets me back to all the original problems with the queue, i.e. how to reopen the queue.
The new Dataset.from_generator() method allows you to define a Dataset that is fed by a Python generator. (To use this feature at present, you must download a nightly build of TensorFlow or build it yourself from source. It will be part of TensorFlow 1.4.)
The easiest way to implement your example would be to replace your receiving thread with a generator, with pseudocode as follows:
def receiver():
while True:
next_element = ... # Receive next element from external source.
# Note that this method may block.
end_of_epoch = ... # Decide whether or not to stop based on next_element.
if not end_of_epoch:
yield next_element # Note: you may need to convert this to an array.
return # Returning will signal OutOfRangeError on downstream iterators.
dataset =, output_types=...)
# You can chain other `Dataset` methods after the generator. For example:
dataset = dataset.prefetch(...) # This will start a background thread
# to prefetch elements from `receiver()`.
dataset = dataset.repeat(...) # Note that each repetition will call
# `receiver()` again, and start from
# a fresh state.
dataset = dataset.batch(...)
More complicated topologies are possible. For example, you can use Dataset.interleave() to create many receivers in parallel.

Scipy, Numpy: Audio classifier,Voice/Speech Activity Detection

I am writting a program to automatically classify recorded audio phone calls files (wav files) which contain atleast some Human Voice or not (only DTMF, Dialtones, ringtones, noise).
My first approach was implementing simple VAD (voice activity detector) using ZCR (zero crossing rate) & calculating Energy, but both of these paramters confuse DTMF, Dialtones with Voice. This techquie failed so I implemented a trivial method to calculate variance of FFT inbetween 200Hz and 300Hz. My numpy code is as follows
wavefft = np.abs(fft(frame))
n = len(frame)
fx = np.arange(0,fs,float(fs)/float(n))
stx = np.where(fx>=200)
stx = stx[0][0]
endx = np.where(fx>=300)
endx = endx[0][0]
return np.sqrt(np.var(wavefft[stx:endx]))/1000
This resulted in 60% accuracy.
Next, I tried implementing a machine learning based approach using SVM (Support Vector Machine) and MFCC (Mel-frequency cepstral coefficients). The results were totally incorrect, almost all samples were incorrectly marked. How should one train a SVM with MFCC feature vectors? My rough code using scikit-learn is as follows
[samplerate, sample] = ('profiles/noise.wav')
noiseProfile = MFCC(samplerate, sample)
[samplerate, sample] = ('profiles/ring.wav')
ringProfile = MFCC(samplerate, sample)
[samplerate, sample] = ('profiles/voice.wav')
voiceProfile = MFCC(samplerate, sample)
machineData = []
for noise in noiseProfile:
for voice in voiceProfile:
dataLabel = []
for i in range(0, len(noiseProfile)):
dataLabel.append (0)
for i in range(0, len(voiceProfile)):
dataLabel.append (1)
clf = svm.SVC(), dataLabel)
I want to know what alternative approach I could implement?
If you don't have to use scipy/numpy, you might checkout webrtvad, which is a Python wrapper around Google's excellent WebRTC Voice Activity Detection code. WebRTC uses Gaussian Mixture Models (GMMs), works well, and is very fast.
Here's an example of how you might use it:
import webrtcvad
# audio must be 16 bit PCM, at 8 KHz, 16 KHz or 32 KHz.
def audio_contains_voice(audio, sample_rate, aggressiveness=0, threshold=0.5):
# Frames must be 10, 20 or 30 ms.
frame_duration_ms = 30
# Assuming split_audio is a function that will split audio into
# frames of the correct size.
frames = split_audio(audio, sample_rate, frame_duration)
# aggressiveness tells the VAD how aggressively to filter out non-speech.
# 0 will have the most false-positives for speech, 3 the least.
vad = webrtc.Vad(aggressiveness)
num_voiced = len([f for f in frames if vad.is_voiced(f, sample_rate)])
return float(num_voiced) / len(frames) > threshold

beats per minute algorithm using only one data variable of type byte stored in an array of byte

I'm getting real time data as byte and plotting the data using timer variable on a graph using
i want to calculate beats per minute from the graph. I've read about many peak detection algorithms but all assume to have two variables for their working.
the waveform looks like the one seen on an ECG (cardiac waveform).
I've tried the algorithm mentioned by Jean-Paul in the following post but it doesn't work for me (gives me a 0 bpm at all times)
Peak signal detection in realtime timeseries data
note: I'm storing the byte data in an array which can take atmost 2000 data points(bytes)