Spark Streaming: Inputs are received but not processed - intellij-idea

I am running a simple SparkStreaming application, that consists in sending messages through a socket server to the SparkStreaming Context and printing them.
This is my code, which I am running in IntelliJ IDE:
SparkConf sparkConfiguration= new SparkConf().setAppName("DataAnalysis").setMaster("spark://IP:7077");
JavaStreamingContext sparkStrContext=new JavaStreamingContext(sparkConfiguration, Durations.seconds(1));
JavaReceiverInputDStream<String> receiveData=sparkStrContext.socketTextStream("localhost",5554);
I am running this application in a standalone cluster mode, with one worker (an Ubuntu VM) and a master (my Windows host).
This is the problem: When I run the application, I see that it successfully connected to the master, but it doesn't print any lines:
it just stays this way permanently.
If I go to the Spark UI, I find that the SparkStreaming Context is receiving inputs, but they are not being processed:
Can someone help me please? Thank you so much.

You need to perform below.
sparkStrContext.start(); // Start the computation
sparkStrContext.awaitTermination(); // Wait for the computation to terminate
Once you do this , you need to post the messages at port 5554 , for this you will first need to run Netcat (a small utility found in most Unix-like systems) as a data server by using and start pushing the stream.
For example , you need to do like below.
TERMINAL 1:
# Running Netcat
$ nc -lk 5554
hello world
TERMINAL 2: RUNNING Your streaming program
-------------------------------------------
Time: 1357008430000 ms
-------------------------------------------
hello world
...
...
You can check similar example here

Related

How do I run multiple configuration commands in Dell EMC OS10 with Paramiko?

I am trying to run a series of commands to configure a vlan on a Dell EMC OS10 server using Paramiko. However I am running into a rather frustrating problem.
I want to run the following
# configure terminal
(config)# interface vlan 3
(conf-if-vl-3)# description VLAN-TEST
(conf-if-vl-3)# end
However, I can't seem to figure out how to achieve this with paramiko.SSHClient().
When I try to use sshclient.exec_command("show vlan") it works great, it runs this command and exits. However, I don't know how to run more than one command with a single exec_command.
If I run sshclient.exec_command("configure") to access the configuration shell, the command completes and I believe the channel is closed, since my next command sshclient.exec_command("interface vlan ...") is not successful since the switch is no longer in configure mode.
If there is a way to establish a persistent channel with exec_command that would be ideal.
Instead I have resorted to a function as follows
chan = sshClient.invoke_shell()
chan.send("configure\n")
chan.send("interface vlan 3\n")
chan.send("description VLAN_TEST\n")
chan.send("end\n")
Oddly, this works when I run it from a Python terminal one command at a time.
However, when I call this function from my Python main, it fails. Perhaps the channel is closed too soon when it goes out of scope from the function call?
Please advise if there is a more reasonable way to do this
Regarding sending commands to the configure mode started with SSHClient.exec_commmand, see:
Execute (sub)commands in secondary shell/command on SSH server in Python Paramiko
Though it's quite common that "devices" do not support the "exec" channel at all:
Executing command using Paramiko exec_command on device is not working
Regarding your problem with invoke_shell, it's quite possible that the server needs some time to get ready for the next command.
Quick-and-dirty solution is to "sleep" shortly between the individual send calls.
Better solution to is to wait for command prompt before sending the next command.

How to send/receive command between VMware

Now I am using VMware is based on ubuntu (named OS-1).
When I operate the another VMware (OS-2 is also based on ubuntu) in OS-1,
I would like to send command (OS-1) for executing specific script file from OS-1 to OS-2 and also receive the stdout from OS-2.
Is it possible?
OS-1 :
Receiving the specific command for executing the test.py from webserver.
Sending the command such as "python test.py" to OS-2.
OS-2 :
Receiving the command from OS-1.
Returning the stdout result to OS-1 such as "test script"
*** WebServer(in OS-1) ---> OS1 ---> OS2
test.py
print("===========");
print("test script");
The most obvious solution is to create internal network between those two virtual machines.
When those machines are connected it would be relatively simple to execute command, i.e you may use ssh (hint https://stackoverflow.com/a/3586168/3188346).
It is worth to note that this solution will work if you decide to use other VMs provider or dedicated servers.

OpenSliceDDS across a network

I am completely new to the DDS world. I understand basic concepts like publish and subscribe, and the stuff that can be gained from the documentation. I am attempting to use OpenSlice DDS, and am able to get through the tutorial without much difficulty. However, I want to get two different computers on the same network to talk to each other, which seems like a relatively simple task, but i can find no documentation on it.
For example, the message chat room tutorial... how would i get the message board running on one machine, and the chatter on another machine?
Thanks!
Found it! http://opensplice.org/pipermail/developer/2009-July/000094.html.
To summarize from the link:
Setup your environment on node 1 by running the release file in the OSPL_HOME directory (release.bat)
start the opensplice daemon on node 1 (ospl start)
run the messageboard application on node 1
Setup your environment on node 1 by running the release file in the OSPL_HOME directory (release.bat)
start the opensplice daemon on node 2 (ospl start)
run the chatter application on node 2

jmeter hangs up and won't return

I am running 340 concurrent users to load test on server using jmeter.
But on most of the cases jmeter hangs up and won' t return, even if I try to close the connection it just hangs up. and eventually I have to close the application.
Any idea how to check what is holding the requests and how to check the requests sent by jmeter and find the bottleneck.
Got the following message on closing the thread
Shutting down thread please be patient message
I've hit this several times over the past few years. In each of my cases (may not be in your's) the issue was with the Load Balance (F5) I was sending my traffic through. Basically a property called OneConnect was holding the connections in a time-wait state and never killing the connection.
Run a pack tool like wireshark and see what's happening with the requests.
Try distributed testing, 340 concurrent users is not a big deal, but still you can try if that decreases your pain. Also take a look at the following link:
http://jmeter.apache.org/usermanual/best-practices.html#lean_mean
First check you script is ok with one user.
Ensure you use assertions.
Then run you test following jmeter best practices:
no gui
no costly listeners
You should then be able to see in csv output the longest request and be able to fix your issue.
I also encountered this problem before when I run my JMeter on my laptop(Core 2 Duo 1.5Ghz) it always hang-up in the middle of the processing. I tried to run on another pc which is more powerful than my laptop and its works now smoothly. Therefore, JMeter will run effectively if your pc or laptop has a better specs.
Note: It is also advisable to run your JMeter in non-gui mode.
Example to run JMeter in Linux box:
$ ./jmeter -t test.jmx -n -l /Users/home/test.jtl
I had the
one or more test threads won't exit
because of a firewall blocking some requests. So I had to leap in the firewalls timeout for all blocked request... then it returned.
You are getting this error probably because JVM is not capable of running so many threads. If you take a look at your terminal, you will see the exception you get:
Uncaught Exception java.lang.OutOfMemoryError: unable to create new native thread. See log file for details.
You can solve this by doing Remote Testing and have multiple clusters running, instead of one.

Brisk TaskTracker not starting in a multi-node Brisk setup

I have a 3 node Brisk cluster (Briskv1.0_beta2). Cassandra is working fine (all three nodes see each other and data is balanced across the ring). I started the nodes with the brisk cassandra -t command. I cannot, however, run any Hive or Pig jobs. When I do, I get an exception saying that it cannot connect to the task tracker.
During the startup process, I see the following in the log:
TaskTracker.java (line 695) TaskTracker up at: localhost.localdomain/127.0.0.1:34928
A few lines later, however, I see this:
Retrying connect to server: localhost.localdomain/127.0.0.1:8012. Already tried 9 time(s).
INFO [TASK-TRACKER-INIT] RPC.java (line 321) Server at localhost.localdomain/127.0.0.1:8012 not available yet, Zzzzz...
Those lines are repeated non-stop as long as my cluster is running.
My cassandra.yaml file specifies the box IP (not 0.0.0.0 or localhost) as the listen_address and the rpc_address is set to 0.0.0.0
Why is the client attempting to connect to a different port than the log shows the task tracker as using? Is there anywhere these addresses/ports can be specified?
I figured this out. In case anyone else has the same issues, here's what was going on:
Brisk uses the first entry in the Cassandra cluster's seed list to pick the initial jobtracker. One of my nodes had 127.0.0.1 in the seed list. This worked for the Cassandra setup since all the other nodes in the cluster connected to that box to get the cluster topology but this didn't work for the job tracker selection.
looks like your jobtracker isn't running. What do you see when you run "brisktool jobtracker"?