Pyinstaller, Multiprocessing, and Pandas - No such file/directory [duplicate] - pandas

Python v3.5, Windows 10
I'm using multiple processes and trying to captures user input. Searching everything I see there are odd things that happen when using input() with multiple processes. After 8 hours+ of trying, nothing I implement worked, I'm positive I am doing it wrong but I can't for the life of me figure it out.
The following is a very stripped down program that demonstrates the issue. Now it works fine when I run this program within PyCharm, but when I use pyinstaller to create a single executable it fails. The program constantly is stuck in a loop asking the user to enter something as shown below:.
I am pretty sure it has to do with how Windows takes in standard input from things I've read. I've also tried passing the user input variables as Queue() items to the functions but the same issue. I read you should put input() in the main python process so I did that under if __name__ = '__main__':
from multiprocessing import Process
import time
def func_1(duration_1):
while duration_1 >= 0:
time.sleep(1)
print('Duration_1: %d %s' % (duration_1, 's'))
duration_1 -= 1
def func_2(duration_2):
while duration_2 >= 0:
time.sleep(1)
print('Duration_2: %d %s' % (duration_2, 's'))
duration_2 -= 1
if __name__ == '__main__':
# func_1 user input
while True:
duration_1 = input('Enter a positive integer.')
if duration_1.isdigit():
duration_1 = int(duration_1)
break
else:
print('**Only positive integers accepted**')
continue
# func_2 user input
while True:
duration_2 = input('Enter a positive integer.')
if duration_2.isdigit():
duration_2 = int(duration_2)
break
else:
print('**Only positive integers accepted**')
continue
p1 = Process(target=func_1, args=(duration_1,))
p2 = Process(target=func_2, args=(duration_2,))
p1.start()
p2.start()
p1.join()
p2.join()

You need to use multiprocessing.freeze_support() when you produce a Windows executable with PyInstaller.
Straight out from the docs:
multiprocessing.freeze_support()
Add support for when a program which uses multiprocessing has been frozen to produce a Windows executable. (Has been tested with py2exe, PyInstaller and cx_Freeze.)
One needs to call this function straight after the if name == 'main' line of the main module. For example:
from multiprocessing import Process, freeze_support
def f():
print('hello world!')
if __name__ == '__main__':
freeze_support()
Process(target=f).start()
If the freeze_support() line is omitted then trying to run the frozen executable will raise RuntimeError.
Calling freeze_support() has no effect when invoked on any operating system other than Windows. In addition, if the module is being run normally by the Python interpreter on Windows (the program has not been frozen), then freeze_support() has no effect.
In your example you also have unnecessary code duplication you should tackle.

Related

How to run multiple functions synchronous in Jupyter Notebook?

I try to run multiple functions at the same time in Jupiter Notebook.
I have two web scraping functions that use Selenium and run for an infinite amount of time, both always creating an updated DataFrame. Another function merges the two DataFrames and does some calculations.
As the data always changes and the calculations from the different DataFrames need to be calculated within the same second (The two DataFrames update every 5 seconds), I wonder how I can run all functions at the same time.
As my code is mainly WebScraping I used this more to describe my goal and hopefully make it more readable. I already tried using 'multiprocessing' but it just does not do anything in the notebook.
def FirstWebScraping():
while True:
time.sleep(5).
#getting all data for DataFrame
def SecondtWebScraping():
while True:
time.sleep(5).
#getting all data for DataFrame
def Calculations():
while True:
#merging DataFrame from First- and SecondWebScraping
#doing calculations
#running this function infinite and looking for specific values
#Goal
def run_all_at_the_same_time()
FirstWebScraping()
SecondWebScraping()
Calculations()
Even though threading does not show the same benefits as multiprocessing it worked for me and with selenium. I put a waiting time at the beginning for the Calculations function and from there they were all looped infinitely.
from threading import Thread
if __name__ == '__main__':
Thread(target = FirstWebScraping).start()
Thread(target = SecondWebscraping).start()
Thread(target = Calculations).start()
You can run multiprocessing in Jupyter, if you follow two rules:
Put the worker functions in a separate module.
Protect the main process-only code with if __name__ == '__main__':
Assuming your three functions are moved to worker.py:
import multiprocessing as mp
import worker
if __name__ == '__main__':
mp.Process(target=worker.FirstWebScraping).start()
mp.Process(target=worker.SecondWebscraping).start()
mp.Process(target=worker.Calculations).start()

Why is my multiprocessing program spawning processes infinitely?

import time
from multiprocessing import Pool, RawArray, sharedctypes
from ctypes import c_int
def init_worker(X):
print(f"{X}")
def worker_func(i):
print(f"{X}")
time.sleep(i) # Some heavy computations
return
# We need this check for Windows to prevent infinitely spawning new child
# processes.
if __name__ == '__main__':
X =sharedctypes.RawValue(c_int)
X=3
with Pool(processes=4, initializer=init_worker, initargs=(X)) as pool:
pool.map(worker_func, [1,2,3,4])
print(X)
--- I am simply trying to print the value of X in each subprocess. This is a toy program in order to check whether I can share a value
and update it by using multiple processes.
This program spawns infinite number of processes because it there is no ,(comma) after the X in initargs=(X) ; it should be initargs=(X,) . That is because, if the comma is left-out that causes an error.

Why a single process can achieve multiple CPU usage of 100% on Windows Subsystem for Linux(WSL), but it can't on Ubuntu on server?

I want to achieve parallel computing by Python multiprocessing module, so I implement a simulated calculation to test whether I can use multiple CPU cores. I found a very strange thing that a single process can achieve 8 CPU usage of 100% on Windows Subsystem for Linux(WSL) on my desktop rather than only one CPU usage of 100% on Ubuntu on Lab's server.
Like this:
And this is the contrast:
Furthermore, I found that using multiple processes does not reduce the time cost on WSL on my desktop, but which indeed largely reduce the time cost on Ubuntu on Lab's server.
Like this:
(Here I run 6 processes and running a single process on Lab's server needs about 440s.)
And this is the contrast:
(Here I run 3 processes and running a single process on my desktop needs about 29s.)
Here is my Python source codes:
import numpy as np
import time
import os
import multiprocessing as mp
PROCESS_MAX = 1
LOOPS = 1
process_list = []
def simulated_calculation():
x = np.random.rand(100, 100)
y = np.random.rand(100, 100)
z = np.outer(x, y)
determinant = np.linalg.det(z)
def child_process(name):
for i in range(LOOPS):
print("The child process[%s] starts at %s and its PID is %s" % (str(name), time.ctime(), os.getpid()))
simulated_calculation()
print("The child process[%s] stops at %s and its PID is %s" %(str(name), time.ctime(), os.getpid()))
def main():
print("All start at %s" % time.ctime())
print("The parent process stars at %s and its PID is %s" % (time.ctime(), os.getpid()))
start_wall_time = time.time()
for i in range(PROCESS_MAX):
p = mp.Process(target = child_process, args = (i + 1, ))
process_list.append(p)
p.daemon = True
p.start()
for i in process_list:
i.join()
stop_wall_time = time.time()
print("All stop at %s" % time.ctime())
print("The whole runtime is %ss" % str(stop_wall_time - start_wall_time))
if __name__ == "__main__":
main()
I hope someone can help me. Thanks!
WSL1 has a virtual layer through which the Windows device drivers are being passed. WSL2 on the other hand, has more access due to a Linux kernel in place. However direct access to the hardware is inaccessible to WSL1 except USB. Hardware such as USB and GPU are currently not available to WSL2 but is being worked.

multiprocessing code gets stuck

I am using python 2.7 on windows 7 and I am currently trying to learn parallel processing.
I downloaded the multiprocessing 2.6.2.1 python package and installed it using pip.
When I try to run the foolowing very simple code, the program seems to get stuck, even after one hour it doesn't exit the execution despite the code to be super simple.
What am I missing?? thank you very much
from multiprocessing import Pool
def f(x):
return x*x
array =[1,2,3,4,5]
p=Pool()
result = p.map(f, array)
p.close()
p.join()
print result
The issue here is the way multiprocessing works. Think of it as python opening a new instance and importing all the modules all over again. You'll want to use the if __name__ == '__main__' convention. The following works fine:
import multiprocessing
def f(x):
return x * x
def main():
p = multiprocessing.Pool(multiprocessing.cpu_count())
result = p.imap(f, xrange(1, 6))
print list(result)
if __name__ == '__main__':
main()
I have changed a few other parts of the code too so you can see other ways to achieve the same thing, but ultimately you only need to stop the code executing over and over as python re-imports the code you are running.

QThread doesn't appear to start; PyQt5, Python 2.7.9

SUMMARY
PyQt5 doesn't appear to be creating a new thread corresponding to QThread object, or I haven't established Slot/Signal linkage correctly. Please help me to isolate my problem.
I'm a relatively casual user of Python, but I've been asked to create a utility for another team that wraps some of their Python libraries (which themselves wrap C++) in a GUI. Because this utility is for another team, I can't change versions of compilers etc, or at least, not without providing a decent reason.
The utility is intended to provide an interface for debugging into some hardware that my colleagues are developing.
After examining the options, I decided to use Qt and the PyQt bindings. The steps I followed were:
Install Visual Studio 2010 SP1 (required because other team's libraries are compiled using this version of the MS compiler).
Install Python 2.7.9 (their version of Python)
Install qt-opensource-windows-x86-msvc2010-5.2.1.exe
Get source for SIP-4.18.zip and compile and install
Get source for PyQt-gpl-5.2.1.zip, compile and install
Try to build a PyQt application that wraps the other team's comms and translation libraries. Those libraries aren't asynchronous as far as I can tell, so I think that I need to separate that part of the application from the GUI.
The code that I've written produces the UI and is responsive in the sense that if I put break points in the methods that are called from the QAction objects, then those break points are appropriately triggered. My problem is that the Worker object that I create doesn't appear to move to a separate thread, (despite the call to moveToThread) because if I make the connection of type BlockingQueuedConnection instead of QueuedConnection then I get a deadlock. Breakpoints that I put on the slots in the Worker type are never triggered.
Here's the code::
import os
import sys
import time
from PyQt5.QtWidgets import QMainWindow, QTextEdit, QAction, QApplication, QStatusBar, QLabel, QWidget, QDesktopWidget, QInputDialog
from PyQt5.QtGui import QIcon
from PyQt5.QtCore import Qt, QThread, QObject, pyqtSignal, pyqtSlot
class Worker(QObject):
def __init__(self):
super(Worker, self).__init__()
self._isRunning = True
self._connectionId = ""
self._terminate = False
#pyqtSlot()
def cmd_start_running(self):
"""This slot is used to send a command to the HW asking for it to enter Running mode.
It will actually work by putting a command in a queue for the main_loop to get to
in its own serialised good time. All the other commands will work in a similar fashion
Up until such time as it is implemented, I will fake it."""
self._isRunning = True
pass
#pyqtSlot()
def cmd_stop_running(self):
"""This slot is used to send a command to the HW asking for it to enter Standby mode.
Up until such time as it is implemented, I will fake it."""
self._isRunning = False
#pyqtSlot()
def cmd_get_version(self):
"""This slot is used to send a command to the HW asking for its version string"""
pass
#pyqtSlot()
def cmd_terminate(self):
"""This slot is used to notify this object that it has to join the main thread."""
pass
#pyqtSlot()
def main_loop(self):
"""This slot is the main loop that is attached to the QThread object. It has sleep periods
that allow the messages on the other slots to be processed."""
while not self._terminate:
self.thread().sleep(1)
# While there is stuff on the wire, get it off, translate it, then
# signal it
# For the mean while, pretend that _isRunning corresponds to when
# RT streams will be
# being received from the HW.
if self._isRunning:
pass
# Search queue for commands, if any found, translate, then put on
# the wire
class DemoMainWindow(QMainWindow):
sgnl_get_version = pyqtSignal()
sgnl_start_running = pyqtSignal()
sgnl_stop_running = pyqtSignal()
sgnl_terminate = pyqtSignal()
def __init__(self):
super(DemoMainWindow, self).__init__()
self.initUI()
self._workerObject = Worker()
self._workerThread = QThread()
self._workerObject.moveToThread(self._workerThread)
self._workerThread.started.connect(self._workerObject.main_loop, type=Qt.QueuedConnection)
# I changed the following connection to type BlockingQueuedConnection,
# and got a Deadlock error
# reported, so I assume that there is already a problem before I get to
# this point.
# I understand that the default for 'type' (Qt.AutoConnection) is
# supposed to correctly infer that a QueuedConnection is required.
# I was getting desperate.
self.sgnl_get_version.connect(self._workerObject.cmd_get_version, type=Qt.QueuedConnection)
self.sgnl_start_running.connect(self._workerObject.cmd_start_running, type=Qt.QueuedConnection)
self.sgnl_stop_running.connect(self._workerObject.cmd_stop_running, type=Qt.QueuedConnection)
self.sgnl_terminate.connect(self._workerObject.cmd_terminate, type=Qt.QueuedConnection)
def initUI(self):
textEdit = QTextEdit()
self.setCentralWidget(textEdit)
lbl = QLabel(self.statusBar())
lbl.setText("HW Version: ")
self.statusBar().addPermanentWidget(lbl)
exitAction = QAction(QIcon('exit24.png'), 'Exit', self)
exitAction.setShortcut('Ctrl+Q')
exitAction.setStatusTip('Exit application')
exitAction.triggered.connect(self.close)
connectAction = QAction(QIcon('connect24.png'), 'Connect', self)
connectAction.setStatusTip('Connect to HW')
connectAction.triggered.connect(self.establishCanConnection)
enterRunningAction = QAction(QIcon('start24.png'), 'Start Running', self)
enterRunningAction.setStatusTip('Start Running')
enterRunningAction.triggered.connect(self.enterRunning)
enterStandbyAction = QAction(QIcon('stop24.png'), 'Stop Running', self)
enterStandbyAction.setStatusTip('Stop Running')
enterStandbyAction.triggered.connect(self.enterStandby)
self.statusBar()
menubar = self.menuBar()
fileMenu = menubar.addMenu('&File')
fileMenu.addAction(exitAction)
hwMenu = menubar.addMenu('&Hardware')
hwMenu.addAction(connectAction)
hwMenu.addAction(enterRunningAction)
hwMenu.addAction(enterStandbyAction)
toolbar = self.addToolBar('Exit')
toolbar.addAction(exitAction)
toolbar.addAction(connectAction)
toolbar.addAction(enterRunningAction)
toolbar.addAction(enterStandbyAction)
self.setGeometry(300, 300, 400, 350) # x, y, width, height
self.setWindowTitle('Demo Prog')
self.show()
def establishCanConnection(self):
iDlg = QInputDialog(self)
iDlg.setInputMode(QInputDialog.IntInput)
idInt, ok = iDlg.getInt(self, 'CAN ID Selection', 'HW ID:')
canID = '%s%d' % ('HW', idInt)
if ok:
self._workerThread.start()
pass
# this would be where the channel is established
def enterRunning(self):
self.sgnl_start_running.emit()
# this would be where the command to start running is sent from
def enterStandby(self):
self.sgnl_stop_running.emit()
# send the command to stop running
if __name__ == '__main__':
app = QApplication(sys.argv)
mainWindow = DemoMainWindow()
sys.exit(app.exec_())
Note that the call to start the _workerThread is in the establishCanConnection method, but that shouldn't be a problem, should it?
I used the procmon utility to check if more threads are created if establishCanConnection is run, and it appears that there are more threads, but I found it hard to relate which thread (if any of them) related to the QThread object.
Don't use BlockingQueuedConnection unless you really need it. If you don't know whether you need it or not, then you don't need it.
Cross-thread signals are queued in the event-loop of the receiving thread. If that thread is running code that blocks, it won't be able to process any events. Thus, if you send a signal with BlockingQueuedConnection to a thread that is blocked, you'll get a deadlock.
Your example uses a worker object that runs a blocking while loop, so it is subject to the deadlock problem outlined above. If you want to send signals to a thread that is blocked, you will need to arrange for the blocking code to periodically allow the thread to process its events, like this:
while not self._terminate:
self.thread().sleep(1)
QApplication.processEvents()
PS:
If you want to check that the worker is running in a different thread, you can print the return value of QThread.currentThread() or QThread.currentThreadId() (these functions are static, so you don't need an instance of QThread to call them).