Shared dict of multiprocessing.Lock in python3 - python-multiprocessing

Is it possible to have a shared dict of Locks in python3? I need multiple locks because I want to protect a dict of shared resources. Each resource gets a lock:
manager = multiprocessing.Manager()
locks = manager.dict({key : manager.Lock() for key in range(100)})
shared_resource = manager.dict({key : SomeClass() for key in range(100)})
# later in a multi-processed function
def foo(key):
# ...
locks[key].acquire()
shared_resource[key] = ...
locks[key].release()
# ...
This toy example would fail with:
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Unserializable message: ('#RETURN', <unlocked _thread.lock object at 0x7f9a4c9dc468>)
Any idea how to get around this problem? I could use conditional variables? Or how would you protect a list of resources?

Ok, seems like it's a bug with ptyhon3.5.
With python3.6 it works like a charm.

I'm sure it's possible. I don't get an error when I run this code...I just substituted SomeClass with 'x'. So maybe there's an issue there.
Also, using a context manager to acquire and release the lock is a nice little abstraction...
manager = multiprocessing.Manager()
locks = {key : manager.Lock() for key in range(100)}
shared_resource = {key : 'x' for key in range(100)}
# later in a multi-processed function
def foo(key):
# ...
with locks[key]:
shared_resource[key] = 'xoyo'
if __name__ == '__main__':
p = Process(target=foo, args=(1,))
p.start()
p = Process(target=foo, args=(1,))
p.start()
p = Process(target=foo, args=(1,))
p.start()
p.join()

Related

How to create a fake StringSession for unit tests

I've got some code which uses StringSession to talk to the Telegram API using telethon.
In my unit tests, I'm trying to instantiate a mocked TelegramClient, passing it a StringSession(myvalue) object as the first parameter. The real code works fine, but I need a fake session string for 'myvalue', to use in my unit tests (where I have a mocked telegram client).
How can I create a dummy value for 'myvalue' which will successfully execute StringSession(myvalue)?
Currently, my tests are dying here:
self = <telethon.sessions.string.StringSession object at 0x7f0777492ad0>
string = 'dummyxxx'
def __init__(self, string: str = None):
super().__init__()
if string:
if string[0] != CURRENT_VERSION:
raise ValueError('Not a valid string')
string = string[1:]
ip_len = 4 if len(string) == 352 else 16
> self._dc_id, ip, self._port, key = struct.unpack(
_STRUCT_PREFORMAT.format(ip_len), StringSession.decode(string))
E struct.error: unpack requires a buffer of 275 bytes
If you don't need a valid session to start with, you can also use MemorySession instead:
from telethon.sessions import MemorySession
session = MemorySession()
# use session variable when creating the client
Someone posted an answer which helped point me in the right direction, but they later deleted it for some reason.
In case it helps anyone else, here is the code that worked for me:
import struct
import base64
from telethon.sessions import StringSession
_STRUCT_PREFORMAT = '>B{}sH256s'
CURRENT_VERSION = '1'
dc_id = 1
ip = b'\x7f\x00\x00\x01' # 127.0.0.1
port = 80
key = b'\x00' * 256
string = StringSession.encode(struct.pack(
_STRUCT_PREFORMAT.format(len(ip)),
dc_id,
ip,
port,
key
))
myvalue = CURRENT_VERSION + string
# Create the StringSession object using the dummy value to confirm it works
session = StringSession(myvalue)
print(myvalue)

insert_many in pymongo not persisting

I'm having some issues with persisting documents with pymongo when using insert_many.
I'm handing over a list of dicts to insert_many and it works fine from inside the same script that does the inserting. Less so once the script has finished.
def row_to_doc(row):
rowdict = row.to_dict()
for key in rowdict:
val = rowdict[key]
if type(val) == float or type(val) == np.float64:
if np.isnan(val):
# If we want a SQL style document collection
rowdict[key] = None
# If we want a NoSQL style document collection
# del rowdict[key]
return rowdict
def dataframe_to_collection(df):
n = len(df)
doc_list = []
for k in range(n):
doc_list.append(row_to_doc(df.iloc[k]))
return doc_list
def get_mongodb_client(host="localhost", port=27017):
return MongoClient(host, port)
def create_collection(client):
db = client["material"]
return db["master-data"]
def add_docs_to_mongo(collection, doc_list):
collection.insert_many(doc_list)
def main():
client = get_mongodb_client()
csv_fname = "some_csv_fname.csv"
df = get_clean_csv(csv_fname)
doc_list = dataframe_to_collection(df)
collection = create_collection(client)
add_docs_to_mongo(collection, doc_list)
test_doc = collection.find_one({"MATERIAL": "000000000000000001"})
When I open up another python REPL and start looking through the client.material.master_data collection with collection.find_one({"MATERIAL": "000000000000000001"}) or collection.count_documents({}) I get None for the find_one and 0 for the count_documents.
Is there a step where I need to call some method to persist the data to disk? db.collection.save() in the mongo client API sounds like what I need but it's just another way of inserting documents from what I have read. Any help would be greatly appreciated.
The problem was that I was getting my collection via client.db_name.collection_name and it wasn't getting the same collection I was creating with my code. client.db_name["collection-name"] solved my issue. Weird.

Advice needed on setting up an (Objective C?) Mac-based web service

I have developed numerous iOS apps over the years so know Objective C reasonably well.
I'd like to build my first web service to offload some of the most processor intensive functions.
I'm leaning towards using my Mac as the server, which comes with Apache. I have configured this and it appears to be working as it should (I can type the Mac's IP address and receive a confirmation).
Now I'm trying to decide on how to build the server-side web service, which is totally new to me. I'd like to leverage my Objective C knowledge if possible. I think I'm looking for an Objective C-compatible web service engine and some examples how to connect it to browsers and mobile interfaces. I was leaning towards using Amazon's SimpleDB as the database.
BTW: I see Apple have Lion Server, but I cannot work out if this is an option.
Any thoughts/recommendations are appreciated.?
There are examples of simple web servers out there written in ObjC such as this and this.
That said, there are probably "better" ways of doing this if you don't mind using other technologies. This is a matter of preference; but I've use Python, MySQL, and the excellent web.py framework for these sorts of backends.
For example, here's an example web service (some redundancies omitted...) using the combination of technologies described. I just run this on my server, and it takes care of url redirection and serves JSON from the db.
import web
import json
import MySQLdb
urls = (
"/equip/gruppo", "gruppo", # GET = get all gruppos, # POST = save gruppo
"/equip/frame", "frame"
)
class StatusCode:
(Success,SuccessNoRows,FailConnect,FailQuery,FailMissingParam,FailOther) = range(6);
# top-level class that handles db interaction
class APIObject:
def __init__(self):
self.object_dict = {} # top-level dictionary to be turned into JSON
self.rows = []
self.cursor = ""
self.conn = ""
def dbConnect(self):
try:
self.conn = MySQLdb.connect( host = 'localhost', user = 'my_api_user', passwd = 'api_user_pw', db = 'my_db')
self.cursor = self.conn.cursor(MySQLdb.cursors.DictCursor)
except:
self.object_dict['api_status'] = StatusCode.FailConnect
return False
else:
return True
def queryExecute(self,query):
try:
self.cursor.execute(query)
self.rows = self.cursor.fetchall()
except:
self.object_dict['api_status'] = StatusCode.FailQuery
return False
else:
return True
class gruppo(APIObject):
def GET(self):
web.header('Content-Type', 'application/json')
if self.dbConnect() == False:
return json.dumps(self.object_dict,sort_keys=True, indent=4)
else:
if self.queryExecute("SELECT * FROM gruppos") == False:
return json.dumps(self.object_dict,sort_keys=True, indent=4)
else:
self.object_dict['api_status'] = StatusCode.SuccessNoRows if self.rows.count == 0 else StatusCode.Success
data_list = []
for row in self.rows:
# create a dictionary with the required elements
d = {}
d['id'] = row['id']
d['maker'] = row['maker_name']
d['type'] = row['type_name']
# append to the object list
data_list.append(d)
self.object_dict['data'] = data_list
# return to the client
return json.dumps(self.object_dict,sort_keys=True, indent=4)

redis move all keys

is it possible to use redis's MOVE command to move all keys from 1 database to another? The move command only moves 1 key, but I need to move all the keys in the database.
I would recommend taking a look at the following alpha version app to backup and restore redis databases.. (you can install it via gem install redis-dump). You could redis-dump your databaseand then redis-load into another database via the --database argument.
redis-dump project
If this doesn't fit your purposes, you may need to make use of a scripting language's redis bindings (or alternatively throw something together using bash / redis-cli / xargs, etc). If you need assistance along these lines then we probably need more details first.
I've wrote a small python script to move data between two redis servers:(only support list and string types, and you must install python redis client):
'''
Created on 2011-11-9
#author: wuyi
'''
import redis
from optparse import OptionParser
import time
def mv_str(r_source, r_dest, quiet):
keys = r_source.keys("*")
for k in keys:
if r_dest.keys(k):
print "skipping %s"%k
continue
else:
print "copying %s"%k
r_dest.set(k, r_source.get(k))
def mv_list(r_source, r_dest, quiet):
keys = r_source.keys("*")
for k in keys:
length = r_source.llen(k)
i = 0
while (i<length):
print "add queue no.:%d"%i
v = r_source.lindex(k, i)
r_dest.rpush(k, v)
i += 1
if __name__ == "__main__":
usage = """usage: %prog [options] source dest"""
parser = OptionParser(usage=usage)
parser.add_option("-q", "--quiet", dest="quiet",
default = False, action="store_true",
help="quiet mode")
parser.add_option("-p", "--port", dest="port",
default = 6380,
help="port for both source and dest")
parser.add_option("", "--dbs", dest="dbs",
default = "0",
help="db list: 0 1 120 220...")
parser.add_option("-t", "--type", dest="type",
default = "normal",
help="available types: normal, lpoplist")
parser.add_option("", "--tmpdb", dest="tmpdb",
default = 0,
help="tmp db number to store tmp data")
(options, args) = parser.parse_args()
if not len(args) == 2:
print usage
exit(1)
source = args[0]
dest = args[1]
if source == dest:
print "dest must not be the same as source!"
exit(2)
dbs = options.dbs.split(' ')
for db in dbs:
r_source = redis.Redis(host=source, db=db, password="", port=int(options.port))
r_dest = redis.Redis(host=dest, db=db, password="", port=int(options.port))
print "______________db____________:%s"%db
time.sleep(2)
if options.type == "normal":
mv_str(r_source, r_dest, options.quiet)
elif options.type == "lpoplist":
mv_list(r_source, r_dest, options.quiet)
del r_source
del r_dest
you can try my own tool, rdd
it's a command line utility,
can dump database to a file, work on it (filter, match, merge, ...), and back it in a redis instance
take care, alpha stage, https://github.com/r043v/rdd/
Now that redis has scripting using lua, you can easily write a command that loops through all the keys, checks their type and moves them accordingly to a new database.
I suggest you can try it as below:
1. copy the rdb file to another dir;
2. modify the rdb file name;
3. modify the redis configure file adapter to the new db;

which OO programming style in R will result readable to a Python programmer?

I'm author of the logging package on CRAN, I don't see myself as an R programmer, so I tried to make it as code-compatible with the Python standard logging package as I could, but now I have a question. and I hope it will give me the chance to learn some more R!
it's about hierarchical loggers. in Python I would create a logger and send it logging records:
l = logging.getLogger("some.lower.name")
l.debug("test")
l.info("some")
l.warn("say no")
In my R package instead you do not create a logger to which you send messages, you invoke a function where one of the arguments is the name of the logger. something like
logdebug("test", logger="some.lower.name")
loginfo("some", logger="some.lower.name")
logwarn("say no", logger="some.lower.name")
the problem is that you have to repeat the name of the logger each time you want to send it a logging message. I was thinking, I might create a partially applied function object and invoke that instead, something like
logdebug <- curry(logging::logdebug, logger="some.lower.logger")
but then I need doing so for all debugging functions...
how would you R users approach this?
Sounds like a job for a reference class ?setRefClass, ?ReferenceClasses
Logger <- setRefClass("Logger",
fields=list(name = "character"),
methods=list(
log = function(level, ...)
{ levellog(level, ..., logger=name) },
debug = function(...) { log("DEBUG", ...) },
info = function(...) { log("INFO", ...) },
warn = function(...) { log("WARN", ...) },
error = function(...) { log("ERROR", ...) }
))
and then
> basicConfig()
> l <- Logger$new(name="hierarchic.logger.name")
> l$debug("oops")
> l$info("oops")
2011-02-11 11:54:05 NumericLevel(INFO):hierarchic.logger.name:oops
> l$warn("oops")
2011-02-11 11:54:11 NumericLevel(WARN):hierarchic.logger.name:oops
>
This could be done with the proto package. This supports older versions of R (its been around for years) so you would not have a problem of old vs. new versions of R.
library(proto)
library(logging)
Logger. <- proto(
new = function(this, name)
this$proto(name = name),
log = function(this, ...)
levellog(..., logger = this$name),
setLevel = function(this, newLevel)
logging::setLevel(newLevel, container = this$name),
addHandler = function(this, ...)
logging::addHandler(this, ..., logger = this$name),
warn = function(this, ...)
this$log(loglevels["WARN"], ...),
error = function(this, ...)
this$log(loglevels["ERROR"], ...)
)
basicConfig()
l <- Logger.$new(name = "hierarchic.logger.name")
l$warn("this may be bad")
l$error("this definitely is bad")
This gives the output:
> basicConfig()
> l <- Logger.$new(name = "hierarchic.logger.name")
> l$warn("this may be bad")
2011-02-28 10:17:54 WARNING:hierarchic.logger.name:this may be bad
> l$error("this definitely is bad")
2011-02-28 10:17:54 ERROR:hierarchic.logger.name:this definitely is bad
In the above we have merely layered proto on top of logging but it would be possible to turn each logging object into a proto object, i.e. it would be both, since both logging objects and proto objects are R environments. That would get rid of the extra layer.
See the http://r-proto.googlecode.com for more info.
Why would you repeat the name? It would be more convenient to pass the log-object directly to the function, ie
logdebug("test",logger=l)
# or
logdebug("test",l)
A bit the way one would use connections in a number of functions. That seems more the R way of doing it I guess.