Changing how object appears in interpreter - interpreter

Is there a way to change how an object appears when displayed at the Python interpreter? For example:
>>> test = myobject(2)
>>> test
'I am 2'
OR
>>> test = myobject(2)
>>> test
myobject(2)

Yes, you can provide a definition for the special __repr__ method:
class Test:
def __repr__(self):
return "I am a Test"
>>> a = Test()
>>> a
I am a Test
In a real example, of course, you would print out some values from object data members.
The __repr__ method is described in the Python documentation here.

Related

What exactly to test (unittest) in a larger function containing several dataframe manipulations

Perhaps this is a constraint of my understanding of unittests, but I get quite confused as to what should be tested, patched, etc in a method that has several pandas dataframe manipulations. Many of the unittest examples out there focus on classes and methods that are typically small. For larger methods, I get a bit lost on the typical unittest paradigm. For example:
myscript.py
class Pivot:
def prepare_dfs(self):
df = pd.read_csv(self.file, sep=self.delimiter)
g = df.groupby("Other_Location")
df1 = g.apply(lambda x: x[x["PRN"] == "Free"].count())
locations = ["O12-03-01", "O12-03-02"]
cp = df1["PRN"]
cp = cp[locations].tolist()
data = [locations, cp]
new_df = pd.DataFrame({"Other_Location": data[0], "Free": data[1]})
return new_df, df
test_myscript.py
class TestPivot(unittest.TestCase):
def setUp(self):
args = parse_args(["-f", "test1", "-d", ","])
self.pivot = Pivot(args)
self.pivot.path = "Pivot/path"
#mock.patch("myscript.cp[locations].tolist()", return_value=None)
#mock.patch("myscript.pd.read_csv", return_value=df)
def test_prepare_dfs_1(self, mock_read_csv, mock_cp):
new_df, df = self.pivot.prepare_dfs()
# Here I get a bit lost
For example here I try to circumvent the following error message:
ModuleNotFoundError: No module named 'myscript.cp[locations]'; 'myscript' is not a package
I managed to mock correctly the pd.read_csv in my method, however further down in the code there are groupy, apply, tolist etc. The error message is thrown at the following line:
cp = cp[locations].tolist()
What is the best way to approach unittesting when your method involves several manipulations on a dataframe? Is refactoring the code always advised (into smaller chunks)? In this case, how can I mock correctly the tolist ?

What does the # mean in this equation? [duplicate]

What does the # symbol do in Python?
An # symbol at the beginning of a line is used for class and function decorators:
PEP 318: Decorators
Python Decorators
The most common Python decorators are:
#property
#classmethod
#staticmethod
An # in the middle of a line is probably matrix multiplication:
# as a binary operator.
Example
class Pizza(object):
def __init__(self):
self.toppings = []
def __call__(self, topping):
# When using '#instance_of_pizza' before a function definition
# the function gets passed onto 'topping'.
self.toppings.append(topping())
def __repr__(self):
return str(self.toppings)
pizza = Pizza()
#pizza
def cheese():
return 'cheese'
#pizza
def sauce():
return 'sauce'
print pizza
# ['cheese', 'sauce']
This shows that the function/method/class you're defining after a decorator is just basically passed on as an argument to the function/method immediately after the # sign.
First sighting
The microframework Flask introduces decorators from the very beginning in the following format:
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
This in turn translates to:
rule = "/"
view_func = hello
# They go as arguments here in 'flask/app.py'
def add_url_rule(self, rule, endpoint=None, view_func=None, **options):
pass
Realizing this finally allowed me to feel at peace with Flask.
In Python 3.5 you can overload # as an operator. It is named as __matmul__, because it is designed to do matrix multiplication, but it can be anything you want. See PEP465 for details.
This is a simple implementation of matrix multiplication.
class Mat(list):
def __matmul__(self, B):
A = self
return Mat([[sum(A[i][k]*B[k][j] for k in range(len(B)))
for j in range(len(B[0])) ] for i in range(len(A))])
A = Mat([[1,3],[7,5]])
B = Mat([[6,8],[4,2]])
print(A # B)
This code yields:
[[18, 14], [62, 66]]
This code snippet:
def decorator(func):
return func
#decorator
def some_func():
pass
Is equivalent to this code:
def decorator(func):
return func
def some_func():
pass
some_func = decorator(some_func)
In the definition of a decorator you can add some modified things that wouldn't be returned by a function normally.
What does the “at” (#) symbol do in Python?
In short, it is used in decorator syntax and for matrix multiplication.
In the context of decorators, this syntax:
#decorator
def decorated_function():
"""this function is decorated"""
is equivalent to this:
def decorated_function():
"""this function is decorated"""
decorated_function = decorator(decorated_function)
In the context of matrix multiplication, a # b invokes a.__matmul__(b) - making this syntax:
a # b
equivalent to
dot(a, b)
and
a #= b
equivalent to
a = dot(a, b)
where dot is, for example, the numpy matrix multiplication function and a and b are matrices.
How could you discover this on your own?
I also do not know what to search for as searching Python docs or Google does not return relevant results when the # symbol is included.
If you want to have a rather complete view of what a particular piece of python syntax does, look directly at the grammar file. For the Python 3 branch:
~$ grep -C 1 "#" cpython/Grammar/Grammar
decorator: '#' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorators: decorator+
--
testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [',']
augassign: ('+=' | '-=' | '*=' | '#=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=')
--
arith_expr: term (('+'|'-') term)*
term: factor (('*'|'#'|'/'|'%'|'//') factor)*
factor: ('+'|'-'|'~') factor | power
We can see here that # is used in three contexts:
decorators
an operator between factors
an augmented assignment operator
Decorator Syntax:
A google search for "decorator python docs" gives as one of the top results, the "Compound Statements" section of the "Python Language Reference." Scrolling down to the section on function definitions, which we can find by searching for the word, "decorator", we see that... there's a lot to read. But the word, "decorator" is a link to the glossary, which tells us:
decorator
A function returning another function, usually applied as a function transformation using the #wrapper syntax. Common
examples for decorators are classmethod() and staticmethod().
The decorator syntax is merely syntactic sugar, the following two
function definitions are semantically equivalent:
def f(...):
...
f = staticmethod(f)
#staticmethod
def f(...):
...
The same concept exists for classes, but is less commonly used there.
See the documentation for function definitions and class definitions
for more about decorators.
So, we see that
#foo
def bar():
pass
is semantically the same as:
def bar():
pass
bar = foo(bar)
They are not exactly the same because Python evaluates the foo expression (which could be a dotted lookup and a function call) before bar with the decorator (#) syntax, but evaluates the foo expression after bar in the other case.
(If this difference makes a difference in the meaning of your code, you should reconsider what you're doing with your life, because that would be pathological.)
Stacked Decorators
If we go back to the function definition syntax documentation, we see:
#f1(arg)
#f2
def func(): pass
is roughly equivalent to
def func(): pass
func = f1(arg)(f2(func))
This is a demonstration that we can call a function that's a decorator first, as well as stack decorators. Functions, in Python, are first class objects - which means you can pass a function as an argument to another function, and return functions. Decorators do both of these things.
If we stack decorators, the function, as defined, gets passed first to the decorator immediately above it, then the next, and so on.
That about sums up the usage for # in the context of decorators.
The Operator, #
In the lexical analysis section of the language reference, we have a section on operators, which includes #, which makes it also an operator:
The following tokens are operators:
+ - * ** / // % #
<< >> & | ^ ~
< > <= >= == !=
and in the next page, the Data Model, we have the section Emulating Numeric Types,
object.__add__(self, other)
object.__sub__(self, other)
object.__mul__(self, other)
object.__matmul__(self, other)
object.__truediv__(self, other)
object.__floordiv__(self, other)
[...]
These methods are called to implement the binary arithmetic operations (+, -, *, #, /, //, [...]
And we see that __matmul__ corresponds to #. If we search the documentation for "matmul" we get a link to What's new in Python 3.5 with "matmul" under a heading "PEP 465 - A dedicated infix operator for matrix multiplication".
it can be implemented by defining __matmul__(), __rmatmul__(), and
__imatmul__() for regular, reflected, and in-place matrix multiplication.
(So now we learn that #= is the in-place version). It further explains:
Matrix multiplication is a notably common operation in many fields of
mathematics, science, engineering, and the addition of # allows
writing cleaner code:
S = (H # beta - r).T # inv(H # V # H.T) # (H # beta - r)
instead of:
S = dot((dot(H, beta) - r).T,
dot(inv(dot(dot(H, V), H.T)), dot(H, beta) - r))
While this operator can be overloaded to do almost anything, in numpy, for example, we would use this syntax to calculate the inner and outer product of arrays and matrices:
>>> from numpy import array, matrix
>>> array([[1,2,3]]).T # array([[1,2,3]])
array([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
>>> array([[1,2,3]]) # array([[1,2,3]]).T
array([[14]])
>>> matrix([1,2,3]).T # matrix([1,2,3])
matrix([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
>>> matrix([1,2,3]) # matrix([1,2,3]).T
matrix([[14]])
Inplace matrix multiplication: #=
While researching the prior usage, we learn that there is also the inplace matrix multiplication. If we attempt to use it, we may find it is not yet implemented for numpy:
>>> m = matrix([1,2,3])
>>> m #= m.T
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: In-place matrix multiplication is not (yet) supported. Use 'a = a # b' instead of 'a #= b'.
When it is implemented, I would expect the result to look like this:
>>> m = matrix([1,2,3])
>>> m #= m.T
>>> m
matrix([[14]])
What does the “at” (#) symbol do in Python?
# symbol is a syntactic sugar python provides to utilize decorator,
to paraphrase the question, It's exactly about what does decorator do in Python?
Put it simple decorator allow you to modify a given function's definition without touch its innermost (it's closure).
It's the most case when you import wonderful package from third party. You can visualize it, you can use it, but you cannot touch its innermost and its heart.
Here is a quick example,
suppose I define a read_a_book function on Ipython
In [9]: def read_a_book():
...: return "I am reading the book: "
...:
In [10]: read_a_book()
Out[10]: 'I am reading the book: '
You see, I forgot to add a name to it.
How to solve such a problem? Of course, I could re-define the function as:
def read_a_book():
return "I am reading the book: 'Python Cookbook'"
Nevertheless, what if I'm not allowed to manipulate the original function, or if there are thousands of such function to be handled.
Solve the problem by thinking different and define a new_function
def add_a_book(func):
def wrapper():
return func() + "Python Cookbook"
return wrapper
Then employ it.
In [14]: read_a_book = add_a_book(read_a_book)
In [15]: read_a_book()
Out[15]: 'I am reading the book: Python Cookbook'
Tada, you see, I amended read_a_book without touching it inner closure. Nothing stops me equipped with decorator.
What's about #
#add_a_book
def read_a_book():
return "I am reading the book: "
In [17]: read_a_book()
Out[17]: 'I am reading the book: Python Cookbook'
#add_a_book is a fancy and handy way to say read_a_book = add_a_book(read_a_book), it's a syntactic sugar, there's nothing more fancier about it.
If you are referring to some code in a python notebook which is using Numpy library, then # operator means Matrix Multiplication. For example:
import numpy as np
def forward(xi, W1, b1, W2, b2):
z1 = W1 # xi + b1
a1 = sigma(z1)
z2 = W2 # a1 + b2
return z2, a1
Decorators were added in Python to make function and method wrapping (a function that receives a function and returns an enhanced one) easier to read and understand. The original use case was to be able to define the methods as class methods or static methods on the head of their definition. Without the decorator syntax, it would require a rather sparse and repetitive definition:
class WithoutDecorators:
def some_static_method():
print("this is static method")
some_static_method = staticmethod(some_static_method)
def some_class_method(cls):
print("this is class method")
some_class_method = classmethod(some_class_method)
If the decorator syntax is used for the same purpose, the code is shorter and easier to understand:
class WithDecorators:
#staticmethod
def some_static_method():
print("this is static method")
#classmethod
def some_class_method(cls):
print("this is class method")
General syntax and possible implementations
The decorator is generally a named object ( lambda expressions are not allowed) that accepts a single argument when called (it will be the decorated function) and returns another callable object. "Callable" is used here instead of "function" with premeditation. While decorators are often discussed in the scope of methods and functions, they are not limited to them. In fact, anything that is callable (any object that implements the _call__ method is considered callable), can be used as a decorator and often objects returned by them are not simple functions but more instances of more complex classes implementing their own __call_ method.
The decorator syntax is simply only a syntactic sugar. Consider the following decorator usage:
#some_decorator
def decorated_function():
pass
This can always be replaced by an explicit decorator call and function reassignment:
def decorated_function():
pass
decorated_function = some_decorator(decorated_function)
However, the latter is less readable and also very hard to understand if multiple decorators are used on a single function.
Decorators can be used in multiple different ways as shown below:
As a function
There are many ways to write custom decorators, but the simplest way is to write a function that returns a subfunction that wraps the original function call.
The generic patterns is as follows:
def mydecorator(function):
def wrapped(*args, **kwargs):
# do some stuff before the original
# function gets called
result = function(*args, **kwargs)
# do some stuff after function call and
# return the result
return result
# return wrapper as a decorated function
return wrapped
As a class
While decorators almost always can be implemented using functions, there are some situations when using user-defined classes is a better option. This is often true when the decorator needs complex parametrization or it depends on a specific state.
The generic pattern for a nonparametrized decorator as a class is as follows:
class DecoratorAsClass:
def __init__(self, function):
self.function = function
def __call__(self, *args, **kwargs):
# do some stuff before the original
# function gets called
result = self.function(*args, **kwargs)
# do some stuff after function call and
# return the result
return result
Parametrizing decorators
In real code, there is often a need to use decorators that can be parametrized. When the function is used as a decorator, then the solution is simple—a second level of wrapping has to be used. Here is a simple example of the decorator that repeats the execution of a decorated function the specified number of times every time it is called:
def repeat(number=3):
"""Cause decorated function to be repeated a number of times.
Last value of original function call is returned as a result
:param number: number of repetitions, 3 if not specified
"""
def actual_decorator(function):
def wrapper(*args, **kwargs):
result = None
for _ in range(number):
result = function(*args, **kwargs)
return result
return wrapper
return actual_decorator
The decorator defined this way can accept parameters:
>>> #repeat(2)
... def foo():
... print("foo")
...
>>> foo()
foo
foo
Note that even if the parametrized decorator has default values for its arguments, the parentheses after its name is required. The correct way to use the preceding decorator with default arguments is as follows:
>>> #repeat()
... def bar():
... print("bar")
...
>>> bar()
bar
bar
bar
Finally lets see decorators with Properties.
Properties
The properties provide a built-in descriptor type that knows how to link an attribute to a set of methods. A property takes four optional arguments: fget , fset , fdel , and doc . The last one can be provided to define a docstring that is linked to the attribute as if it were a method. Here is an example of a Rectangle class that can be controlled either by direct access to attributes that store two corner points or by using the width , and height properties:
class Rectangle:
def __init__(self, x1, y1, x2, y2):
self.x1, self.y1 = x1, y1
self.x2, self.y2 = x2, y2
def _width_get(self):
return self.x2 - self.x1
def _width_set(self, value):
self.x2 = self.x1 + value
def _height_get(self):
return self.y2 - self.y1
def _height_set(self, value):
self.y2 = self.y1 + value
width = property(
_width_get, _width_set,
doc="rectangle width measured from left"
)
height = property(
_height_get, _height_set,
doc="rectangle height measured from top"
)
def __repr__(self):
return "{}({}, {}, {}, {})".format(
self.__class__.__name__,
self.x1, self.y1, self.x2, self.y2
)
The best syntax for creating properties is using property as a decorator. This will reduce the number of method signatures inside of the class
and make code more readable and maintainable. With decorators the above class becomes:
class Rectangle:
def __init__(self, x1, y1, x2, y2):
self.x1, self.y1 = x1, y1
self.x2, self.y2 = x2, y2
#property
def width(self):
"""rectangle height measured from top"""
return self.x2 - self.x1
#width.setter
def width(self, value):
self.x2 = self.x1 + value
#property
def height(self):
"""rectangle height measured from top"""
return self.y2 - self.y1
#height.setter
def height(self, value):
self.y2 = self.y1 + value
Starting with Python 3.5, the '#' is used as a dedicated infix symbol for MATRIX MULTIPLICATION (PEP 0465 -- see https://www.python.org/dev/peps/pep-0465/)
# can be a math operator or a DECORATOR but what you mean is a decorator.
This code:
def func(f):
return f
func(lambda :"HelloWorld")()
using decorators can be written like:
def func(f):
return f
#func
def name():
return "Hello World"
name()
Decorators can have arguments.
You can see this GeeksforGeeks post: https://www.geeksforgeeks.org/decorators-in-python/
It indicates that you are using a decorator. Here is Bruce Eckel's example from 2008.
Python decorator is like a wrapper of a function or a class. It’s still too conceptual.
def function_decorator(func):
def wrapped_func():
# Do something before the function is executed
func()
# Do something after the function has been executed
return wrapped_func
The above code is a definition of a decorator that decorates a function.
function_decorator is the name of the decorator.
wrapped_func is the name of the inner function, which is actually only used in this decorator definition. func is the function that is being decorated.
In the inner function wrapped_func, we can do whatever before and after the func is called. After the decorator is defined, we simply use it as follows.
#function_decorator
def func():
pass
Then, whenever we call the function func, the behaviours we’ve defined in the decorator will also be executed.
EXAMPLE :
from functools import wraps
def mydecorator(f):
#wraps(f)
def wrapped(*args, **kwargs):
print "Before decorated function"
r = f(*args, **kwargs)
print "After decorated function"
return r
return wrapped
#mydecorator
def myfunc(myarg):
print "my function", myarg
return "return value"
r = myfunc('asdf')
print r
Output :
Before decorated function
my function asdf
After decorated function
return value
To say what others have in a different way: yes, it is a decorator.
In Python, it's like:
Creating a function (follows under the # call)
Calling another function to operate on your created function. This returns a new function. The function that you call is the argument of the #.
Replacing the function defined with the new function returned.
This can be used for all kinds of useful things, made possible because functions are objects and just necessary just instructions.
# symbol is also used to access variables inside a plydata / pandas dataframe query, pandas.DataFrame.query.
Example:
df = pandas.DataFrame({'foo': [1,2,15,17]})
y = 10
df >> query('foo > #y') # plydata
df.query('foo > #y') # pandas

return a list from class object

I am using multiprocessing module to generate 35 dataframes. I guess this will save my time. But the problem is that the class does not return anything. I expect the list of dataframes to be returned from self.dflist
Here is how to create dfnames list.
urls=[]
fnames=[]
dfnames=[]
for x in xrange(100,3600,100):
y = str(x)
i = y.zfill(4)
filename='DCHB_Town_Release_'+i+'.xlsx'
url = "http://www.censusindia.gov.in/2011census/dchb/"+filename
urls.append(url)
fnames.append(filename)
dfnames.append((filename, 'DCHB_Town_Release_'+i))
This is the class that uses the dfnames generated by above code.
import pandas as pd
import multiprocessing
class mydf1():
def __init__(self, dflist, jobs, dfnames):
self.dflist=list()
self.jobs=list()
self.dfnames=dfnames
def dframe_create(self, filename, dfname):
print 'abc', filename, dfname
dfname=pd.read_excel(filename)
self.dflist.append(dfname)
print self.dflist
return self.dflist
def mp(self):
for f,d in self.dfnames:
p = multiprocessing.Process(target=self.dframe_create, args=(f,d))
self.jobs.append(p)
p.start()
#return self.dflist
for j in self.jobs:
j.join()
print '%s.exitcode = %s' % (j.name, j.exitcode)
This class when called like this...
dflist=[]
jobs=[]
x=mydf1(dflist, jobs, dfnames)
y=x.mp()
Prints the self.dflist correctly. But does not return anything.
I can collect all datafarmes sequentially. But in order to save time, I need to use multiple processes simultaneously to generate and add dataframes to a list.
In your case I prefer to write as less code as possible and use Pool:
import pandas as pd
import logging
import multiprocessing
def dframe_create(filename):
try:
return pd.read_excel(filename)
except Exception as e:
logging.error("Something went wrong: %s", e, exc_info=1)
return None
p = multiprocessing.Pool()
excel_files = p.map(dframe_create, dfnames)
for f in excel_files:
if f is not None:
print 'Ready to work'
else:
print ':('
Prints the self.dflist correctly. But does not return anything.
That's because you don't have a return statement in the mp method, e.g.
def mp(self):
...
return self.dflist
It's not entirely clear what you're issue is, however, you have to take some care here in that you can't just pass objects/lists across processes. That's why you have special objects (which lock while they make modifications to a list), that way you don't get tripped up when two processes try to make a change at the same time (and you only get one update).
That is, you have to use multiprocessing's list.
class mydf1():
def __init__(self, dflist, jobs, dfnames):
self.dflist = multiprocessing.list() # perhaps should be multiprocessing.list(dflist or ())
self.jobs = list()
self.dfnames = dfnames
However you have a bigger problem: the whole point of multiprocessing is that they may run/finish out of order, so keeping two lists like this is doomed to fail. You should use a multiprocessing.dict that way the DataFrame is saved unambiguously with the filename.
class mydf1():
def __init__(self, dflist, jobs, dfnames):
self.dfdict = multiprocessing.dict()
...
def dframe_create(self, filename, dfname):
print 'abc', filename, dfname
df = pd.read_excel(filename)
self.dfdict[dfname] = df

Two Class instances in Python not different

I'm working on another data acquisition project, which has turned into an object oriented programming question. In “main” at the bottom of my code I make two instances of the Object DAQInput. When I wrote this, I thought my method .getData would refer to the taskHandle of the particular instance, but it does not. When I run, the code does the getData task with the first handle twice, so clearly I don’t really understand object oriented programming in Python. I’m sorry this code will not run without PyDAQmx and a National Instruments board attached.
from PyDAQmx import *
import numpy
class DAQInput:
# Declare variables passed by reference
taskHandle = TaskHandle()
read = int32()
data = numpy.zeros((10000,),dtype=numpy.float64)
sumi = [0,0,0,0,0,0,0,0,0,0]
def __init__(self, num_data, num_chan, channel, high, low):
""" This is init function that opens the channel"""
#Get the passed variables
self.num_data = num_data
self.channel = channel
self.high = high
self.low = low
self.num_chan = num_chan
# Create a task and configure a channel
DAQmxCreateTask(b"",byref(self.taskHandle))
DAQmxCreateAIThrmcplChan(self.taskHandle, self.channel, b"",
self.low, self.high,
DAQmx_Val_DegC,
DAQmx_Val_J_Type_TC,
DAQmx_Val_BuiltIn, 0, None)
# Start the task
DAQmxStartTask(self.taskHandle)
def getData(self):
""" This function gets the data from the board and calculates the average"""
print(self.taskHandle)
DAQmxReadAnalogF64(self.taskHandle, self.num_data, 10,
DAQmx_Val_GroupByChannel, self.data, 10000,
byref(self.read), None)
# Calculate the average of the values in data (could be several channels)
i = self.read.value
for j in range(self.num_chan):
self.sumi[j] = numpy.sum(self.data[j*i:(j+1)*i])/self.read.value
return self.sumi
def killTask(self):
""" This function kills the tasks"""
# If the task is still alive kill it
if self.taskHandle != 0:
DAQmxStopTask(self.taskHandle)
DAQmxClearTask(self.taskHandle)
if __name__ == '__main__':
myDaq1 = DAQInput(1, 4, b"cDAQ1Mod1/ai0:3", 200.0, 10.0)
myDaq2 = DAQInput(1, 4, b"cDAQ1Mod2/ai0:3", 200.0, 10.0)
result = myDaq1.getData()
print (result[0:4])
result2 = myDaq2.getData()
print (result2[0:4])
myDaq1.killTask()
myDaq2.killTask()
These variables:
class DAQInput:
# Declare variables passed by reference
taskHandle = TaskHandle()
read = int32()
data = numpy.zeros((10000,),dtype=numpy.float64)
sumi = [0,0,0,0,0,0,0,0,0,0]
Are class variables. They belong to the class itself and are shared among instances of the class (i.e. if you modify self.data in Instance1, Instace2's self.data is modified as well).
If you want them to be instance variables, define them in __init__.

"Pythonic" way to "reset" an object's variables?

("variables" here refers to "names", I think, not completely sure about the definition pythonistas use)
I have an object and some methods. These methods all need and all change the object's variables. How can I, in the most pythonic and in the best, respecting the techniques of OOP, way achieve to have the object variables used by the methods but also keep their original values for the other methods?
Should I copy the object everytime a method is called? Should I save the original values and have a reset() method to reset them everytime a method needs them? Or is there an even better way?
EDIT: I was asked for pseudocode. Since I am more interested in understanding the concept rather than just specifically solving the problem I am encountering I am going to try give an example:
class Player():
games = 0
points = 0
fouls = 0
rebounds = 0
assists = 0
turnovers = 0
steals = 0
def playCupGame(self):
# simulates a game and then assigns values to the variables, accordingly
self.points = K #just an example
def playLeagueGame(self):
# simulates a game and then assigns values to the variables, accordingly
self.points = Z #just an example
self.rebounds = W #example again
def playTrainingGame(self):
# simulates a game and then assigns values to the variables, accordingly
self.points = X #just an example
self.rebounds = Y #example again
The above is my class for a Player object (for the example assume he is a basketball one). This object has three different methods that all assign values to the players' statistics.
So, let's say the team has two league games and then a cup game. I'd have to make these calls:
p.playLeagueGame()
p.playLeagueGame()
p.playCupGame()
It's obvious that when the second and the third calls are made, the previously changed statistics of the player need to be reset. For that, I can either write a reset method that sets all the variables back to 0, or copy the object for every call I make. Or do something completely different.
That's where my question lays, what's the best approach, python and oop wise?
UPDATE: I am suspicious that I have superovercomplicated this and I can easily solve my problem by using local variables in the functions. However, what happens if I have a function inside another function, can I use locals of the outer one inside the inner one?
Not sure if it's "Pythonic" enough, but you can define a "resettable" decorator
for the __init__ method that creates a copy the object's __dict__ and adds a reset() method that switches the current __dict__ to the original one.
Edit - Here's an example implementation:
def resettable(f):
import copy
def __init_and_copy__(self, *args, **kwargs):
f(self, *args)
self.__original_dict__ = copy.deepcopy(self.__dict__)
def reset(o = self):
o.__dict__ = o.__original_dict__
self.reset = reset
return __init_and_copy__
class Point(object):
#resettable
def __init__(self, x, y):
self.x = x
self.y = y
def __str__(self):
return "%d %d" % (self.x, self.y)
class LabeledPoint(Point):
#resettable
def __init__(self, x, y, label):
self.x = x
self.y = y
self.label = label
def __str__(self):
return "%d %d (%s)" % (self.x, self.y, self.label)
p = Point(1, 2)
print p # 1 2
p.x = 15
p.y = 25
print p # 15 25
p.reset()
print p # 1 2
p2 = LabeledPoint(1, 2, "Test")
print p2 # 1 2 (Test)
p2.x = 3
p2.label = "Test2"
print p2 # 3 2 (Test2)
p2.reset()
print p2 # 1 2 (Test)
Edit2: Added a test with inheritance
I'm not sure about "pythonic", but why not just create a reset method in your object that does whatever resetting is required? Call this method as part of your __init__ so you're not duplicating the data (ie: always (re)initialize it in one place -- the reset method)
I would create a default dict as a data member with all of the default values, then do __dict__.update(self.default) during __init__ and then again at some later point to pull all the values back.
More generally, you can use a __setattr__ hook to keep track of every variable that has been changed and later use that data to reset them.
Sounds like you want to know if your class should be an immutable object. The idea is that, once created, an immutable object can't/should't/would't be changed.
On Python, built-in types like int or tuple instances are immutable, enforced by the language:
>>> a=(1, 2, 3, 1, 2, 3)
>>> a[0] = 9
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
As another example, every time you add two integers a new instance is created:
>>> a=5000
>>> b=7000
>>> d=a+b
>>> d
12000
>>> id(d)
42882584
>>> d=a+b
>>> id(d)
42215680
The id() function returns the address of the int object 12000. And every time we add a+b a new 12000 object instance is created.
User defined immutable classes must be enforced manually, or simply done as a convention with a source code comment:
class X(object):
"""Immutable class. Don't change instance variables values!"""
def __init__(self, *args):
self._some_internal_value = ...
def some_operation(self, arg0):
new_instance = X(arg0 + ...)
new_instance._some_internal_operation(self._some_internal_value, 42)
return new_instance
def _some_internal_operation(self, a, b):
"""..."""
Either way, it's OK to create a new instance for every operation.
See the Memento Design Pattern if you want to restore previous state, or the Proxy Design Pattern if you want the object to seem pristine, as if just created. In any case, you need to put something between what's referenced, and it's state.
Please comment if you need some code, though I'm sure you'll find plenty on the web if you use the design pattern names as keywords.
# The Memento design pattern
class Scores(object):
...
class Player(object):
def __init__(self,...):
...
self.scores = None
self.history = []
self.reset()
def reset(self):
if (self.scores):
self.history.append(self.scores)
self.scores = Scores()
It sounds like overall your design needs some reworking. What about a PlayerGameStatistics class that would keep track of all that, and either a Player or a Game would hold a collection of these objects?
Also the code you show is a good start, but could you show more code that interacts with the Player class? I'm just having a hard time seeing why a single Player object should have PlayXGame methods -- does a single Player not interact with other Players when playing a game, or why does a specific Player play the game?
A simple reset method (called in __init__ and re-called when necessary) makes a lot of sense. But here's a solution that I think is interesting, if a bit over-engineered: create a context manager. I'm curious what people think about this...
from contextlib import contextmanager
#contextmanager
def resetting(resettable):
try:
resettable.setdef()
yield resettable
finally:
resettable.reset()
class Resetter(object):
def __init__(self, foo=5, bar=6):
self.foo = foo
self.bar = bar
def setdef(self):
self._foo = self.foo
self._bar = self.bar
def reset(self):
self.foo = self._foo
self.bar = self._bar
def method(self):
with resetting(self):
self.foo += self.bar
print self.foo
r = Resetter()
r.method() # prints 11
r.method() # still prints 11
To over-over-engineer, you could then create a #resetme decorator
def resetme(f):
def rf(self, *args, **kwargs):
with resetting(self):
f(self, *args, **kwargs)
return rf
So that instead of having to explicitly use with you could just use the decorator:
#resetme
def method(self):
self.foo += self.bar
print self.foo
I liked (and tried) the top answer from PaoloVictor. However, I found that it "reset" itself, i.e., if you called reset() a 2nd time it would throw an exception.
I found that it worked repeatably with the following implementation
def resettable(f):
import copy
def __init_and_copy__(self, *args, **kwargs):
f(self, *args, **kwargs)
def reset(o = self):
o.__dict__ = o.__original_dict__
o.__original_dict__ = copy.deepcopy(self.__dict__)
self.reset = reset
self.__original_dict__ = copy.deepcopy(self.__dict__)
return __init_and_copy__
It sounds to me like you need to rework your model to at least include a separate "PlayerGameStats" class.
Something along the lines of:
PlayerGameStats = collections.namedtuple("points fouls rebounds assists turnovers steals")
class Player():
def __init__(self):
self.cup_games = []
self.league_games = []
self.training_games = []
def playCupGame(self):
# simulates a game and then assigns values to the variables, accordingly
stats = PlayerGameStats(points, fouls, rebounds, assists, turnovers, steals)
self.cup_games.append(stats)
def playLeagueGame(self):
# simulates a game and then assigns values to the variables, accordingly
stats = PlayerGameStats(points, fouls, rebounds, assists, turnovers, steals)
self.league_games.append(stats)
def playTrainingGame(self):
# simulates a game and then assigns values to the variables, accordingly
stats = PlayerGameStats(points, fouls, rebounds, assists, turnovers, steals)
self.training_games.append(stats)
And to answer the question in your edit, yes nested functions can see variables stored in outer scopes. You can read more about that in the tutorial: http://docs.python.org/tutorial/classes.html#python-scopes-and-namespaces
thanks for the nice input, as I had kind of a similar problem. I'm solving it with a hook on the init method, since I'd like to be able to reset to whatever initial state an object had. Here's my code:
import copy
_tool_init_states = {}
def wrap_init(init_func):
def init_hook(inst, *args, **kws):
if inst not in _tool_init_states:
# if there is a class hierarchy, only the outer scope does work
_tool_init_states[inst] = None
res = init_func(inst, *args, **kws)
_tool_init_states[inst] = copy.deepcopy(inst.__dict__)
return res
else:
return init_func(inst, *args, **kws)
return init_hook
def reset(inst):
inst.__dict__.clear()
inst.__dict__.update(
copy.deepcopy(_tool_init_states[inst])
)
class _Resettable(type):
"""Wraps __init__ to store object _after_ init."""
def __new__(mcs, *more):
mcs = super(_Resetable, mcs).__new__(mcs, *more)
mcs.__init__ = wrap_init(mcs.__init__)
mcs.reset = reset
return mcs
class MyResettableClass(object):
__metaclass__ = Resettable
def __init__(self):
self.do_whatever = "you want,"
self.it_will_be = "resetted by calling reset()"
To update the initial state, you could build some method like reset(...) that writes data into _tool_init_states. I hope this helps somebody. If this is possible without a metaclass, please let me know.