Related
Is it possible to add two lists using a reference of each list instead of a copy?
For example -
first_list = [1,2,3]
second_list = [5,6]
new_list = first_list + second_list
print(new_list) # Will print [1,2,3,5,6]
first_list.append(4)
print(new_list) # Should print [1,2,3,4,5,6]
Is there a way to do this in Python? Or is code re-write my only option?
Edit: I removed confusing comments I made about using C++ to do this.
You can't directly do this in Python any more than you can in C++.
But you can indirectly do it in Python the exact same way you can in C++: by writing an object that holds onto both lists and dispatches appropriately.
For example:
class TwoLists(collections.abc.Sequence):
def __init__(self, a, b):
self.a, self.b = a, b
def __len__(self):
return len(self.a) + len(self.b)
def __getitem__(self, idx):
# cheating a bit, not handling slices or negative indexing
if idx < len(self.a):
return self.a[idx]
else:
return self.b[idx - len(self.a)]
Now:
>>> first_list = [1,2,3]
>>> second_list = [5,6]
>>> two_lists = TwoLists(first_list, second_list)
>>> print(*two_lists)
1 2 3 5 6
>>> first_list.append(4)
>>> print(*two_lists)
1 2 3 4 5 6
What I think you were missing here is a fundamental distinction between Python and C++ in how variables work. Briefly, every Python variable (and attribute and list position and so on) is, in C++ terms, a reference variable.
Less misleadingly:
C++ variables (and attributes, etc.) are memory locations—they're where values live. If you want a to be a reference to the value in b, you have to make a reference-to-b value, and store that in a. (C++ has a bit of magic that lets you define a reference variable like int& a = b, but you can't later reassign a to refer to c; if you want that, you have to explicitly use pointers, C-style.)
Python variables (and etc.) are names for values, while the values live wherever they want to. If you want a to be a reference to the value in a, you just bind a to the same value b is bound to: a = b. (And, unlike C++, you can reassign a = c at any time.)
Of course the cost is performance: there's an extra indirection to reach any value from its name in Python, while in C++, that only happens when you use pointer variables. But that cost is pretty much always invisible compared to the other overhead of Python (interpreting bytecode and dynamically looking names up in dictionaries and so on), so it makes sense for a high-level language to just not give you the choice.
All that being said, there's usually not a good reason to do this in either language. Both Python, and the C++ standard library, are designed around (similar, but different notions of) iteration.
In Python, you usually don't actually need a sequence, just an iterable. And to chain two iterables together is trivial:
>>> from itertools import chain
>>> first_list = [1,2,3]
>>> second_list = [5,6]
>>> print(*chain(first_list, second_list))
1 2 3 5 6
>>> first_list.append(4)
>>> print(*chain(first_list, second_list))
1 2 3 4 5 6
Yes, I can only iterate over the chain once, but usually that's all you need. (Just as in C++ you usually only need to loop from begin(c) to end(c), not to build a new persistent object that holds onto them.)
And if you think that's cheating because I'm using itertools, we can define it ourselves:
def chain(*its):
for it in its:
yield from it
How would I print a list of strings as their individual variable values?
For example, take this code:
a=1
b=2
c=3
text="abc"
splittext = text.split(text)
print(splittext)
How would I get this to output 123?
You could do this using eval, but it is very dangerous:
>>> ''.join(map(lambda x : str(eval(x)),Text))
'123'
eval (perhaps they better rename it to evil, no hard feelings, simply use it as a warning) evaluates a string as if you would have coded it there yourself. So eval('a') will fetch the value of a. The problem is that a hacker could perhaps find some trick to inject arbitrary code using this, and thus hack your server, program, etc. Furthermore by accident it can perhaps change the state of your program. So a piece of advice is "Do not use it, unless you have absolutely no other choice" (which is not the case here).
Or a less dangerous variant:
>>> ''.join(map(lambda x : str(globals()[x]),Text))
'123'
in case these are global variables (you can use locals() for local variables).
This is ugly and dangerous, because you do not know in advance what a, b and c are, neither do you have much control on what part of the program can set these variables. So it can perhaps allow code injection. As is advertised in the comments on your question, you better use a dictionary for that.
Dictionary approach
A better way to do this is using a dictionary (as #Ignacio Vazquez-Abrams was saying):
>>> dic = {'a':1,'b': 2,'c':3}
>>> ''.join(map(lambda x : str(dic[x]),Text))
'123'
List instead of string
In the above we converted the content to a string using str in the lambda-expression and used ''.join to concatenate these strings. If you are however interested in an array of "results", you can drop these constructs. For instance:
>>> map(lambda x : dic[x],Text)
[1, 2, 3]
The same works for all the above examples.
EDIT
For some reason, I later catched the fact that you want to print the valuesm, this can easily be achieved using list comprehension:
for x in Text :
print dic[x]
again you can use the same technique for the above cases.
In case you want to print out the value of the variables named in the string you can use locals (or globals, depending on what/where you want them)
>>> a=1
>>> b=2
>>> c=3
>>> s='abc'
>>> for v in s:
... print(locals()[v])
...
1
2
3
or, if you use separators in the string
>>> s='a,b,c'
>>> for v in s.split(','):
... print(locals()[v])
...
1
2
3
I am a Perl user for many years and started Python recently.
I learned that there is always "one obvious way" to do certain things. I wish to check the "one" way to translate my coding style in Perl below into Python. Thanks!
The objective is to:
Detect the existance of the pattern
If found, extract certain portion of the pattern
Perl:
if ($str =~ /my(pat1)and(pat2)/) {
my ($var1, $var2) = ($1, $2);
}
As far as I had learn for Python, below is how I am coding now. It seems to be taking more steps than Perl. That's why I have doubt about my Python code.
mySearch = re.search ( r'my(pat1)and(pat2)', str )
if mySearch:
var1 = mySearch.group(1)
var2 = mySearch.group(2)
Python doesn't prioritize pattern matching and string manipulation like perl does. These are analogous patterns, and yeah, Python's is longer (it's also got a lot of great things going for it, like the fact it's OOP and doesn't use weird magical global variables).
For the record though, you can be using tuple unpacking to make this more succinct:
var1, var2 = mySearch.groups()
Update:
Tuple unpacking
Tuple unpacking is a useful feature in Python. To understand it, let's first ask, what is a tuple. A tuple is, at its heart, an immutable sequence -- unlike with a list, you cannot append or pop or any of that stuff. Syntactically, it's very simple to declare a tuple -- it's just a few values separated by commas.
my_tuple = "I", "am", "awesome"
my_tuple[0] # "I"
my_tuple[1] # "am"
my_tuple[2] # "awesome"
People often think a tuple is in fact defined by surrounding parentheses -- my_tuple = ("I", "am", "awesome") -- but this is wrong; the parentheses are only useful insofar as they clarify or enforce a certain order of operations.
Tuple unpacking is one of the sweetest features in Python. You define a tuple data structure containing undefined names on the left, and you unpack the iterable on the right into it. The right side can contain any kind of iterable, but the shape of its contained data must exactly match the tuple structure of names on the left.
# some_var and other_var are both undefined
print some_var # NameError: some_var is undefined
print other_var # NameError: other_var is undefined
my_iterable = ["so", "cool"]
# note that 'some_var, other_var' looks a whole lot like a tuple
some_var, other_var = my_iterable
print some_var # "so"
print other_var # "cool"
Again, we don't need a list on the right but any kind of iterable -- for example, a generator:
def some_generator():
yield 1
yield 2
yield 3
a, b, c = some_generator()
print a # 1
print b # 2
print c # 3
You can even do tuple unpacking with nested data structures.
nested_list = [1, [2, 3], 4]
# note that parentheses are necessary here to delimit tuples
a, (b, c), d = nested_list
If the iterable on the right doesn't match the pattern on the left, things blow up:
# THESE EXAMPLES DON'T WORK
a, b = [1, 2, 3] # ValueError: too many values to unpack
a, b = [] # ValueError: need more than 0 values to unpack
Actually, this noisy failure makes tuple unpacking my favorite way to get an item from an iterable when I think that iterable should only have one item in it and I want my code to fail if it has more than one.
# note that the left side below is how you define a tuple of one
bank_statement, = bank_statements # we def want to blow up if too many statements
Multiple Assignment
What people think of as multiple assignment is actually just plain tuple unpacking.
a, b = 1, 2
print a # 1
print b # 2
This is nothing special. The interpreter evaluates the right hand side of the equation as a tuple -- remember, a tuple is just values (literals, variables, evaluated function calls, whatever) separated by a commas -- and then the interpreter matches it against the left side, just like it did with all the examples above.
Bringing it on home
I wrote this to explain the two different answers you were getting for this problem:
var1, var2 = mySearch.group(1), mySearch.group(2)
and
var1, var2 = mySearch.groups()
First, recognize that these two statements, for your situation -- where mySearch is a MatchObject resulting from a regex with two matching groups -- are entirely functionally equivalent.
They differ only very slightly in terms of the nature of the tuple unpacking. The first one declares a tuple on the right while the second uses the tuple returned by MatchObject.groups.
This does not really apply to your situation, but it might be useful to understand that MatchObject.group and MatchObject.groups have slightly different behavior (see here and here). MatchObject.groups returns all the 'subgroups' -- i.e. capturing groups -- that the regex encounters while MatchObject.group returns an individual group and counts the entire pattern as a group accessible at 0.
In reality, for this situation, you should use whichever of these two you think is most expressive or clearest. I personally think mentioning groups 1 and 2 on the right side is redundant and I am constantly annoyed by the fact that MatchObject.groups(0) returns the string matched by the entire pattern, thus offsetting all the 'subgroups' to one-indexing.
You can extract all groups at once and assign them to variables:
var1, var2 = mySeach.groups()
In Python you could do more than one variable assignments in a single line with comma as a delimiter.
var1, var2 = mySearch.group(1), mySearch.group(2)
Other answers says about tuple unpacking. So that would be better if you want to extract all the captured group contents to variables. If you want to grab particular groups contents, you must need to go for the method I mentioned.
va1, var2, var3 = mySearch.group(2), mySearch.group(3), mySearch.group(1)
Example:
>>> import re
>>> x = "foobarbuzzlorium"
>>> m = re.search(r'(foo)(bar.*)(lorium)', x)
>>> if m:
x, y = m.group(1), m.group(3)
print(x,y)
foo lorium
Consider:
operator.add(a, b)
I'm having trouble understanding what this does. An operator is something like +-*/, so what does operator.add(a, b) do and how would you use it in a program?
Operator functions let you pick operations dynamically.
They do the same thing as the operator, so operator.add(a, b) does the exact same thing as a + b, but you can now use these operators in abstract.
Take for example:
import operator, random
ops = [operator.add, operator.sub]
print(random.choice(ops)(10, 5))
The above code will randomly either add up or subtract the two numbers. Because the operators can be applied in function form, you can also store these functions in variables (lists, dictionaries, etc.) and use them indirectly, based on your code. You can pass them to map() or reduce() or partial, etc. etc. etc.
As operator.add is a function and you can pass argument to it, it's for the situations where you can not use statements like a+d, like the map or itertools.imap functions. For better understanding, see the following example:
>>> import operator
>>> from itertools import imap
>>> list(imap(operator.add,[1,3],[5,5]))
[6, 8]
It does the same, it's just a function version of the operator in the Python operator module. It returns the result, so you would just it like this:
result = operator.add(a, b)
This is functionally equivalent to
result = a + b
It literally is how the + operator is defined. Look at the following example
class foo():
def __init__(self, a):
self.a = a
def __add__(self, b):
return self.a + b
>>> x = foo(5)
>>> x + 3
8
The + operator actually just calls the __add__ method of the class
The same thing happens for native Python types,
>>> 5 + 3
8
>>> operator.add(5,3)
8
Note that since I defined my __add__ method, I can also do
>>> operator.add(x, 3)
8
For the first part of your question, checkout the source for operator.add. It does exactly as you'd expect; adds two values together.
The answer to part two of your question is a little tricky.
They can be good for when you don't know what operator you'll need until run time. Like when the data file you're reading contains the operation as well as the values:
# warning: nsfw
total = 0
with open('./foo.dat') as fp:
for line in fp:
operation, first_val, second_val = line.split()
total += getattr(operator, operation)(first_val, second_val)
Also, you might want to make your code cleaner or more efficient (subjective) by using the operator functions with the map built-in as the example shows in the Python docs:
orig_values = [1,2,3,4,5]
new_values = [5,4,3,2,1]
total = sum(map(operator.add, orig_values, new_values))
Those are both convoluted examples which usually means that you probably won't use them except in extraordinary situations. You should really know that you need these functions before you use them.
In python, you can have a function return multiple values. Here's a contrived example:
def divide(x, y):
quotient = x/y
remainder = x % y
return quotient, remainder
(q, r) = divide(22, 7)
This seems very useful, but it looks like it can also be abused ("Well..function X already computes what we need as an intermediate value. Let's have X return that value also").
When should you draw the line and define a different method?
Absolutely (for the example you provided).
Tuples are first class citizens in Python
There is a builtin function divmod() that does exactly that.
q, r = divmod(x, y) # ((x - x%y)/y, x%y) Invariant: div*y + mod == x
There are other examples: zip, enumerate, dict.items.
for i, e in enumerate([1, 3, 3]):
print "index=%d, element=%s" % (i, e)
# reverse keys and values in a dictionary
d = dict((v, k) for k, v in adict.items()) # or
d = dict(zip(adict.values(), adict.keys()))
BTW, parentheses are not necessary most of the time.
Citation from Python Library Reference:
Tuples may be constructed in a number of ways:
Using a pair of parentheses to denote the empty tuple: ()
Using a trailing comma for a singleton tuple: a, or (a,)
Separating items with commas: a, b, c or (a, b, c)
Using the tuple() built-in: tuple() or tuple(iterable)
Functions should serve single purpose
Therefore they should return a single object. In your case this object is a tuple. Consider tuple as an ad-hoc compound data structure. There are languages where almost every single function returns multiple values (list in Lisp).
Sometimes it is sufficient to return (x, y) instead of Point(x, y).
Named tuples
With the introduction of named tuples in Python 2.6 it is preferable in many cases to return named tuples instead of plain tuples.
>>> import collections
>>> Point = collections.namedtuple('Point', 'x y')
>>> x, y = Point(0, 1)
>>> p = Point(x, y)
>>> x, y, p
(0, 1, Point(x=0, y=1))
>>> p.x, p.y, p[0], p[1]
(0, 1, 0, 1)
>>> for i in p:
... print(i)
...
0
1
Firstly, note that Python allows for the following (no need for the parenthesis):
q, r = divide(22, 7)
Regarding your question, there's no hard and fast rule either way. For simple (and usually contrived) examples, it may seem that it's always possible for a given function to have a single purpose, resulting in a single value. However, when using Python for real-world applications, you quickly run into many cases where returning multiple values is necessary, and results in cleaner code.
So, I'd say do whatever makes sense, and don't try to conform to an artificial convention. Python supports multiple return values, so use it when appropriate.
The example you give is actually a python builtin function, called divmod. So someone, at some point in time, thought that it was pythonic enough to include in the core functionality.
To me, if it makes the code cleaner, it is pythonic. Compare these two code blocks:
seconds = 1234
minutes, seconds = divmod(seconds, 60)
hours, minutes = divmod(minutes, 60)
seconds = 1234
minutes = seconds / 60
seconds = seconds % 60
hours = minutes / 60
minutes = minutes % 60
Yes, returning multiple values (i.e., a tuple) is definitely pythonic. As others have pointed out, there are plenty of examples in the Python standard library, as well as in well-respected Python projects. Two additional comments:
Returning multiple values is sometimes very, very useful. Take, for example, a method that optionally handles an event (returning some value in doing so) and also returns success or failure. This might arise in a chain of responsibility pattern. In other cases, you want to return multiple, closely linked pieces of data---as in the example given. In this setting, returning multiple values is akin to returning a single instance of an anonymous class with several member variables.
Python's handling of method arguments necessitates the ability to directly return multiple values. In C++, for example, method arguments can be passed by reference, so you can assign output values to them, in addition to the formal return value. In Python, arguments are passed "by reference" (but in the sense of Java, not C++). You can't assign new values to method arguments and have it reflected outside method scope. For example:
// C++
void test(int& arg)
{
arg = 1;
}
int foo = 0;
test(foo); // foo is now 1!
Compare with:
# Python
def test(arg):
arg = 1
foo = 0
test(foo) # foo is still 0
It's definitely pythonic. The fact that you can return multiple values from a function the boilerplate you would have in a language like C where you need to define a struct for every combination of types you return somewhere.
However, if you reach the point where you are returning something crazy like 10 values from a single function, you should seriously consider bundling them in a class because at that point it gets unwieldy.
Returning a tuple is cool. Also note the new namedtuple
which was added in python 2.6 which may make this more palatable for you:
http://docs.python.org/dev/library/collections.html#collections.namedtuple
OT: RSRE's Algol68 has the curious "/:=" operator. eg.
INT quotient:=355, remainder;
remainder := (quotient /:= 113);
Giving a quotient of 3, and a remainder of 16.
Note: typically the value of "(x/:=y)" is discarded as quotient "x" is assigned by reference, but in RSRE's case the returned value is the remainder.
c.f. Integer Arithmetic - Algol68
It's fine to return multiple values using a tuple for simple functions such as divmod. If it makes the code readable, it's Pythonic.
If the return value starts to become confusing, check whether the function is doing too much and split it if it is. If a big tuple is being used like an object, make it an object. Also, consider using named tuples, which will be part of the standard library in Python 2.6.
I'm fairly new to Python, but the tuple technique seems very pythonic to me. However, I've had another idea that may enhance readability. Using a dictionary allows access to the different values by name rather than position. For example:
def divide(x, y):
return {'quotient': x/y, 'remainder':x%y }
answer = divide(22, 7)
print answer['quotient']
print answer['remainder']