Add string to another string - python

I currently encountered a problem:
I want to handle adding strings to other strings very efficiently, so I looked up many methods and techniques, and I figured the "fastest" method.
But I quite can not understand how it actually works:
def method6():
return ''.join([`num` for num in xrange(loop_count)])
From source (Method 6)
Especially the ([`num` for num in xrange(loop_count)]) confused me totally.

it's a list comprehension, that uses backticks for repr conversion. Don't do this. Backticks are deprecated and removed in py3k and more efficient and pythonic way is not to build intermediate list at all, but to use generator expression:
''.join(str(num) for num in xrange(loop_count)) # use range in py3k

xrange() is a faster (written in C) version of range().
Backtick notation -- num, coerces a variable to a string, and is the same as str(num).
[x for x in y] is called a list comprehension, and is basically an one-liner for loop that returns a list as its result. So all together, your code's semantically equivalent to the following, but faster, because list comprehensions and xrange are faster than for loops and range:
z = []
for i in range(loop_count):
z.append(str(i))
return "".join(z)

That bit in the brackets is a list comprehension, arguably one of the most powerful elements of Python. It produces a list from iteration. You may want to look up its documentation. The use of backticks to convert num to a string is not suggestible - try str(num) or some such instead.
join() is a method of the string class. It takes a list of strings and return a single string consisting of each component string separated by "self" (aka the calling string). The trick here is that join() is being called directly from the string literal '', which is allowed in Python. What this code will to is produce a string consisting of the string form of each element of xrange(loop_count) with no separation.

First of all: while this code is still correct in the 2.x series of Python, it a bit confusing and can be written differently:
def method6a():
return ''.join(str(num) for num in xrange(loop_count))
In Python 2.x, the backticks can be used instead of the repr function. The expression within the square brackets [] is a list comprehension. In case you are new to list comprehensions: they work like a combination of a loop and a list append-statement, only that you don't have to invent a name for a variable:
Those two are equivalent:
a = [repr(num) for num in xrange(loop_count)]
# <=>
a = []
for num in xrange(loop_count):
a.append(repr(num))
As a result, the list comprehension will contain a list of all numbers from 0 to loop_count (exclusively).
Finally, string.join(iterable) will use the contents of string concatenate all of the strings in iterable, using string as the seperator between each element. If you use the empty string as the seperator, then all elements are concatenated without anything between them - this is exactly what you wanted: a concatenation of all of the numbers from 0 to loop_count.
As for my modifications:
I used str instead of repr because the result is the same for all ints and it is easier to read.
I am using a generator expression instead of a list comprehension because the list built by the list comprehension is unnecessary and gets garbage collected anyway. Generator expressions are iterable, but they don't need to store all elements of the list. Of course, if you already have a list of strings, then simply pass the list to the join.
Generally, the ''.join(iterable) idiom is well understood by most Python programmers to mean "string concatenation of any list of strings", so understandability shouldn't be an issue.

Related

Why do sequence for loops don't seem to work well in Python?

I'm trying to convert every string in a list to it's lowercase format using this function:
def lower_list(strings):
for string in strings:
string = string.lower()
return strings
But this implementation is not working, however when using the range funtion and I iterate using an index:
def lower_list(strings):
for i in range(len(strings)):
strings[i] = strings[i].lower()
return strings
I do get every element on my list converted to lowercase:
> print(lower_list(mylist))
['oh brother where art thou', 'hello dolly', 'monsters inc', 'fargo']
But with the first implementation I get the original list with Uppercase values, am I missing something important in how the for loop works?
In the first case, all you are doing is storing the lowercase value in a variable, but the list is untouched.
In the second case, you are actually updating the value in the list at that index.
You can also use a lambda function here:
def lower_list(strings):
return list(map(lambda x: x.replace(x, x.lower()), strings))
List comprehension is the easiest and the best:
def lower_list(strings):
return [string.lower() for string in strings]
The reason the first one does not work is that it is not actually modifying the value inside of the list, rather it is just affecting a copy of the value in the list. When you use the index-based function, it modifies the list itself.
def lower_list(strings):
for string in strings:
index_of_string = strings.index(string)
string = string.lower()
strings[index_of_string] = string
return strings
If you want the first one to work, maybe you can try something like that, but thats a bad way of doing it, just showing it as an example so maybe you'll understand better. You need the index of that string so you can replace it in the list. In your first attempt, you do not replace anything in the list.

python string to list (special list)

I'm trying to get this string into list, how can i do that pleas ?
My string :
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
I want to convert it to a list, so that:
print(list[0])
Output : (['xyz1'], 'COM95')
If you have this string instead of a list, that presumes it is coming from somewhere outside your control (otherwise you'd just make a proper list). If the string is coming from a source outside your program eval() is dangerous. It will gladly run any code passed to it. In this case you can use ast.liter_eval() which is safer (but make sure you understand the warning on the docs):
import ast
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
l = ast.literal_eval(x)
Which gives an l of:
[(['xyz1'], 'COM95'),
(['xyz2'], 'COM96'),
(['xyz3'], 'COM97'),
(['xyz4'], 'COM98'),
(['xyz5'], 'COM99'),
(['xyz6'], 'COM100')]
If the structure is uniformly a list of tuples with a one-element list of strings and an individual string, you can manually parse it using the single quote as a separator. This will give you one string value every other component of the split (which you can access using a striding subscript). You can then build the actual tuple from pairing of two values:
tuples = [([a],s) for a,s in zip(*[iter(x.split("'")[1::2])]*2)]
print(tuples[0])
(['xyz1'], 'COM95')
Note that this does not cover the case where an individual string contains a single quote that needed escaping
You mean convert list like string into list? Maybe you can use eval().
For example
a="[1,2,3,4]"
a=eval(a)
Then a become a list
to convert as list use x = eval(x)
print(list[0]) will give you an error because list is a python builtin function
you should do print(x[0]) to get what you want

Harshad number in python without statements (if, else, while, for)

I need to find a way to split a string of multiples numbers into multiples strings of those numbers and then split again to have individual digits which would allow me to test those first inputed numbers to see if they are a harshad number without using for, else, while and if.
So far i'm able to split the input string:
a = input("Multiple numbers separated by a ,: ")
a.split(",")
Then I need to split again I think I need to use the map function. Any idea how to go any further.
The python builtin functions map, filter, and reduce are going to be your friend when you are working in a more functional style.
map
The map function lets you transform each item in an iterable (list, tuple, etc.) by passing it to a function and using the return value as a new value in a new iteratable*.
The non-functional approach would use a for ... in construct:
numbers_as_strings = ["1", "12", "13"]
numbers_as_ints = []
for number in numbers_as_strings:
numbers_as_ints.append(int(number))
or more concisely a list comprehension
numbers_as_ints =[int(number) for number in numbers_as_strings]
Since you are eschewing for there is another way
numbers_as_ints = map(int, numbers_as_strings)
But you don't just want your strings mapped to integers, you want to test them for harshadiness. Since we're doing the functional thing let's create a function to do this for us.
def is_harshad(number_as_string):
return # do your harshad test here
Then you can map your numbers through this function
list(map(is_harshad, numbers_as_string)) # wrap in list() to resolve the returned map object.
>>> [True, True, False]
But maybe you want the results as a sequence of harshady number strings? Well check out filter
filter
The filter function lets you choose which items from an iterable you want to keep in a new iterable. You give it a function that operates on an single item and returns True for a keeper or False for a rejection. You also give it an iterable of items to test.
A non-functional way to do this is with a for loop
harshady_numbers = []
for number in numbers_as_strings:
if is_harshad(number):
harshady_numbers.append(number)
Or more concisely and nicely, with a list comprehension
harshady_numbers = [number for number in numbers_as_strings if is_harshady(number)]
But, since we're getting functional well use filter
harshady_numbers = filter(is_harshady, numbers_as_strings)
That's about it. Apply the same functional thinking to complete the is_harshad function and you're done.
map() can take more than one iterable argument and it returns an iterator not a list.

Doing the wrong list comprehension

I've been trying every iteration of a list comprehension that I can in the context.
I am getting a call from a database, converting it to a list of [['item', long integer]].
I want to convert the long integer to a regular one, because the rest of my math is in regular integrals.
I'm trying this:
catnum = c.fetchall()
catnum = [list(x) for x in catnum]
for x in catnum:
[int(y) for y in x]
I've also tried x[1], and a few other things (it is always in position 1 inside the list)
No luck. How do I convert only the second value in the list to a regular integer?
does this work?
catnum=[[x,int(y)] for x,y in catnum]
But, I think it's worth asking why you need to do this conversion. Python should handle long integers just fine anywhere a regular integer would work. There's a slight performance penalty to leaving them as long ints, but in most cases I don't think that would justify the extra work to convert to regular integers.
EDIT for the people reading the comments, my first answer was incorrect and did not involve a list comprehension. It relied on mutating the elements in catnum, but since those elements are in tuples, they can't be mutated.
[[x[0],int(x[1])] for x in catnum]
This will return a list of lists, where the first entry in the name and the second is the value cast down to a normal integer.

How to delete an item in a list if it exists?

I am getting new_tag from a form text field with self.response.get("new_tag") and selected_tags from checkbox fields with
self.response.get_all("selected_tags")
I combine them like this:
tag_string = new_tag
new_tag_list = f1.striplist(tag_string.split(",") + selected_tags)
(f1.striplist is a function that strips white spaces inside the strings in the list.)
But in the case that tag_list is empty (no new tags are entered) but there are some selected_tags, new_tag_list contains an empty string " ".
For example, from logging.info:
new_tag
selected_tags[u'Hello', u'Cool', u'Glam']
new_tag_list[u'', u'Hello', u'Cool', u'Glam']
How do I get rid of the empty string?
If there is an empty string in the list:
>>> s = [u'', u'Hello', u'Cool', u'Glam']
>>> i = s.index("")
>>> del s[i]
>>> s
[u'Hello', u'Cool', u'Glam']
But if there is no empty string:
>>> s = [u'Hello', u'Cool', u'Glam']
>>> if s.index(""):
i = s.index("")
del s[i]
else:
print "new_tag_list has no empty string"
But this gives:
Traceback (most recent call last):
File "<pyshell#30>", line 1, in <module>
if new_tag_list.index(""):
ValueError: list.index(x): x not in list
Why does this happen, and how do I work around it?
1) Almost-English style:
Test for presence using the in operator, then apply the remove method.
if thing in some_list: some_list.remove(thing)
The removemethod will remove only the first occurrence of thing, in order to remove all occurrences you can use while instead of if.
while thing in some_list: some_list.remove(thing)
Simple enough, probably my choice.for small lists (can't resist one-liners)
2) Duck-typed, EAFP style:
This shoot-first-ask-questions-last attitude is common in Python. Instead of testing in advance if the object is suitable, just carry out the operation and catch relevant Exceptions:
try:
some_list.remove(thing)
except ValueError:
pass # or scream: thing not in some_list!
except AttributeError:
call_security("some_list not quacking like a list!")
Off course the second except clause in the example above is not only of questionable humor but totally unnecessary (the point was to illustrate duck-typing for people not familiar with the concept).
If you expect multiple occurrences of thing:
while True:
try:
some_list.remove(thing)
except ValueError:
break
a little verbose for this specific use case, but very idiomatic in Python.
this performs better than #1
PEP 463 proposed a shorter syntax for try/except simple usage that would be handy here, but it was not approved.
However, with contextlib's suppress() contextmanager (introduced in python 3.4) the above code can be simplified to this:
with suppress(ValueError, AttributeError):
some_list.remove(thing)
Again, if you expect multiple occurrences of thing:
with suppress(ValueError):
while True:
some_list.remove(thing)
3) Functional style:
Around 1993, Python got lambda, reduce(), filter() and map(), courtesy of a Lisp hacker who missed them and submitted working patches*. You can use filter to remove elements from the list:
is_not_thing = lambda x: x is not thing
cleaned_list = filter(is_not_thing, some_list)
There is a shortcut that may be useful for your case: if you want to filter out empty items (in fact items where bool(item) == False, like None, zero, empty strings or other empty collections), you can pass None as the first argument:
cleaned_list = filter(None, some_list)
[update]: in Python 2.x, filter(function, iterable) used to be equivalent to [item for item in iterable if function(item)] (or [item for item in iterable if item] if the first argument is None); in Python 3.x, it is now equivalent to (item for item in iterable if function(item)). The subtle difference is that filter used to return a list, now it works like a generator expression - this is OK if you are only iterating over the cleaned list and discarding it, but if you really need a list, you have to enclose the filter() call with the list() constructor.
*These Lispy flavored constructs are considered a little alien in Python. Around 2005, Guido was even talking about dropping filter - along with companions map and reduce (they are not gone yet but reduce was moved into the functools module, which is worth a look if you like high order functions).
4) Mathematical style:
List comprehensions became the preferred style for list manipulation in Python since introduced in version 2.0 by PEP 202. The rationale behind it is that List comprehensions provide a more concise way to create lists in situations where map() and filter() and/or nested loops would currently be used.
cleaned_list = [ x for x in some_list if x is not thing ]
Generator expressions were introduced in version 2.4 by PEP 289. A generator expression is better for situations where you don't really need (or want) to have a full list created in memory - like when you just want to iterate over the elements one at a time. If you are only iterating over the list, you can think of a generator expression as a lazy evaluated list comprehension:
for item in (x for x in some_list if x is not thing):
do_your_thing_with(item)
See this Python history blog post by GvR.
This syntax is inspired by the set-builder notation in math.
Python 3 has also set and dict comprehensions.
Notes
you may want to use the inequality operator != instead of is not (the difference is important)
for critics of methods implying a list copy: contrary to popular belief, generator expressions are not always more efficient than list comprehensions - please profile before complaining
try:
s.remove("")
except ValueError:
print "new_tag_list has no empty string"
Note that this will only remove one instance of the empty string from your list (as your code would have, too). Can your list contain more than one?
As a one liner:
>>> s = [u'', u'Hello', u'Cool', u'Glam']
>>> s.remove('') if '' in s else None # Does nothing if '' not in s
>>> s
['Hello', 'Cool', 'Glam']
>>>
If index doesn't find the searched string, it throws the ValueError you're seeing. Either
catch the ValueError:
try:
i = s.index("")
del s[i]
except ValueError:
print "new_tag_list has no empty string"
or use find, which returns -1 in that case.
i = s.find("")
if i >= 0:
del s[i]
else:
print "new_tag_list has no empty string"
Adding this answer for completeness, though it's only usable under certain conditions.
If you have very large lists, removing from the end of the list avoids CPython internals having to memmove, for situations where you can re-order the list. It gives a performance gain to remove from the end of the list, since it won't need to memmove every item after the one your removing - back one step (1).
For one-off removals the performance difference may be acceptable, but if you have a large list and need to remove many items - you will likely notice a performance hit.
Although admittedly, in these cases, doing a full list search is likely to be a performance bottleneck too, unless items are mostly at the front of the list.
This method can be used for more efficient removal,as long as re-ordering the list is acceptable. (2)
def remove_unordered(ls, item):
i = ls.index(item)
ls[-1], ls[i] = ls[i], ls[-1]
ls.pop()
You may want to avoid raising an error when the item isn't in the list.
def remove_unordered_test(ls, item):
try:
i = ls.index(item)
except ValueError:
return False
ls[-1], ls[i] = ls[i], ls[-1]
ls.pop()
return True
While I tested this with CPython, its quite likely most/all other Python implementations use an array to store lists internally. So unless they use a sophisticated data structure designed for efficient list re-sizing, they likely have the same performance characteristic.
A simple way to test this, compare the speed difference from removing from the front of the list with removing the last element:
python -m timeit 'a = [0] * 100000' 'while a: a.remove(0)'
With:
python -m timeit 'a = [0] * 100000' 'while a: a.pop()'
(gives an order of magnitude speed difference where the second example is faster with CPython and PyPy).
In this case you might consider using a set, especially if the list isn't meant to store duplicates.In practice though you may need to store mutable data which can't be added to a set. Also check on btree's if the data can be ordered.
Eek, don't do anything that complicated : )
Just filter() your tags. bool() returns False for empty strings, so instead of
new_tag_list = f1.striplist(tag_string.split(",") + selected_tags)
you should write
new_tag_list = filter(bool, f1.striplist(tag_string.split(",") + selected_tags))
or better yet, put this logic inside striplist() so that it doesn't return empty strings in the first place.
Here's another one-liner approach to throw out there:
next((some_list.pop(i) for i, l in enumerate(some_list) if l == thing), None)
It doesn't create a list copy, doesn't make multiple passes through the list, doesn't require additional exception handling, and returns the matched object or None if there isn't a match. Only issue is that it makes for a long statement.
In general, when looking for a one-liner solution that doesn't throw exceptions, next() is the way to go, since it's one of the few Python functions that supports a default argument.
1-use the filter option
new_tag_list = [u'', u'Hello', u'Cool', u'Glam']
new_tag_list= list(filter(None, new_tag_list))
2-list comprehension works as well for elements other than None
new_tag_list = [u'', u'Hello', u'Cool', u'Glam']
[element for element in new_tag_list if element not in['']]

Categories