python removing whitespace from string in a list - python

I have a list of lists. I want to remove the leading and trailing spaces from them. The strip() method returns a copy of the string without leading and trailing spaces. Calling that method alone does not make the change. With this implementation, I am getting an 'array index out of bounds error'. It seems to me like there would be "an x" for exactly every list within the list (0-len(networks)-1) and "a y" for every string within those lists (0-len(networks[x]) aka i and j should map exactly to legal, indexes and not go out of bounds?
i = 0
j = 0
for x in networks:
for y in x:
networks[i][j] = y.strip()
j = j + 1
i = i + 1

You're forgetting to reset j to zero after iterating through the first list.
Which is one reason why you usually don't use explicit iteration in Python - let Python handle the iterating for you:
>>> networks = [[" kjhk ", "kjhk "], ["kjhkj ", " jkh"]]
>>> result = [[s.strip() for s in inner] for inner in networks]
>>> result
[['kjhk', 'kjhk'], ['kjhkj', 'jkh']]

You don't need to count i, j yourself, just enumerate, also looks like you do not increment i, as it is out of loop and j is not in inner most loop, that is why you have an error
for x in networks:
for i, y in enumerate(x):
x[i] = y.strip()
Also note you don't need to access networks but accessing 'x' and replacing value would work, as x already points to networks[index]

This generates a new list:
>>> x = ['a', 'b ', ' c ']
>>> map(str.strip, x)
['a', 'b', 'c']
>>>
Edit: No need to import string when you use the built-in type (str) instead.

So you have something like: [['a ', 'b', ' c'], [' d', 'e ']], and you want to generate [['a', 'b',' c'], ['d', 'e']]. You could do:
mylist = [['a ', 'b', ' c'], [' d', 'e ']]
mylist = [[x.strip() for x in y] for y in mylist]
The use of indexes with lists is generally not necessary, and changing a list while iterating though it can have multiple bad side effects.

c=[]
for i in networks:
d=[]
for v in i:
d.append(v.strip())
c.append(d)

A much cleaner version of cleaning list could be implemented using recursion. This will allow you to have a infinite amount of list inside of list all while keeping a very low complexity to your code.
Side note: This also puts in place safety checks to avoid data type issues with strip. This allows your list to contain ints, floats, and much more.
def clean_list(list_item):
if isinstance(list_item, list):
for index in range(len(list_item)):
if isinstance(list_item[index], list):
list_item[index] = clean_list(list_item[index])
if not isinstance(list_item[index], (int, tuple, float, list)):
list_item[index] = list_item[index].strip()
return list_item
Then just call the function with your list. All of the values will be cleaned inside of the list of list.
clean_list(networks)

Related

For Loop incrementing by 2s instead of 1 [duplicate]

Now I know that it is not safe to modify the list during an iterative looping. However, suppose I have a list of strings, and I want to strip the strings themselves. Does replacement of mutable values count as modification?
See Scope of python variable in for loop for a related problem: assigning to the iteration variable does not modify the underlying sequence, and also does not impact future iteration.
Since the loop below only modifies elements already seen, it would be considered acceptable:
a = ['a',' b', 'c ', ' d ']
for i, s in enumerate(a):
a[i] = s.strip()
print(a) # -> ['a', 'b', 'c', 'd']
Which is different from:
a[:] = [s.strip() for s in a]
in that it doesn't require the creation of a temporary list and an assignment of it to replace the original, although it does require more indexing operations.
Caution: Although you can modify entries this way, you can't change the number of items in the list without risking the chance of encountering problems.
Here's an example of what I mean—deleting an entry messes-up the indexing from that point on:
b = ['a', ' b', 'c ', ' d ']
for i, s in enumerate(b):
if s.strip() != b[i]: # leading or trailing whitespace?
del b[i]
print(b) # -> ['a', 'c '] # WRONG!
(The result is wrong because it didn't delete all the items it should have.)
Update
Since this is a fairly popular answer, here's how to effectively delete entries "in-place" (even though that's not exactly the question):
b = ['a',' b', 'c ', ' d ']
b[:] = [entry for entry in b if entry.strip() == entry]
print(b) # -> ['a'] # CORRECT
See How to remove items from a list while iterating?.
It's considered poor form. Use a list comprehension instead, with slice assignment if you need to retain existing references to the list.
a = [1, 3, 5]
b = a
a[:] = [x + 2 for x in a]
print(b)
One more for loop variant, looks cleaner to me than one with enumerate():
for idx in range(len(list)):
list[idx]=... # set a new value
# some other code which doesn't let you use a list comprehension
Modifying each element while iterating a list is fine, as long as you do not change add/remove elements to list.
You can use list comprehension:
l = ['a', ' list', 'of ', ' string ']
l = [item.strip() for item in l]
or just do the C-style for loop:
for index, item in enumerate(l):
l[index] = item.strip()
The answer given by Ignacio Vazquez-Abrams is really good. It can be further illustrated by this example. Imagine that:
A list with two vectors is given to you.
You would like to traverse the list and reverse the order of each one of the arrays.
Let's say you have:
v = np.array([1,2,3,4])
b = np.array([3,4,6])
for i in [v, b]:
i = i[::-1] # This command does not reverse the string.
print([v,b])
You will get:
[array([1, 2, 3, 4]), array([3, 4, 6])]
On the other hand, if you do:
v = np.array([1,2,3,4])
b = np.array([3,4,6])
for i in [v, b]:
i[:] = i[::-1] # This command reverses the string.
print([v,b])
The result is:
[array([4, 3, 2, 1]), array([6, 4, 3])]
No you wouldn't alter the "content" of the list, if you could mutate strings that way. But in Python they are not mutable. Any string operation returns a new string.
If you had a list of objects you knew were mutable, you could do this as long as you don't change the actual contents of the list.
Thus you will need to do a map of some sort. If you use a generator expression it [the operation] will be done as you iterate and you will save memory.
You can do something like this:
a = [1,2,3,4,5]
b = [i**2 for i in a]
It's called a list comprehension, to make it easier for you to loop inside a list.
It is not clear from your question what the criteria for deciding what strings to remove is, but if you have or can make a list of the strings that you want to remove , you could do the following:
my_strings = ['a','b','c','d','e']
undesirable_strings = ['b','d']
for undesirable_string in undesirable_strings:
for i in range(my_strings.count(undesirable_string)):
my_strings.remove(undesirable_string)
which changes my_strings to ['a', 'c', 'e']
In short, to do modification on the list while iterating the same list.
list[:] = ["Modify the list" for each_element in list "Condition Check"]
example:
list[:] = [list.remove(each_element) for each_element in list if each_element in ["data1", "data2"]]
Something I just discovered - when looping over a list of mutable types (such as dictionaries) you can just use a normal for loop like this:
l = [{"n": 1}, {"n": 2}]
for d in l:
d["n"] += 1
print(l)
# prints [{"n": 2}, {"n": 1}]

Understanding Python strip() with escape characters [duplicate]

Now I know that it is not safe to modify the list during an iterative looping. However, suppose I have a list of strings, and I want to strip the strings themselves. Does replacement of mutable values count as modification?
See Scope of python variable in for loop for a related problem: assigning to the iteration variable does not modify the underlying sequence, and also does not impact future iteration.
Since the loop below only modifies elements already seen, it would be considered acceptable:
a = ['a',' b', 'c ', ' d ']
for i, s in enumerate(a):
a[i] = s.strip()
print(a) # -> ['a', 'b', 'c', 'd']
Which is different from:
a[:] = [s.strip() for s in a]
in that it doesn't require the creation of a temporary list and an assignment of it to replace the original, although it does require more indexing operations.
Caution: Although you can modify entries this way, you can't change the number of items in the list without risking the chance of encountering problems.
Here's an example of what I mean—deleting an entry messes-up the indexing from that point on:
b = ['a', ' b', 'c ', ' d ']
for i, s in enumerate(b):
if s.strip() != b[i]: # leading or trailing whitespace?
del b[i]
print(b) # -> ['a', 'c '] # WRONG!
(The result is wrong because it didn't delete all the items it should have.)
Update
Since this is a fairly popular answer, here's how to effectively delete entries "in-place" (even though that's not exactly the question):
b = ['a',' b', 'c ', ' d ']
b[:] = [entry for entry in b if entry.strip() == entry]
print(b) # -> ['a'] # CORRECT
See How to remove items from a list while iterating?.
It's considered poor form. Use a list comprehension instead, with slice assignment if you need to retain existing references to the list.
a = [1, 3, 5]
b = a
a[:] = [x + 2 for x in a]
print(b)
One more for loop variant, looks cleaner to me than one with enumerate():
for idx in range(len(list)):
list[idx]=... # set a new value
# some other code which doesn't let you use a list comprehension
Modifying each element while iterating a list is fine, as long as you do not change add/remove elements to list.
You can use list comprehension:
l = ['a', ' list', 'of ', ' string ']
l = [item.strip() for item in l]
or just do the C-style for loop:
for index, item in enumerate(l):
l[index] = item.strip()
The answer given by Ignacio Vazquez-Abrams is really good. It can be further illustrated by this example. Imagine that:
A list with two vectors is given to you.
You would like to traverse the list and reverse the order of each one of the arrays.
Let's say you have:
v = np.array([1,2,3,4])
b = np.array([3,4,6])
for i in [v, b]:
i = i[::-1] # This command does not reverse the string.
print([v,b])
You will get:
[array([1, 2, 3, 4]), array([3, 4, 6])]
On the other hand, if you do:
v = np.array([1,2,3,4])
b = np.array([3,4,6])
for i in [v, b]:
i[:] = i[::-1] # This command reverses the string.
print([v,b])
The result is:
[array([4, 3, 2, 1]), array([6, 4, 3])]
No you wouldn't alter the "content" of the list, if you could mutate strings that way. But in Python they are not mutable. Any string operation returns a new string.
If you had a list of objects you knew were mutable, you could do this as long as you don't change the actual contents of the list.
Thus you will need to do a map of some sort. If you use a generator expression it [the operation] will be done as you iterate and you will save memory.
You can do something like this:
a = [1,2,3,4,5]
b = [i**2 for i in a]
It's called a list comprehension, to make it easier for you to loop inside a list.
It is not clear from your question what the criteria for deciding what strings to remove is, but if you have or can make a list of the strings that you want to remove , you could do the following:
my_strings = ['a','b','c','d','e']
undesirable_strings = ['b','d']
for undesirable_string in undesirable_strings:
for i in range(my_strings.count(undesirable_string)):
my_strings.remove(undesirable_string)
which changes my_strings to ['a', 'c', 'e']
In short, to do modification on the list while iterating the same list.
list[:] = ["Modify the list" for each_element in list "Condition Check"]
example:
list[:] = [list.remove(each_element) for each_element in list if each_element in ["data1", "data2"]]
Something I just discovered - when looping over a list of mutable types (such as dictionaries) you can just use a normal for loop like this:
l = [{"n": 1}, {"n": 2}]
for d in l:
d["n"] += 1
print(l)
# prints [{"n": 2}, {"n": 1}]

list comprehension without if but with else

My question aims to use the else condition of a for-loop in a list comprehension.
example:
empty_list = []
def example_func(text):
for a in text.split():
for b in a.split(","):
empty_list.append(b)
else:
empty_list.append(" ")
I would like to make it cleaner by using a list comprehension with both for-loops.
But how can I do this by including an escape-clause for one of the loops (in this case the 2nd).
I know I can use if with and without else in a list comprehension. But how about using else without an if statement.
Is there a way, so the interpreter will understand it as escape-clause of a for loop?
Any help is much appreciated!
EDIT:
Thanks for the answers! In fact im trying to translate morse code.
The input is a string, containing morse codes.
Each word is separated by 3 spaces. Each letter of each word is separated by 1 space.
def decoder(code):
str_list = []
for i in code.split(" "):
for e in i.split():
str_list.append(morse_code_dic[e])
else:
str_list.append(" ")
return "".join(str_list[:-1]).capitalize()
print(decoder(".. - .-- .- ... .- --. --- --- -.. -.. .- -.--"))
I want to break down the whole sentence into words, then translate each word.
After the inner loop (translation of one word) is finished, it will launch its escape-clause else, adding a space, so that the structure of the whole sentence will be preserved. That way, the 3 Spaces will be translated to one space.
As noted in comments, that else does not really make all that much sense, since the purpose of an else after a for loop is actually to hold code for conditional execution if the loop terminates normally (i.e. not via break), which your loop always does, thus it is always executed.
So this is not really an answer to the question how to do that in a list comprehension, but more of an alternative. Instead of adding spaces after all words, then removing the last space and joining everything together, you could just use two nested join generator expressions, one for the sentence and one for the words:
def decoder(code):
return " ".join("".join(morse_code_dic[e] for e in i.split())
for i in code.split(" ")).capitalize()
As mentioned in the comments, the else clause in your particular example is pointless because it always runs. Let's contrive an example that would let us investigate the possibility of simulating a break and else.
Take the following string:
s = 'a,b,c b,c,d c,d,e, d,e,f'
Let's say you wanted to split the string by spaces and commas as before, but you only wanted to preserve the elements of the inner split up to the first occurrence of c:
out = []
for i in s.split():
for e in i.split(','):
if e == 'c':
break
out.append(e)
else:
out.append('-')
The break can be simulated using the arcane two-arg form of iter, which accepts a callable and a termination value:
>>> x = list('abcd')
>>> list(iter(iter(x).__next__, 'c'))
['a', 'b']
You can implement the else by chaining the inner iterable with ['-'].
>>> from itertools import chain
>>> x = list('abcd')
>>> list(iter(chain(x, ['-'])
.__next__, 'c'))
['a', 'b']
>>> y = list('def')
>>> list(iter(chain(y, ['-'])
.__next__, 'c'))
['d', 'e', 'f', '-']
Notice that the placement of chain is crucial here. If you were to chain the dash to the outer iterator, it would always be appended, not only when c is not encountered:
>>> list(chain(iter(iter(x).__next__, 'c'), ['-']))
['a', 'b', '-']
You can now simulate the entire nested loop with a single expression:
from itertools import chain
out = [e for i in s.split() for e in iter(chain(i.split(','), ['-']).__next__, 'c')]

Python integers with trailing 'L' in a list [duplicate]

So I have a list:
['x', 3, 'b']
And I want the output to be:
[x, 3, b]
How can I do this in python?
If I do str(['x', 3, 'b']), I get one with quotes, but I don't want quotes.
In Python 2:
mylist = ['x', 3, 'b']
print '[%s]' % ', '.join(map(str, mylist))
In Python 3 (where print is a builtin function and not a syntax feature anymore):
mylist = ['x', 3, 'b']
print('[%s]' % ', '.join(map(str, mylist)))
Both return:
[x, 3, b]
This is using the map() function to call str for each element of mylist, creating a new list of strings that is then joined into one string with str.join(). Then, the % string formatting operator substitutes the string in instead of %s in "[%s]".
This is simple code, so if you are new you should understand it easily enough.
mylist = ["x", 3, "b"]
for items in mylist:
print(items)
It prints all of them without quotes, like you wanted.
Using only print:
>>> l = ['x', 3, 'b']
>>> print(*l, sep='\n')
x
3
b
>>> print(*l, sep=', ')
x, 3, b
If you are using Python3:
print('[',end='');print(*L, sep=', ', end='');print(']')
Instead of using map, I'd recommend using a generator expression with the capability of join to accept an iterator:
def get_nice_string(list_or_iterator):
return "[" + ", ".join( str(x) for x in list_or_iterator) + "]"
Here, join is a member function of the string class str. It takes one argument: a list (or iterator) of strings, then returns a new string with all of the elements concatenated by, in this case, ,.
You can delete all unwanted characters from a string using its translate() method with None for the table argument followed by a string containing the character(s) you want removed for its deletechars argument.
lst = ['x', 3, 'b']
print str(lst).translate(None, "'")
# [x, 3, b]
If you're using a version of Python before 2.6, you'll need to use the string module's translate() function instead because the ability to pass None as the table argument wasn't added until Python 2.6. Using it looks like this:
import string
print string.translate(str(lst), None, "'")
Using the string.translate() function will also work in 2.6+, so using it might be preferable.
Here's an interactive session showing some of the steps in #TokenMacGuy's one-liner. First he uses the map function to convert each item in the list to a string (actually, he's making a new list, not converting the items in the old list). Then he's using the string method join to combine those strings with ', ' between them. The rest is just string formatting, which is pretty straightforward. (Edit: this instance is straightforward; string formatting in general can be somewhat complex.)
Note that using join is a simple and efficient way to build up a string from several substrings, much more efficient than doing it by successively adding strings to strings, which involves a lot of copying behind the scenes.
>>> mylist = ['x', 3, 'b']
>>> m = map(str, mylist)
>>> m
['x', '3', 'b']
>>> j = ', '.join(m)
>>> j
'x, 3, b'
Using .format for string formatting,
mylist = ['x', 3, 'b']
print("[{0}]".format(', '.join(map(str, mylist))))
Output:
[x, 3, b]
Explanation:
map is used to map each element of the list to string type.
The elements are joined together into a string with , as separator.
We use [ and ] in the print statement to show the list braces.
Reference:
.format for string formatting PEP-3101
I was inspired by #AniMenon to write a pythonic more general solution.
mylist = ['x', 3, 'b']
print('[{}]'.format(', '.join(map('{}'.format, mylist))))
It only uses the format method. No trace of str, and it allows for the fine tuning of the elements format.
For example, if you have float numbers as elements of the list, you can adjust their format, by adding a conversion specifier, in this case :.2f
mylist = [1.8493849, -6.329323, 4000.21222111]
print("[{}]".format(', '.join(map('{:.2f}'.format, mylist))))
The output is quite decent:
[1.85, -6.33, 4000.21]

list of None's is returned after list comprehension - python

here all_sentences_2d is a list of lists, contains the list of sentences in each tweet:
all_sentences_2d = [['tweet_1_sentence_1', 'tweet_1_sentence_2'],['tweet_2_sentence_1', 'tweet_2_sentence_2']]
I want to append an empty sentence after the sentences of each tweet.
I mean I want all_sentences_2d to be like that:
all_sentences_2d = [['tweet_1_sentence_1', 'tweet_1_sentence_2', ''],['tweet_2_sentence_1', 'tweet_2_sentence_2', '']]
I used this list comprehension to do that:
all_sentences_2d = [tweet_sentences.append('') for tweet_sentences in all_sentences_2d]
but I got that:
all_sentences_2d = [None, None, None, None, None, None]
While debugging, I've seen the appending operation done correctly in the all elements but after that becomes all None. Any ideas?
list.append() updates the list in place and returns None. You can try this, though:
[tweet_sentences + [''] for tweet_sentences in all_sentences_2d]
Although I would prefer a plain for-loop here:
for v in all_sentences_2d:
v.append('')
You can use map:
>>> all_sentences_2d = [['tweet_1_sentence_1', 'tweet_1_sentence_2'],['tweet_2_sentence_1', 'tweet_2_sentence_2']]
>>> map(lambda l: l+[' '], all_sentences_2d )
[['tweet_1_sentence_1', 'tweet_1_sentence_2', ' '], ['tweet_2_sentence_1', 'tweet_2_sentence_2', ' ']]
Yes it can be done but you need to be careful of your syntax, you have your if statement in the wrong place for a start. Look at the example below, note that the list comprehension is NOT assigned to a variabe.
words = ['1', '12', '123', '1234']
shortest = []
longest = []
shortest_len = 3
[shortest.append(i) if len(i) <= shortest_len else longest.append(i) for i in words]

Categories