Given a list of strings as such,
xs = ['1\n','2\n','3\n','4\n','5\n']
sum up the integers to return the sum as a string and append the sum to the list so that the returned list
xs = ['1\n','2\n','3\n','4\n','5\n','Sum:15\n']
I understand the process of going through the list and iterating this, I just don't understand how to get rid of the \n character so that I can only use the integer to find the sum?
def my_fun(x):
return x+["Sum: %s\n"%sum(map(int,x)),]
This uses a generator:
>>> xs + ['Sum:{0}\n'.format(str(sum(int(s) for s in xs)))]
['1\n', '2\n', '3\n', '4\n', '5\n', 'Sum:15\n']
To answer your question a bit more directly (leaving out the iteration since you said that's not the problem):
Believe it or not, int actually ignores the trailing newline when you use it to parse:
>>> int('1\n')
1
Once you have an int, you can do arithmetic as normal.
This is a documented feature in Python 2 and 3:
Optionally, the literal can be preceded by + or - (with no space in between) and surrounded by whitespace.
-Python int documentation (emphasis mine)
If you're interested in more streamlined ways of doing the iteration, you can see Joran's answer and the comments on it, but if this is some kind of assignment, I wouldn't use them if I were you. It benefits you more to work through the problems yourself. You of course want to use the more advanced features for professional work.
Related
What is the point of '/segment/segment/'.split('/') returning ['', 'segment', 'segment', '']?
Notice the empty elements. If you're splitting on a delimiter that happens to be at position one and at the very end of a string, what extra value does it give you to have the empty string returned from each end?
str.split complements str.join, so
"/".join(['', 'segment', 'segment', ''])
gets you back the original string.
If the empty strings were not there, the first and last '/' would be missing after the join().
More generally, to remove empty strings returned in split() results, you may want to look at the filter function.
Example:
f = filter(None, '/segment/segment/'.split('/'))
s_all = list(f)
returns
['segment', 'segment']
There are two main points to consider here:
Expecting the result of '/segment/segment/'.split('/') to be equal to ['segment', 'segment'] is reasonable, but then this loses information. If split() worked the way you wanted, if I tell you that a.split('/') == ['segment', 'segment'], you can't tell me what a was.
What should be the result of 'a//b'.split() be? ['a', 'b']?, or ['a', '', 'b']? I.e., should split() merge adjacent delimiters? If it should, then it will be very hard to parse data that's delimited by a character, and some of the fields can be empty. I am fairly sure there are many people who do want the empty values in the result for the above case!
In the end, it boils down to two things:
Consistency: if I have n delimiters, in a, I get n+1 values back after the split().
It should be possible to do complex things, and easy to do simple things: if you want to ignore empty strings as a result of the split(), you can always do:
def mysplit(s, delim=None):
return [x for x in s.split(delim) if x]
but if one doesn't want to ignore the empty values, one should be able to.
The language has to pick one definition of split()—there are too many different use cases to satisfy everyone's requirement as a default. I think that Python's choice is a good one, and is the most logical. (As an aside, one of the reasons I don't like C's strtok() is because it merges adjacent delimiters, making it extremely hard to do serious parsing/tokenization with it.)
There is one exception: a.split() without an argument squeezes consecutive white-space, but one can argue that this is the right thing to do in that case. If you don't want the behavior, you can always to a.split(' ').
I'm not sure what kind of answer you're looking for? You get three matches because you have three delimiters. If you don't want that empty one, just use:
'/segment/segment/'.strip('/').split('/')
Having x.split(y) always return a list of 1 + x.count(y) items is a precious regularity -- as #gnibbler's already pointed out it makes split and join exact inverses of each other (as they obviously should be), it also precisely maps the semantics of all kinds of delimiter-joined records (such as csv file lines [[net of quoting issues]], lines from /etc/group in Unix, and so on), it allows (as #Roman's answer mentioned) easy checks for (e.g.) absolute vs relative paths (in file paths and URLs), and so forth.
Another way to look at it is that you shouldn't wantonly toss information out of the window for no gain. What would be gained in making x.split(y) equivalent to x.strip(y).split(y)? Nothing, of course -- it's easy to use the second form when that's what you mean, but if the first form was arbitrarily deemed to mean the second one, you'd have lot of work to do when you do want the first one (which is far from rare, as the previous paragraph points out).
But really, thinking in terms of mathematical regularity is the simplest and most general way you can teach yourself to design passable APIs. To take a different example, it's very important that for any valid x and y x == x[:y] + x[y:] -- which immediately indicates why one extreme of a slicing should be excluded. The simpler the invariant assertion you can formulate, the likelier it is that the resulting semantics are what you need in real life uses -- part of the mystical fact that maths is very useful in dealing with the universe.
Try formulating the invariant for a split dialect in which leading and trailing delimiters are special-cased... counter-example: string methods such as isspace are not maximally simple -- x.isspace() is equivalent to x and all(c in string.whitespace for c in x) -- that silly leading x and is why you so often find yourself coding not x or x.isspace(), to get back to the simplicity which should have been designed into the is... string methods (whereby an empty string "is" anything you want -- contrary to man-in-the-street horse-sense, maybe [[empty sets, like zero &c, have always confused most people;-)]], but fully conforming to obvious well-refined mathematical common-sense!-).
Well, it lets you know there was a delimiter there. So, seeing 4 results lets you know you had 3 delimiters. This gives you the power to do whatever you want with this information, rather than having Python drop the empty elements, and then making you manually check for starting or ending delimiters if you need to know it.
Simple example: Say you want to check for absolute vs. relative filenames. This way you can do it all with the split, without also having to check what the first character of your filename is.
Consider this minimal example:
>>> '/'.split('/')
['', '']
split must give you what's before and after the delimiter '/', but there are no other characters. So it has to give you the empty string, which technically precedes and follows the '/', because '' + '/' + '' == '/'.
If you don't want empty spaces to be returned by split use it without args.
>>> " this is a sentence ".split()
['this', 'is', 'a', 'sentence']
>>> " this is a sentence ".split(" ")
['', '', 'this', '', '', 'is', '', 'a', 'sentence', '']
always use strip function before split if want to ignore blank lines.
youroutput.strip().split('splitter')
Example:
yourstring =' \nhey\njohn\nhow\n\nare\nyou'
yourstring.strip().split('\n')
What is the point of '/segment/segment/'.split('/') returning ['', 'segment', 'segment', '']?
Notice the empty elements. If you're splitting on a delimiter that happens to be at position one and at the very end of a string, what extra value does it give you to have the empty string returned from each end?
str.split complements str.join, so
"/".join(['', 'segment', 'segment', ''])
gets you back the original string.
If the empty strings were not there, the first and last '/' would be missing after the join().
More generally, to remove empty strings returned in split() results, you may want to look at the filter function.
Example:
f = filter(None, '/segment/segment/'.split('/'))
s_all = list(f)
returns
['segment', 'segment']
There are two main points to consider here:
Expecting the result of '/segment/segment/'.split('/') to be equal to ['segment', 'segment'] is reasonable, but then this loses information. If split() worked the way you wanted, if I tell you that a.split('/') == ['segment', 'segment'], you can't tell me what a was.
What should be the result of 'a//b'.split() be? ['a', 'b']?, or ['a', '', 'b']? I.e., should split() merge adjacent delimiters? If it should, then it will be very hard to parse data that's delimited by a character, and some of the fields can be empty. I am fairly sure there are many people who do want the empty values in the result for the above case!
In the end, it boils down to two things:
Consistency: if I have n delimiters, in a, I get n+1 values back after the split().
It should be possible to do complex things, and easy to do simple things: if you want to ignore empty strings as a result of the split(), you can always do:
def mysplit(s, delim=None):
return [x for x in s.split(delim) if x]
but if one doesn't want to ignore the empty values, one should be able to.
The language has to pick one definition of split()—there are too many different use cases to satisfy everyone's requirement as a default. I think that Python's choice is a good one, and is the most logical. (As an aside, one of the reasons I don't like C's strtok() is because it merges adjacent delimiters, making it extremely hard to do serious parsing/tokenization with it.)
There is one exception: a.split() without an argument squeezes consecutive white-space, but one can argue that this is the right thing to do in that case. If you don't want the behavior, you can always to a.split(' ').
I'm not sure what kind of answer you're looking for? You get three matches because you have three delimiters. If you don't want that empty one, just use:
'/segment/segment/'.strip('/').split('/')
Having x.split(y) always return a list of 1 + x.count(y) items is a precious regularity -- as #gnibbler's already pointed out it makes split and join exact inverses of each other (as they obviously should be), it also precisely maps the semantics of all kinds of delimiter-joined records (such as csv file lines [[net of quoting issues]], lines from /etc/group in Unix, and so on), it allows (as #Roman's answer mentioned) easy checks for (e.g.) absolute vs relative paths (in file paths and URLs), and so forth.
Another way to look at it is that you shouldn't wantonly toss information out of the window for no gain. What would be gained in making x.split(y) equivalent to x.strip(y).split(y)? Nothing, of course -- it's easy to use the second form when that's what you mean, but if the first form was arbitrarily deemed to mean the second one, you'd have lot of work to do when you do want the first one (which is far from rare, as the previous paragraph points out).
But really, thinking in terms of mathematical regularity is the simplest and most general way you can teach yourself to design passable APIs. To take a different example, it's very important that for any valid x and y x == x[:y] + x[y:] -- which immediately indicates why one extreme of a slicing should be excluded. The simpler the invariant assertion you can formulate, the likelier it is that the resulting semantics are what you need in real life uses -- part of the mystical fact that maths is very useful in dealing with the universe.
Try formulating the invariant for a split dialect in which leading and trailing delimiters are special-cased... counter-example: string methods such as isspace are not maximally simple -- x.isspace() is equivalent to x and all(c in string.whitespace for c in x) -- that silly leading x and is why you so often find yourself coding not x or x.isspace(), to get back to the simplicity which should have been designed into the is... string methods (whereby an empty string "is" anything you want -- contrary to man-in-the-street horse-sense, maybe [[empty sets, like zero &c, have always confused most people;-)]], but fully conforming to obvious well-refined mathematical common-sense!-).
Well, it lets you know there was a delimiter there. So, seeing 4 results lets you know you had 3 delimiters. This gives you the power to do whatever you want with this information, rather than having Python drop the empty elements, and then making you manually check for starting or ending delimiters if you need to know it.
Simple example: Say you want to check for absolute vs. relative filenames. This way you can do it all with the split, without also having to check what the first character of your filename is.
Consider this minimal example:
>>> '/'.split('/')
['', '']
split must give you what's before and after the delimiter '/', but there are no other characters. So it has to give you the empty string, which technically precedes and follows the '/', because '' + '/' + '' == '/'.
If you don't want empty spaces to be returned by split use it without args.
>>> " this is a sentence ".split()
['this', 'is', 'a', 'sentence']
>>> " this is a sentence ".split(" ")
['', '', 'this', '', '', 'is', '', 'a', 'sentence', '']
always use strip function before split if want to ignore blank lines.
youroutput.strip().split('splitter')
Example:
yourstring =' \nhey\njohn\nhow\n\nare\nyou'
yourstring.strip().split('\n')
My program seems to be indexing the wrong character or not at all.
I wrote a basic calculator that allows expressions to be used. It works by having the user enter the expression, then turning it into a list, and indexing the first number at position 0 and then using try/except statements to index number2 and the operator. All this is in a while loop that is finished when the user enters done at the prompt.
The program seems to work fine if I type the expression like this "1+1" but if I add spaces "1 + 1" it cannot index it or it ends up indexing the operator if I do "1+1" followed by "1 + 1".
I have asked in a group chat before and someone told me to use tokenization instead of my method, but I want to understand why my program is not running properly before moving on to something else.
Here is my code:
https://hastebin.com/umabukotab.py
Thank you!
Strings are basically lists of characters. 1+1 contains three characters, whereas 1 + 1 contains five, because of the two added spaces. Thus, when you access the third character in this longer string, you're actually accessing the middle element.
Parsing input is often not easy, and certainly parsing arithmetic expressions can get tricky quite quickly. Removing spaces from the input, as suggested by #Sethroph is a viable solution, but will only go that far. If you all of a sudden need to support stuff like 1+2+3, it will still break.
Another solution would be to split your input on the operator. For example:
input = '1 + 2'
terms = input.split('+') # ['1 ', ' 2'] note the spaces
terms = map(int, terms) # [1, 2] since int() can handle leading/trailing whitespace
output = terms[0] + terms[1]
Still, although this can handle situations like 1 + 2 + 3, it will still break when there's multiple different operators involved, or there are parentheses (but that might be something you need not worry about, depending on how complex you want your calculator to be).
IMO, a better approach would indeed be to use tokenization. Personally, I'd use parser combinators, but that may be a bit overkill. For reference, here's an example calculator whose input is parsed using parsy, a parser combinator library for Python.
You could remove the spaces before processing the string by using replace().
Try adding in:
clean_input = hold_input.replace(" ", "")
just after you create hold_input.
I am using Python 3.4.1. The purpose of this program is to take a series of elements that the user types in and run them backwards. For example, "Hello" becomes "olleH". I've been trying to get this right for a while now and It's been completely stumping me for quite some time.
word=input("Type any word:")
word_l=len(word)
word2=None
for word in range(word_l):
word2+=word[word_l]-i
print(word2)
Sorry if this seems like a serious noob question. Thanks.
This is a standard indexing problem. There's a third optional argument when you're slicing that creates a step. By setting that step to -1 you reverse the sequence:
>>> word = "hello"
>>> print(word[::-1])
olleh
>>>
Rule of thumb: If you are using string index or list indexes in a loop in Python to access each element -- look at the problem again 'cause there probably is a better way to do it.
You can simply add each character to the string in turn which will reverse the string:
word='Hello'
word2=''
for c in word:
word2=c+word2
print(word2)
You can also just use reversed:
>>> ''.join(reversed('Hello'))
'olleH'
word=input("Type any word:")
word_l=len(word)
word2=''
for i in range(word_l):
word2+=word[word_l-i-1]
print(word2)
From an S.O answer:
"Don't modify strings.
Work with them as lists; turn them into strings only when needed.
... code sample ...
Python strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings."
Is this considered best practice?
I find it a bit odd that Python has methods that return new modified strings (such as upper(), title(), replace() etc.) but doesn't have an insert method that returns a new string. Have I missed such a method?
Edit: I'm trying to rename files by inserting a character:
import os
for i in os.listdir('.'):
i.insert(3, '_')
Which doesn't work due to immutability. Adding to the beginning of a string works fine though:
for i in os.listdir('.'):
os.rename(i, 'some_random_string' + i)
Edit2: the solution:
>>> for i in os.listdir('.'): │··
... os.rename(i, i[:4] + '_' + i[4:])
Slicing certainly is nice and solves my problem, but is there a logical explanation why there is no insert() method that returns a new string?
Thanks for the help.
If you want to insert at a particular spot, you can use slices and +. For example:
a = "hello"
b = a[:2] + '_S1M0N_' + a[2:]
then b will be equal to he_S1M0N_llo.
It's at least arguably a best practice if you are doing a very large number of modifications to a string. It is not a general purpose best practice. It's simply a useful technique for solving performance problems when doing heavy string manipulation.
My advice is, don't do it until performance becomes an issue.
You can define a generic function that works on any sequence (strings, lists, tuples, etc.) using the slice syntax:
def insert(s, c, p):
return s[:p] + c + s[p:]
insert('FILE1', '_', 4)
> 'FILE_1'