I have a string
s = 'texttexttextblahblah",".'
and I want to cut of some of the rightmost characters by indexing and assign it to s so that s will be equal to texttexttextblahblah".
I've looked around and found how to print by indexing, but not how to reassign that actual variable to be trimmed.
Just reassign what you printed to the variable.
>>> s = 'texttexttextblahblah",".'
>>> s = s[:-3]
>>> s
'texttexttextblahblah"'
>>>
Unless you know exactly how many text and blah's you'll have, use .find() as Brent suggested (or .index(x), which is like find, except complains when it doesn't find x).
If you want that trailing ", just add one to the value it kicks out. (or just find the value you actually want to split at, ,)
s = s[:s.find('"') + 1]
If you need something that works like a string, but is mutable you can use a bytearray:
>>> s = bytearray('texttexttextblahblah",".')
>>> s[20:] = ''
>>> print s
texttexttextblahblah
bytearray has all the usual string methods.
Strings are immutable so you can't really change the string in-place. You'll need to slice out the part you want and then reassign it back over the original variable.
Is something like this what you wanted? (note I left out storing the index in a variable because I'm not sure how you're using this):
>>> s = 'texttexttextblahblah",".'
>>> s.index('"')
20
>>> s = s[:20]
>>> s
'texttexttextblahblah'
I myself prefer to do it without indexing: (My favorite partition was commented as winner in speed and clearness in comments so I updated the original code)
s = 'texttexttextblahblah",".'
s,_,_ = s.partition(',')
print s
Result
texttexttextblahblah"
Related
I am new in python, and I just find something strange:
>>> test="acdefg"
>>> test.replace('a','h')
'hcdefg'
>>> test
'acdefg'
>>> test=[1,2,3]
>>> test.reverse()
>>> test
[3, 2, 1]
As you can see in the code, in the first time, variable "test" is a string, when I call method "replace", the value of "test" doesn't change, the second time is is a list, and the list changed after I called the method reverse().
Why was that? Is it because of something different between the methods or something different between the objects or something else?
Strings are immutable. So you aren't actually changing test. You are actually getting the return of the replace string method. To use this modified string, you have to create a new string, or simply replace the existing string with the new value.
>>> some_string = "abcd"
>>> new_string = some_string.replace('a', 'x')
>>> new_string
xbcd
>>> some_string = "abcd"
>>> some_string = some_string.replace('a', 'x')
>>> some_string
xbcd
The second example, the list is mutable, and you are performing an in place manipulation of the list. If you actually do this:
res = your_list.reverse()
res will actually be None, because it doesn't return anything, it actually does it in place, which is why test list will hold the new manipulation you performed.
Read this on immutable vs mutable types in Python.
Also, refer to the documentation here on the Data Model to further your understanding as well.
It depends entirely on the implementation of the method. Some methods modify the objects they're called on, some do not.
I have various strings
123_dog
2_fish
56_cat
45_cat_fish
There is always one number. Always a '_' after the number.
I need to remove the number and the underscore. I can use regex, but I wonder if there is some pythonic way that uses builtin methods?
(I'm an experienced coder - but new to Python.)
Assuming that there is always an underscore after the number, and that there is always exactly a single number, you can do this:
s = '45_cat_fish'
print s.split('_', 1)[1]
# >>> cat_fish
The argument to split specifies the maximum number of splits to perform.
Using split and join:
>>> a="45_cat_fish"
>>> '_'.join(a.split('_')[1:])
'cat_fish'
Edit: split can take a maxsplit argument (see YS-L answer), so '_'.join is unnecessary, a.split('_',1)[1]…
Using find
>>> a[a.find('_')+1:]
'cat_fish'
Another way is:
s = "45_cat_fish"
print ''.join(c for c in s if c.isalpha() or c == '_')[1:]
gives cat_fish
I have a bit of a weird question here.
I am using iperf to test performance between a device and a server. I get the results of this test over SSH, which I then want to parse into values using a parser that has already been made. However, there are several lines at the top of the results (which I read into an object of lines) that I don't want to go into the parser. I know exactly how many lines I need to remove from the top each time though. Is there any way to drop specific entries out of a list? Something like this in psuedo-python
print list
["line1","line2","line3","line4"]
list = list.drop([0 - 1])
print list
["line3","line4"]
If anyone knows anything I could use I would really appreciate you helping me out. The only thing I can think of is writing a loop to iterate through and make a new list only putting in what I need. Anyway, thanlks!
Michael
Slices:
l = ["line1","line2","line3","line4"]
print l[2:] # print from 2nd element (including) onwards
["line3","line4"]
Slices syntax is [from(included):to(excluded):step]. Each part is optional. So you can write [:] to get the whole list (or any iterable for that matter -- string and tuple as an example from the built-ins). You can also use negative indexes, so [:-2] means from beginning to the second last element. You can also step backwards, [::-1] means get all, but in reversed order.
Also, don't use list as a variable name. It overrides the built-in list class.
This is what the slice operator is for:
>>> before = [1,2,3,4]
>>> after = before[2:]
>>> print after
[3, 4]
In this instance, before[2:] says 'give me the elements of the list before, starting at element 2 and all the way until the end.'
(also -- don't use reserved words like list or dict as variable names -- doing that can lead to confusing bugs)
You can use slices for that:
>>> l = ["line1","line2","line3","line4"] # don't use "list" as variable name, it's a built-in.
>>> print l[2:] # to discard items up to some point, specify a starting index and no stop point.
['line3', 'line4']
>>> print l[:1] + l[3:] # to drop items "in the middle", join two slices.
['line1', 'line4']
why not use a basic list slice? something like:
list = list[3:] #everything from the 3 position to the end
You want del for that
del list[:2]
You can use "del" statment to remove specific entries :
del(list[0]) # remove entry 0
del(list[0:2]) # remove entries 0 and 1
I'm relatively new to Python and it's libraries and I was wondering how I might create a string array with a preset size. It's easy in java but I was wondering how I might do this in python.
So far all I can think of is
strs = ['']*size
And some how when I try to call string methods on it, the debugger gives me an error X operation does not exist in object tuple.
And if it was in java this is what I would want to do.
String[] ar = new String[size];
Arrays.fill(ar,"");
Please help.
Error code
strs[sum-1] = strs[sum-1].strip('\(\)')
AttributeError: 'tuple' object has no attribute 'strip'
Question: How might I do what I can normally do in Java in Python while still keeping the code clean.
In python, you wouldn't normally do what you are trying to do. But, the below code will do it:
strs = ["" for x in range(size)]
In Python, the tendency is usually that one would use a non-fixed size list (that is to say items can be appended/removed to it dynamically). If you followed this, there would be no need to allocate a fixed-size collection ahead of time and fill it in with empty values. Rather, as you get or create strings, you simply add them to the list. When it comes time to remove values, you simply remove the appropriate value from the string. I would imagine you can probably use this technique for this. For example (in Python 2.x syntax):
>>> temp_list = []
>>> print temp_list
[]
>>>
>>> temp_list.append("one")
>>> temp_list.append("two")
>>> print temp_list
['one', 'two']
>>>
>>> temp_list.append("three")
>>> print temp_list
['one', 'two', 'three']
>>>
Of course, some situations might call for something more specific. In your case, a good idea may be to use a deque. Check out the post here: Python, forcing a list to a fixed size. With this, you can create a deque which has a fixed size. If a new value is appended to the end, the first element (head of the deque) is removed and the new item is appended onto the deque. This may work for what you need, but I don't believe this is considered the "norm" for Python.
The simple answer is, "You don't." At the point where you need something to be of fixed length, you're either stuck on old habits or writing for a very specific problem with its own unique set of constraints.
The best and most convenient method for creating a string array in python is with the help of NumPy library.
Example:
import numpy as np
arr = np.chararray((rows, columns))
This will create an array having all the entries as empty strings. You can then initialize the array using either indexing or slicing.
Are you trying to do something like this?
>>> strs = [s.strip('\(\)') for s in ['some\\', '(list)', 'of', 'strings']]
>>> strs
['some', 'list', 'of', 'strings']
But what is a reason to use fixed size? There is no actual need in python to use fixed size arrays(lists) so you always have ability to increase it's size using append, extend or decrease using pop, or at least you can use slicing.
x = ['' for x in xrange(10)]
strlist =[{}]*10
strlist[0] = set()
strlist[0].add("Beef")
strlist[0].add("Fish")
strlist[1] = {"Apple", "Banana"}
strlist[1].add("Cherry")
print(strlist[0])
print(strlist[1])
print(strlist[2])
print("Array size:", len(strlist))
print(strlist)
The error message says it all: strs[sum-1] is a tuple, not a string. If you show more of your code someone will probably be able to help you. Without that we can only guess.
Sometimes I need a empty char array. You cannot do "np.empty(size)" because error will be reported if you fill in char later. Then I usually do something quite clumsy but it is still one way to do it:
# Suppose you want a size N char array
charlist = [' ']*N # other preset character is fine as well, like 'x'
chararray = np.array(charlist)
# Then you change the content of the array
chararray[somecondition1] = 'a'
chararray[somecondition2] = 'b'
The bad part of this is that your array has default values (if you forget to change them).
def _remove_regex(input_text, regex_pattern):
findregs = re.finditer(regex_pattern, input_text)
for i in findregs:
input_text = re.sub(i.group().strip(), '', input_text)
return input_text
regex_pattern = r"\buntil\b|\bcan\b|\bboat\b"
_remove_regex("row and row and row your boat until you can row no more", regex_pattern)
\w means that it matches word characters, a|b means match either a or b, \b represents a word boundary
If you want to take input from user here is the code
If each string is given in new line:
strs = [input() for i in range(size)]
If the strings are separated by spaces:
strs = list(input().split())
I have a string like this that I need to parse into a 2D array:
str = "'813702104[813702106]','813702141[813702143]','813702172[813702174]'"
the array equiv would be:
arr[0][0] = 813702104
arr[0][1] = 813702106
arr[1][0] = 813702141
arr[1][1] = 813702143
#... etc ...
I'm trying to do this by REGEX. The string above is buried in an HTML page but I can be certain it's the only string in that pattern on the page. I'm not sure if this is the best way, but it's all I've got right now.
imgRegex = re.compile(r"(?:'(?P<main>\d+)\[(?P<thumb>\d+)\]',?)+")
If I run imgRegex.match(str).groups() I only get one result (the first couplet). How do I either get multiple matches back or a 2d match object (if such a thing exists!)?
Note: Contrary to how it might look, this is not homework
Note part deux: The real string is embedded in a large HTML file and therefore splitting does not appear to be an option.
I'm still getting answers for this, so I thought I better edit it to show why I'm not changing the accepted answer. Splitting, though more efficient on this test string, isn't going to extract the parts from a whole HTML file. I could combine a regex and splitting but that seems silly.
If you do have a better way to find the parts from a load of HTML (the pattern \d+\[\d+\] is unique to this string in the source), I'll happily change accepted answers. Anything else is academic.
I would try findall or finditer instead of match.
Edit by Oli: Yeah findall work brilliantly but I had to simplify the regex to:
r"'(?P<main>\d+)\[(?P<thumb>\d+)\]',?"
I think I will not go for regex for this task. Python list comprehension is quite powerful for this
In [27]: s = "'813702104[813702106]','813702141[813702143]','813702172[813702174]'"
In [28]: d=[[int(each1.strip(']\'')) for each1 in each.split('[')] for each in s.split(',')]
In [29]: d[0][1]
Out[29]: 813702106
In [30]: d[1][0]
Out[30]: 813702141
In [31]: d
Out[31]: [[813702104, 813702106], [813702141, 813702143], [813702172, 813702174]]
Modifying your regexp a little,
>>> str = "'813702104[813702106]','813702141[813702143]','813702172[813702174]"
>>> imgRegex = re.compile(r"'(?P<main>\d+)\[(?P<thumb>\d+)\]',?")
>>> print imgRegex.findall(str)
[('813702104', '813702106'), ('813702141', '813702143')]
Which is a "2 dimensional array" - in Python, "a list of 2-tuples".
I've got something that seems to work on your data set:
In [19]: str = "'813702104[813702106]','813702141[813702143]','813702172[813702174]'"
In [20]: ptr = re.compile( r"'(?P<one>\d+)\[(?P<two>\d+)\]'" )
In [21]: ptr.findall( str )
Out [23]:
[('813702104', '813702106'),
('813702141', '813702143'),
('813702172', '813702174')]
Alternatively, you could use Python's [statement for item in list] syntax for building lists. You should find this to be considerably faster than a regex, particularly for small data sets. Larger data sets will show a less marked difference (it only has to load the regular expressions engine once no matter the size), but the listmaker should always be faster.
Start by splitting the string on commas:
>>> str = "'813702104[813702106]','813702141[813702143]','813702172[813702174]'"
>>> arr = [pair for pair in str.split(",")]
>>> arr
["'813702104[813702106]'", "'813702141[813702143]'", "'813702172[813702174]'"]
Right now, this returns the same thing as just str.split(","), so isn't very useful, but you should be able to see how the listmaker works — it iterates through list, assigning each value to item, executing statement, and appending the resulting value to the newly-built list.
In order to get something useful accomplished, we need to put a real statement in, so we get a slice of each pair which removes the single quotes and closing square bracket, then further split on that conveniently-placed opening square bracket:
>>> arr = [pair[1:-2].split("[") for pair in str.split(",")]
>>> arr
>>> [['813702104', '813702106'], ['813702141', '813702143'], ['813702172', '813702174']]
This returns a two-dimensional array like you describe, but the items are all strings rather than integers. If you're simply going to use them as strings, that's far enough. If you need them to be actual integers, you simply use an "inner" listmaker as the statement for the "outer" listmaker:
>>> arr = [[int(x) for x in pair[1:-2].split("[")] for pair in str.split(",")]
>>> arr
>>> [[813702104, 813702106], [813702141, 813702143], [813702172, 813702174]]
This returns a two-dimensional array of the integers representing in a string like the one you provided, without ever needing to load the regular expressions engine.