remove specific whitespaces from list items - python

I have a huge list of lines, each of which looks as follows
1 01 01 some random text
The 1 01 01 part is a reference number that changes from line to line. I want to remove the two whitespaces between the three reference numbers, so that the lines look as follows.
10101 some random text
Obviously, this calls for a for loop. The question is what I should write inside the loop I can't use strip,
for i in my_list:
i.strip()
because that, if anything, would remove all whitespaces, giving me
10101somerandomtext
which I don't want. But if I write
for i in my_list:
i.remove(4)
i.remove(1)
I get an error message 'str' object has no attribute 'remove'. What is the proper solution in this case.
Thanks in advance.

If the number is always at the beginning, you can use the fact that str.replace function takes an optional argument count:
for l in mylist:
print l.replace(' ', '', 2)
Note that I'm doing print here for a reason: you can't change the strings in-place, because strings are immutable (this is also why they don't have a remove method, and replace returns a modified string, but leaves the initial string intact). So if you need them in a list, it's cleaner to create another list like this:
newlist = [l.replace(' ', '', 2) for l in mylist]
It's also safe to overwrite the list like this:
mylist = [l.replace(' ', '', 2) for l in mylist]

Use the count argument for replace, to replace the first 2 spaces.
a = "1 01 01 some random text"
a.replace(" " , "", 2)
>>> '10101 some random text'

split takes a second argument - the number of splits to make
for i in my_list:
components = i.strip(" ", 3)
refnum = ''.join(components[:3])
text = components[3]
Or in python 3:
for i in my_list:
*components, text = i.strip(" ", 3)
refnum = ''.join(components)

Related

Extract data from a list Python

I have a list of string and I want to take the last "word" of it, explanation :
Here's my code :
myList = ["code 53 value 281", "code 53 value 25", ....]
And I want to take only the number at the end :
myList = ["281", "25", ....]
Thank you.
Let's break down your problem.
So first off, you've got a list of strings. You know that each string will end with some kind of numeric value, you want to pull that out and store it in the list. Basically, you want to get rid of everything except for that last numeric value.
To write it in code terms, we need to iterate on that list, split each string by a space character ' ', then grab the last word from that collection, and store it in the list.
There are quite a few ways you could do this, but the simplest would be list comprehension.
myList = ["Hey 123", "Hello 456", "Bye 789"] # we want 123, 456, 789
myNumericList = [x.split(' ')[-1] for x in myList]
# for x in myList is pretty obvious, looks like a normal for loop
# x.split(' ') will split the string by the space, as an example, "Hey 123" would become ["Hey", "123"]
# [-1] gets the last element from the collection
print(myNumericList) # "123", "456", "789"
I don't know why you would want to check if there are integers in your text, extract them and then convert them back to a string and add to a list. Anyhow, you can use .split() to split the text on spaces and then try to interpret the splitted strings as integers, like so:
myList = ["code 53 value 281", "code 53 value 25"]
list = []
for var in myList:
list.append(var.split()[-1])
print(list)
Loop through the list and for a particular value at i-th index in the list simply pick the last value.
See code section below:
ans=[]
for i in myList:
ans.append(i.split(" ")[-1])
print(ans)

Split in python with character special

I split within a string traversing an array with values, this split must contain the following rule:
Split the string into two parts when there is a special character, and select the first part as a result;
SCRIPT
array = [
'srv1 #s',
'srv2;192.168.9.1'
]
result = []
for x in array:
outfinally = [line.split(';')[0] and line.split()[0] for line in x.splitlines() if line and line[0].isalpha()]
for srv in outfinally:
if srv != None:
result.append(srv)
for i in result:
print(i)
OUTPUT
srv1
srv2;192.168.9.1
DESIRED OUTPUT
srv1
srv2
This should split on any special charters and append the first part of the split to a new list:
array = [
'srv1 #s',
'srv2;192.168.9.1'
]
sep = (r'[`\-=~!##$%^&*()_+\[\]{};\'\\:"|<,./<>?]')
rest = text.split(sep, 1)[0]
new_array =[]
for i in array:
new_array.append(re.split(sep,i)[0])
Output:
['srv1 ', 'srv2']
You can split twice with the two different separators instead:
result = [s.split()[0].split(';')[0] for s in array]
result becomes:
['srv1', 'srv2']
The problem is here: line.split(';')[0] and line.split()[0]
Your second condition splits on whitespace. As a result, it'll always return the whitespace-split version unless there's a semicolon at the start of the input (in which case you get empty string).
You probably want to chain the two splits instead:
line.split(';')[0].split()[0]
To see what the code in your question is doing, take a look at what your conditional expression does in a few different cases:
array = ['srv1 s', 'srv2;192.168.9.1', ';192.168.1.1', 'srv1;srv2 192.168.1.1']
>>> for item in array:
... print("Original: {}\n\tSplit: {}".format(item, item.split(';')[0] and item.split()[0]))
...
Original: srv1 s
Split: srv1 # split on whitespace
Original: srv2;192.168.9.1
Split: srv2;192.168.9.1 # split on whitespace!
Original: ;192.168.1.1
Split: # split on special char, returned empty which is falsey, returns empty str
Original: srv1;srv2 192.168.1.1
Split: srv1;srv2 # split only on whitespace
Change
outfinally = [line.split(';')[0] and line.split()[0] for line in x.splitlines() if line and line[0].isalpha()]
To
outfinally = [line.replace(';', ' ').split()[0] for line in x.splitlines() if line and line[0].isalpha()]
When you use and like that, it will always return the first result as long as the first result is truthy. The split function returns the full string in a list when a match is not found. Since it's returning something truthy, you'll never move on to the second condition (and if you use or like I first tried to do, you'll always move on to the second condition). Instead of having 2 conditions, what you'll have to do is combine them into one. Something like line.replace(';', ' ').split()[0] or blhsing's solution is even better.

AttributeError: 'str' object has no attribute 'remove' [duplicate]

There is a string, for example. EXAMPLE.
How can I remove the middle character, i.e., M from it? I don't need the code. I want to know:
Do strings in Python end in any special character?
Which is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?
In Python, strings are immutable, so you have to create a new string. You have a few options of how to create the new string. If you want to remove the 'M' wherever it appears:
newstr = oldstr.replace("M", "")
If you want to remove the central character:
midlen = len(oldstr) // 2
newstr = oldstr[:midlen] + oldstr[midlen+1:]
You asked if strings end with a special character. No, you are thinking like a C programmer. In Python, strings are stored with their length, so any byte value, including \0, can appear in a string.
To replace a specific position:
s = s[:pos] + s[(pos+1):]
To replace a specific character:
s = s.replace('M','')
This is probably the best way:
original = "EXAMPLE"
removed = original.replace("M", "")
Don't worry about shifting characters and such. Most Python code takes place on a much higher level of abstraction.
Strings are immutable. But you can convert them to a list, which is mutable, and then convert the list back to a string after you've changed it.
s = "this is a string"
l = list(s) # convert to list
l[1] = "" # "delete" letter h (the item actually still exists but is empty)
l[1:2] = [] # really delete letter h (the item is actually removed from the list)
del(l[1]) # another way to delete it
p = l.index("a") # find position of the letter "a"
del(l[p]) # delete it
s = "".join(l) # convert back to string
You can also create a new string, as others have shown, by taking everything except the character you want from the existing string.
How can I remove the middle character, i.e., M from it?
You can't, because strings in Python are immutable.
Do strings in Python end in any special character?
No. They are similar to lists of characters; the length of the list defines the length of the string, and no character acts as a terminator.
Which is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?
You cannot modify the existing string, so you must create a new one containing everything except the middle character.
Use the translate() method:
>>> s = 'EXAMPLE'
>>> s.translate(None, 'M')
'EXAPLE'
def kill_char(string, n): # n = position of which character you want to remove
begin = string[:n] # from beginning to n (n not included)
end = string[n+1:] # n+1 through end of string
return begin + end
print kill_char("EXAMPLE", 3) # "M" removed
I have seen this somewhere here.
card = random.choice(cards)
cardsLeft = cards.replace(card, '', 1)
How to remove one character from a string:
Here is an example where there is a stack of cards represented as characters in a string.
One of them is drawn (import random module for the random.choice() function, that picks a random character in the string).
A new string, cardsLeft, is created to hold the remaining cards given by the string function replace() where the last parameter indicates that only one "card" is to be replaced by the empty string...
On Python 2, you can use UserString.MutableString to do it in a mutable way:
>>> import UserString
>>> s = UserString.MutableString("EXAMPLE")
>>> type(s)
<class 'UserString.MutableString'>
>>> del s[3] # Delete 'M'
>>> s = str(s) # Turn it into an immutable value
>>> s
'EXAPLE'
MutableString was removed in Python 3.
Another way is with a function,
Below is a way to remove all vowels from a string, just by calling the function
def disemvowel(s):
return s.translate(None, "aeiouAEIOU")
Here's what I did to slice out the "M":
s = 'EXAMPLE'
s1 = s[:s.index('M')] + s[s.index('M')+1:]
To delete a char or a sub-string once (only the first occurrence):
main_string = main_string.replace(sub_str, replace_with, 1)
NOTE: Here 1 can be replaced with any int for the number of occurrence you want to replace.
You can simply use list comprehension.
Assume that you have the string: my name is and you want to remove character m. use the following code:
"".join([x for x in "my name is" if x is not 'm'])
If you want to delete/ignore characters in a string, and, for instance, you have this string,
"[11:L:0]"
from a web API response or something like that, like a CSV file, let's say you are using requests
import requests
udid = 123456
url = 'http://webservices.yourserver.com/action/id-' + udid
s = requests.Session()
s.verify = False
resp = s.get(url, stream=True)
content = resp.content
loop and get rid of unwanted chars:
for line in resp.iter_lines():
line = line.replace("[", "")
line = line.replace("]", "")
line = line.replace('"', "")
Optional split, and you will be able to read values individually:
listofvalues = line.split(':')
Now accessing each value is easier:
print listofvalues[0]
print listofvalues[1]
print listofvalues[2]
This will print
11
L
0
Two new string removal methods are introduced in Python 3.9+
#str.removeprefix("prefix_to_be_removed")
#str.removesuffix("suffix_to_be_removed")
s='EXAMPLE'
In this case position of 'M' is 3
s = s[:3] + s[3:].removeprefix('M')
OR
s = s[:4].removesuffix('M') + s[4:]
#output'EXAPLE'
from random import randint
def shuffle_word(word):
newWord=""
for i in range(0,len(word)):
pos=randint(0,len(word)-1)
newWord += word[pos]
word = word[:pos]+word[pos+1:]
return newWord
word = "Sarajevo"
print(shuffle_word(word))
Strings are immutable in Python so both your options mean the same thing basically.

Python get the x first words in a string

I'm looking for a code that takes the 4 (or 5) first words in a script.
I tried this:
import re
my_string = "the cat and this dog are in the garden"
a = my_string.split(' ', 1)[0]
b = my_string.split(' ', 1)[1]
But I can't take more than 2 strings:
a = the
b = cat and this dog are in the garden
I would like to have:
a = the
b = cat
c = and
d = this
...
You can use slice notation on the list created by split:
my_string.split()[:4] # first 4 words
my_string.split()[:5] # first 5 words
N.B. these are example commands. You should use one or the other, not both in a row.
The second argument of the split() method is the limit. Don't use it and you will get all words.
Use it like this:
my_string = "the cat and this dog are in the garden"
splitted = my_string.split()
first = splitted[0]
second = splitted[1]
...
Also, don't call split() every time when you want a word, it is expensive. Do it once and then just use the results later, like in my example.
As you can see, there is no need to add the ' ' delimiter since the default delimiter for the split() function (None) matches all whitespace. You can use it however if you don't want to split on Tab for example.
You can split a string on whitespace easily enough, but if your string doesn't happen to have enough words in it, the assignment will fail where the list is empty.
a, b, c, d, e = my_string.split()[:5] # May fail
You'd be better off keeping the list as is instead of assigning each member to an individual name.
words = my_string.split()
at_most_five_words = words[:5] # terrible variable name
That's a terrible variable name, but I used it to illustrate the fact that you're not guaranteed to get five words – you're only guaranteed to get at most five words.

how to get the last part of a string before a certain character?

I am trying to print the last part of a string before a certain character.
I'm not quite sure whether to use the string .split() method or string slicing or maybe something else.
Here is some code that doesn't work but I think shows the logic:
x = 'http://test.com/lalala-134'
print x['-':0] # beginning at the end of the string, return everything before '-'
Note that the number at the end will vary in size so I can't set an exact count from the end of the string.
You are looking for str.rsplit(), with a limit:
print x.rsplit('-', 1)[0]
.rsplit() searches for the splitting string from the end of input string, and the second argument limits how many times it'll split to just once.
Another option is to use str.rpartition(), which will only ever split just once:
print x.rpartition('-')[0]
For splitting just once, str.rpartition() is the faster method as well; if you need to split more than once you can only use str.rsplit().
Demo:
>>> x = 'http://test.com/lalala-134'
>>> print x.rsplit('-', 1)[0]
http://test.com/lalala
>>> 'something-with-a-lot-of-dashes'.rsplit('-', 1)[0]
'something-with-a-lot-of'
and the same with str.rpartition()
>>> print x.rpartition('-')[0]
http://test.com/lalala
>>> 'something-with-a-lot-of-dashes'.rpartition('-')[0]
'something-with-a-lot-of'
Difference between split and partition is split returns the list without delimiter and will split where ever it gets delimiter in string i.e.
x = 'http://test.com/lalala-134-431'
a,b,c = x.split(-)
print(a)
"http://test.com/lalala"
print(b)
"134"
print(c)
"431"
and partition will divide the string with only first delimiter and will only return 3 values in list
x = 'http://test.com/lalala-134-431'
a,b,c = x.partition('-')
print(a)
"http://test.com/lalala"
print(b)
"-"
print(c)
"134-431"
so as you want last value you can use rpartition it works in same way but it will find delimiter from end of string
x = 'http://test.com/lalala-134-431'
a,b,c = x.rpartition('-')
print(a)
"http://test.com/lalala-134"
print(b)
"-"
print(c)
"431"

Categories