Replace specific instance in string - Python - python

I know that the following is how to replace a string with another string i
line.replace(x, y)
But I only want to replace the second instance of x in the line. How do you do that?
Thanks
EDIT
I thought I would be able to ask this question without going into specifics but unfortunately none of the answers worked in my situation. I'm writing into a text file and I'm using the following piece of code to change the file.
with fileinput.FileInput("Player Stats.txt", inplace=True, backup='.bak') as file:
for line in file:
print(line.replace(chosenTeam, teamName), end='')
But if chosenTeam occurs multiple times then all of them are replaced.
How can I replace only the nth instance in this situation.

That's actually a little tricky. First use str.find to get an index beyond the first occurrence. Then slice and apply the replace (with count 1, so as to replace only one occurrence).
>>> x = 'na'
>>> y = 'banana'
>>> pos = y.find(x) + 1
>>> y[:pos] + y[pos:].replace(x, 'other', 1)
'banaother'

Bonus, this is a method to replace "NTH" occurrence in a string
def nth_replace(str,search,repl,index):
split = str.split(search,index+1)
if len(split)<=index+1:
return str
return search.join(split[:-1])+repl+split[-1]
example:
nth_replace("Played a piano a a house", "a", "in", 1) # gives "Played a piano in a house"

You can try this:
import itertools
line = "hello, hi hello how are you hello"
x = "hello"
y = "something"
new_data = line.split(x)
new_data = ''.join(itertools.chain.from_iterable([(x, a) if i != 2 else (y, a) for i, a in enumerate(new_data)]))

Related

Break the string (udacity: Intro to python ) [duplicate]

I wrote a quick script to remove the 'http://' substring from a list of website addresses saved on an excel column. The function replace though, doesn't work and I don't understand why.
from openpyxl import load_workbook
def rem(string):
print string.startswith("http://") #it yields "True"
string.replace("http://","")
print string, type(string) #checking if it works (it doesn't though, the output is the same as the input)
wb = load_workbook("prova.xlsx")
ws = wb["Sheet"]
for n in xrange(2,698):
c = "B"+str(n)
print ws[c].value, type(ws[c].value) #just to check value and type (unicode)
rem(str(ws[c].value)) #transformed to string in order to make replace() work
wb.save("prova.xlsx") #nothing has changed
String.replace(substr)
does not happen in place, change it to:
string = string.replace("http://","")
string.replace(old, new[, max]) only returns a value—it does not modify string. For example,
>>> a = "123"
>>> a.replace("1", "4")
'423'
>>> a
'123'
You must re-assign the string to its modified value, like so:
>>> a = a.replace("1", "4")
>>> a
'423'
So in your case, you would want to instead write
string = string.replace("http://", "")

python string split slice and into a list

I have a string for example "streemlocalbbv"
and I have my_function that takes this string and a string that I want to find ("loc") in the original string. And what I want to get returned is this;
my_function("streemlocalbbv", "loc")
output = ["streem","loc","albbv"]
what I did so far is
def find_split(string,find_word):
length = len(string)
find_word_start_index = string.find(find_word)
find_word_end_index = find_word_start_index + len(find_word)
string[find_word_start_index:find_word_end_index]
a = string[0:find_word_start_index]
b = string[find_word_start_index:find_word_end_index]
c = string[find_word_end_index:length]
return [a,b,c]
Trying to find the index of the string I am looking for in the original string, and then split the original string. But from here I am not sure how should I do it.
You can use str.partition which does exactly what you want:
>>> "streemlocalbbv".partition("loc")
('streem', 'loc', 'albbv')
Use the split function:
def find_split(string,find_word):
ends = string.split(find_word)
return [ends[0], find_word, ends[1]]
Use the split, index and insert function to solve this
def my_function(word,split_by):
l = word.split(split_by)
l.insert(l.index(word[:word.find(split_by)])+1,split_by)
return l
print(my_function("streemlocalbbv", "loc"))
#['str', 'eem', 'localbbv']

split string every nth character and append ':'

I've read some switch MAC address table into a file and for some reason the MAC address if formatted as such:
'aabb.eeff.hhii'
This is not what a MAC address should be, it should follow: 'aa:bb:cc:dd:ee:ff'
I've had a look at the top rated suggestions while writing this and found an answer that may fit my needs but it doesn't work
satomacoto's answer
The MACs are in a list, so when I run for loop I can see them all as such:
Current Output
['8424.aa21.4er9','fa2']
['94f1.3002.c43a','fa1']
I just want to append ':' at every 2nd nth character, I can just remove the '.' with a simple replace so don't worry about that
Desired output
['84:24:aa:21:4e:r9','fa2']
['94:f1:30:02:c4:3a','fa1']
My code
info = []
newinfo = []
file = open('switchoutput')
newfile = file.read().split('switch')
macaddtable = newfile[3].split('\\r')
for x in macaddtable:
if '\\n' in x:
x = x.replace('\\n', '')
if carriage in x:
x = x.replace(carriage, '')
if '_#' in x:
x = x.replace('_#', '')
x.split('/r')
info.append(x)
for x in info:
if "Dynamic" in x:
x = x.replace('Dynamic', '')
if 'SVL' in x:
x = x.replace('SVL', '')
newinfo.append(x.split(' '))
for x in newinfo:
for x in x[:1]:
if '.' in x:
x = x.replace('.', '')
print(x)
Borrowing from the solution that you linked, you can achieve this as follows:
macs = [['8424.aa21.4er9','fa2'], ['94f1.3002.c43a','fa1']]
macs_fixed = [(":".join(map(''.join, zip(*[iter(m[0].replace(".", ""))]*2))), m[1]) for m in macs]
Which yields:
[('84:24:aa:21:4e:r9', 'fa2'), ('94:f1:30:02:c4:3a', 'fa1')]
If you like regular expressions:
import re
dotted = '1234.3456.5678'
re.sub('(..)\.?(?!$)', '\\1:', dotted)
# '12:34:34:56:56:78'
The template string looks for two arbitrary characters '(..)' and assigns them to group 1. It then allows for 0 or 1 dots to follow '\.?' and makes sure that at the very end there is no match '(?!$)'. Every match is then replaced with its group 1 plus a colon.
This uses the fact that re.sub operates on nonoverlapping matches.
x = '8424.aa21.4er9'.replace('.','')
print(':'.join(x[y:y+2] for y in range(0, len(x) - 1, 2)))
>> 84:24:aa:21:4e:r9
Just iterate through the string once you've cleaned it, and grab 2 string each time you loop through the string. Using range() third optional argument you can loop through every second elements. Using join() to add the : in between the two elements you are iterating.
You can use re module to achieve your desired output.
import re
s = '8424.aa21.4er9'
s = s.replace('.','')
groups = re.findall(r'([a-zA-Z0-9]{2})', s)
mac = ":".join(groups)
#'84:24:aa:21:4e:r9'
Regex Explanation
[a-zA-Z0-9]: Match any alphabets or number
{2}: Match at most 2 characters.
This way you can get groups of two and then join them on : to achieve your desired mac address format
wrong_mac = '8424.aa21.4er9'
correct_mac = ''.join(wrong_mac.split('.'))
correct_mac = ':'.join(correct_mac[i:i+2] for i in range(0, len(correct_mac), 2))
print(correct_mac)

Regular Expression in Python

I'm trying to build a list of domain names from an Enom API call. I get back a lot of information and need to locate the domain name related lines, and then join them together.
The string that comes back from Enom looks somewhat like this:
SLD1=domain1
TLD1=com
SLD2=domain2
TLD2=org
TLDOverride=1
SLD3=domain3
TLD4=co.uk
SLD5=domain4
TLD5=net
TLDOverride=1
I'd like to build a list from that which looks like this:
[domain1.com, domain2.org, domain3.co.uk, domain4.net]
To find the different domain name components I've tried the following (where "enom" is the string above) but have only been able to get the SLD and TLD matches.
re.findall("^.*(SLD|TLD).*$", enom, re.M)
Edit:
Every time I see a question asking for regular expression solution I have this bizarre urge to try and solve it without regular expressions. Most of the times it's more efficient than the use of regex, I encourage the OP to test which of the solutions is most efficient.
Here is the naive approach:
a = """SLD1=domain1
TLD1=com
SLD2=domain2
TLD2=org
TLDOverride=1
SLD3=domain3
TLD4=co.uk
SLD5=domain4
TLD5=net
TLDOverride=1"""
b = a.split("\n")
c = [x.split("=")[1] for x in b if x != 'TLDOverride=1']
for x in range(0,len(c),2):
print ".".join(c[x:x+2])
>> domain1.com
>> domain2.org
>> domain3.co.uk
>> domain4.net
You have a capturing group in your expression. re.findall documentation says:
If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.
That's why only the conent of the capturing group is returned.
try:
re.findall("^.*((?:SLD|TLD)\d*)=(.*)$", enom, re.M)
This would return a list of tuples:
[('SLD1', 'domain1'), ('TLD1', 'com'), ('SLD2', 'domain2'), ('TLD2', 'org'), ('SLD3', 'domain3'), ('TLD4', 'co.uk'), ('SLD5', 'domain4'), ('TLD5', 'net')]
Combining SLDs and TLDs is then up to you.
this works for you example,
>>> sld_list = re.findall("^.*SLD[0-9]*?=(.*?)$", enom, re.M)
>>> tld_list = re.findall("^.*TLD[0-9]*?=(.*?)$", enom, re.M)
>>> map(lambda x: x[0] + '.' + x[1], zip(sld_list, tld_list))
['domain1.com', 'domain2.org', 'domain3.co.uk', 'domain4.net']
I'm not sure why are you talking about regular expressions. I mean, why don't you just run a for loop?
A famous quote seems to be appropriate here:
Some people, when confronted with a problem, think “I know, I'll use
regular expressions.” Now they have two problems.
domains = []
components = []
for line in enom.split('\n'):
k,v = line.split('=')
if k == 'TLDOverride':
continue
components.append(v)
if k.startswith('TLD'):
domains.append('.'.join(components))
components = []
P.S. I'm not sure what's this TLDOverride so the code just ignores it.
Here's one way:
import re
print map('.'.join, zip(*[iter(re.findall(r'^(?:S|T)LD\d+=(.*)$', text, re.M))]*2))
# ['domain1.com', 'domain2.org', 'domain3.co.uk', 'domain4.net']
Just for fun, map -> filter -> map:
input = """
SLD1=domain1
TLD1=com
SLD2=domain2
TLD2=org
TLDOverride=1
SLD3=domain3
TLD4=co.uk
SLD5=domain4
TLD5=net
"""
splited = map(lambda x: x.split("="), input.split())
slds = filter(lambda x: x[1][0].startswith('SLD'), enumerate(splited))
print map(lambda x: '.'.join([x[1][1], splited[x[0] + 1][1], ]), slds)
>>> ['domain1.com', 'domain2.org', 'domain3.co.uk', 'domain4.net']
This appears to do what you want:
domains = re.findall('SLD\d+=(.+)', re.sub(r'\nTLD\d+=', '.', enom))
It assumes that the lines are sorted and SLD always comes before its TLD. If that can be not the case, try this slightly more verbose code without regexes:
d = dict(x.split('=') for x in enom.strip().splitlines())
domains = [
d[key] + '.' + d.get('T' + key[1:], '')
for key in d if key.startswith('SLD')
]
You need to use multiline regex for this. This is similar to this post.
data = """SLD1=domain1
TLD1=com
SLD2=domain2
TLD2=org
TLDOverride=1
SLD3=domain3
TLD4=co.uk
SLD5=domain4
TLD5=net
TLDOverride=1"""
domain_seq = re.compile(r"SLD\d=(\w+)\nTLD\d=(\w+)", re.M)
for item in domain_seq.finditer(data):
domain, tld = item.group(1), item.group(2)
print "%s.%s" % (domain,tld)
As some other answers already said, there's no need to use a regular expression here. A simple split and some filtering will do nicely:
lines = data.split("\n") #assuming data contains your input string
sld, tld = [[x.split("=")[1] for x in lines if x[:3] == t] for t in ("SLD", "TLD")]
result = [x+y for x, y in zip(sld, tld)]

(permutation/Anagrm) words find in python 2.72 (need help to find what's wrong with my code)

i hope this request is legit.
i'm taking a programming course in python for engineers, so i'm kinda new at this business.
anyway, in my homework i was requested to write a function with receive two strings and check if one is a (permutation/Anagrm) of the other. (which means if they both have exactly the same letters and same number of appearances for each letter)
iv'e found some great codes here while searching, but i still don't get what's wrong with my code (and it's important for me to know for my studying process).
we got a tests file which suppose to check our functions, and it gave me that error:
Traceback (most recent call last):
File "C:\Users\Or\Desktop\תכנות\4\hw4\123456789_a4.py", line 110, in <module>
test_hw4()
File "C:\Users\Or\Desktop\תכנות\4\hw4\123456789_a4.py", line 97, in test_hw4
test(is_anagram('Tom Marvolo Riddle','I Am Lord Voldemort'), True)
File "C:\Users\Or\Desktop\תכנות\4\hw4\123456789_a4.py", line 31, in is_anagram
s2_list.sort()
NameError: global name 's2_list' is not defined
this is my code:
def is_anagram(string1, string2):
string1 = string1.lower() #turns Capital letter to small ones
string2 = string2.lower()
string1 = string1.replace(" ","") #turns the words inside the string to one word
string2 = string2.replace(" ","")
if len(string1)!= len(string2):
return False
s1_list = [string1[i] for i in range(len(string1))] #creates a list of string 1 letters
a2_list = [string1[k] for k in range(len(string1))]
s1_list.sort() #sorting the list
s2_list.sort()
booli=False
k=0
for i in s1_list: #for loop which compares each letter in the two lists
if s1_list[k]==s2_list[k]:
booli = True
k=k+1
else:
booli=False
break
return booli
any one know how to fix it ?
Thanks!
It looks like you have a typo with a2_list. That section should read:
s1_list = [string1[i] for i in range(len(string1))] #creates a list of string 1 letters
s2_list = [string2[k] for k in range(len(string2))]
s1_list.sort() #sorting the list
s2_list.sort()
FWIW, here is an interactive prompt example of how to tell if two strings are anagrams of one another:
>>> string1 = 'Logarithm'
>>> string2 = 'algorithm'
>>> sorted(string1.lower()) == sorted(string2.lower()) # see if they are anagrams
True
If you make a listify_string function and use that to set your s1_list and s2_list, it might be easier to see that there are multiple things that look to be wrong with your code, unless you intended both s1_list and s2_list to be populated from the same string.
def listify(string):
return [c for c in string]
Then you can simply do s1_list = listify(string1) and s2_list = ... to set the values.
I would probably turn at least the 'check if the two lists are the same' into a function, so I could use an early return to indicate falseness (so instead of starting with booli as true, setting it on each iteration through the loop and breaking out of the loop if false).
If you look at the join method of Python strings, you might find inspiration for another way to check if s1_list and s2_list are the same.
Try this one-liner instead:
sorted(s1.lower().replace(' ', '')) == sorted(s2.lower().replace(' ', ''))
Python strings are essentially lists, so they can be sorted. We just need to take care of uppercase and whitespace first. The python equals operator then takes care of the actual comparison.

Categories