How to extract string in Python

How to extract string in Python - python

I want to extract specific string from a string, but it shows error. Why can't I use the find as index to extract string ?
Here is my code
string = 'ABP'
p_index = string.find('P')
s = string[0, p_index]
print(s)
TypeError: string indices must be integers

s = string[0, p_index] isn't a valid syntax in python, you should rather do:
s = string[0:p_index]
Since an omitted first index defaults to zero, this returns the same result:
s = string[:p_index]
I'd recommend reading this page for reference on Python string's slicing and it's syntax in general.

You should change this line:
s = string[0, p_index]
with
s = string[p_index]
You don't need to put anything rather than the index of the letter to get 'P' and you found the index with string.find('P') already.
If you mean substracting the 'P' letter from 'ABP' then use:
new_string = 'ABP'.replace('P','')

I'm pretty sure you slice strings like this s = string[0:2]

string = 'ABP'
p_index = string.index('P')
s = string[p_index]
print(s)
string = 'ABP'
p_index = string.find('P')
s = string[p_index]
print(s)
maybe you can try it like this two

Related

How to get integer for two characters in python

a = "a26lsdm3684"
How can I get an integer with value of 26(a[1] and a[2])? If I write int(a[1) or int (a[2]) it just gives me integer of one character. What should I write when I want integer with value of 26 and store it in variable b?

Slice out the two characters, then convert:
b = int(a[1:3]) # Slices are exclusive on the end index, so you need to go to 3 to get 1 and 2

you can get substrings out of the string and convert that to int, as long as you know the exact indexes
a = "a26lsdm3684"
substring_of_a = a[1:3]
number = int(substring_of_a)
print(number, type(number))

There is more than one way to do it.
Use Slicing, as pointed out by jasonharper and ShadowRanger.
Or use re.findall to find the first stretch of digits.
Or use re.split to split on non-digits and find the 2nd element (the first one is an empty string at the beginning).
import re
a = "a26lsdm3684"
print(int(a[1:3]))
print(int((re.findall(r'\d+', a))[0]))
print(int((re.split(r'\D+', a))[1]))
# 26

A little more sustainable if you want multiple numbers from the same string:
def get_numbers(input_string):
i = 0
buffer = ""
out_list = []
while i < len(input_string):
if input_string[i].isdigit():
buffer = buffer + input_string[i]
else:
if buffer:
out_list.append(int(buffer))
buffer = ""
i = i + 1
if buffer:
out_list.append(int(buffer))
return out_list
a = "a26lsdm3684"
print(get_numbers(a))
output:
[26, 3684]

If you want to convert all the numeric parts in your string, and say put them in a list, you may do something like:
from re import finditer
a = "a26lsdm3684"
s=[int(m.group(0)) for m in finditer(r'\d+', a)] ##[26, 3684]

Replacing substring but skipping previous occurance

I have a long string that may contain multiple same sub-strings. I would like to extract certain sub-strings by using regex. Then, for each extracted sub-string, I want to append [i] and replace the original one.
By using Regex, I extracted ['df.Libor3m','df.Libor3m_lag1','df.Libor3m_lag1']. However, when I tried to add [i] to each item, the first 'df.Libor3m_lag1' in string is replaced twice.
function_text_MD='0.11*(np.maximum(df.Libor3m,0.9)-np.maximum(df.Libor3m_lag1,0.9))+0.7*np.maximum(df.Libor3m_lag1,0.9)'
read_var = re.findall(r"df.[\w+][^\W]+",function_text_MD)
for var_name in read_var:
function_text_MD.find(var_name)
new_var_name = var_name+'[i]'
function_text_MD=function_text_MD.replace(var_name,new_var_name,1)
So I got '0.11*(np.maximum(df.Libor3m[i],0.9)-np.maximum(df.Libor3m_lag1[i][i],0.9))+0.7*np.maximum(df.Libor3m_lag1,0.9)'.
df.Libor3m_lag1[i][i] was added [i] twice.
What I want to get:
'0.11*(np.maximum(df.Libor3m[i],0.9)-np.maximum(df.Libor3m_lag1[i],0.9))+0.7*np.maximum(df.Libor3m_lag1[i],0.9)'
Thanks in advance!

Here is the code.
import re
function_text_MD='0.11*(np.maximum(df.Libor3m,0.9)-np.maximum(df.Libor3m_lag1,0.9))+0.7*np.maximum(df.Libor3m_lag1,0.9)'
read_var = re.findall(r"df.[\w+][^\W]+",function_text_MD)
for var_name in read_var:
function_text_MD = function_text_MD.replace(var_name,var_name+'[i]')
print(function_text_MD)

t = "0.11*(np.maximum(df.Libor3m,0.9)-np.maximum(df.Libor3m_lag1,0.9))+0.7*np.maximum(df.Libor3m_lag1,0.9)"
p = re.split("(?<=df\.)[a-zA-Z_0-9]+", t)
s = re.findall("(?<=df\.)[a-zA-Z_0-9]+", t)
s = [x+"[i]" for x in s]
result = "".join([p[0],s[0],p[1],s[1],p[2],s[2]])
use the regular expression to split string first.
use the same regular expression to find the spliters
change the spliters to what you want
put the 2 list together and join.

Python: Replace character RANGE in a string with new string

Given the following string:
mystring = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
The goal is to swap out a character position range with other characters.
For example, swap out characters 20-24 with ABCDE.
The result would look like:
XXXXXXXXXXXXXXXXXXXABCDEXXXXXXXXXXXXXXX
Testing:
mystring = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
mystring[20:24] = 'ABCDE'
I get the error: TypeError: 'str' object does not support item assignment
The end goal is a reusable function such as:
def replace_chars(some_string, start_char, end_char, replace_string):
if len(replace_string) == (end_char_pos - start_char_pos + 1):
some_string[start_char:end_char] = replace_string
else:
print "replace string invalid length"
sys.exit(1)
return mystring
new_string = replace_chars('XYZXYZ', 2, 4, 'AAA')
I realize that it's possible to pad out the unchanged range into a new string:
mystring = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
mystring = mystring[0:19] + 'ABCDE' + mystring[25:38]
However that will force more calculation and since this will be happening thousands of times against lines in a file. The different lines will be different length and will be different character positions to swap. Doing this seems like it would be a long workaround where I should just be able to insert direct into the character positions in-place.
Appreciate any help, thanks!

strings are immutable (unchangeable). But you can index and join items.
mystring = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
mystring = 'ABCDE'.join([mystring[:20],mystring[24:]])
'XXXXXXXXXXXXXXXXXXXXABCDEXXXXXXXXXXXXXX'
Do be careful as the string length "ABCDE" and the number of items you omit between mystring[:20], mystring[24:] need to be the same length.

Strings are immutable in python! You'll have to split the string into three pieces and concatenate them together :)
mystring = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
new_str = "ABCDE"
first_piece = mystring[0:20]
third_piece = mystring[24:len(mystring)]
final_string = first_piece + new_str + third_piece

This is not strictly possible in python, but consider using bytearray a similar structure to a string in python, with a key difference being mutability
In [52]: my_stuff = bytearray('XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')
In [53]: my_stuff = my_stuff[0:19] + "abcd" + my_stuff[25:38]
In [54]: print my_stuff
XXXXXXXXXXXXXXXXXXXabcdXXXXXXXXXXXXX
There are some key things you should know when using a bytearray, you can see some of them here

As much as you think you should be able to assign to individual characters of a string, 'str' object does not support item assignment says you can't.

Insert a string before a substring of a string

I want to insert some text before a substring in a string.
For example:
str = "thisissometextthatiwrote"
substr = "text"
inserttxt = "XX"
I want:
str = "thisissomeXXtextthatiwrote"
Assuming substr can only appear once in str, how can I achieve this result? Is there some simple way to do this?

my_str = "thisissometextthatiwrote"
substr = "text"
inserttxt = "XX"
idx = my_str.index(substr)
my_str = my_str[:idx] + inserttxt + my_str[idx:]
ps: avoid using reserved words (i.e. str in your case) as variable names

Why not use replace?
my_str = "thisissometextthatiwrote"
substr = "text"
inserttxt = "XX"
my_str.replace(substr, substr + inserttxt)
# 'thisissometextXXthatiwrote'

Use str.split(substr) to split str to ['thisissome', 'thatiwrote'], since you want to insert some text before a substring, so we join them with "XXtext" ((inserttxt+substr)).
so the final solution should be:
>>>(inserttxt+substr).join(str.split(substr))
'thisissomeXXtextthatiwrote'
if you want to append some text after a substring, just replace with:
>>>(substr+appendtxt).join(str.split(substr))
'thisissometextXXthatiwrote'

With respect to the question (were ´my_str´ is the variable), the right is:
(inserttxt+substr).join(**my_str**.split(substr))

how to append a smaller string in between a larger string in python?

I am new to python and i want to append a smaller string to a bigger string at a position defined by me. For example, have a string aaccddee. Now i want to append string bb to it at a position which would make it aabbccddee. How can I do it? Thanks in advance!

String is immutable, you might need to do this:
strA = 'aaccddee'
strB = 'bb'
pos = 2
strC = strA[:pos]+strB+strA[pos:] # get aabbccddee

You can slice strings up as if they were lists, like this:
firststring = "aaccddee"
secondstring = "bb"
combinedstring = firststring[:2] + secondstring + firststring[2:]
print(combinedstring)
There is excellent documentation on the interwebs.

There are various ways to do this, but it's a tiny bit fiddlier than some other languages, as strings in python are immutable. This is quite an easy one to follow
First up, find out where in the string c starts at:
add_at = my_string.find("c")
Then, split the string there, and add your new part in
new_string = my_string[0:add_at] + "bb" + my_string[add_at:]
That takes the string up to the split point, then appends the new snippet, then the remainder of the string

try these in a python shell:
string = "aaccddee"
string_to_append = "bb"
new_string = string[:2] + string_to_append + string[2:]
you can also use a more printf style like this:
string = "aaccddee"
string_to_append = "bb"
new_string = "%s%s%s" % ( string[:2], string_to_append, string[2:] )

Let say your string object is s=aaccddee. The you can do it as:
s = s[:2] + 'bb' + s[2:]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract string in Python - python

I want to extract specific string from a string, but it shows error. Why can't I use the find as index to extract string ? Here is my code string = 'ABP' p_index = string.find('P') s = string[0, p_index] print(s) TypeError: string indices must be integers

s = string[0, p_index] isn't a valid syntax in python, you should rather do: s = string[0:p_index] Since an omitted first index defaults to zero, this returns the same result: s = string[:p_index] I'd recommend reading this page for reference on Python string's slicing and it's syntax in general.

I'm pretty sure you slice strings like this s = string[0:2]

string = 'ABP' p_index = string.index('P') s = string[p_index] print(s) string = 'ABP' p_index = string.find('P') s = string[p_index] print(s) maybe you can try it like this two

Related

How to get integer for two characters in python

Replacing substring but skipping previous occurance

Python: Replace character RANGE in a string with new string

Insert a string before a substring of a string

how to append a smaller string in between a larger string in python?

Categories

Resources