python find specific pattern at end of string - python

I need to find a timestamp pattern at the end of a string. For example, blahblahblah_tsYYYMMDDHHMMSS. I need to extract the time string and replace it with current timestamp.
What is the regex for such pattern?

You might try making a function.
def tsExtractor(SomeString): #your String
LenSomeString = len(SomeString) #get length of String
TargetString = SomeString[(LenSomeString - 14):LenSomeString] #Get placeholder of numeric data
try:
int(TargetString) #attempt to change to integer
except ValueError:
TargetString = 99998877665544 #not integer, put in some dummy data
return str(TargetString) #change integer back to string
#examples
exampleString = 'blahblahblahyeahblahblahts20160616140003'
badString = 'Blahblahblahaldskfjlakdflaksdflkjasl500566'
exampleString = tsExtractor(exampleString)
print(exampleString)
exampleString = tsExtractor(badString)
print(exampleString)

Thanks #rock321987, i did this:
match = re.compile("_ts\d{14}$").match(pathstring)
if match:
do something
It works pretty well.

Related

How do I check whether an input string has alphabets in another string in Python?

As the title says, how do I check whether an input string has alphabets in another string in Python?
The string specifies that only alphabets A to G ('ABCDEFG') can be used in the input string. However, my attempt did not get the results I want. Instead, input strings with alphabets in order such as 'ABC' and 'ABCD' work, while those not in order such as 'BADD' and 'EFEG' do not.
Please refer to my attempt below.
ID = 'ABCDEFG'
addcode=input('Enter new product code: ')
Code = []
if addcode in ID:
Code.append(addcode)
print(Code)
else:
print("Product code is invalid")
Ideally, as long as the input string contain letters from A to G, it should be appended to 'Code' regardless of the order. How do I modify my code so that I can get the results I want? Thank you.
You can use RegEx:
re.search('[a-zA-Z]', string)
You can try converting your input string (addcode) to a set and then see if it is a subset of ID. I am not converting ID into a set as it contains unique elements as per your code:
ID = 'ABCDEFG'
addcode = input('Enter new product code: ')
Code = []
if set(addcode).issubset(ID):
Code.append(addcode)
print(Code)
else:
print("Product code is invalid")
If you want to use a RegEx based approach, you can do this:
import re
pattern = re.compile("^[A-G]+$")
addcode = input('Enter new product code: ')
Code = []
if pattern.findall(addcode):
Code.append(addcode)
print(Code)
else:
print("Product code is invalid")
We are checking if the input string contains only characters between A-G here i.e A,B,C,D,E,F,G. If there is a match, we append the input string and print it.
String is immutable. You should check whether each letter of product code is present in the ID.
To achieve this you can use ID as tuple instead of single string.
ID = ('A','B','C','D','E','F','G')
addcode=input('Enter new product code: ')
Code = []
for l in range(0,len(addcode)):
if addcode[l] in ID:
Code.append(addcode[l])
else:
print("Product code is invalid")
print(Code)

python string split slice and into a list

I have a string for example "streemlocalbbv"
and I have my_function that takes this string and a string that I want to find ("loc") in the original string. And what I want to get returned is this;
my_function("streemlocalbbv", "loc")
output = ["streem","loc","albbv"]
what I did so far is
def find_split(string,find_word):
length = len(string)
find_word_start_index = string.find(find_word)
find_word_end_index = find_word_start_index + len(find_word)
string[find_word_start_index:find_word_end_index]
a = string[0:find_word_start_index]
b = string[find_word_start_index:find_word_end_index]
c = string[find_word_end_index:length]
return [a,b,c]
Trying to find the index of the string I am looking for in the original string, and then split the original string. But from here I am not sure how should I do it.
You can use str.partition which does exactly what you want:
>>> "streemlocalbbv".partition("loc")
('streem', 'loc', 'albbv')
Use the split function:
def find_split(string,find_word):
ends = string.split(find_word)
return [ends[0], find_word, ends[1]]
Use the split, index and insert function to solve this
def my_function(word,split_by):
l = word.split(split_by)
l.insert(l.index(word[:word.find(split_by)])+1,split_by)
return l
print(my_function("streemlocalbbv", "loc"))
#['str', 'eem', 'localbbv']

Update last character of string with a value

I have two strings:
input = "12.34.45.362"
output = "2"
I want to be able to replace the 362 in input by 2 from output.
Thus the final result should be 12.34.45.2. I am unsure on how to do it. Any help is appreciated.
You can use a simple regex for this:
import re
input_ = "12.34.45.362"
output = "2"
input_ = re.sub(r"\.\d+$", f".{output}", input_)
print(input_)
Output:
12.34.45.2
Notice that I also changed input to input_, so we're not shadowing the built-in input() function.
Can also use a more simple, but little bit less robust pattern, which doesn't take the period into account at all, and just replaces all the digits from the end:
import re
input_ = "12.34.45.362"
output = "2"
input_ = re.sub(r"\d+$", output, input_)
print(input_)
Output:
12.34.45.2
Just in case you want to do this for any string of form X.Y.Z.W where X, Y, Z, and W may be of non-constant length:
new_result = ".".join(your_input.split(".")[:-1]) + "." + output
s.join will join a collection together to a string using the string s between each element. s.split will turn a string into a list which each element between the given character .. Slicing the list (l[:-1]) will give you all but the last element, and finally string concatenation (if you are sure output is str) will give you your result.
Breaking it down step-by-step:
your_input = "12.34.45.362"
your_input.split(".") # == ["12", "34", "45", "362"]
your_input.split(".")[:-1] # == ["12", "34", "45"]
".".join(your_input.split(".")[:-1]) # == "12.34.45"
".".join(your_input.split(".")[:-1]) + "." + output # == "12.34.45.2"
If you are trying to split int the lat . just do a right split get everything and do a string formatting
i = "12.34.45.362"
r = "{}.2".format(i.rsplit(".",1)[0])
output
'12.34.45.2'

Extract numeric values from a string for python

I have a string with contains numeric values which are inside quotes. I need to remove numeric values from these and also the [ and ]
sample string: texts = ['13007807', '13007779']
texts = ['13007807', '13007779']
texts.replace("'", "")
texts..strip("'")
print texts
# this will return ['13007807', '13007779']
So what i need to extract from string is:
13007807
13007779
If your texts variable is a string as I understood from your reply, then you can use Regular expressions:
import re
text = "['13007807', '13007779']"
regex=r"\['(\d+)', '(\d+)'\]"
values=re.search(regex, text)
if values:
value1=int(values.group(1))
value2=int(values.group(2))
output:
value1=13007807
value2=13007779
You can use * unpack operator:
texts = ['13007807', '13007779']
print (*texts)
output:
13007807 13007779
if you have :
data = "['13007807', '13007779']"
print (*eval(data))
output:
13007807 13007779
The easiest way is to use map and wrap around in list
list(map(int,texts))
Output
[13007807, 13007779]
If your input data is of format data = "['13007807', '13007779']" then
import re
data = "['13007807', '13007779']"
list(map(int, re.findall('(\d+)',data)))
or
list(map(int, eval(data)))

Can't get rid of hex characters

This program makes an array of verbs which come from a text file.
file = open("Verbs.txt", "r")
data = str(file.read())
table = eval(data)
num_table = len(table)
new_table = []
for x in range(0, num_table):
newstr = table[x].replace(")", "")
split = newstr.rsplit("(")
numx = len(split)
for y in range(0, numx):
split[y] = split[y].split(",", 1)[0]
new_table.append(split[y])
num_new_table = len(new_table)
for z in range(0, num_new_table):
print(new_table[z])
However the text itself contains hex characters such as in
('a\\xc4\\x9fr\\xc4\\xb1[Verb]+[Pos]+[Imp]+[A2sg]', ':', 17.6044921875)('A\\xc4\\x9fr\\xc4\\xb1[Noun]+[Prop]+[A3sg]+[Pnon]+[Nom]', ':', 11.5615234375)
I'm trying to get rid of those. How am supposed to do that?
I've looked up pretty much everywhere and decode() returns an error (even after importing codecs).
You could use parse, a python module that allows you to search inside a string for regularly-formatted components, and, from the components returned, you could extract the corresponding integers, replacing them from the original string.
For example (untested alert!):
import parse
# Parse all hex-like items
list_of_findings = parse.findall("\\x{:w}", your_string)
# For each item
for hex_item in list_of_findings:
# Replace the item in the string
your_string = your_string.replace(
# Retrieve the value from the Parse Data Format
hex_item[0],
# Convert the value parsed to a normal hex string,
# then to int, then to string again
str(int("0x"+hex_item[0]))
)
Obs: instead of "int", you could convert the found hex-like values to characters, using chr, as in:
chr(hex_item[0])

Categories