Python replace particular substring in string - python

I need my program to replace a certain - with a variable. I've tried this but it replaces the whole thing with the varibale.
jelenleg = 8 * "-"
x = 0
guess = c
jelenleg = jelenleg[x].replace("-", guess)
print(jelenleg)
So I need this to happen:
before: --------
after: c-------
But instead what I get is this: c

You can specify the count of to be replaced items:
jelenleg.replace("-", guess, 1)
will only replace one -
To replace at a particular location, I cant think of anything easier than transforming the string into a list, then replacing, then back to a string, like this:
jelenleg_list = list(jelenleg) # str --> list
jelenleg_list[x] = guess # replace at pos x
jelenleg = "".join(jelenleg_list) # list --> str

Related

How do I read a set of characters from a specific position in a string in python

So I am writing my own cypher and decoder for a bit of fun but I've gotten stuck on my decipher program where I need to move up by 1 character in my string in order to decipher the next "block" of characters.
cyinput2 = ",#96c8a2: ,#808000: ,#96c8a2: ,#e1a95f: ,#808000: ,#6f00ff:"
#989E86
Above is my string that I am trying move up in, essentially I am trying to move on from my first base 6 string to the second, which in this case would be ",#808000" and then so on by increments of 10 characters from each comma.
I've tried to incorporate some ideas such as just adding 1 character to count as the space in between each base 6 string but I wasn't able to figure out. I have also started to work on using a variable set to integer 0 that would increment by 10 after each string is converted back into hexadecimal.
But I haven't yielded any results from that yet, so I'm hoping someone on here may be able to stop me from going overboard when I could instead use a much simpler solution.
while str(cyinput2) != "":
S = 0
S1 = 9
hexstart = ','
hexend = ':'
start_index = cyinput2.find(hexstart) + len(hexstart)
end_index = cyinput2.find(hexend)
block = cyinput2[start_index:end_index]
print(block)
You could simply convert your string to a list by using split,
If you want clean blocks you can use replace too
# Removing useless chars in order to have a clean split
cyinput2 = cyinput2.replace(',', '')
cyinput2 = cyinput2.replace(':', '')
# From : ",#96c8a2: ,#808000: ,#96c8a2: ,#e1a95f: ,#808000: ,#6f00ff:"
# to : "#96c8a2 #808000 #96c8a2 #e1a95f #808000 #6f00ff"
# Split
cyinput2 = cyinput2.split(' ')
# From : "#96c8a2 #808000 #96c8a2 #e1a95f #808000 #6f00ff"
# to : ["#96c8a2", "#808000", "#96c8a2", "#e1a95f", "#808000", "#6f00ff"]
for block in cyinput2:
print(block)
You can use this while loop
cyinput2 = ",#96c8a2: ,#808000: ,#96c8a2: ,#e1a95f: ,#808000: ,#6f00ff: "
while len(str(cyinput2)) > 9 :
print(cyinput2[1:10])
cyinput2 = cyinput2.replace(cyinput2[0:10],'')
Output
#96c8a2:
#808000:
#e1a95f:
#6f00ff:

How to I find and replace after performing math on the string

I have a file with with following strings
input complex_data_BITWIDTH;
output complex_data_(2*BITWIDTH+1);
Lets say BITWIDTH = 8
I want the following output
input complex_data_8;
output complex_data_17;
How can I achieve this in python with find and replace with some mathematical operation.
I would recommend looking into the re RegEx library for string replacement and string search, and the eval() function for performing mathematical operations on strings.
Example (assuming that there are always parentheses around what you want to evaluate) :
import re
BITWIDTH_VAL = 8
string_initial = "something_(BITWIDTH+3)"
string_with_replacement = re.sub("BITWIDTH", str(BITWIDTH_VAL), string_initial)
# note: string_with_replacement is "something_(8+3)"
expression = re.search("(\(.*\))", string_with_replacement).group(1)
# note: expression is "(8+3)"
string_evaluated = string_with_replacement.replace(expression, str(eval(expression)))
# note: string_evaluated is "something_11"
You can use variables for that if you know the value to change, one for the value to search and other for the new value
BITWIDTH = 8
NEW_BITWIDTH = 2 * BITWIDTH + 1
string_input = 'complex_data_8;'
string_output = string_input.replace(str(BITWIDTH), str(NEW_BITWIDTH))
if you don't know the value then you need to get it first and then operate with it
string_input = 'complex_data_8;'
bitwidth = string_input.split('_')[-1].replace(';', '')
new_bitwidth = 2 * int(bitwidth) + 1
string_output = string_input.replace(bitwidth, str(new_bitwidth))

How to overcome a ValueError when working with multiple if conditions?

I'm trying to make a script that can identify if a folder name is a project or not. To do this I want to use multiple if conditions. But I struggle with the resulting ValueError that comes from checking, for example, the first letter of a folder name and if it is a Number. If it's a String i want to skip the folder and make it check the next one. Thank you all in advance for your help.
Cheers, Benn
I tried While and except ValueError: but haven't been successful with it.
# Correct name to a project "YYMM_ProjectName" = "1908_Sample_Project"
projectnames = ['190511_Waldfee', 'Mountain_Shooting_Test', '1806_Coffe_Prime_Now', '180410_Fotos', '191110', '1901_Rollercoaster_Vision_Ride', 'Musicvideo_LA', '1_Project_Win', '19_Wrong_Project', '1903_di_2', '1907_DL_2', '3401_CAR_Wagon']
# Check conditions
for projectname in projectnames:
if int(str(projectname[0])) < 3 and int(projectname[1]) > 5 and ((int(projectname[2]) * 10) + int(projectname[3])) <= 12 and str(projectname[4]) == "_" and projectname[5].isupper():
print('Real Project')
print('%s is a real Project' % projectname)
# print("Skipped Folders")
ValueError: invalid literal for int() with base 10: 'E'
From what I understand from all the ifs...you may actually be better off using a regex match. You're parsing through each character, and expecting each individual one to be within a very limited character range.
I haven't tested this pattern string, so it may be incorrect or need to be tweaked for your needs.
import re
projectnames = ['1911_Waldfee', "1908_Project_Test", "1912_WinterProject", "1702_Stockfootage", "1805_Branded_Content"]
p = ''.join(["^", # Start of string being matched
"[0-2]", # First character a number 0 through 2 (less than 3)
"[6-9]", # Second character a number 6 through 9 (single digit greater than 5)
"(0(?=[0-9])|1(?=[0-2]))", # (lookahead) A 0 followed only by any number 0 through 9 **OR** A 1 followed only by any number 0 through 2
"((?<=0)[1-9]|(?<=1)[0-2])", # (lookbehind) Match 1-9 if the preceding character was a 0, match 0-2 if the preceding was a 1
"_", # Next char is a "_"
"[A-Z]", #Next char (only) is an upper A through Z
".*$" # Match anything until end of string
])
for projectname in projectnames:
if re.match(p, projectname):
#print('Real Project')
print('%s is a real Project' % projectname)
# print("Skipped Folders")
EDIT: ========================
You can step-by-step test the pattern using the following...
projectname = "2612_UPPER"
p = "^[0-2].*$" # The first character is between 0 through 2, and anything else afterwards
if re.match(p, projectname): print(projectname)
# If you get a print, the first character match is right.
# Now do the next
p = "^[0-2][6-9].*$" # The first character is between 0 through 2, the second between 6 and 9, and anything else afterwards
if re.match(p, projectname): print(projectname)
# If you get a print, the first and second character match is right.
# continue with the third, fourth, etc.
This is something that just gets the work done and may not be the most efficient way to do this.
Given your list of projects,
projectnames = [
'190511_Waldfee',
'Mountain_Shooting_Test',
'1806_Coffe_Prime_Now',
'180410_Fotos',
'191110',
'1901_Rollercoaster_Vision_Ride',
'Musicvideo_LA',
'1_Project_Win',
'19_Wrong_Project',
'1903_di_2',
'1907_DL_2',
'3401_CAR_Wagon'
]
I see that there are a limited number of valid YYMM strings (24 of them to be more precise). So I first create a list of these 24 valid YYMMs.
nineteen = list(range(1900, 1913))
eighteen = list(range(1800, 1813))
YYMM = nineteen + eighteen # A list of all 24 valid dates
Then I modify your for loop a little bit using a try-except-else block,
for projectname in projectnames:
try:
first_4_digits = int(projectname[:4]) # Make sure the first 4 are digits.
except ValueError:
pass # Pass silently
else:
if (first_4_digits in YYMM
and projectname[4] == "_"
and projectname[5].isupper()):
# if all conditions are true
print("%s is a real project." % projectname)
One solution would be to make a quick function that catches an error and returns false:
def isInt(elem):
try:
int(elem)
return True
except ValueError:
return False
...
if all(isInt(e) for e in projectname[:3]) and int(str(projectname[0])) < 3 and ...:
...
Or you could use something like str.isdigit() as a check first, rather than writing your own function, to avoid triggering the ValueError in the first place.
Though if I were you I would reconsider why you need such a long and verbose if statement in the first place. There might be a more efficient way to implement this feature.
You could write a small parser (might be a bit over the top, admittedly):
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
from parsimonious.exceptions import ParseError
projectnames = ['190511_Waldfee', 'Mountain_Shooting_Test', '1806_Coffe_Prime_Now', '180410_Fotos', '191110', '1901_Rollercoaster_Vision_Ride', 'Musicvideo_LA', '1_Project_Win', '19_Wrong_Project', '1903_di_2', '1907_DL_2', '3401_CAR_Wagon']
class ProjectVisitor(NodeVisitor):
grammar = Grammar(
r"""
expr = first second third fourth fifth rest
first = ~"[0-2]"
second = ~"[6-9]"
third = ~"\d{2}"
fourth = "_"
fifth = ~"[A-Z]"
rest = ~".*"
"""
)
def generic_visit(self, node, visited_children):
return visited_children or node
def visit_third(self, node, visited_children):
x, y = int(node.text[0]), int(node.text[1])
if not (x * 10 + y) <= 12:
raise ParseError
# loop over them
pv = ProjectVisitor()
for projectname in projectnames:
try:
pv.parse(projectname)
print("Valid project name: {}".format(projectname))
except ParseError:
pass
This yields
Valid project name: 1806_Coffe_Prime_Now
Valid project name: 1901_Rollercoaster_Vision_Ride
Valid project name: 1907_DL_2

How to replace letters with numbers and re-convert at anytime (Caesar cipher)?

I've been coding this for almost 2 days now but cant get it. I've coded two different bits trying to find it.
Code #1
So this one will list the letters but wont change it to the numbers (a->1, b->2, ect)
import re
text = input('Write Something- ')
word = '{}'.format(text)
for letter in word:
print(letter)
#lists down
Outcome-
Write something- test
t
e
s
t
Then I have this code that changes the letters into numbers, but I haven't been able to convert it back into letters.
Code #2
u = input('Write Something')
a = ord(u[-1])
print(a)
#converts to number and prints ^^
enter code here
print('')
print(????)
#need to convert from numbers back to letters.
Outcome:
Write Something- test
116
How can I send a text through (test) and make it convert it to either set numbers (a->1, b->2) or random numbers, save it to a .txt file and be able to go back and read it at any time?
What youre trying to achieve here is called "caesar encryption".
You for example say normally you would have: A=1, a=2, B=3, B=4, etc...
then you would have a "key" which "shifts" the letters. Lets say the key is "3", so you would shift all letters 3 numbers up and you would end up with: A=4, a=5, B=6, b=7, etc...
This is of course only ONE way of doing a caesar encryption. This is the most basic example. You could also say your key is "G", which would give you:
A=G, a=g, B=H, b=h, etc.. or
A=G, a=H, B=I, b=J, etc...
Hope you understand what im talking about. Again, this is only one very simple example way.
Now, for your program/script you need to define this key. And if the key should be variable, you need to save it somewhere (write it down). Put your words in a string, and check and convert each letter and write it into a new string.
You then could say (pseudo code!):
var key = READKEYFROMFILE;
string old = READKEYFROMFILE_OR_JUST_A_NORMAL_STRING_:)
string new = "";
for (int i=0, i<old.length, i++){
get the string at i;
compare with your "key";
shift it;
write it in new;
}
Hope i could help you.
edit:
You could also use a dictionary (like the other answer says), but this is a very static (but easy) way.
Also, maybe watch some guides/tutorials on programming. You dont seem to be that experienced. And also, google "Caesar encryption" to understand this topic better (its very interesting).
edit2:
Ok, so basically:
You have a variable, called "key" in this variable, you store your key (you understood what i wrote above with the key and stuff?)
You then have a string variable, called "old". And another one called "new".
In old, you write your string that you want to convert.
New will be empty for now.
You then do a "for loop", which goes as long as the ".length" of your "old" string. (that means if your sentence has 15 letters, the loop will go through itself 15 times and always count the little "i" variable (from the for loop) up).
You then need to try and get the letter from "old" (and save it for short in another vairable, for example char temp = "" ).
After this, you need to compare your current letter and decide how to shift it.
If thats done, just add your converted letter to the "new" string.
Here is some more precise pseudo code (its not python code, i dont know python well), btw char stands for "character" (letter):
var key = g;
string old = "teststring";
string new = "";
char oldchar = "";
char newchar = "";
for (int i=0; i<old.length; i++){
oldchar = old.charAt[i];
newchar = oldchar //shift here!!!
new.addChar(newchar);
}
Hope i could help you ;)
edit3:
maybe also take a look at this:
https://inventwithpython.com/chapter14.html
Caesar Cipher Function in Python
https://www.youtube.com/watch?v=WXIHuQU6Vrs
Just use dictionary:
letters = {'a': 1, 'b': 2, ... }
And in the loop:
for letter in word:
print(letters[letter])
To convert to symbol codes and back to characters:
text = input('Write Something')
for t in text:
d = ord(t)
n = chr(d)
print(t,d,n)
To write into file:
f = open("a.txt", "w")
f.write("someline\n")
f.close()
To read lines from file:
f = open("a.txt", "r")
lines = f.readlines()
for line in lines:
print(line, end='') # all lines have newline character at the end
f.close()
Please see documentation for Python 3: https://docs.python.org/3/
Here are a couple of examples. My method involves mapping the character to the string representation of an integer padded with zeros so it's 3 characters long using str.zfill.
Eg 0 -> '000', 42 -> '042', 125 -> '125'
This makes it much easier to convert a string of numbers back to characters since it will be in lots of 3
Examples
from string import printable
#'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~ \t\n\r\x0b\x0c'
from random import sample
# Option 1
char_to_num_dict = {key : str(val).zfill(3) for key, val in zip(printable, sample(range(1000), len(printable))) }
# Option 2
char_to_num_dict = {key : str(val).zfill(3) for key, val in zip(printable, range(len(printable))) }
# Reverse mapping - applies to both options
num_to_char_dict = {char_to_num_dict[key] : key for key in char_to_num_dict }
Here are two sets of dictionaries to map a character to a number. The first option uses random numbers eg 'a' = '042', 'b' = '756', 'c' = '000' the problem with this is you can use it one time, close the program and then the next time the mapping will most definitely not match. If you want to use random values then you will need to save the dictionary to a file so you can open to get the key.
The second option creates a dictionary mapping a character to a number and maintains order. So it will follow the sequence eg 'a' = '010', 'b' = '011', 'c' = '012' everytime.
Now I've explained the mapping choices here are the function to convert between
def text_to_num(s):
return ''.join( char_to_num_dict.get(char, '') for char in s )
def num_to_text(s):
slices = [ s[ i : i + 3 ] for i in range(0, len(s), 3) ]
return ''.join( num_to_char_dict.get(char, '') for char in slices )
Example of use ( with option 2 dictionary )
>>> text_to_num('Hello World!')
'043014021021024094058024027021013062'
>>> num_to_text('043014021021024094058024027021013062')
'Hello World!'
And finally if you don't want to use a dictionary then you can use ord and chr still keeping with padding out the number with zeros method
def text_to_num2(s):
return ''.join( str(ord(char)).zfill(3) for char in s )
def num_to_text2(s):
slices = [ s[ i : i + 3] for i in range(0, len(s), 3) ]
return ''.join( chr(int(val)) for val in slices )
Example of use
>>> text_to_num2('Hello World!')
'072101108108111032087111114108100033'
>>> num_to_text2('072101108108111032087111114108100033')
'Hello World!'

Returning every instance of whatever's between two strings in a file [Python 3]

What I'm trying to do is open a file, then find every instance of '[\x06I"' and '\x06;', then return whatever is between the two.
Since this is not a standard text file (it's map data from RPG maker) readline() will not work for my purposes, as the file is not at all formatted in such a way that the data I want is always neatly within one line by itself.
What I'm doing right now is loading the file into a list with read(), then simply deleting characters from the very beginning until I hit the string '[\x06I'. Then I scan ahead to find '\x06;', store what's between them as a string, append said string to a list, then resume at the character after the semicolon I found.
It works, and I ended up with pretty much exactly what I wanted, but I feel like that's the worst possible way to go about it. Is there a more efficient way?
My relevant code:
while eofget == 0:
savor = 0
while savor == 0 or eofget == 0:
if line[0:4] == '[\x06I"':
x = 4
spork = 0
while spork == 0:
x += 1
if line[x] == '\x06':
if line[x+1] == ';':
spork = x
savor = line[5:spork] + "\n"
line = line[x+1:]
linefinal[lineinc] = savor
lineinc += 1
elif line[x:x+7] == '#widthi':
print("eof reached")
spork = 1
eofget = 1
savor = 0
elif line[x:x+7] == '#widthi':
print("finished map " + mapname)
eofget = 1
savor = 0
break
else:
line = line[1:]
You can just ignore the variable names. I just name things the first thing that comes to mind when I'm doing one-offs like this. And yes, I am aware a few things in there don't make any sense, but I'm saving cleanup for when I finalize the code.
When eofget gets flipped on this subroutine terminates and the next map is loaded. Then it repeats. The '#widthi' check is basically there to save time, since it's present in every map and indicates the beginning of the map data, AKA data I don't care about.
I feel this is a natural case to use regular expressions. Using the findall method:
>>> s = 'testing[\x06I"text in between 1\x06;filler text[\x06I"text in between 2\x06;more filler[\x06I"text in between \n with some line breaks \n included in the text\x06;ending'
>>> import re
>>> p = re.compile('\[\x06I"(.+?)\x06;', re.DOTALL)
>>> print(p.findall(s))
['text in between 1', 'text in between 2', 'text in between \n with some line breaks \n included in the text']
The regex string '\[\x06I"(.+?)\x06;'can be interpreted as follows:
Match as little as possible (denoted by ?) of an undetermined number of unspecified characters (denoted by .+) surrounded by '[\x06I"' and '\x06;', and only return the enclosed text (denoted by the parentheses around .+?)
Adding re.DOTALL in the compile makes the .? match line breaks as well, allowing multi-line text to be captured.
I would use split():
fulltext = 'adsfasgaseg[\x06I"thisiswhatyouneed\x06;sdfaesgaegegaadsf[\x06I"this is the second what you need \x06;asdfeagaeef'
parts = fulltext.split('[\x06I"') # split by first label
results = []
for part in parts:
if '\x06;' in part: # if second label exists in part
results.append(part.split('\x06;')[0]) # get the part until the second label
print results

Categories