how to separate characters only if they have spaces between them? - python

I'm making a morse code translator in python, and I successfully created a program that translates words into morse code, but now I want to make an option to translate morse code into words. while I was doing so, I realized that if I wanted to translate a letter that uses more than 2 characters, it printed out the letters e and t. I deducted that this was caused by adding every character into a list and translating those separately. Is there a way i can check if there is a space between characters and separating them only if there is?
Here is my code so far:
codes = { ' ':' ', 'A':'.-', 'B':'-...',
'C':'-.-.', 'D':'-..', 'E':'.',
'F':'..-.', 'G':'--.', 'H':'....',
'I':'..', 'J':'.---', 'K':'-.-',
'L':'.-..', 'M':'--', 'N':'-.',
'O':'---', 'P':'.--.', 'Q':'--.-',
'R':'.-.', 'S':'...', 'T':'-',
'U':'..-', 'V':'...-', 'W':'.--',
'X':'-..-', 'Y':'-.--', 'Z':'--..',
'1':'.----', '2':'..---', '3':'...--',
'4':'....-', '5':'.....', '6':'-....',
'7':'--...', '8':'---..', '9':'----.',
'0':'-----', ', ':'--..--', '.':'.-.-.-',
'?':'..--..', '/':'-..-.', '-':'-....-',
'(':'-.--.', ')':'-.--.-'}
ask = input("A: translate english to code \nB: translate code to english").upper()
if ask == "A":
i = input("")
mylist = list(i)
for i in mylist:
if i == " ":
print(codes[i], end="", flush=True)
else:
print(codes[i.upper()] + " ", end="", flush=True)
elif ask == "B":
print("Make sure to add 1 space between letters and 2 spaces between words!")
i = input("")
mylist = list(i)
key_list = list(codes.keys())
val_list = list(codes.values())
for i in mylist:
position = val_list.index(i)
print(key_list[position], end="", flush=True)

The str.split() method without an argument splits on whitespace

Here's a simplification of your code.
# Modified to make ' ' entry be just single space
# This allows us to add a space after every character rather than treating space as special when generating Morse code
codes = { ' ':' ', 'A':'.-', 'B':'-...',
'C':'-.-.', 'D':'-..', 'E':'.',
'F':'..-.', 'G':'--.', 'H':'....',
'I':'..', 'J':'.---', 'K':'-.-',
'L':'.-..', 'M':'--', 'N':'-.',
'O':'---', 'P':'.--.', 'Q':'--.-',
'R':'.-.', 'S':'...', 'T':'-',
'U':'..-', 'V':'...-', 'W':'.--',
'X':'-..-', 'Y':'-.--', 'Z':'--..',
'1':'.----', '2':'..---', '3':'...--',
'4':'....-', '5':'.....', '6':'-....',
'7':'--...', '8':'---..', '9':'----.',
'0':'-----', ', ':'--..--', '.':'.-.-.-',
'?':'..--..', '/':'-..-.', '-':'-....-',
'(':'-.--.', ')':'-.--.-'}
# Generate reverse code table (i.e. to go from Morse code to english)
codes_rev = {v:k for k, v in codes.items()}
ask = input("A: translate english to code \nB: translate code to english").upper()
if ask == "A":
for letter in input("enter text: ").upper(): # can apply upper to all letters (leaves space unchanged)
# We space a space against all letters
print(codes[letter] + ' ', end="") # all letters are followed by a space
# this cause a space to be two spaces
elif ask == "B":
print("Make sure to add 1 space between letters and 2 spaces between words!")
for word in input("enter morse code: ").split(' '): # Words are separated by double spaces
for letter in word.split(' '): # letters are separated by single spaces
if letter: # handles case of empty string on split at end of line
print(codes_rev[letter], end="")
print(' ', end = "") # space between words
else:
print('A or B should le entered')
Usage
Encoding
A: translate english to code
B: translate code to englisha
enter text: A journey of a thousand miles begins with a single step.
.- .--- --- ..- .-. -. . -.-- --- ..-. .- - .... --- ..- ... .- -. -.. -- .. .-.. . ... -... . --. .. -. ... .-- .. - .... .- ... .. -. --. .-.. . ... - . .--. .-.-.-
Decoding
A: translate english to code
B: translate code to englishb
Make sure to add 1 space between letters and 2 spaces between words!
enter morse code: .- .--- --- ..- .-. -. . -.-- --- ..-. .- - .... --- ..- ... .- -. -.. -- .. .-.. . ... -... . --. .. -. ... .-- .. - .... .- ... .. -. --. .-.. . ... - . .--.
A JOURNEY OF A THOUSAND MILES BEGINS WITH A SINGLE STEP

Related

why are the spaces in between the words not showing up

I have a morse program but the spaces in between the words are not showing does any one have any ideas? Prefer the simplest way to do so
sample input:
APRIL FOOLS DAY
output for encode_Morse function:
' .- .--. .-. .. .-.. ..-. --- --- .-.. ... -.. .- -.-- '
output for the decode_Morse function:
APRILFOOLSDAY
MORSE_CODES={'A':' .- ','B':' -... ','C':' -.-. ',
'D':' -.. ','E':' . ','F':' ..-. ','G':' --. ',
'H':' .... ','I':' .. ','J':' .--- ','K':' -.- ',
'L':' .-.. ','M':' -- ','N':' -. ','O':' --- ',
'P':' .--. ','Q':' --.- ','R':' .-. ',
'S':' ... ','T':' - ','U':' ..- ','V':' ...- ',
'W':' .-- ','X':' -..- ','Y':' -.-- ','Z':' --.. '}
##Define functions here
def encode_Morse(my_msg):
#my_msg=my_msg.upper()
my_msg_Morse=""
for letter in my_msg:
if letter!=" " and letter not in MORSE_CODES:
my_msg_Morse+="*"
elif letter!=" ":
my_msg_Morse+= MORSE_CODES[letter]
else:
my_msg_Morse+=" "
return my_msg_Morse+""
def decode_Morse(my_msg):
string=""
for word in my_msg.split(" "):
for ch in word.split():
if ch!=" " and ch!="*":
string=string + list(MORSE_CODES.keys())[list(MORSE_CODES.values()).index(" "+ch+" ")]
elif ch==" ":
string+=" "
string=string+""
return string
The split function absorbes your delimiter
I propose :
def decode_Morse(my_msg):
words = []
for word in my_msg.split(" "):
string = ""
for ch in word.split():
string=string + list(MORSE_CODES.keys())[list(MORSE_CODES.values()).index(" "+ch+" ")]
words.append(string)
return " ".join(words)
I propse you this solution:
MORSE_CODES={
'A':'.-','B':'-...','C':'-.-.',
'D':'-..','E':'.','F':'..-.','G':'--.',
'H':'....','I':'..','J':'.---','K':'-.-',
'L':'.-..','M':'--','N':'-.','O':'---',
'P':'.--.','Q':'--.-','R':'.-.',
'S':'...','T':'-','U':'..-','V':'...-',
'W':'.--','X':'-..-','Y':'-.--','Z':'--..'
}
R_MORSE_CODES = {v:k for k,v in MORSE_CODES.items()}
def encode_morse(msg):
words = msg.split()
return " ".join(" ".join(MORSE_CODES.get(c, '*') for c in w) for w in words)
def decode_morse(msg):
words = msg.split(" ")
return " ".join("".join(R_MORSE_CODES.get(c, '?') for c in w.split()) for w in words)
# Original message
msg = "APRIL FOOLS DAY"
enc_msg = encode_morse(msg)
print(enc_msg)
# .- .--. .-. .. .-.. ..-. --- --- .-.. ... -.. .- -.--
dec_msg = decode_morse(enc_msg)
print(dec_msg)
# APRIL FOOLS DAY
Deviating from your solution, I
do not use spaces in the translation table between characters and morse codes.
use one space character to seperate single morse codes and two space to mark word separation
For back translation i reverse the dictionary keys and values to another translation table called R_MORSE_CODES for better readability.
Using one and two spaces is sufficient to allow compatibility to decode a morse code back to its original message, as long as any unknown characters appear.

I get a message "local variable 'cache' referenced before assignment

I'm trying to make a Morse encrypter, and I don't see why my code isn't working. I have a working one which I made using a tutorial, but this one and that one mainly have the samr guts.
My code:
cheat_sheet = { 'A':'.-', 'B':'-...',
'C':'-.-.', 'D':'-..', 'E':'.',
'F':'..-.', 'G':'--.', 'H':'....',
'I':'..', 'J':'.---', 'K':'-.-',
'L':'.-..', 'M':'--', 'N':'-.',
'O':'---', 'P':'.--.', 'Q':'--.-',
'R':'.-.', 'S':'...', 'T':'-',
'U':'..-', 'V':'...-', 'W':'.--',
'X':'-..-', 'Y':'-.--', 'Z':'--..',
'1':'.----', '2':'..---', '3':'...--',
'4':'....-', '5':'.....', '6':'-....',
'7':'--...', '8':'---..', '9':'----.',
'0':'-----', ', ':'--..--', '.':'.-.-.-',
'?':'..--..', '/':'-..-.', '-':'-....-',
'(':'-.--.', ')':'-.--.-'}
cache = 'beans'
def morse_encrypter (placeholder):
for letter in placeholder:
cache += cheat_sheet [letter]
return cache
b = input ('bruh')
def DO_THE_THING():
placeholder = b
the_answer = morse_encrypter(placeholder.upper())
print (the_answer)
DO_THE_THING ()
The problem is your code references the current value of cache via the += operation but it's not defined. Variables used in functions are local to them by default, so you could fix that by adding a cache = '' at the very beginning in the morse_encrypter().
However. the fact is you don't really need the variable at all if you instead used what is called a generator expression as shown below:
def morse_encrypter(placeholder):
return ''.join(cheat_sheet[letter] for letter in placeholder)
There's additional information on generator expressions and related list comprehensions here in the documentation if you're interested.
Use
"global cache" in morse_encrypter()
This should work-
cheat_sheet = { 'A':'.-', 'B':'-...',
'C':'-.-.', 'D':'-..', 'E':'.',
'F':'..-.', 'G':'--.', 'H':'....',
'I':'..', 'J':'.---', 'K':'-.-',
'L':'.-..', 'M':'--', 'N':'-.',
'O':'---', 'P':'.--.', 'Q':'--.-',
'R':'.-.', 'S':'...', 'T':'-',
'U':'..-', 'V':'...-', 'W':'.--',
'X':'-..-', 'Y':'-.--', 'Z':'--..',
'1':'.----', '2':'..---', '3':'...--',
'4':'....-', '5':'.....', '6':'-....',
'7':'--...', '8':'---..', '9':'----.',
'0':'-----', ', ':'--..--', '.':'.-.-.-',
'?':'..--..', '/':'-..-.', '-':'-....-',
'(':'-.--.', ')':'-.--.-'}
cache = 'beans'
def morse_encrypter (placeholder):
global cache
for letter in placeholder:
cache += cheat_sheet [letter]
return cache
b = input ('bruh')
def DO_THE_THING():
placeholder = b
the_answer = morse_encrypter(placeholder.upper())
print (the_answer)
DO_THE_THING ()
Or you can initiate a local instance of 'cache', depending on your program requirements.
Put cache = '' in the method morse_encrypted

Convert a list of tab prefixed strings to a dictionary

Text mining attempts here, I would like to turn the below:
a=['Colors.of.the universe:\n',
' Black: 111\n',
' Grey: 222\n',
' White: 11\n'
'Movies of the week:\n',
' Mission Impossible: 121\n',
' Die_Hard: 123\n',
' Jurassic Park: 33\n',
'Lands.categories.said:\n',
' Desert: 33212\n',
' forest: 4532\n',
' grassland : 431\n',
' tundra : 243451\n']
to this:
{'Colors.of.the universe':{Black:111,Grey:222,White:11},
'Movies of the week':{Mission Impossible:121,Die_Hard:123,Jurassic Park:33},
'Lands.categories.said': {Desert:33212,forest:4532,grassland:431,tundra:243451}}
Tried this code below but it was not good:
{words[1]:words[1:] for words in a}
which gives
{'o': 'olors.of.the universe:\n',
' ': ' tundra : 243451\n',
'a': 'ands.categories.said:\n'}
It only takes the first word as the key which is not what's needed.
A dict comprehension is an interesting approach.
a = ['Colors.of.the universe:\n',
' Black: 111\n',
' Grey: 222\n',
' White: 11\n',
'Movies of the week:\n',
' Mission Impossible: 121\n',
' Die_Hard: 123\n',
' Jurassic Park: 33\n',
'Lands.categories.said:\n',
' Desert: 33212\n',
' forest: 4532\n',
' grassland : 431\n',
' tundra : 243451\n']
result = dict()
current_key = None
for w in a:
# If starts with tab - its an item (under category)
if w.startswith(' '):
# Splitting item (i.e. ' Desert: 33212\n' -> [' Desert', ' 33212\n']
splitted = w.split(':')
# Setting the key and the value of the item
# Removing redundant spaces and '\n'
# Converting value to number
k, v = splitted[0].strip(), int(splitted[1].replace('\n', ''))
result[current_key][k] = v
# Else, it's a category
else:
# Removing ':' and '\n' form category name
current_key = w.replace(':', '').replace('\n', '')
# If category not exist - create a dictionary for it
if not current_key in result.keys():
result[current_key] = {}
# {'Colors.of.the universe': {'Black': 111, 'Grey': 222, 'White': 11}, 'Movies of the week': {'Mission Impossible': 121, 'Die_Hard': 123, 'Jurassic Park': 33}, 'Lands.categories.said': {'Desert': 33212, 'forest': 4532, 'grassland': 431, 'tundra': 243451}}
print(result)
That's really close to valid YAML already. You could just quote the property labels and parse. And parsing a known format is MUCH superior to dealing with and/or inventing your own. Even if you're just exploring base python, exploring good practices is just as (probably more) important.
import re
import yaml
raw = ['Colors.of.the universe:\n',
' Black: 111\n',
' Grey: 222\n',
' White: 11\n',
'Movies of the week:\n',
' Mission Impossible: 121\n',
' Die_Hard: 123\n',
' Jurassic Park: 33\n',
'Lands.categories.said:\n',
' Desert: 33212\n',
' forest: 4532\n',
' grassland : 431\n',
' tundra : 243451\n']
# Fix spaces in property names
fixed = []
for line in raw:
match = re.match(r'^( *)(\S.*?): ?(\S*)\s*', line)
if match:
fixed.append('{indent}{safe_label}:{value}'.format(
indent = match.group(1),
safe_label = "'{}'".format(match.group(2)),
value = ' ' + match.group(3) if match.group(3) else ''
))
else:
raise Exception("regex failed")
parsed = yaml.load('\n'.join(fixed), Loader=yaml.FullLoader)
print(parsed)

How do I remove the last character of a variable until there is only one left in python

I am working on a program that prints every possible combination of a word. Everything works, but I wanted to take it a step further so it doesn't only print all combinations of a word. It should remove the last character of a word until there is only one letter left.
Here is what I have written:
# Enter Word
print("Enter your word:")
print()
s = input(colorGreen)
# If s is a string, print all combinations
if all(s.isalpha() or s.isspace() for s in s):
t=list(itertools.permutations(s,len(s)))
for i in range(0,len(t)):
print(colorGreen + "".join(t[i]))
while len(s) != 1:
t=list(itertools.permutations(s,len(s)))
for i in range(0,len(t)):
print(colorGreen + "".join(t[i]))
print()
print("Finished!")
print()
input("Press anything to quit...")
# If s is not a string, print error
if not all(s.isalpha() or s.isspace() for s in s):
print(colorRed + "You did not enter a correct word")
print("Only use a word for Word combinations")
print("Please restart the program.")
print()
input("Press anything to quit...")
Thanks in advance :D
You can do it this way:
from itertools import permutations
string = 'hello'
c = reversed([''.join(s) for i in range(1,len(string)+1) for s in permutations(string,i)])
print('\n'.join(c))
Output:
olleh
ollhe
olelh
olehl
olhle
olhel
olleh
ollhe
olelh
olehl
olhle
olhel
oellh
oelhl
oellh
oelhl
oehll
oehll
ohlle
ohlel
ohlle
ohlel
ohell
ohell
loleh
lolhe
loelh
loehl
lohle
lohel
lloeh
llohe
lleoh
lleho
llhoe
llheo
leolh
leohl
leloh
lelho
lehol
lehlo
lhole
lhoel
lhloe
lhleo
lheol
lhelo
loleh
lolhe
loelh
loehl
lohle
lohel
lloeh
llohe
lleoh
lleho
llhoe
llheo
leolh
leohl
leloh
lelho
lehol
lehlo
lhole
lhoel
lhloe
lhleo
lheol
lhelo
eollh
eolhl
eollh
eolhl
eohll
eohll
elolh
elohl
elloh
ellho
elhol
elhlo
elolh
elohl
elloh
ellho
elhol
elhlo
eholl
eholl
ehlol
ehllo
ehlol
ehllo
holle
holel
holle
holel
hoell
hoell
hlole
hloel
hlloe
hlleo
hleol
hlelo
hlole
hloel
hlloe
hlleo
hleol
hlelo
heoll
heoll
helol
hello
helol
hello
olle
ollh
olel
oleh
olhl
olhe
olle
ollh
olel
oleh
olhl
olhe
oell
oelh
oell
oelh
oehl
oehl
ohll
ohle
ohll
ohle
ohel
ohel
lole
lolh
loel
loeh
lohl
lohe
lloe
lloh
lleo
lleh
llho
llhe
leol
leoh
lelo
lelh
leho
lehl
lhol
lhoe
lhlo
lhle
lheo
lhel
lole
lolh
loel
loeh
lohl
lohe
lloe
lloh
lleo
lleh
llho
llhe
leol
leoh
lelo
lelh
leho
lehl
lhol
lhoe
lhlo
lhle
lheo
lhel
eoll
eolh
eoll
eolh
eohl
eohl
elol
eloh
ello
ellh
elho
elhl
elol
eloh
ello
ellh
elho
elhl
ehol
ehol
ehlo
ehll
ehlo
ehll
holl
hole
holl
hole
hoel
hoel
hlol
hloe
hllo
hlle
hleo
hlel
hlol
hloe
hllo
hlle
hleo
hlel
heol
heol
helo
hell
helo
hell
oll
ole
olh
oll
ole
olh
oel
oel
oeh
ohl
ohl
ohe
lol
loe
loh
llo
lle
llh
leo
lel
leh
lho
lhl
lhe
lol
loe
loh
llo
lle
llh
leo
lel
leh
lho
lhl
lhe
eol
eol
eoh
elo
ell
elh
elo
ell
elh
eho
ehl
ehl
hol
hol
hoe
hlo
hll
hle
hlo
hll
hle
heo
hel
hel
ol
ol
oe
oh
lo
ll
le
lh
lo
ll
le
lh
eo
el
el
eh
ho
hl
hl
he
o
l
l
e
h

Traceback for regular expression

Lets say i have a regular expression:
match = re.search(pattern, content)
if not match:
raise Exception, 'regex traceback' # i want to throw here the regex matching process.
If regular expression fails to match then i want to throw in exception Its working and where it fails to match the regular expression pattern, at what stage etc. Is it possible even to achieve the desired functionality?
I have something that helps me to debug complex regex patterns among my codes.
Does this help you ? :
import re
li = ('ksjdhfqsd\n'
'5 12478 abdefgcd ocean__12 ty--\t\t ghtr789\n'
'qfgqrgqrg',
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12340\n',
'2 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877',
'9 54879 bbdecddf antarctic__13 18:13pomodoro\t\t ghtr6798',
'ksjdhfqsd\n'
'5 12478 abdefgcd ocean__1247101247887 ty--\t\t ghtr789\n'
'qfgqrgqrg',
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12940\n',
'25 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877',
'9 54879 bbdeYddf antarctic__13 18:13pomodoro\t\t ghtr6798')
tupleRE = ('^\d',
' ',
'\d{5}',
' ',
'[abcdefghi]+',
' ',
'(?=[a-z\d_ ]{14} [^ ]+\t\t ght)',
'[a-z]+',
'__',
'[\d]+',
' +',
'[^\t]+',
'\t\t',
' ',
'ght',
'(r[5-9]+|u[0-4]+)',
'$')
def REtest(ch, tuplRE, flags = re.MULTILINE):
for n in xrange(len(tupleRE)):
regx = re.compile(''.join(tupleRE[:n+1]), flags)
testmatch = regx.search(ch)
if not testmatch:
print '\n -*- tupleRE :\n'
print '\n'.join(str(i).zfill(2)+' '+repr(u)
for i,u in enumerate(tupleRE[:n]))
print ' --------------------------------'
# tupleRE doesn't works because of element n
print str(n).zfill(2)+' '+repr(tupleRE[n])\
+" doesn't match anymore from this ligne "\
+str(n)+' of tupleRE'
print '\n'.join(str(n+1+j).zfill(2)+' '+repr(u)
for j,u in enumerate(tupleRE[n+1:
min(n+2,len(tupleRE))]))
for i in xrange(n):
match = re.search(''.join(tupleRE[:n-i]),ch, flags)
if match:
break
matching_portion = match.group()
matching_li = '\n'.join(map(repr,
matching_portion.splitlines(True)[-5:]))
fin_matching_portion = match.end()
print ('\n\n -*- Part of the tested string which is concerned :\n\n'
'######### matching_portion ########\n'+matching_li + '\n'
'##### end of matching_portion #####\n'
'-----------------------------------\n'
'######## unmatching_portion #######')
print '\n'.join(map(repr,
ch[fin_matching_portion:
fin_matching_portion+300].splitlines(True)) )
break
else:
print '\n SUCCES . The regex integrally matches.'
for x in li:
print ' -*- Analyzed string :\n%r' % x
REtest(x,tupleRE)
print '\nmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm'
result
-*- Analyzed string :
'ksjdhfqsd\n5 12478 abdefgcd ocean__12 ty--\t\t ghtr789\nqfgqrgqrg'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12340\n'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'2 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'9 54879 bbdecddf antarctic__13 18:13pomodoro\t\t ghtr6798'
SUCCESS . The regex integrally matches.
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'ksjdhfqsd\n5 12478 abdefgcd ocean__1247101247887 ty--\t\t ghtr789\nqfgqrgqrg'
-*- tupleRE :
00 '^\\d'
01 ' '
02 '\\d{5}'
03 ' '
04 '[abcdefghi]+'
05 ' '
--------------------------------
06 '(?=[a-z\\d_ ]{14} [^ ]+\t\t ght)' doesn't match anymore from this ligne 6 of tupleRE
07 '[a-z]+'
-*- Part of the tested string which is concerned :
######### matching_portion ########
'5 12478 abdefgcd '
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'ocean__1247101247887 ty--\t\t ghtr789\n'
'qfgqrgqrg'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12940\n'
-*- tupleRE :
00 '^\\d'
01 ' '
02 '\\d{5}'
03 ' '
04 '[abcdefghi]+'
05 ' '
06 '(?=[a-z\\d_ ]{14} [^ ]+\t\t ght)'
07 '[a-z]+'
08 '__'
09 '[\\d]+'
10 ' +'
11 '[^\t]+'
12 '\t\t'
13 ' '
14 'ght'
15 '(r[5-9]+|u[0-4]+)'
--------------------------------
16 '$' doesn't match anymore from this ligne 16 of tupleRE
-*- Part of the tested string which is concerned :
######### matching_portion ########
'6 48788 bcfgdebc atlantic__7899 %fg#\t\t ghtu12'
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'940\n'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'25 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877'
-*- tupleRE :
00 '^\\d'
--------------------------------
01 ' ' doesn't match anymore from this ligne 1 of tupleRE
02 '\\d{5}'
-*- Part of the tested string which is concerned :
######### matching_portion ########
'2'
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'5 47890 bbcedefg arctic__124 **juyf\t\t ghtr89877'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
-*- Analyzed string :
'9 54879 bbdeYddf antarctic__13 18:13pomodoro\t\t ghtr6798'
-*- tupleRE :
00 '^\\d'
01 ' '
02 '\\d{5}'
03 ' '
04 '[abcdefghi]+'
--------------------------------
05 ' ' doesn't match anymore from this ligne 5 of tupleRE
06 '(?=[a-z\\d_ ]{14} [^ ]+\t\t ght)'
-*- Part of the tested string which is concerned :
######### matching_portion ########
'9 54879 bbde'
##### end of matching_portion #####
-----------------------------------
######## unmatching_portion #######
'Yddf antarctic__13 18:13pomodoro\t\t ghtr6798'
mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwm
I've used Kodos (http://kodos.sourceforge.net/about.html) in the past to perform RegEx debugging. It's not the ideal solution since you want something for run-time, but it may be helpful to you.
if you need to test the re, you can probably use groups followed by * ... as in ( sometext)*
use this along w/ your desired regex, and then you should be able to pluck out your failure locations
and then leverage the following, as stated on python.org
pos
The value of pos which was passed to the search() or match() method of the RegexObject. This is the index into the string at which the RE engine started looking for a match.
endpos
The value of endpos which was passed to the search() or match() method of the > RegexObject. This is the index into the string beyond which the RE engine will not go.
lastindex
The integer index of the last matched capturing group, or None if no group was matched at all. For example, the expressions (a)b, ((a)(b)), and ((ab)) will have lastindex == 1 if applied to the string 'ab', while the expression (a)(b) will have lastindex == 2, if applied to the same string.
lastgroup
The name of the last matched capturing group, or None if the group didn’t have a name, or if no group was matched at all.
re
The regular expression object whose match() or search() method produced this MatchObject instance.
string
The string passed to match() or search().
so for a very simple example
>>> m1 = re.compile(r'the real thing')
>>> m2 = re.compile(r'(the)* (real)* (thing)*')
>>> if not m1.search(mytextvar):
>>> res = m2.search(mytextvar)
>>> print res.lastgroup
>>> #raise my exception

Categories