Finding the length of longest string

Finding the length of longest string - python

I have just started to learn how to use python. A part of my exercise is to find the length of longest string in texts, defined as 'box' in the following case:
def file(box):
maxlen=0
f=box.splitlines()
for i in f:
if len(i)>=maxlen:
maxlen=len(i)
return maxlen
print file("""abcd efgh ijkl
on different lines
I""")
In this case, I get number 14, instead of 18, which is the correct answer...can please somebody help me to solve this problem?

You've indented your return statement too much:
for i in f:
if len(i)>=maxlen:
maxlen=len(i)
return maxlen
At the moment, you're telling it to return on every iteration of the loop, which means only the first line is returned. Move the return statement outside the loop:
for i in f:
if len(i)>=maxlen:
maxlen=len(i)
return maxlen
...and it should work.

Related

How do I solve my program's counting problem?

(Apologies this is gonna be a long question)
I just have a bug in my code that I have not been able to resolve for a very long time. I would really appreciate if someone could help me find out what the problem is.
Context:
I have a long string of letters - lets call this subject - containing the letters A, G, T and C (like DNA) and the whole point of my algorithms is to correctly count how many of each of the following STRs are found within subject. The STRs are:
AGATC
TTTTTTCT
AATG
TCTAG
GATA
TATC
GAAA
TCTG
I must count how many of each are within subject. Counting works by going sequentially letter by letter until the start of one of above STRs are found. If the rest of the STR follows, the program should update the counter of the respective STR and then boost the searching index to account of the length of the STR and then keep going. It should stop when it reaches the end of subject.
(Hope it makes sense).
My Code:
STRs = ['AGATC','TTTTTTCT','AATG','TCTAG','GATA','TATC','GAAA','TCTG']
subject = "GCTAAATTTGTTCAGCCAGATGTAGGCTTACAAATCAAGCTGTCCGCTCGGCACGGCCTACACACGTCGTGTAACTACAACAGCTAGTTAATCTGGATATCACCATGACCGAATCATAGATTTCGCCTTAAGGAGCTTTACCATGGCTTGGGATCCAATACTAAGGGCTCGACCTAGGCGAATGAGTTTCAGGTTGGCAATCAGCAACGCTCGCCATCCGGACGACGGCTTACAGTTAGTAGCATAGTACGCGATTTTCGGGAAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGAATGTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCCCGTCAACTCATTCACACCGCATCCTTTCCTGCCACTGTAACTAGTCGACTGGGGAACCTCATCATCCATACTCTCCCACATTATGCCTCCCAACCTTGTTAAGCGTGGCATGCTTGGGATTGCATTGATGCTTCTTGGAGAGGACGCTTTCGTTTTGGAGATTACAGGGATCCAATTTTATCATCGGTTCGACTCCCGTAACGACTTAGCAGTAAGGGTGCTAGTTCCTGGTTAGAATCTTAATAAATCACGTCGCTTGGAGCAAGACAAAGATCGTCGTAATGCCAAGTGCACGACCACCTTCAGACTTGCAGGACCCGTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTTTTTTTCTCGATAGCTATGCGGTTCAATACAATCTTAACGCAATGCAGCGATGTGGTTTCGTACACTTAGCATAAAACCCCCCACATTAAATCGATGTACCCGCCCTCTTAGACGCCAATTTCAATGCCGAACCTCCGGCGGGTATCTCTGCACTAGGAGAAGTAGCACGTCGCTGTAGCGAACTCCTATCGTGAGATAATTTGTAGAGCTGCTCTTATAATACAATAGCTCAGATGGATTATTCCATGGACATCCCCGTGCGTTGTTTCGAGGATGGTAGGTGGAAATTTTGCCAGACCTCTAGTCTTAAACATGGTTGACGTTATAGGCGCTATCTCTTGCGTCTGGAAGTGTTAATCCGTGAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAAGAAACACGCAACTCTGGAGGAGGGCACTGCACTGCAAACTTGCGTAATATCCTTCACCCACACTTGCCTGGCCTCCTTGCTTAAAGCTCTGGCGATGCGATTTTTCGGCCCAGTAGCTGAATAGGTCATGAAATGGGCACCGAACTGGAAAGACCCATATATTCGATACTCACAACTTAATGATAGCGCGATTAAGAGCGACACCAAAAACCAAATTACGTTCACGAACCTTTGAGAGTCAAGGAGACTTAGACCGAATTGAATGATCACTGATGCGCCCGCTGATACTGAGCCTCACCATTAATCGCCGACCAATACGGCGTGTACCGGGCGCGGCCTTGCCGCATAACGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATATCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTCTGTACACAGCCCCGTCCTCATTGCTAAGTGCACTGGCAACTGGACCTAAAGATTTTTCGAGTATGGCCCTCGAATCAAGCGCCCACCCAGAAACCTACGAGCCAGTAACCCCAGTAAACAAGCATTAGTGCTATATGCTTGCTGCCCACTAGGACCCTTATGGTTCATACCAGGGTGACGTGTCTTGCGGGCCAAGGATGAACCAGAAGCAAGATCCTTAGATGGACGACTGTCTCATTGCTTAAACTCCACATACCAAAGGGCGCGGTAAACGATAGTTTTAGGTAATGTTAGTCGGATGGTTGTCTGCAGCTACCAATACAGCCTGGCACCCAGGGTCTGAACAATAACGCGTGAGAGCAGCTCTCCCGCGTGTGGTGGATTTGCCGTCTATGAAATTGAGGCTCTTGCAACTATTCGCACTCGGAATGCCCTCATATCTGGTGCCTAGCGGCCTTTGCCCCGTGCCGGTAGGACTAAACTCTACGGATCGTTGACGGATCTCGATGTGGAAGATGGTTATGAAAGATAACAACGCGTGTGCTAATTGATTTAGACAAGTATTGCGGCAGTAAAAGATAATCGGCTGCAGAGTTACGAAAGACTTCCATGCATGGATTCCATTCCTTCTAGTATAGGACCCACTCTGAATACACGTCTTGCGGGCCGATCATCTCCACCGCTGCGGAAGAAAGCAATTAAGAATCTATGCTCATTAAGAGTGCGACTATAATGCGGATCTTACAGTGCTAATGATCAGGACGTCGTCCAAGCAGGCTGCATGCCGAATTTAGCTTACGTCAGGATCAGGCGTTATAGCCTGGGAATCGGACTATGAGGACGCCACGACCTCTGGGAGAAAGCTATATACATTGAGGATCGCGCCATCTTTATGAGACTCAAATGAATCTAGATAGGTAGCATTGCGGACTTGAGTTAGCACATCGGTATTGGAAGGTGAGGGTCCTGCCGCTCGTTCTATGTTCGGTTTATAGTATACAAATAGGTCATCCCGAACGTTGAAGTTAAACTCATGACACGTTGTCGTAATGAAACGGGCCTGTTATTAGGGATACAGACAAAAGGCACAAGCTGGCTTGCACATTAAGGCGCACTAGAGATCCTCACAACCGTTGCCCGCACGGAGGTCGTGTCTAACAGACAGTGAACCAGCCGTATTGGGGTGGATGACCTGAGCTTCTTGGGGCCTGTTGTACACCGCGTGTGGTTCAACTGGTACACATACTACGAATATTCGAAATCATTGTACTGTGCTCTTCGGTGCTACTGACTGTGAGCGAATGCATCCCAATCCCAAACAATGCTTGTGGTAGGAGAATTGAAACTCTCGAAGCCTGGCCCAATGTCATCTACTTTTAACATGTCGGGCCAGGAGTTACGGGCATTGCTTACTTACTTTGCCCCCTTACACCACAGCAGCGCGATTCTTGTTGTAGTAGATTTTATACGACTCGCGAATTAAATGGAACTTGTCTGTCCCATATCGATCGTGTCCATCGTAAGATGAGATTGTAGGAGCATTCGGAAGTCTATGCGGCCCAGGGACTACTACGTTAAATCTGGTCAGACGTGGTTTACAAGGCGTCCCGATCTTCTCAGAACATATGGGAAAGCACTACCGTTCCTTCACGCATACAGTTGTTCGTGCCGAACGAGTAAGCTTGCGACCAGCCCACCCGCTAGGGCTATGCAGCGGGTCATGGCTGGCGCCATACTGTGCGGACAACCCACGCTCTGGCAGAAAGCGTCTTGTGTTTTGTAGTAGCTCCAACGGTTAGACCTTCGATATCTATTCAGAGCGCGAGCGACCACTATTAGACGGCATGTAAACAATGTGTATTTGTTCGGCCCAACCGGTATATGGGTAAGACCGCGAAGGGCCTGCGCGAATACCAGCGTCCAAAAATTCCTCACCCGAGATATGCGGTTAGTACCCCTTGGGTAACGGTCCGCTACGGGTAGCGACGCGAGCCGGCCGCATCGGTTGGAGCCGAGTTGTCGGGCAGGCGAGTAACGTGTGCAATTTGATGGGCCCAAGCCTCCGGCACTATCCACCTCATACATCGACAAAAGCACCAAATATGGGGAAAAGCTGAGCGTCGATATGTACATCTACCCAGGAACCGGCCCGAACATTAGGCGGACGTGAATTTCCGACCTAGGTTCGGCTACATTTCTACGATCCAAGCACACGTGAAGGAGGAGGGGTGTTCCGACCGTAAATGAACGAGGTGCGCAGTGACCCGATGGCGTTTAGCGGATAGCCTTCCTATGCCGGCCTATGCTGTATGGTAGTTGGTTGGTGCCTCCAGAGCCACTGCACCCAATCATAGGGTCTACAGCAGCGTACTTATAAAATTGTACGGGTGACCCATATCCATTACGGGTTGCGACCAGTATAGGAGAGTATAACTGCGTGAACTAATGCGTTATGACGCTTCAGAGTTTGCTCGGGCCCGAGTTCTAGGGCTATAATGTGTTAGGGCGCAAGTATGCCAAGCTAAGATGTGGCGTGCACACTAGGAGTTGTGTTCCTCTGCAAGCAGACACGAGCACTCTGGCAGTAGTTTGACCACACCCGGGTATCACTGCTACTCCATTTCGAACAAGCTATTGGAGCGGACAAAATATGCTACTCAAGAGCATTAGTTATAGGTCTACGAGACAGAAGCAGTTACTGAGTCTGAATATTCGATATAAGTAGGCATGGAGGCGGAGCAAAACAACGTCTGCGATCAATCGTGTTGATGACGTATGGCGACTGGAAGGTAAGGACTATGGCCGGACGGAATGATTCATGTTCTGTTCAAAGCTATATTTCGAAGGGGTATATTAGCGGTCCTACACTTGGTTAGCACCCTCCCCCCTCTGGATCCTGCACTAATTCGAGCTGGCCTCCATCGGTATCAGTCCGGAAGCTCCACTCTCTATCGTAGTCCTAATCAACAGGGTGCCAGTTTGCTCACGTGGAAGTTTGAGGCCCTTTGTGCTCCATAGCCAATCACTAACCATGCACGCGCGACCCACTCTACGTCCAGATCGGCTATAATAGTTGCGCCCGGGACTGGCAGAGTAGACATGTAAGCTAGATAGAGCCCCGACATCGGCCAAGAGATCCTACGCTGCTTCCAGATAATGAGAGACATTCTAGCATTAGACATGCAAGTCGGCAGGGACTCCCCTTATCTAGTAATTTCGATGAATTGGTTTTTCGGCTAGCATCTAGTCTAGTCTAGTCTAGTCTAGTCTAGTCTAGTCTAGTCTAGTCTAGTCTAGTCTAGACCATGCCGACCTCATCATAGAAGGAATGCTCTAAACTTAGAGTGCTACTAGGAAAACTATTAATCAATGATCGTCCTGCTTACATAGCTGGACGGCGAAAGTTCTTATACTGCGGAGGTTGCTGACGTAGAGTGCGCTGGGTACAGCGGATAAGTTGATCAGGGTGGGGATAGGGTGGCTCACCGTTTATACTCATATAGATTCCTGGCGTCGACGCTGTGACAGGGTCGAGATCGAGGGGGAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGATCAGCGGAGCGGAGGGAAAATTATCACCAGAGGGTAGGGGCTCGCGACATTCTATTCAATGCATTTCAAGCTACTTACGTATTTCGGCACAGTGACTACTGCCTGCGCGGCAGCCGTAAGGTTTCCCGTCAATAGGTGGCACGTATCATTGATGAAAGTGTCAGCTAATCATTCAGGCCTTA"
x = 0 # Searching index.
dataSTR = { # All the STRs to seach for.
"AGATC":0,
"TTTTTTCT":0,
"AATG":0,
"TCTAG":0,
"GATA":0,
"TATC":0,
"GAAA":0,
"TCTG":0,
}
# This dict will hold all the count values of STR's in the text-file.
# Scanning STR's from the txt file.
total = len(subject)
limit = 8
while x < total:
currentString = subject[x:x+limit] # A temporary variable to hold the next few letters from the text-file at index x.
for STR in STRs:
if STR in currentString: # The STR is found within this set of letters?
lSTR = len(STR) - 1
if STR[0:lSTR] == currentString[0:lSTR]: # In order to minimise the risk of duplication...
dataSTR[STR] += 1 # ...the STR must be at the start of currentString.
#print(currentString, STR, x, dataSTR[STR])
x += lSTR # The index must be boosted each time a new STR is read. In the event that an STR is at the end of a stand...
x += 1 # The index counts up by 1 by default. (From above) ...so that no duplicates are added.
print(dataSTR.items())
print("The correct result is: AGATC - 22, TTTTTTCT - 33, AATG - 43, TCTAG - 12, GATA - 26, TATC - 18, GAAA - 47, TCTG - 41")
(Sorry its very long, it might be helpful to copy into a separate python file).
As you will see from running it, the result my program brings up from counting is incorrect. The correct results are in the final print statement of the program, but the program does not match this (yes I know that these results are 100% correct since this is part of a problem set from an online computer science course).
However, I cannot seem to find the bug or logic error that seems to be causing my program to count wrong and I have been trying for quite a while now. Does anyone know what the solution is?
Please feel free to ask me anything about the program, thank you all.

Your problem statement doesn't agree with the "correct results" given in your example code. Either you've misunderstood the problem, or you've taken the correct results from a different problem. (The "correct results" appear to be for the problem of finding the maximum number of consecutive repeats of each query string.) [The latter possibility is the point that Chris Charley makes in a comment on the original post.]
You can convince yourself by doing the problem "by hand": look at the subject string in a text editor, pick a query string, do a search on it, and step through the occurrences.
E.g., for the query string "GAAA", you'll count ~67 occurrences, but most of them are in a block of 47 repeats in subject[1449:1637]. (This is more obvious if you use a text editor that highlights all occurrences of the search string, as 188 characters of consecutive highlighting should jump out at you.) And 47 agrees with the "correct result" for GAAA.

Does this help?
count_results = dict()
STRs = ['AGATC','TTTTTTCT','AATG','TCTAG','GATA','TATC','GAAA','TCTG']
subject = "loooong string..."
for search_string in STRs:
count_results[search_string] = subject.count(search_string)
print(count_results)
{'AGATC': 28, 'TTTTTTCT': 33, 'AATG': 69, 'TCTAG': 18, 'GATA': 46, 'TATC': 36, 'GAAA': 67, 'TCTG': 60}
I realize the results are sometimes different to your expected counts, but I didn't go through the intricacies of your search algo and wonder if the expected output might be wrong? If not, check out the docs for the str.count() function, to see how & why it gets different output, and adapt what it does to your needs.

Try like this:
import re
# Define STRs and subject here
dic = {}
for x in STRs:
tv = len([m.start() for m in re.finditer(x,subject)])
tv += 1
dic[x] = tv
for y in dic.keys():
print(y,dic[y])

The results in the last print statement are incorrect. I checked it with python's built in method .count(), if you are allowed to use this method just use this one instead, but if not, I would recommend to do the following:
total = len(subject)
while x < total:
for STR in STRs:
limit = len(STR)
currentString = subject[x:x+limit]
if STR == currentString:
dataSTR[STR] += 1
x += 1
that way, you set the limit to the string's length so the STR is either exactly the string or not, so you don't have to check for duplicates. I don't know why your code didn't work, but I hope this will help you.

Reading data from a text file in Python according to the parameters provided

I have a text file something like this
Mqtt_allowed=true
Mqtt_host=192.168.0.1
Mqtt_port=2223
<=============>
cloud_allowed=true
cloud_host=m12.abc.com
cloud_port=1232
<=============>
local_storage=true
local_path=abcd
I needed to get each of the value w.r.t parameter provided by the user.
What i am doing right now is:
def search(param):
try:
with open('config.txt') as configuration:
for line in configuration:
if not line:
continue
function, f_input=line.split("=")
if function == param:
result=f_input.split()
break
else:
result="0"
except FileNotFoundError:
print("File not found: ")
return result
mqttIsAllowed=search("Mqtt_allowed")
print mqttIsAllowed
Now when i call only mqt stuff it is working fine but when i call cloud or anything after the "<==========>" separation it throws an error. Thanks

Just skip all the lines starting with <:
if not line or line.lstrip().startswith("<"):
continue
Or, if you really, really want to match the separator exactly:
if line.strip() == "<=============>":
continue
I think the first variant is better because if someone slightly modified the separator by accident, the second piece of code won't work at all.

Because you are trying to split on the = character in a style that seems to be standard INI format, it is safe to assume that your pairs will be at max size 2. I'm not a fan of using methods that rely on character checking (unless specifically called for), so give this a whirl:
def search(param):
result = '0' # declare here
try:
with open('config.txt') as configuration:
for line in configuration:
if not line:
continue
f_pair = line.strip().split("=") # remove \r\n, \n
if len(f_pair) > 2: # your separator will be much longer
continue
else if f_pair[0] == param:
result = f_pair[1]
# result = f_input.split() # why the 'split()' here?
break
except FileNotFoundError:
print("File not found: ")
return result
mqttIsAllowed=search("Mqtt_allowed")
I'm pretty sure the error you were getting was a ValueError: too many values to unpack.
Here is how I know that:
When you call this function for any of the Mqtt_* values, the loop never encounters the separator string <=============>. As soo as you try to call anything below that first separator (for example a cloud_* key), the loop eventually reaches the first separator and tries to execute:
function, f_input = line.split('=')
But that wont work, in fact it will tell you:
ValueError: too many values to unpack (expected 2)
And that is because you are forcing the split() call to push into only 2 variables, but a split('=') on your separator string will return a list of 15 elements (a '<', a '>' and 13 ''). Thus, doing what I have posted above ensures that your split('=') still goes off, but checks to see if you hit a separator or not.

Functional approach to file parsing in Python

I have a text file describing an electronic circuit and a few other things done with it. I've built a simple Python code that splits the file into different units which can then be further analyzed if needed.
The syntax of the simulation language defines these units as contained within the following lines:
subckt xxx .....
...
...
ends xxx ...
There is a few of these 'text blocks' and other stuff I'm parsing or leaving out - like comment lines.
To accomplish this, I use the following core:
with open('input') as f:
for l in iter(f):
if 'subckt' not in l:
pass
else:
with open('output') as o:
o.write(l)
for l in iter(f):
if 'ends' in l:
o.write(l)
break
else:
o.write(l)
(can't easily paste the real code, there might be oversights)
The nice thing about it is the fact that iter(f) keeps scanning the file so when I break out of the inner loop as I reached the ends line of a subckt, the outer loop keeps going from that point onward, searching for new occurrences of the token subckt in subsequent lines.
I am looking for suggestions and/or guidance on how to transform the forest of if/then clauses into something more functional, i.e. based on 'pure' functions which just yield values (the file rows or lines) and are then composed as to bring to the final result.
Specifically, I am not sure how to approach the fact that the generator\map\filter should actually yield a different row based on the fact that it has found the subckt token or not.
I can think of a filter of the form:
line = filter(lambda x: 'subckt' in x, iter(f))
but this of course only gives me the lines where that string is present, whereas I would like - from that moment on - yield all lines, until the ends token is found.
Is this something I'd have to handle with recursion? Or maybe itertools.tee?
Seems to me that what I want is to have some form of state, i.e. "you have reached a subckt", but without resorting to a true state variable, which would be against the functional paradigm.

Not sure if this is what you are looking for. blocks(f) is a generator producing the blocks in your file f. Each block is an iterator over the lines between 'subckt' and 'ends'. If you want to include those two lines in the block, you'd have to do some more work in _blocks. But I hope this gives you an idea:
def __block(f):
while 'subckt' not in next(f): pass # raises StopIteration at EOF
return iter(next(iter([])) if 'ends' in l else l.strip() for l in f)
def blocks(f):
while 1: yield __block(f) # StopIteration from __block will stop the generator
f = open('data.txt')
for block in blocks(f):
# process block
for line in block:
# process line
next(iter([])) if is a little hack to terminate a comprehension/generator.

This answer also works, still very keen on hearing comments:
from itertools import takewhile, dropwhile
def start(l): return 'subckt' not in l
def stop(l): return 'ends' not in l
def sub(iter):
while True:
a = list(dropwhile(start,takewhile(stop,iter)))
if len(a):
yield a
else:
return
f = open('file.txt')
for b in sub(f):
#process b
f.close()
Something I couldn't work out yet: enclose the last line (containing ends keyword) in the output.

Transposition Cipher in Python

Im currently trying to code a transposition cipher in python. however i have reached a point where im stuck.
my code:
key = "german"
length = len(key)
plaintext = "if your happy and you know it clap your hands, clap your hands"
Formatted = "".join(plaintext.split()).replace(",","")
split = split_text(formatted,length)
def split_text(formatted,length):
return [formatted[i:i + length] for i in range(0, len(formatted), length)]
def encrypt():
i use that to count the length of the string, i then use the length to determine how many columns to create within the program. So it would create this:
GERMAN
IFYOUR
HAPPYA
NDYOUK
NOWITC
LAPYOU
RHANDS
CLAPYO
URHAND
S
this is know where im stuck. as i want to get the program to create a string by combining the columns together. so it would combine each column to create:
IHNNLRCUSFADOAHLRYPYWPAAH .....
i know i would need a loop of some sort but unsure how i would tell the program to create such a string.
thanks

you can use slices of the string to get each letter of the string in steps of 6 (length)
print(formatted[0::length])
#output:
ihnnlrcus
Then just loop through all the possible start indices in range(length) and link them all together:
def encrypt(formatted,length):
return "".join([formatted[i::length] for i in range(length)])
note that this doesn't actually use split_text, it would take formatted directly:
print(encrypt(formatted,length))
the problem with using the split_text you then cannot make use of tools like zip since they stop when the first iterator stops (so because the last group only has one character in it you only get the one group from zip(*split))
for i in zip("stuff that is important","a"):
print(i)
#output:
("s","a")
#nothing else, since one of the iterators finished.
In order to use something like that you would have to redefine the way zip works by allowing some of the iterators to finish and continue until all of them are done:
def myzip(*iterators):
iterators = tuple(iter(it) for it in iterators)
while True: #broken when none of iterators still have items in them
group = []
for it in iterators:
try:
group.append(next(it))
except StopIteration:
pass
if group:
yield group
else:
return #none of the iterators still had items in them
then you can use this to process the split up data like this:
encrypted_data = ''.join(''.join(x) for x in myzip(*split))

Really weird python error

...
def splitMunipulation(p,threshold=5000):
runs=[];i=0
while i<len(p):
l=[];i+=1
print i,p[i]
while p[i]!=press(0,1,0):
l.append(p[i]);i+=1
else:
runs.append(l)#here i points to another (0,1,0)
return runs
...
record=splitMunipulation(record)
'''
Output:
1 <__main__.press instance at 0x046690A8>
File "H:\mutate.py", line 28, in splitMunipulation
while p[i]!=press(0,1,0):
IndexError: list index out of range
pressis a class
and since print p[i] works well,why p[i] is considered out of range?
Really don't get what's going on
'''

so, a few things..
Firstly, your code is very... unpythonic. This isn't C, so you don't need to use while loops for iteration, and don't use semicolons to separate multiple commands on one line in Python. Ever. Also, the while...else format is confusing and should be avoided.
If you look at the first few 'lines' of your while loop,
while i<len(p):
l=[];i+=1
You keep i below the length of p, but you immediately increase i's value by one. As such, when i=len(p) - 1, you will make i one larger, len(p). So when you try to access p[i], you are trying to access a value that doesn't exist.
Fixing those issues, you would get:
...
def splitMunipulation(p,threshold=5000):
runs=[]
for i in p:
l=[]
print i
if i != press(0,1,0):
runs.append(i)
return runs
...
record=splitMunipulation(record)

while p[i]!=press(0,1,0):
l.append(p[i]);i+=1
The variable i gets incremented in this loop until p[i]!=press(0,1,0). Since nothing is happening to make p longer, or to test that i is not greater than the length of p, it is easy to see how the index could get out of range.

len returns the length, not the last index. If l=[1,2,3], then len(l) returns 3, but l[3] is out of range.
so you should use
while i<len(p)-1
or better yet:
for i in range(len(p)):

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.