Write a function file_in_english(filename, character_limit) that takes a filename (as a str) and a character_limit (as an int). The filename is the name of the file to convert from Cat Latin to English and the character limit is the maximum number of characters that can be converted. The limit is on the total number of characters in the output (including newline characters).
The function should return a string that contains all the converted lines in the same order as the file - remember the newline character at the end of each line (that is make sure you include a newline character at the end of each converted line so that it is included in the line's length).
If the limit is exceeded (ie, a converted sentence would take the output over the limit) then the sentence that takes the character count over the limit shouldn't be added to the output. A line with "<>" should be added at the end of the output instead. The processing of lines should then stop.
The lines in the file will each be a sentence in Weird Latin and your program should print out the English version of each sentence
The function should keep adding sentences until it runs out of input from the file or the total number of characters printed (including spaces) exceeds the limit.
The answer must include your definition of english_sentence and its helper(s) functions - that I should have called english_word or similar.
You MUST use while in your file_in_english function.
You can only use one return statement per function.
The test file used in the examples (test1.txt) has the following data:
impleseeoow estteeoow aseceeoow
impleseeoow estteeoow aseceeoow ineleeoow 2meeoow
impleseeoow estteeoow aseceeoow ineleeoow 3meeoow
impleseeoow estteeoow aseceeoow ineleeoow 4meeoow
I program works fine except that sometimes it returns None.
def english_sentence(sentence):
"""Reverse Translation"""
consonants = 'bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ'
eng_sentence = []
for coded_word in sentence.split():
if coded_word.endswith("eeoow") and (coded_word[-6] in consonants):
english_word = coded_word[-6] + coded_word[:-6]
if (coded_word[-6] == 'm') and (coded_word[0] not in consonants):
english_word = '(' + english_word + ' or ' + coded_word[:-6] + ')'
eng_sentence.append(english_word)
return " ".join(eng_sentence)
def file_in_english(filename, character_limit):
"""English File"""
newone = open(filename)
nowline = newone.readline()
characters = 0
while characters < character_limit and nowline != "":
process = nowline[0:-1]
print(english_sentence(process))
characters += len(nowline)
nowline = newone.readline()
if characters > character_limit:
return("<<Output limit exceeded>>")
ans = file_in_english('test1.txt', 20)
print(ans)
Output is:
simple test case
simple test case line (m2 or 2)
simple test case line (m3 or 3)
simple test case line (m4 or 4)
None
But I must use only one return statement in each function. How can I do that for the second function and avoid the "None" in output?
You're doing the same thing as:
def f():
print('hello')
print(f())
So basically narrows down to:
print(print('hello world'))
Also btw:
>>> type(print('hello'))
hello
<class 'NoneType'>
>>>
To solve your code do:
def file_in_english(filename, character_limit):
s=""
"""English File"""
newone = open(filename)
nowline = newone.readline()
characters = 0
while characters < character_limit and nowline != "":
process = nowline[0:-1]
s+=english_sentence(process)+'\n'
characters += len(nowline)
nowline = newone.readline()
if characters > character_limit:
s+="<<Output limit exceeded>>"
return s
ans = file_in_english('test1.txt', 20)
print(ans)
You have to make sure, that any function that should return something, does this for ALL ways that your function can end.
Your function file_in_english only returns something for the case if characters > character_limit:
If charachter == or charachter < character_limit this is not the case, the function returns nothing explicitly.
Any function that does not return something from it on end, implicitly returns None when it returns to its caller.
def something(boolean):
"""Function that only returns something meaninfull if boolean is True."""
if boolean:
return "Wow"
print(something(True)) # returns Wow
print(something(False)) # implicitly returns/prints None
You can find this fact f.e. in the python tutorial:
Coming from other languages, you might object that fib is not a
function but a procedure since it doesn’t return a value. In fact,
even functions without a return statement do return a value, albeit a
rather boring one. This value is called None (it’s a built-in name).
Writing the value None is normally suppressed by the interpreter if it
would be the only value written. You can see it if you really want to
using print():
Source:https://docs.python.org/3.7/tutorial/controlflow.html#defining-functions - just short after the 2nd green example box
Related
I need this program to create a sheet as a list of strings of ' ' chars and distribute text strings (from a list) into it. I have already coded return statements in python 3 but this one keeps giving
return(riplns)
^
SyntaxError: invalid syntax
It's the return(riplns) on line 39. I want the function to create a number of random numbers (randint) inside a range built around another randint, coming from the function ripimg() that calls this one.
I see clearly where the program declares the list I want this return() to give me. I know its type. I see where I feed variables (of the int type) to it, through .append(). I know from internet research that SyntaxErrors on python's return() functions usually come from mistype but it doesn't seem the case.
#loads the asciified image ("/home/userX/Documents/Programmazione/Python projects/imgascii/myascify/ascimg4")
#creates a sheet "foglio1", same number of lines as the asciified image, and distributes text on it on a randomised line
#create the sheet foglio1
def create():
ref = open("/home/userX/Documents/Programmazione/Python projects/imgascii/myascify/ascimg4")
charcount = ""
field = []
for line in ref:
for c in line:
if c != '\n':
charcount += ' '
if c == '\n':
charcount += '*' #<--- YOU GONNA NEED TO MAKE THIS A SPACE IN A FOLLOWING FUNCTION IN THE WRITER.PY PROGRAM
for i in range(50):#<------- VALUE ADJUSTMENT FROM WRITER.PY GOES HERE(default : 50):
charcount += ' '
charcount += '\n'
break
for line in ref:
field.append(charcount)
return(field)
#turn text in a list of lines and trasforms the lines in a list of strings
def poemln():
txt = open("/home/gcg/Documents/Programmazione/Python projects/imgascii/writer/poem")
arrays = []
for line in txt:
arrays.append(line)
txt.close()
return(arrays)
#rander is to be called in ripimg()
def rander(rando, fldepth):
riplns = []
for i in range(fldepth):
riplns.append(randint((rando)-1,(rando)+1)
return(riplns) #<---- THIS RETURN GIVES SyntaxError upon execution
#opens a rip on the side of the image.
def ripimg():
upmost = randint(160, 168)
positions = []
fldepth = 52 #<-----value is manually input as in DISTRIB function.
positions = rander(upmost,fldepth)
return(positions)
I omitted the rest of the program, I believe these functions are enough to get the idea, please tell me if I need to add more.
You have incomplete set of previous line's parenthesis .
In this line:-
riplns.append(randint((rando)-1,(rando)+1)
You have to add one more brace at the end. This was causing error because python was reading things continuously and thought return statement to be a part of previous uncompleted line.
Start by setting t to be the local time 1, 500, 000, 000 seconds from the start of January 1, 1970 UTC:
import time
t = time.localtime(1500000000)
Construct the next strings by using the string time format function strftime(): (a) 'Thursday, July 13 2017'
(b) '09:40 PM Central Daylight Time on 07/13/2017'
(c) 'I will meet you on Thu July 13 at 09:40 PM.'
A couple things, Stack Overflow is not the place for code reviews, for that: try this.
Regardless, you have indentation problems, Python is based off of indents, you need to have the code in your function indented one ahead of your def, like so:
def filesStringSearch():
infile = open('example.txt')
a = input('Search for a word: ')
result = infile.read().find(a)
#result = a.find:
#for a in infile:
if a.find:
print("True")
elif a < 3:
print("-1")
else:
print("False")
return
Second, you're not taking an input with the function, and hard-coding the file to open; this is a simple fix however,
def filesStringSearch(filename):
infile = open(filename)
Third, you're not going to accomplish your goal with your if statements, if the length of the input is less than 3, you shouldn't even try to search for anything, so you need to reorder and change your boolean expressions a bit; to this:
if len(a) < 3:
print("-1")
elif a.find:
print("True")
else:
print("False")
Finally, a.find will not work, rather you can check to see the value of result, so you can replace elif: a.find with:
elif result != -1:
print("True")
Since result will be -1 if it cannot find anything.
Also, the return is useless at the end.
According to your questions the right implementation is:
def filesStringSearch(filename, pattern):
with open(filename, 'r') as f:
text = f.read()
if len(pattern) >= 3:
return text.find(pattern) > -1 or False
else:
return -1
filename = 'example.txt'
pattern_to_find = input('Search for a word: ')
out = filesStringSearch(filename, pattern_to_find)
print(out)
If you are asked to write a function that accepts two arguments, then your function must accept two arguments as here:
def filesStringSearch(filename, pattern):
Then you must read the file, I did it using with statement. with statement will close our file for us, so you don't have to do it manually (and yes, you forgot to close an opened file, it is not a big problem for now, but avoid such things in big projects). You can read more about with statement there: Reading and writing files
What about find method. It is a string method, that will return index of found substring in your string, for instance my_string.find('h') is going to return the index of first substring (which is 'h') in my_string string. If find method can't find your substring it will return -1, that's why we do this:
return text.find(pattern) > -1 or False
As if we will find our pattern in text, then the index certainly is going to be greater that -1. Otherwise we return False or -1 if pattern string's length is less than 3, according to your question
And at the end we take input from user and pass that input to our function with the name of file example.txt. We store the return value of our function in out variable and then print it
I'm trying to read a null terminated string but i'm having issues when unpacking a char and putting it together with a string.
This is the code:
def readString(f):
str = ''
while True:
char = readChar(f)
str = str.join(char)
if (hex(ord(char))) == '0x0':
break
return str
def readChar(f):
char = unpack('c',f.read(1))[0]
return char
Now this is giving me this error:
TypeError: sequence item 0: expected str instance, int found
I'm also trying the following:
char = unpack('c',f.read(1)).decode("ascii")
But it throws me:
AttributeError: 'tuple' object has no attribute 'decode'
I don't even know how to read the chars and add it to the string, Is there any proper way to do this?
Here's a version that (ab)uses __iter__'s lesser-known "sentinel" argument:
with open('file.txt', 'rb') as f:
val = ''.join(iter(lambda: f.read(1).decode('ascii'), '\x00'))
How about:
myString = myNullTerminatedString.split("\x00")[0]
For example:
myNullTerminatedString = "hello world\x00\x00\x00\x00\x00\x00"
myString = myNullTerminatedString.split("\x00")[0]
print(myString) # "hello world"
This works by splitting the string on the null character. Since the string should terminate at the first null character, we simply grab the first item in the list after splitting. split will return a list of one item if the delimiter doesn't exist, so it still works even if there's no null terminator at all.
It also will work with byte strings:
myByteString = b'hello world\x00'
myStr = myByteString.split(b'\x00')[0].decode('ascii') # "hello world" as normal string
If you're reading from a file, you can do a relatively larger read - estimate how much you'll need to read to find your null string. This is a lot faster than reading byte-by-byte. For example:
resultingStr = ''
while True:
buf = f.read(512)
resultingStr += buf
if len(buf)==0: break
if (b"\x00" in resultingStr):
extraBytes = resultingStr.index(b"\x00")
resultingStr = resultingStr.split(b"\x00")[0]
break
# now "resultingStr" contains the string
f.seek(0 - extraBytes,1) # seek backwards by the number of bytes, now the pointer will be on the null byte in the file
# or f.seek(1 - extraBytes,1) to skip the null byte in the file
(edit version 2, added extra way at the end)
Maybe there are some libraries out there that can help you with this, but as I don't know about them lets attack the problem at hand with what we know.
In python 2 bytes and string are basically the same thing, that change in python 3 where string is what in py2 is unicode and bytes is its own separate type, which mean that you don't need to define a read char if you are in py2 as no extra work is required, so I don't think you need that unpack function for this particular case, with that in mind lets define the new readString
def readString(myfile):
chars = []
while True:
c = myfile.read(1)
if c == chr(0):
return "".join(chars)
chars.append(c)
just like with your code I read a character one at the time but I instead save them in a list, the reason is that string are immutable so doing str+=char result in unnecessary copies; and when I find the null character return the join string. And chr is the inverse of ord, it will give you the character given its ascii value. This will exclude the null character, if its needed just move the appending...
Now lets test it with your sample file
for instance lets try to read "Sword_Wea_Dummy" from it
with open("sword.blendscn","rb") as archi:
#lets simulate that some prior processing was made by
#moving the pointer of the file
archi.seek(6)
string=readString(archi)
print "string repr:", repr(string)
print "string:", string
print ""
#and the rest of the file is there waiting to be processed
print "rest of the file: ", repr(archi.read())
and this is the output
string repr: 'Sword_Wea_Dummy'
string: Sword_Wea_Dummy
rest of the file: '\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6=\x00\x00\x00\x00\xeaQ8?\x9e\x8d\x874$-i\xb3\x00\x00\x00\x00\x9b\xc6\xaa2K\x15\xc6=;\xa66?\x00\x00\x00\x00\xb8\x88\xbf#\x0e\xf3\xb1#ITuB\x00\x00\x80?\xcd\xcc\xcc=\x00\x00\x00\x00\xcd\xccL>'
other tests
>>> with open("sword.blendscn","rb") as archi:
print readString(archi)
print readString(archi)
print readString(archi)
sword
Sword_Wea_Dummy
ÍÌÌ=p=Š4:¦6¿JÆ=
>>> with open("sword.blendscn","rb") as archi:
print repr(readString(archi))
print repr(readString(archi))
print repr(readString(archi))
'sword'
'Sword_Wea_Dummy'
'\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6='
>>>
Now that I think about it, you mention that the data portion is of fixed size, if that is true for all files and the structure on all of them is as follow
[unknow size data][know size data]
then that is a pattern we can exploit, we only need to know the size of the file and we can get both part smoothly as follow
import os
def getDataPair(filename,knowSize):
size = os.path.getsize(filename)
with open(filename, "rb") as archi:
unknown = archi.read(size-knowSize)
know = archi.read()
return unknown, know
and by knowing the size of the data portion, its use is simple (which I get by playing with the prior example)
>>> strins_data, data = getDataPair("sword.blendscn", 80)
>>> string_data, data = getDataPair("sword.blendscn", 80)
>>> string_data
'sword\x00Sword_Wea_Dummy\x00'
>>> data
'\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6=\x00\x00\x00\x00\xeaQ8?\x9e\x8d\x874$-i\xb3\x00\x00\x00\x00\x9b\xc6\xaa2K\x15\xc6=;\xa66?\x00\x00\x00\x00\xb8\x88\xbf#\x0e\xf3\xb1#ITuB\x00\x00\x80?\xcd\xcc\xcc=\x00\x00\x00\x00\xcd\xccL>'
>>> string_data.split(chr(0))
['sword', 'Sword_Wea_Dummy', '']
>>>
Now to get each string a simple split will suffice and you can pass the rest of the file contained in data to the appropriated function to be processed
Doing file I/O one character at a time is horribly slow.
Instead use readline0, now on pypi: https://pypi.org/project/readline0/ . Or something like it.
In 3.x, there's a "newline" argument to open, but it doesn't appear to be as flexible as readline0.
Here is my implementation:
import struct
def read_null_str(f):
r_str = ""
while 1:
back_offset = f.tell()
try:
r_char = struct.unpack("c", f.read(1))[0].decode("utf8")
except:
f.seek(back_offset)
temp_char = struct.unpack("<H", f.read(2))[0]
r_char = chr(temp_char)
if ord(r_char) == 0:
return r_str
else:
r_str += r_char
What I'm trying to do is open a file, then find every instance of '[\x06I"' and '\x06;', then return whatever is between the two.
Since this is not a standard text file (it's map data from RPG maker) readline() will not work for my purposes, as the file is not at all formatted in such a way that the data I want is always neatly within one line by itself.
What I'm doing right now is loading the file into a list with read(), then simply deleting characters from the very beginning until I hit the string '[\x06I'. Then I scan ahead to find '\x06;', store what's between them as a string, append said string to a list, then resume at the character after the semicolon I found.
It works, and I ended up with pretty much exactly what I wanted, but I feel like that's the worst possible way to go about it. Is there a more efficient way?
My relevant code:
while eofget == 0:
savor = 0
while savor == 0 or eofget == 0:
if line[0:4] == '[\x06I"':
x = 4
spork = 0
while spork == 0:
x += 1
if line[x] == '\x06':
if line[x+1] == ';':
spork = x
savor = line[5:spork] + "\n"
line = line[x+1:]
linefinal[lineinc] = savor
lineinc += 1
elif line[x:x+7] == '#widthi':
print("eof reached")
spork = 1
eofget = 1
savor = 0
elif line[x:x+7] == '#widthi':
print("finished map " + mapname)
eofget = 1
savor = 0
break
else:
line = line[1:]
You can just ignore the variable names. I just name things the first thing that comes to mind when I'm doing one-offs like this. And yes, I am aware a few things in there don't make any sense, but I'm saving cleanup for when I finalize the code.
When eofget gets flipped on this subroutine terminates and the next map is loaded. Then it repeats. The '#widthi' check is basically there to save time, since it's present in every map and indicates the beginning of the map data, AKA data I don't care about.
I feel this is a natural case to use regular expressions. Using the findall method:
>>> s = 'testing[\x06I"text in between 1\x06;filler text[\x06I"text in between 2\x06;more filler[\x06I"text in between \n with some line breaks \n included in the text\x06;ending'
>>> import re
>>> p = re.compile('\[\x06I"(.+?)\x06;', re.DOTALL)
>>> print(p.findall(s))
['text in between 1', 'text in between 2', 'text in between \n with some line breaks \n included in the text']
The regex string '\[\x06I"(.+?)\x06;'can be interpreted as follows:
Match as little as possible (denoted by ?) of an undetermined number of unspecified characters (denoted by .+) surrounded by '[\x06I"' and '\x06;', and only return the enclosed text (denoted by the parentheses around .+?)
Adding re.DOTALL in the compile makes the .? match line breaks as well, allowing multi-line text to be captured.
I would use split():
fulltext = 'adsfasgaseg[\x06I"thisiswhatyouneed\x06;sdfaesgaegegaadsf[\x06I"this is the second what you need \x06;asdfeagaeef'
parts = fulltext.split('[\x06I"') # split by first label
results = []
for part in parts:
if '\x06;' in part: # if second label exists in part
results.append(part.split('\x06;')[0]) # get the part until the second label
print results
This seems fairly trivial but I can't seem to work it out
I have a text file with the contents:
B>F
I am reading this with the code below, stripping the '>' and trying to convert the strings into their corresponding ASCII value, minus 65 to give me a value that will correspond to another list index
def readRoute():
routeFile = open('route.txt', 'r')
for line in routeFile.readlines():
route = line.strip('\n' '\r')
route = line.split('>')
#startNode, endNode = route
startNode = ord(route[0])-65
endNode = ord(route[1])-65
# Debug (this comment was for my use to explain below the print values)
print 'Route Entered:'
print line
print startNode, ',', endNode, '\n'
return[startNode, endNode]
However I am having slight trouble doing the conversion nicely, because the text file only contains one line at the moment but ideally I need it to be able to support more than one line and run an amount of code for each line.
For example it could contain:
B>F
A>D
C>F
E>D
So I would want to run the same code outside this function 4 times with the different inputs
Anyone able to give me a hand
Edit:
Not sure I made my issue that clear, sorry
What I need it do it parse the text file (possibly containing one line or multiple lines like above. I am able to do it for one line with the lines
startNode = ord(route[0])-65
endNode = ord(route[1])-65
But I get errors when trying to do more than one line because the ord() is expecting different inputs
If I have (below) in the route.txt
B>F
A>D
This is the error it gives me:
line 43, in readRoute endNode = ord(route[1])-65
TypeError: ord() expected a character, but string of length 2 found
My code above should read the route.txt file and see that B>F is the first route, strip the '>' - convert the B & F to ASCII, so 66 & 70 respectively then minus 65 from both to give 1 & 5 (in this example)
The 1 & 5 are corresponding indexes for another "array" (list of lists) to do computations and other things on
Once the other code has completed it can then go to the next line in route.txt which could be A>D and perform the above again
Perhaps this will work for you. I turned the fileread into a generator so you can do as you please with the parsed results in the for-i loop.
def readRoute(file_name):
with open(file_name, 'r') as r:
for line in r:
yield (ord(line[0])-65, ord(line[2])-65)
filename = 'route.txt'
for startnode, endnode in readRoute(filename):
print startnode, endnode
If you can't change readRoute, change the contents of the file before each call. Better yet, make readRoute take the filename as a parameter (default it to 'route.txt' to preserve the current behavior) so you can have it process other files.
What about something like this? It takes the routes defined in your file and turns them into path objects with start and end member variables. As an added bonus PathManager.readFile() allows you to load multiple route files without overwriting the existing paths.
import re
class Path:
def __init__(self, start, end):
self.start = ord(start) - 65 # Scale the values as desired
self.end = ord(end) - 65 # Scale the values as desired
class PathManager:
def __init__(self):
self.expr = re.compile("^([A-Za-z])[>]([A-Za-z])$") # looks for string "C>C"
# where C is a char
self.paths = []
def do_logic_routine(self, start, end):
# Do custom logic here that will execute before the next line is read
# Return True for 'continue reading' or False to stop parsing file
return True
def readFile(self, path):
file = open(path,"r")
for line in file:
item = self.expr.match(line.strip()) # strip whitespaces before parsing
if item:
'''
item.group(0) is *not* used here; it matches the whole expression
item.group(1) matches the first parenthesis in the regular expression
item.group(2) matches the second
'''
self.paths.append(Path(item.group(1), item.group(2)))
if not do_logic_routine(self.paths[-1].start, self.paths[-1].end):
break
# Running the example
MyManager = PathManager()
MyManager.readFile('route.txt')
for path in MyManager.paths:
print "Start: %s End: %s" % (path.start, path.end)
Output is:
Start: 1 End: 5
Start: 0 End: 3
Start: 2 End: 5
Start: 4 End: 3