Finding word context with regular expressions - python
I have created a function to search for the contexts of a given word(w) in a text, with left and right as parameters for flexibility in the number of words to record.
import re
def get_context (text, w, left, right):
text.insert (0, "*START*")
text.append ("*END*")
all_contexts = []
for i in range(len(text)):
if re.match(w,text[i], 0):
if i < left:
context_left = text[:i]
else:
context_left = text[i-left:i]
if len(text) < (i+right):
context_right = text[i:]
else:
context_right = text[i:(i+right+1)]
context = context_left + context_right
all_contexts.append(context)
return all_contexts
So for example if a have a text in the form of a list like this:
text = ['Python', 'is', 'dynamically', 'typed', 'language', 'Python',
'functions', 'really', 'care', 'about', 'what', 'you', 'pass', 'to',
'them', 'but', 'you', 'got', 'it', 'the', 'wrong', 'way', 'if', 'you',
'want', 'to', 'pass', 'one', 'thousand', 'arguments', 'to', 'your',
'function', 'then', 'you', 'can', 'explicitly', 'define', 'every',
'parameter', 'in', 'your', 'function', 'definition', 'and', 'your',
'function', 'will', 'be', 'automagically', 'able', 'to', 'handle',
'all', 'the', 'arguments', 'you', 'pass', 'to', 'them', 'for', 'you']
The function works fine for example:
get_context(text, "function",2,2)
[['language', 'python', 'functions', 'really', 'care'], ['to', 'your', 'function', 'then', 'you'], ['in', 'your', 'function', 'definition', 'and'], ['and', 'your', 'function', 'will', 'be']]
Now I am trying to build a dictionary with the contexts of every word in the text doing the following:
d = {}
for w in set(text):
d[w] = get_context(text,w,2,2)
But I am getting this error.
Traceback (most recent call last):
File "<pyshell#32>", line 2, in <module>
d[w] = get_context(text,w,2,2)
File "<pyshell#20>", line 9, in get_context
if re.match(w,text[i], 0):
File "/usr/lib/python3.4/re.py", line 160, in match
return _compile(pattern, flags).match(string)
File "/usr/lib/python3.4/re.py", line 294, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/lib/python3.4/sre_compile.py", line 568, in compile
p = sre_parse.parse(p, flags)
File "/usr/lib/python3.4/sre_parse.py", line 760, in parse
p = _parse_sub(source, pattern, 0)
File "/usr/lib/python3.4/sre_parse.py", line 370, in _parse_sub
itemsappend(_parse(source, state))
File "/usr/lib/python3.4/sre_parse.py", line 579, in _parse
raise error("nothing to repeat")
sre_constants.error: nothing to repeat
I don't understand this error. Can anyone help me with this?
The problem is that "*START*" and "*END*" are being interpreted as regex. Also, note that inserting "*START*" and "*END*" in text in the begging of the function will cause problem. You should do it just once.
Here is a complete version of the working code:
import re
def get_context(text, w, left, right):
all_contexts = []
for i in range(len(text)):
if re.match(w,text[i], 0):
if i < left:
context_left = text[:i]
else:
context_left = text[i-left:i]
if len(text) < (i+right):
context_right = text[i:]
else:
context_right = text[i:(i+right+1)]
context = context_left + context_right
all_contexts.append(context)
return all_contexts
text = ['Python', 'is', 'dynamically', 'typed', 'language',
'Python', 'functions', 'really', 'care', 'about', 'what',
'you', 'pass', 'to', 'them', 'but', 'you', 'got', 'it', 'the',
'wrong', 'way', 'if', 'you', 'want', 'to', 'pass', 'one',
'thousand', 'arguments', 'to', 'your', 'function', 'then',
'you', 'can', 'explicitly', 'define', 'every', 'parameter',
'in', 'your', 'function', 'definition', 'and', 'your',
'function', 'will', 'be', 'automagically', 'able', 'to', 'handle',
'all', 'the', 'arguments', 'you', 'pass', 'to', 'them', 'for', 'you']
text.insert(0, "START")
text.append("END")
d = {}
for w in set(text):
d[w] = get_context(text,w,2,2)
Maybe you can replace re.match(w,text[i], 0) with w == text[i].
The whole thing can be re-written very succinctly follows,
text = 'Python is dynamically typed language Python functions really care about what you pass to them but you got it the wrong way if you want to pass one thousand arguments to your function then you can explicitly define every parameter in your function definition and your function will be automagically able to handle all the arguments you pass to them for you'
Keeping it a str, assuming context = 'function',
pat = re.compile(r'(\w+\s\w+\s)functions?(?=(\s\w+\s\w+))')
pat.findall(text)
[('language Python ', ' really care'),
('to your ', ' then you'),
('in your ', ' definition and'),
('and your ', ' will be')]
Now, minor customization will be needed in the regex to allow for, words like say, functional or functioning not only function or functions. But the important idea is to do away with indexing and go more functional.
Please comment if this doesn't work out for you, when you apply it in bulk.
At least one of the elements in text contains characters that are special in a regular expression. If you're just trying to find whether the word is in the string, just use str.startswith, i.e.
if text[i].startswith(w): # instead of re.match(w,text[i], 0):
But I don't understand why you are checking for that anyways, and not for equality.
Related
Output will not become a string from a list
import os import random file = open('getty.txt') filetext = file.read() def getline(words,length): ans=[] total=0 while (length>total) and 0 != len(words): word=words.pop(0) total += len(word)+1 #add 1 for the space ans.append(word) #now we are one word too long if total > length: words.insert(0,ans.pop()) return ans def printPara(words,length): line = [] spaces = [] while len(words) != 0: line.append(getline(words, length)) for z in range(0,len(line)): for i in range(0,len(line[z])): spaces = [[1] * len(line[i]) for i in range (len(line))] for p in range (0,len(spaces)): spaces[p][len(spaces[p])-1] = 0 if len(words) + len(spaces) != 0: addSpace(line,spaces,length) printLine(line,spaces) else: printLine(line,spaces) def addSpace(line,spaces,length): totalInt = 0 for i in range (0, len(line)): totalInt = (len(spaces[i])-2) + len(line[i]) while length < totalInt: num = random.randint(0, len(spaces) - 2) spaces[num] += 1 return spaces def printLine(line, spaces): for i in range (len(line)): print(str(line[i]) + (' ' * len(spaces[i]))) def main(): length = 75 textparagraph = filetext.split("\n\n") para = [0] * len(textparagraph) for i in range (0, len(textparagraph)): para[i] = textparagraph[i] words = [[0] * len(textparagraph) for i in range(len(para))] for b in range (0,len(para)): words[b] = para[b].split() for z in range (0, len(para)): printPara(words[z],length) main() My code outputs only lists of the separate lines and will not concatenate the two lists of words and spaces. How would I get it to output correctly? Some exampes of output. ['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'fathers', 'brought', 'forth', 'on', 'this'] ['continent,', 'a', 'new', 'nation,', 'conceived', 'in', 'Liberty,', 'and', 'dedicated', 'to', 'the'] ['proposition', 'that', 'all', 'men', 'are', 'created', 'equal.'] ['Now', 'we', 'are', 'engaged', 'in', 'a', 'great', 'civil', 'war,', 'testing', 'whether', 'that', 'nation,', 'or'] ['any', 'nation', 'so', 'conceived', 'and', 'so', 'dedicated,', 'can', 'long', 'endure.', 'We', 'are', 'met', 'on', 'a'] Expected output "Four score and seven years ago..."
You can use " ".join(["hello", "world"]) and you'll get "hello world"
find words that can be made from a string in python
im fairly new to python and im not sure how to tackle my problem im trying to make a program that can take a string of 15 characters from a .txt file and find words that you can make from those characters with a dictionary file, than output those words to another text file. this is what i have tried: attempting to find words that don't contain the characters and removing them from the list various anagram solver type programs of git hub i tried this sudo pip3 install anagram-solverbut it has been 3 hours on 15 characters and it is still running im new so please tell me if im forgetting something
If you're looking for "perfect" anagrams, i.e. those that contain exactly the same number of characters, not a subset, it's pretty easy: take your word-to-find, sort it by its letters take your dictionary, sort each word by its letters if the sorted versions match, they're anagrams def find_anagrams(seek_word): sorted_seek_word = sorted(seek_word.lower()) for word in open("/usr/share/dict/words"): word = word.strip() # remove trailing newline sorted_word = sorted(word.lower()) if sorted_word == sorted_seek_word and word != seek_word: print(seek_word, word) if __name__ == "__main__": find_anagrams("begin") find_anagrams("nicer") find_anagrams("decor") prints (on my macOS machine – Windows machines won't have /usr/share/dict/words by default, and some Linux distributions need it installed separately) begin being begin binge nicer cerin nicer crine decor coder decor cored decor Credo EDIT A second variation that finds all words that are assemblable from the letters in the original word, using collections.Counter: import collections def find_all_anagrams(seek_word): seek_word_counter = collections.Counter(seek_word.lower()) for word in open("/usr/share/dict/words"): word = word.strip() # remove trailing newline word_counter = collections.Counter(word.strip()) if word != seek_word and all( n <= seek_word_counter[l] for l, n in word_counter.items() ): yield word if __name__ == "__main__": print("decoration", set(find_all_anagrams("decoration"))) Outputs e.g. decoration {'carte', 'drona', 'roit', 'oat', 'cantred', 'rond', 'rid', 'centroid', 'trine', 't', 'tenai', 'cond', 'toroid', 'recon', 'contra', 'dain', 'cootie', 'iao', 'arctoid', 'oner', 'indart', 'tine', 'nace', 'rident', 'cerotin', 'cran', 'eta', 'eoan', 'cardoon', 'tone', 'trend', 'trinode', 'coaid', 'ranid', 'rein', 'end', 'actine', 'ide', 'cero', 'iodate', 'corn', 'oer', 'retia', 'nidor', 'diter', 'drat', 'tec', 'tic', 'creat', 'arent', 'coon', 'doater', 'ornoite', 'terna', 'docent', 'tined', 'edit', 'octroi', 'eric', 'read', 'toned', 'c', 'tera', 'can', 'rocta', 'cortina', 'adonite', 'iced', 'no', 'natr', 'net', 'oe', 'rodeo', 'actor', 'otarine', 'on', 'cretin', 'ericad', 'dance', 'tornade', 'tinea', 'coontie', 'anerotic', 'acrite', 'ra', 'danio', 'inroad', 'inde', 'tied', 'tar', 'coronae', 'tid', 'rad', 'doc', 'derat', 'tea', 'acerin', 'ronde', 'recti', 'areito', 'drain', 'odontic', 'octoad', 'rio', 'actin', 'tread', 'rect', 'ariot', 'road', 'doctrine', 'enactor', 'indoor', 'toco', 'ton', 'trice', 'norite', 'nea', 'coda', 'noria', 'rot', 'trona', 'rice', 'arite', 'eria', 'orad', 'rate', 'toed', 'enact', 'crinet', 'cento', 'arid', 'coot', 'nat', 'nar', 'cain', 'at', 'antired', 'ear', 'triode', 'doter', 'cedarn', 'orna', 'rand', 'tari', 'crea', 'tiar', 'retan', 'tire', 'cora', 'aroid', 'iron', 'tenio', 'enroot', 'd', 'oaric', 'acetin', 'tain', 'neat', 'noter', 'tien', 'aortic', 'tode', 'dicer', 'irate', 'tie', 'canid', 'ado', 'noticer', 'arn', 'nacre', 'ceration', 'ratine', 'denaro', 'cotoin', 'aint', 'canto', 'cinter', 'decani', 'roon', 'donor', 'acnode', 'aide', 'doer', 'tacnode', 'oread', 'acetoin', 'rine', 'acton', 'conoid', 'a', 'otocrane', 'norate', 'care', 'ticer', 'io', 'detain', 'cedar', 'ta', 'toadier', 'atone', 'cornet', 'dacoit', 'toric', 'orate', 'arni', 'adroit', 'rend', 'tanier', 'rooted', 'doit', 'dier', 'odorate', 'trica', 'rated', 'cotonier', 'dine', 'roid', 'cairned', 'cat', 'i', 'coin', 'octine', 'trod', 'orc', 'cardo', 'eniac', 'arenoid', 'erd', 'creant', 'oda', 'ratio', 'ceria', 'ad', 'acorn', 'dorn', 'deric', 'credit', 'door', 'cinder', 'cantor', 'er', 'doon', 'coner', 'donate', 'roe', 'tora', 'antic', 'racoon', 'ooid', 'noa', 'tae', 'coroa', 'earn', 'retain', 'canted', 'norie', 'rota', 'tao', 'redan', 'rondo', 'entia', 'ctenoid', 'cent', 'daroo', 'inrooted', 'roed', 'adore', 'coat', 'e', 'rat', 'deair', 'arend', 'coir', 'acid', 'coronate', 'rodent', 'acider', 'iota', 'codo', 'redaction', 'cot', 'aeric', 'tonic', 'candier', 'decart', 'dicta', 'dot', 'recoat', 'caroon', 'rone', 'tarie', 'tarin', 'teca', 'oar', 'ocrea', 'ante', 'creation', 'tore', 'conto', 'tairn', 'roc', 'conter', 'coeditor', 'certain', 'roncet', 'decator', 'not', 'coatie', 'toran', 'caid', 'redia', 'root', 'cad', 'cartoon', 'n', 'coed', 'cand', 'neo', 'coronadite', 'dare', 'dartoic', 'acoin', 'detar', 'dite', 'trade', 'train', 'ordinate', 'racon', 'citron', 'dan', 'doat', 'nito', 'tercia', 'rote', 'cooer', 'acone', 'rita', 'caret', 'dern', 'enatic', 'too', 'cried', 'tade', 'dit', 'orient', 'ria', 'torn', 'coati', 'cnida', 'note', 'tried', 'acrid', 'nitro', 'acron', 'tern', 'one', 'it', 'naio', 'dor', 'ea', 'ca', 'ire', 'inert', 'orcanet', 'cine', 'coe', 'nardoo', 'deota', 'den', 'toi', 'adion', 'to', 'rite', 'nectar', 'rane', 'riant', 'cod', 'de', 'adit', 'airt', 'ie', 'retin', 'toon', 'cane', 'aeon', 'are', 'cointer', 'actioner', 'crin', 'detrain', 'art', 'cant', 'ort', 'tored', 'antoeci', 'tier', 'cite', 'onto', 'coater', 'tranced', 'atonic', 'roi', 'in', 'roan', 'decoat', 'rain', 'cronet', 'ronco', 'dont', 'citer', 'redact', 'cider', 'nor', 'octan', 'ration', 'doina', 'rie', 'aero', 'noted', 'crate', 'crain', 'cadet', 'condite', 'ran', 'odeon', 'date', 'eat', 'intoed', 'cation', 'carone', 'ratoon', 'retina', 'tiao', 'nice', 'nodi', 'codon', 'coo', 'torc', 'dent', 'entad', 'ne', 'toe', 'dae', 'decant', 'redcoat', 'coiner', 'irade', 'air', 'oint', 'coronet', 'radon', 'ce', 'octonare', 'oaten', 'citrean', 'dice', 'dancer', 'carotid', 'cretion', 'don', 'cion', 'nei', 'tead', 'nori', 'nacrite', 'ootid', 'rancid', 'dornic', 'orenda', 'cairn', 'aroon', 'coardent', 'aider', 'notice', 'cored', 'adorn', 'tad', 'carid', 'otic', 'dian', 'od', 'dint', 'tercio', 'die', 'conred', 'tice', 'rant', 'candor', 'anti', 'dar', 'antre', 'cornea', 'ordain', 'corona', 'recta', 'redo', 'tare', 'coranto', 'action', 'caird', 'creta', 'naid', 'tri', 'acre', 'crane', 'coated', 'citronade', 'anoetic', 'tenor', 'anode', 'triad', 'ceratoid', 'rod', 'idea', 'carton', 'cortin', 'endaortic', 'dicot', 'tend', 'da', 'tod', 'erotica', 'cord', 'coreid', 'toader', 'dace', 'tan', 'editor', 'rection', 'toner', 'cone', 'ni', 'tide', 'coder', 'din', 'ocote', 'ore', 'daer', 'octane', 'darn', 'do', 'reit', 'na', 'catenoid', 'tron', 'condor', 'crinated', 'cordon', 'crone', 'toad', 'noir', 'into', 'tirade', 'nadir', 'ant', 'ade', 'droit', 'icon', 'drone', 'ared', 'cardin', 'nid', 'dire', 'orcin', 'donator', 'rani', 'tane', 'ace', 'iodo', 'doria', 'ride', 'eon', 'ornate', 'cedrat', 'aire', 'carotin', 'dation', 'tear', 'onca', 'cote', 'taroc', 'con', 'nod', 'dinero', 'ecad', 'recant', 'ae', 'octad', 'cor', 'doctor', 'acridone', 'neti', 'cordite', 'crotin', 'aneroid', 'diota', 'coorie', 'dita', 'aconite', 'nard', 'cadent', 'ectad', 'rance', 'rea', 'tai', 'denat', 'rood', 'acne', 'decan', 'ani', 'rit', 'cit', 'cetin', 'odor', 'acorned', 'iceroot', 'inro', 'crood', 'daric', 'dacite', 'trone', 'acier', 'reina', 'oncia', 'drant', 'acrodont', 'nacred', 'cotrine', 'dinar', 'tean', 'atoner', 'toorie', 'nadorite', 'cardon', 'taen', 'tin', 'conte', 'acoine', 'dater', 'diact', 'aid', 'anodic', 'coronated', 'direct', 're', 'era', 'anticor', 'triace', 'octoid', 'dao', 'corta', 'edict', 'trode', 'ode', 'orant', 'niter', 'centrad', 'cater', 'tronc', 'coronad', 'r', 'toro', 'ar', 'once', 'ora', 'trace', 'creodont', 'erotic', 'ai', 'troca', 'ion', 'tecon', 'tra', 'acor', 'radio', 'acred', 'croon', 'tricae', 'recto', 'riden', 'andorite', 'taro', 'red', 'dear', 'ate', 'tinder', 'trin', 'deacon', 'ardent', 'aer', 'arc', 'crine', 'dart', 'diet', 'riot', 'tanrec', 'tor', 'noetic', 'ret', 'trance', 'ona', 'rind', 'coto', 'daoine', 'teind', 'toa', 'inter', 'code', 'cart', 'aion', 'detin', 'core', 'oont', 'rent', 'cedrin', 'card', 'trained', 'o', 'recoin', 'cro', 'and', 'diner', 'id', 'cordant', 'cedron', 'ditone', 'odic', 'cadi', 'cerin', 'nit', 'ecoid', 'nide', 'ean', 'andric', 'tind', 'raid', 'crena', 'oroide', 'roadite', 'canter', 'idant', 'cade', 'race', 'ten', 'caner', 'tarn', 'cooter', 'etna', 'tornadic', 'irone', 'ice', 'en', 'oord', 'oared', 'draine', 'cordate', 'react', 'reaction', 'tornado', 'troco', 'niota', 'carotenoid', 'an', 'cader', 'naric', 'car', 'centiar', 'ti', 'cearin', 'aroint', 'crined', 'iter', 'di', 'or', 'trio', 'dari', 'oration', 'orcein', 'coned', 'odorant', 'dean', 'coadore', 'cate', 'drate', 'dirten', 'ted', 'done', 'cadre', 'ocean', 'tired', 'adet', 'dirt', 'te', 'nae', 'ceti', 'cern', 'rotan', 'doe', 'roto', 'dote', 'node', 'ait', 'act', 'canoe', 'rode'}
why does this it say in the console Process finished with exit code 0 instead of printing the 'sen' variable? [duplicate]
This question already has answers here: How to check if type of a variable is string? [duplicate] (22 answers) Closed 2 years ago. import random import sys def v1_debug(v1, subject): if v1 != str and subject != str: sys.exit() else: if subject == 'He' or 'She' or 'It': for i in v1: if i == [len(v1)+1]: if i == 's' or 'z' or 'x' or 'o': v1 = v1 + 'es' elif i == 'y': v1 = v1 - 'y' + 'ies' elif v1[len(v1)] == 's' and v1[len(v1)+1] == 'h': v1 = v1 + 'es' elif v1[len(v1)] == 'c' and v1[len(v1)+1] == 'h': v1 = v1 + 'es' if subject == 'I' or 'You' or 'We' or 'They': for i in v1: if i == v1[len(v1)+1]: v1 = v1 + 'ing' return '' def default_positive_form(): try: sbj = ['He', 'She', 'It', 'I', 'You', 'We', 'They'] v1 = ['be', 'beat', 'become', 'begin', 'bend', 'bet', 'bid', 'bite', 'blow', 'break', 'bring', 'build', 'burn', 'buy', 'catch', 'choose', 'come', 'cost', 'cut', 'dig', 'dive', 'do', 'draw', 'dream', 'drive', 'drink', 'eat', 'fall', 'feel', 'fight', 'find', 'fly', 'forget', 'forgive', 'freeze', 'get', 'give', 'go', 'grow', 'hang', 'have', 'hear', 'hide', 'hit', 'hold', 'hurt', 'keep', 'know', 'lay', 'lead', 'leave', 'lend', 'let', 'lie', 'lose', 'make', 'mean', 'meet', 'pay', 'put', 'read', 'ride', 'ring', 'rise', 'run', 'say', 'see', 'sell', 'send', 'show', 'shut', 'sing', 'sit', 'sleep', 'speak', 'spend', 'stand', 'swim', 'take', 'teach', 'tear', 'tell', 'think', 'throw', 'understand', 'wake', 'wear', 'win', 'write'] sbj = random.choice(sbj) v1 = random.choice(v1) verb_debug = v1_debug(v1, sbj) sen = '' if sbj == 'I': sen = sbj + 'am' + verb_debug elif sbj == 'He' or 'She' or 'It': sen = sbj + 'is' + verb_debug elif sbj == 'You' or 'We' or 'They': sen = sbj + 'are' + verb_debug print(f'{sen}') except NameError: print('this is bullshit') return default_positive_form() this is python 3.8
sen will only consist of an empty string if none of the conditions of your if/elif/elif blocks are met. Change the print line to print(f"sen is: {sen}") But that's not the real problem. obj != str does not check if obj is a string, it checks to see if the object is pointing to the type constant str (Thanks Charles Duffy for the comment). Instead, use the builtin function isinstance() like so: if not isinstance(v1, str) and not isinstance(subject, str): print("Variables are the wrong type!") sys.exit()
How to open and search large txt files in Python/ flask
So i am currently making a flask app that is a part of speech tagger, and part of the app uses a couple of txt files to check if a word is a noun or a verb, by seeing if that word is in the file. for example, here is my object I use for that: class Word_Ref (object): #used for part of speech tagging, and word look up. def __init__(self, selection): if selection == 'Verbs': wordfile = open('Verbs.txt', 'r') wordstring = wordfile.read() self.reference = wordstring.split() wordfile.close() elif selection == 'Nouns': wordfile = open('Nouns.txt', 'r') wordstring = wordfile.read() self.reference = wordstring.split() wordfile.close() elif selection == 'Adjectives': wordfile = open('Adjectives.txt', 'r') wordstring = wordfile.read() self.reference = wordstring.split() wordfile.close() elif selection == 'Adverbs': wordfile = open('Adverbs.txt', 'r') wordstring = wordfile.read() self.reference = wordstring.split() wordfile.close() elif selection == 'Pronouns': self.reference = ['i', 'me', 'my', 'mine', 'myself', 'you', 'your', 'yours', 'yourself', 'he', 'she', 'it', 'him', 'her' 'his', 'hers', 'its', 'himself', 'herself', 'itself', 'we', 'us', 'our', 'ours', 'ourselves', 'they', 'them', 'their', 'theirs', 'themselves', 'that', 'this'] elif selection == 'Coord_Conjunc': self.reference = ['for', 'and', 'nor', 'but', 'or', 'yet', 'so'] elif selection == 'Be_Verbs': self.reference = ['is', 'was', 'are', 'were', 'could', 'should', 'would', 'be', 'can', 'cant', 'cannot' 'does', 'do', 'did', 'am', 'been'] elif selection == 'Subord_Conjunc': self.reference = ['as', 'after', 'although', 'if', 'how', 'till', 'unless', 'until', 'since', 'where', 'when' 'whenever', 'where', 'wherever', 'while', 'though', 'who', 'because', 'once', 'whereas' 'before'] elif selection =='Prepositions': self.reference = ['on', 'at', 'in'] else: raise ReferenceError('Must choose a valid reference library.') def __contains__(self, other): if other[-1] == ',': return other[:-1] in self.reference else: return other in self.reference And then here is my flask app py document: from flask import Flask, render_template, request from POS_tagger import * app = Flask(__name__) #app.route('/', methods=['GET', 'POST']) def index(result=None): if request.args.get('mail', None): retrieved_text = request.args['mail'] result = process_text(retrieved_text) return render_template('index.html', result=result) def process_text(text): elem = Sentence(text) tag = tag_pronouns(elem) tag = tag_preposition(tag) tag = tag_be_verbs(tag) tag = tag_coord_conj(tag) tag = tag_subord_conj(tag) tagged = package_sentence(tag) new = str(tagged) return new if __name__ == '__main__': app.run() So, when ever the process_text function in the flask app uses any function that uses open() and then .read(), it causes an internal server, even if I use it with the Word_Ref object or not. Also, I also tested this with a txt file with 3 lines, and it still caused the same internal server error. All the other functions of my POS_tagger work within the flask app, and all of these functions, even the open() work in the interpreter. Any alternate solution to the open() way of looking in txt files for this purpose? EDIT: here are the tracebacks: File "/Users/Josh/PycharmProjects/Informineer/POS_tagger.py", line 174, in tag_avna adverbs = Word_Ref('Adverbs') File "/Users/Josh/PycharmProjects/Informineer/POS_tagger.py", line 91, in __init__ wordfile = open('Adverbs.txt', 'r') FileNotFoundError: [Errno 2] No such file or directory: 'Adverbs.txt' The txt files are in the same directory though as the flask app
Maybe try something like this in your Flask app.py program : import os _dir = os.path.abspath(os.path.dirname(__file__)) adverb_file = os.path.join(_dir, 'Adverbs.txt') You may need to modify depending on where you want _dir to point to but it will be a bit more dynamic. Also consider using a Context Manager for File IO. It will condense the code a bit and also guarantees that the file is closed in case of Exceptions, etc. For example: with open(adverb_file, 'r') as wordfile: wordstring = wordfile.read() self.reference = wordstring.split()
Python #define equivalent
I'm developing a Hebrew python library for my toddler, who doesn't speak English yet. So far I've managed to make it work (function names and variables work fine). The problem is with 'if', 'while', 'for', etc. statements. if this were C++, for ex., I'd use #define if אם are there are any alternatives for #define in Python? ****EDIT***** For now a quick and dirty solution works for me; instead of running the program I run this code: def RunReady(Path): source = open(Path, 'rb') program = source.read().decode() output = open('curr.py', 'wb') program = program.replace('כל_עוד', 'while') program = program.replace('עבור', 'for') program = program.replace('אם', 'if') program = program.replace(' ב ', ' in ') program = program.replace('הגדר', 'def') program = program.replace('אחרת', 'else') program = program.replace('או', 'or') program = program.replace('וגם', 'and') output.write(program.encode('utf-8')) output.close() source.close() import curr current_file = 'Sapir_1.py' RunReady(current_file)
Python 3 has 33 keywords of which only a few are used by beginners: ['False', 'None', 'True', 'and', 'as', 'assert', 'break', 'case', 'class', 'continue', 'def', 'default', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'match', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield'] Given that Python doesn't support renaming keywords it's probably easier to teach a few of these keywords along with teaching programming.
How about if you add the #define stuff then run the c preprocessor (but not the compiler) which will give you a python source.