Python #define equivalent - python

I'm developing a Hebrew python library for my toddler, who doesn't speak English yet. So far I've managed to make it work (function names and variables work fine). The problem is with 'if', 'while', 'for', etc. statements. if this were C++, for ex., I'd use
#define if אם
are there are any alternatives for #define in Python?
****EDIT*****
For now a quick and dirty solution works for me; instead of running the program I run this code:
def RunReady(Path):
source = open(Path, 'rb')
program = source.read().decode()
output = open('curr.py', 'wb')
program = program.replace('כל_עוד', 'while')
program = program.replace('עבור', 'for')
program = program.replace('אם', 'if')
program = program.replace(' ב ', ' in ')
program = program.replace('הגדר', 'def')
program = program.replace('אחרת', 'else')
program = program.replace('או', 'or')
program = program.replace('וגם', 'and')
output.write(program.encode('utf-8'))
output.close()
source.close()
import curr
current_file = 'Sapir_1.py'
RunReady(current_file)

Python 3 has 33 keywords of which only a few are used by beginners:
['False', 'None', 'True', 'and', 'as', 'assert', 'break', 'case', 'class', 'continue', 'def', 'default', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'match', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']
Given that Python doesn't support renaming keywords it's probably easier to teach a few of these keywords along with teaching programming.

How about if you add the #define stuff then run the c preprocessor (but not the compiler) which will give you a python source.

Related

find words that can be made from a string in python

im fairly new to python and im not sure how to tackle my problem
im trying to make a program that can take a string of 15 characters from a .txt file and find words that you can make from those characters with a dictionary file, than output those words to another text file.
this is what i have tried:
attempting to find words that don't contain the characters and removing them from the list
various anagram solver type programs of git hub
i tried this sudo pip3 install anagram-solverbut it has been 3 hours on 15 characters and it is still running
im new so please tell me if im forgetting something
If you're looking for "perfect" anagrams, i.e. those that contain exactly the same number of characters, not a subset, it's pretty easy:
take your word-to-find, sort it by its letters
take your dictionary, sort each word by its letters
if the sorted versions match, they're anagrams
def find_anagrams(seek_word):
sorted_seek_word = sorted(seek_word.lower())
for word in open("/usr/share/dict/words"):
word = word.strip() # remove trailing newline
sorted_word = sorted(word.lower())
if sorted_word == sorted_seek_word and word != seek_word:
print(seek_word, word)
if __name__ == "__main__":
find_anagrams("begin")
find_anagrams("nicer")
find_anagrams("decor")
prints (on my macOS machine – Windows machines won't have /usr/share/dict/words by default, and some Linux distributions need it installed separately)
begin being
begin binge
nicer cerin
nicer crine
decor coder
decor cored
decor Credo
EDIT
A second variation that finds all words that are assemblable from the letters in the original word, using collections.Counter:
import collections
def find_all_anagrams(seek_word):
seek_word_counter = collections.Counter(seek_word.lower())
for word in open("/usr/share/dict/words"):
word = word.strip() # remove trailing newline
word_counter = collections.Counter(word.strip())
if word != seek_word and all(
n <= seek_word_counter[l] for l, n in word_counter.items()
):
yield word
if __name__ == "__main__":
print("decoration", set(find_all_anagrams("decoration")))
Outputs e.g.
decoration {'carte', 'drona', 'roit', 'oat', 'cantred', 'rond', 'rid', 'centroid', 'trine', 't', 'tenai', 'cond', 'toroid', 'recon', 'contra', 'dain', 'cootie', 'iao', 'arctoid', 'oner', 'indart', 'tine', 'nace', 'rident', 'cerotin', 'cran', 'eta', 'eoan', 'cardoon', 'tone', 'trend', 'trinode', 'coaid', 'ranid', 'rein', 'end', 'actine', 'ide', 'cero', 'iodate', 'corn', 'oer', 'retia', 'nidor', 'diter', 'drat', 'tec', 'tic', 'creat', 'arent', 'coon', 'doater', 'ornoite', 'terna', 'docent', 'tined', 'edit', 'octroi', 'eric', 'read', 'toned', 'c', 'tera', 'can', 'rocta', 'cortina', 'adonite', 'iced', 'no', 'natr', 'net', 'oe', 'rodeo', 'actor', 'otarine', 'on', 'cretin', 'ericad', 'dance', 'tornade', 'tinea', 'coontie', 'anerotic', 'acrite', 'ra', 'danio', 'inroad', 'inde', 'tied', 'tar', 'coronae', 'tid', 'rad', 'doc', 'derat', 'tea', 'acerin', 'ronde', 'recti', 'areito', 'drain', 'odontic', 'octoad', 'rio', 'actin', 'tread', 'rect', 'ariot', 'road', 'doctrine', 'enactor', 'indoor', 'toco', 'ton', 'trice', 'norite', 'nea', 'coda', 'noria', 'rot', 'trona', 'rice', 'arite', 'eria', 'orad', 'rate', 'toed', 'enact', 'crinet', 'cento', 'arid', 'coot', 'nat', 'nar', 'cain', 'at', 'antired', 'ear', 'triode', 'doter', 'cedarn', 'orna', 'rand', 'tari', 'crea', 'tiar', 'retan', 'tire', 'cora', 'aroid', 'iron', 'tenio', 'enroot', 'd', 'oaric', 'acetin', 'tain', 'neat', 'noter', 'tien', 'aortic', 'tode', 'dicer', 'irate', 'tie', 'canid', 'ado', 'noticer', 'arn', 'nacre', 'ceration', 'ratine', 'denaro', 'cotoin', 'aint', 'canto', 'cinter', 'decani', 'roon', 'donor', 'acnode', 'aide', 'doer', 'tacnode', 'oread', 'acetoin', 'rine', 'acton', 'conoid', 'a', 'otocrane', 'norate', 'care', 'ticer', 'io', 'detain', 'cedar', 'ta', 'toadier', 'atone', 'cornet', 'dacoit', 'toric', 'orate', 'arni', 'adroit', 'rend', 'tanier', 'rooted', 'doit', 'dier', 'odorate', 'trica', 'rated', 'cotonier', 'dine', 'roid', 'cairned', 'cat', 'i', 'coin', 'octine', 'trod', 'orc', 'cardo', 'eniac', 'arenoid', 'erd', 'creant', 'oda', 'ratio', 'ceria', 'ad', 'acorn', 'dorn', 'deric', 'credit', 'door', 'cinder', 'cantor', 'er', 'doon', 'coner', 'donate', 'roe', 'tora', 'antic', 'racoon', 'ooid', 'noa', 'tae', 'coroa', 'earn', 'retain', 'canted', 'norie', 'rota', 'tao', 'redan', 'rondo', 'entia', 'ctenoid', 'cent', 'daroo', 'inrooted', 'roed', 'adore', 'coat', 'e', 'rat', 'deair', 'arend', 'coir', 'acid', 'coronate', 'rodent', 'acider', 'iota', 'codo', 'redaction', 'cot', 'aeric', 'tonic', 'candier', 'decart', 'dicta', 'dot', 'recoat', 'caroon', 'rone', 'tarie', 'tarin', 'teca', 'oar', 'ocrea', 'ante', 'creation', 'tore', 'conto', 'tairn', 'roc', 'conter', 'coeditor', 'certain', 'roncet', 'decator', 'not', 'coatie', 'toran', 'caid', 'redia', 'root', 'cad', 'cartoon', 'n', 'coed', 'cand', 'neo', 'coronadite', 'dare', 'dartoic', 'acoin', 'detar', 'dite', 'trade', 'train', 'ordinate', 'racon', 'citron', 'dan', 'doat', 'nito', 'tercia', 'rote', 'cooer', 'acone', 'rita', 'caret', 'dern', 'enatic', 'too', 'cried', 'tade', 'dit', 'orient', 'ria', 'torn', 'coati', 'cnida', 'note', 'tried', 'acrid', 'nitro', 'acron', 'tern', 'one', 'it', 'naio', 'dor', 'ea', 'ca', 'ire', 'inert', 'orcanet', 'cine', 'coe', 'nardoo', 'deota', 'den', 'toi', 'adion', 'to', 'rite', 'nectar', 'rane', 'riant', 'cod', 'de', 'adit', 'airt', 'ie', 'retin', 'toon', 'cane', 'aeon', 'are', 'cointer', 'actioner', 'crin', 'detrain', 'art', 'cant', 'ort', 'tored', 'antoeci', 'tier', 'cite', 'onto', 'coater', 'tranced', 'atonic', 'roi', 'in', 'roan', 'decoat', 'rain', 'cronet', 'ronco', 'dont', 'citer', 'redact', 'cider', 'nor', 'octan', 'ration', 'doina', 'rie', 'aero', 'noted', 'crate', 'crain', 'cadet', 'condite', 'ran', 'odeon', 'date', 'eat', 'intoed', 'cation', 'carone', 'ratoon', 'retina', 'tiao', 'nice', 'nodi', 'codon', 'coo', 'torc', 'dent', 'entad', 'ne', 'toe', 'dae', 'decant', 'redcoat', 'coiner', 'irade', 'air', 'oint', 'coronet', 'radon', 'ce', 'octonare', 'oaten', 'citrean', 'dice', 'dancer', 'carotid', 'cretion', 'don', 'cion', 'nei', 'tead', 'nori', 'nacrite', 'ootid', 'rancid', 'dornic', 'orenda', 'cairn', 'aroon', 'coardent', 'aider', 'notice', 'cored', 'adorn', 'tad', 'carid', 'otic', 'dian', 'od', 'dint', 'tercio', 'die', 'conred', 'tice', 'rant', 'candor', 'anti', 'dar', 'antre', 'cornea', 'ordain', 'corona', 'recta', 'redo', 'tare', 'coranto', 'action', 'caird', 'creta', 'naid', 'tri', 'acre', 'crane', 'coated', 'citronade', 'anoetic', 'tenor', 'anode', 'triad', 'ceratoid', 'rod', 'idea', 'carton', 'cortin', 'endaortic', 'dicot', 'tend', 'da', 'tod', 'erotica', 'cord', 'coreid', 'toader', 'dace', 'tan', 'editor', 'rection', 'toner', 'cone', 'ni', 'tide', 'coder', 'din', 'ocote', 'ore', 'daer', 'octane', 'darn', 'do', 'reit', 'na', 'catenoid', 'tron', 'condor', 'crinated', 'cordon', 'crone', 'toad', 'noir', 'into', 'tirade', 'nadir', 'ant', 'ade', 'droit', 'icon', 'drone', 'ared', 'cardin', 'nid', 'dire', 'orcin', 'donator', 'rani', 'tane', 'ace', 'iodo', 'doria', 'ride', 'eon', 'ornate', 'cedrat', 'aire', 'carotin', 'dation', 'tear', 'onca', 'cote', 'taroc', 'con', 'nod', 'dinero', 'ecad', 'recant', 'ae', 'octad', 'cor', 'doctor', 'acridone', 'neti', 'cordite', 'crotin', 'aneroid', 'diota', 'coorie', 'dita', 'aconite', 'nard', 'cadent', 'ectad', 'rance', 'rea', 'tai', 'denat', 'rood', 'acne', 'decan', 'ani', 'rit', 'cit', 'cetin', 'odor', 'acorned', 'iceroot', 'inro', 'crood', 'daric', 'dacite', 'trone', 'acier', 'reina', 'oncia', 'drant', 'acrodont', 'nacred', 'cotrine', 'dinar', 'tean', 'atoner', 'toorie', 'nadorite', 'cardon', 'taen', 'tin', 'conte', 'acoine', 'dater', 'diact', 'aid', 'anodic', 'coronated', 'direct', 're', 'era', 'anticor', 'triace', 'octoid', 'dao', 'corta', 'edict', 'trode', 'ode', 'orant', 'niter', 'centrad', 'cater', 'tronc', 'coronad', 'r', 'toro', 'ar', 'once', 'ora', 'trace', 'creodont', 'erotic', 'ai', 'troca', 'ion', 'tecon', 'tra', 'acor', 'radio', 'acred', 'croon', 'tricae', 'recto', 'riden', 'andorite', 'taro', 'red', 'dear', 'ate', 'tinder', 'trin', 'deacon', 'ardent', 'aer', 'arc', 'crine', 'dart', 'diet', 'riot', 'tanrec', 'tor', 'noetic', 'ret', 'trance', 'ona', 'rind', 'coto', 'daoine', 'teind', 'toa', 'inter', 'code', 'cart', 'aion', 'detin', 'core', 'oont', 'rent', 'cedrin', 'card', 'trained', 'o', 'recoin', 'cro', 'and', 'diner', 'id', 'cordant', 'cedron', 'ditone', 'odic', 'cadi', 'cerin', 'nit', 'ecoid', 'nide', 'ean', 'andric', 'tind', 'raid', 'crena', 'oroide', 'roadite', 'canter', 'idant', 'cade', 'race', 'ten', 'caner', 'tarn', 'cooter', 'etna', 'tornadic', 'irone', 'ice', 'en', 'oord', 'oared', 'draine', 'cordate', 'react', 'reaction', 'tornado', 'troco', 'niota', 'carotenoid', 'an', 'cader', 'naric', 'car', 'centiar', 'ti', 'cearin', 'aroint', 'crined', 'iter', 'di', 'or', 'trio', 'dari', 'oration', 'orcein', 'coned', 'odorant', 'dean', 'coadore', 'cate', 'drate', 'dirten', 'ted', 'done', 'cadre', 'ocean', 'tired', 'adet', 'dirt', 'te', 'nae', 'ceti', 'cern', 'rotan', 'doe', 'roto', 'dote', 'node', 'ait', 'act', 'canoe', 'rode'}

why does this it say in the console Process finished with exit code 0 instead of printing the 'sen' variable? [duplicate]

This question already has answers here:
How to check if type of a variable is string? [duplicate]
(22 answers)
Closed 2 years ago.
import random
import sys
def v1_debug(v1, subject):
if v1 != str and subject != str:
sys.exit()
else:
if subject == 'He' or 'She' or 'It':
for i in v1:
if i == [len(v1)+1]:
if i == 's' or 'z' or 'x' or 'o':
v1 = v1 + 'es'
elif i == 'y':
v1 = v1 - 'y' + 'ies'
elif v1[len(v1)] == 's' and v1[len(v1)+1] == 'h':
v1 = v1 + 'es'
elif v1[len(v1)] == 'c' and v1[len(v1)+1] == 'h':
v1 = v1 + 'es'
if subject == 'I' or 'You' or 'We' or 'They':
for i in v1:
if i == v1[len(v1)+1]:
v1 = v1 + 'ing'
return ''
def default_positive_form():
try:
sbj = ['He',
'She',
'It',
'I',
'You',
'We',
'They']
v1 = ['be',
'beat',
'become',
'begin',
'bend',
'bet',
'bid',
'bite',
'blow',
'break',
'bring',
'build',
'burn',
'buy',
'catch',
'choose',
'come',
'cost',
'cut',
'dig',
'dive',
'do',
'draw',
'dream',
'drive',
'drink',
'eat',
'fall',
'feel',
'fight',
'find',
'fly',
'forget',
'forgive',
'freeze',
'get',
'give',
'go',
'grow',
'hang',
'have',
'hear',
'hide',
'hit',
'hold',
'hurt',
'keep',
'know',
'lay',
'lead',
'leave',
'lend',
'let',
'lie',
'lose',
'make',
'mean',
'meet',
'pay',
'put',
'read',
'ride',
'ring',
'rise',
'run',
'say',
'see',
'sell',
'send',
'show',
'shut',
'sing',
'sit',
'sleep',
'speak',
'spend',
'stand',
'swim',
'take',
'teach',
'tear',
'tell',
'think',
'throw',
'understand',
'wake',
'wear',
'win',
'write']
sbj = random.choice(sbj)
v1 = random.choice(v1)
verb_debug = v1_debug(v1, sbj)
sen = ''
if sbj == 'I':
sen = sbj + 'am' + verb_debug
elif sbj == 'He' or 'She' or 'It':
sen = sbj + 'is' + verb_debug
elif sbj == 'You' or 'We' or 'They':
sen = sbj + 'are' + verb_debug
print(f'{sen}')
except NameError:
print('this is bullshit')
return
default_positive_form()
this is python 3.8
sen will only consist of an empty string if none of the conditions of your if/elif/elif blocks are met. Change the print line to
print(f"sen is: {sen}")
But that's not the real problem. obj != str does not check if obj is a string, it checks to see if the object is pointing to the type constant str (Thanks Charles Duffy for the comment). Instead, use the builtin function isinstance() like so:
if not isinstance(v1, str) and not isinstance(subject, str):
print("Variables are the wrong type!")
sys.exit()

Is there a tool to check what names I have used from a "wildly" imported module?

I've been using python to do computations for my research. In an effort to clean up my terrible code, I've been reading Code Like a Pythonista: Idiomatic Python by David Goodger.
In this article, Goodger advises against "wild-card" imports of the form
from module import *
My code uses a lot of these. I'd like to clean my code up, but I'm not sure how. I'm curious if there is a way to check what names from module I have used in my code. This way I could either explicitly import these names or replace the names with module.name. Is there a tool designed to accomplish such a task?
Use a tool like pyflakes (which you should use anyway) to note which names in your code become undefined after you replace from module import * with import module. Once you've positively identified every instance of a name imported from module, you can assess whether to
Always use import module and module.x for x imported from module.
Use import module as m and m.x if the module name is long.
Selectively import some names from module into the module namespace with from module import x, y, z
The above three are not mutually exclusive; as an extreme example, you can use all three in the same module:
import really.long.module.name
import really.long.module.name as rlmn
from really.long.module.name import obvious_name
really.long.module.name.foo() # full module name
rlmn.bar() # module alias
obvious_name() # imported name
all in the same code. (I don't recommend using all three in the same module. Stick with either the full module name or the alias throughout a single module, but there is no harm importing common, obvious names directly and using the fully qualified name for more obscure module attributes.)
Overview
One approach is to:
Manually (or even automatically) identify each of the modules you've imported from using the * approach,
Import them in a separate file,
And then do a search-and-replace if they appear in sys.modules[<module>].__dict__, which keeps track of which objects from a Python module have been loaded.
See for yourself sys.modules in action:
from numpy import *
import sys
sys.modules['numpy'].__dict__.keys() # will display everything you just imported from `numpy`
>>> ['disp', 'union1d', 'all', 'issubsctype', 'savez', 'atleast_2d', 'restoredot', 'ptp', 'PackageLoader', 'ix_', 'mirr', 'blackman', 'FLOATING_POINT_SUPPORT', 'division', 'busdaycalendar', 'pkgload', 'void', 'ubyte', 'moveaxis', 'ERR_RAISE', 'void0', 'tri', 'diag_indices', 'array_equal', 'fmod', 'True_', 'indices', 'loads', 'round', 'set_numeric_ops', 'pmt', 'nanstd', '_mat', 'cosh', 'object0', 'argpartition', 'FPE_OVERFLOW', 'index_exp', 'append', 'compat', 'nanargmax', 'hstack', 'typename', 'diag', 'rollaxis', 'ERR_WARN', 'polyfit', 'version', 'memmap', 'nan_to_num', 'complex64', 'fmax', 'spacing', 'sinh', '__git_revision__', 'unicode_', 'sinc', 'trunc', 'vstack', 'ERR_PRINT', 'asscalar', 'copysign', 'less_equal', 'BUFSIZE', 'object_', 'divide', 'csingle', 'dtype', 'unsignedinteger', 'fastCopyAndTranspose', 'bitwise_and', 'uintc', 'select', 'deg2rad', 'nditer', 'eye', 'kron', 'newbuffer', 'negative', 'busday_offset', 'mintypecode', 'MAXDIMS', 'sort', 'einsum', 'uint0', 'zeros_like', 'int_asbuffer', 'uint8', 'chararray', 'linspace', 'resize', 'uint64', 'ma', 'true_divide', 'Inf', 'finfo', 'triu_indices', 'complex256', 'add_newdoc', 'seterrcall', 'logical_or', 'minimum', 'WRAP', 'tan', 'absolute', 'MAY_SHARE_EXACT', 'numarray', 'array_repr', 'get_array_wrap', 'polymul', 'tile', 'array_str', 'setdiff1d', 'sin', 'longlong', 'product', 'int16', 'str_', 'mat', 'fv', 'max', 'asanyarray', 'uint', 'npv', 'logaddexp', 'flatnonzero', 'amin', 'correlate', 'fromstring', 'left_shift', 'searchsorted', 'int64', 'may_share_memory', 'dsplit', 'intersect1d', 'can_cast', 'ppmt', 'show_config', 'cumsum', 'roots', 'outer', 'CLIP', 'fix', 'busday_count', 'timedelta64', 'degrees', 'choose', 'FPE_INVALID', 'recfromcsv', 'fill_diagonal', 'empty_like', 'logaddexp2', 'greater', 'histogram2d', 'polyint', 'arctan2', 'datetime64', 'complexfloating', 'ndindex', 'ctypeslib', 'PZERO', 'isfortran', 'asfarray', 'nanmedian', 'radians', 'fliplr', 'alen', 'recarray', 'modf', 'mean', 'square', 'ogrid', 'MAY_SHARE_BOUNDS', 'nanargmin', 'r_', 'diag_indices_from', 'hanning', 's_', 'allclose', 'extract', 'float16', 'ulonglong', 'matrix', 'asarray', 'poly1d', 'promote_types', 'rec', 'datetime_as_string', 'uint32', 'math', 'log2', '__builtins__', 'cumproduct', 'diagonal', 'atleast_1d', 'meshgrid', 'column_stack', 'put', 'byte', 'remainder', 'row_stack', 'expm1', 'nper', 'ndfromtxt', 'matmul', 'place', 'DataSource', 'newaxis', 'arccos', 'signedinteger', 'ndim', 'rint', 'number', 'rank', 'little_endian', 'ldexp', 'lookfor', 'array', 'vsplit', 'common_type', 'size', 'logical_xor', 'geterrcall', 'sometrue', 'exp2', 'bool8', 'msort', 'alltrue', 'zeros', 'False_', '__NUMPY_SETUP__', 'nansum', 'bool_', 'inexact', 'nanpercentile', 'broadcast', 'copyto', 'short', 'arctanh', 'typecodes', 'rot90', 'savetxt', 'sign', 'int_', 'std', 'not_equal', 'fromfunction', 'tril_indices_from', '__config__', 'double', 'require', 'rate', 'typeNA', 'str', 'getbuffer', 'abs', 'clip', 'savez_compressed', 'frompyfunc', 'triu_indices_from', 'conjugate', 'alterdot', 'asfortranarray', 'binary_repr', 'angle', 'lib', 'min', 'unwrap', 'apply_over_axes', 'ERR_LOG', 'right_shift', 'take', 'broadcast_to', 'byte_bounds', 'trace', 'warnings', 'any', 'shares_memory', 'compress', 'histogramdd', 'issubclass_', 'multiply', 'mask_indices', 'amax', 'logical_not', 'average', 'partition', 'nbytes', 'exp', 'sum', 'dot', 'int0', 'nanprod', 'longfloat', 'random', 'setxor1d', 'copy', 'FPE_UNDERFLOW', 'frexp', 'errstate', 'nanmin', 'swapaxes', 'SHIFT_OVERFLOW', 'infty', 'fft', 'ModuleDeprecationWarning', 'digitize', '__file__', 'NZERO', 'ceil', 'ones', 'add_newdoc_ufunc', '_NoValue', 'deprecate', 'median', 'geterr', 'convolve', 'isreal', 'where', 'isfinite', 'SHIFT_UNDERFLOW', 'MachAr', 'argmax', 'testing', 'deprecate_with_doc', 'full', 'polyder', 'rad2deg', 'isnan', '__all__', 'irr', 'sctypeDict', 'NINF', 'min_scalar_type', 'count_nonzero', 'sort_complex', 'nested_iters', 'concatenate', 'vdot', 'bincount', 'transpose', 'array2string', 'corrcoef', 'fromregex', 'vectorize', 'set_printoptions', 'isrealobj', 'trim_zeros', 'unravel_index', 'cos', 'float64', 'log1p', 'ushort', 'equal', 'cumprod', 'float_', 'vander', 'geterrobj', 'load', 'fromiter', 'poly', 'bitwise_or', 'polynomial', 'diff', 'iterable', 'array_split', 'get_include', 'pv', 'tensordot', 'piecewise', 'invert', 'UFUNC_PYVALS_NAME', 'SHIFT_INVALID', 'c_', 'flexible', 'pi', '__doc__', 'empty', 'VisibleDeprecationWarning', 'find_common_type', 'isposinf', 'arcsin', 'sctypeNA', 'imag', 'sctype2char', 'singlecomplex', 'SHIFT_DIVIDEBYZERO', 'matrixlib', 'apply_along_axis', 'reciprocal', 'tanh', 'dstack', 'cov', 'cast', 'logspace', 'packbits', 'issctype', 'mgrid', 'longdouble', 'signbit', 'conj', 'asmatrix', 'inf', 'flatiter', 'bitwise_xor', 'fabs', 'generic', 'reshape', 'NaN', 'cross', 'sqrt', '__package__', 'longcomplex', 'complex', 'pad', 'split', 'floor_divide', '__version__', 'format_parser', 'nextafter', 'polyval', 'flipud', 'i0', 'iscomplexobj', 'sys', 'mafromtxt', 'bartlett', 'polydiv', 'stack', 'identity', 'safe_eval', 'greater_equal', 'Tester', 'trapz', 'PINF', 'object', 'recfromtxt', 'oldnumeric', 'add_newdocs', 'RankWarning', 'ascontiguousarray', 'less', 'putmask', 'UFUNC_BUFSIZE_DEFAULT', 'unicode', 'half', 'NAN', 'absolute_import', 'typeDict', '__path__', 'shape', 'setbufsize', 'cfloat', 'RAISE', 'isscalar', 'character', 'bench', 'source', 'add', 'uint16', 'cbrt', 'bool', 'ufunc', 'save', 'ravel', 'float32', 'real', 'int32', 'tril_indices', 'around', 'lexsort', 'complex_', 'ComplexWarning', 'unicode0', 'ipmt', '_import_tools', 'atleast_3d', 'isneginf', 'integer', 'unique', 'mod', 'insert', 'bitwise_not', 'getbufsize', 'array_equiv', 'arange', 'asarray_chkfinite', 'in1d', 'interp', 'hypot', 'logical_and', 'get_printoptions', 'diagflat', 'float128', 'nonzero', 'kaiser', 'ERR_IGNORE', 'polysub', 'fromfile', 'prod', 'nanmax', 'core', 'who', 'seterrobj', 'power', 'bytes_', 'percentile', 'FPE_DIVIDEBYZERO', '__name__', 'subtract', 'print_function', 'nanmean', 'frombuffer', 'iscomplex', 'add_docstring', 'argsort', 'fmin', 'ones_like', 'is_busday', 'arcsinh', 'intc', 'float', 'ndenumerate', 'intp', 'unpackbits', 'Infinity', 'log', 'cdouble', 'complex128', 'long', 'round_', 'broadcast_arrays', 'inner', 'var', 'sctypes', 'log10', 'uintp', 'linalg', 'histogram', 'issubdtype', 'maximum_sctype', 'squeeze', 'int8', 'info', 'seterr', 'argmin', 'genfromtxt', 'maximum', 'record', 'obj2sctype', 'clongdouble', 'euler_gamma', 'arccosh', 'delete', 'tril', 'int', 'ediff1d', 'char', 'single', 'loadtxt', 'hsplit', 'ScalarType', 'triu', 'floating', 'expand_dims', 'floor', 'polyadd', 'nan', 'TooHardError', 'emath', 'arctan', 'bmat', 'isclose', 'ERR_DEFAULT', 'test', 'roll', 'string0', 'compare_chararrays', 'iinfo', 'real_if_close', 'repeat', 'nanvar', 'hamming', 'ALLOW_THREADS', 'ravel_multi_index', 'string_', 'isinf', 'ndarray', 'e', 'ERR_CALL', 'datetime_data', 'clongfloat', 'full_like', 'result_type', 'gradient', 'base_repr', 'argwhere', 'set_string_function']
You can either manually check if a function name you're not sure of appears here using if <function_name> in sys.modules[<module>].__dict__, or you can write a neat automated script that goes through each entry in sys.modules[<module>].
I would favour the latter for anything too sophisticated, and the former for diagnostic purposes.
Rough Implementation of Automatic Tool
A very, very, very quick-and-dirty example of how to write such an automated script:
import re
import sys
with open('file_I_want_to_change.py', 'r+') as f:
file_contents = f.read() # get the entire file as a string
search_string = r"from ([a-zA-Z]+) import *" # regex to find all loaded module names
module_names = re.findall(search_string, file_contents)
map(__import__, module_names) # import ALL of these modules names at once
for module in module_names:
for function_name in sys.modules[module].__dict__:
# do a very quick-and-dirty replace-all
file_contents = file_contents.replace(function_name, "{0}.{1}".format(module, function_name))
f.seek(0) # move to start of file
f.write(file_contents)
This is not very robust, and you shouldn't use it as-is! You may find yourself overwriting names not from the module but that are defined anyway.
It's probably best to allow some form of user interaction to confirm you want to apply a change for each function name found. But it gets the gist across.
This has been tested with the following simple example file:
from numpy import *
array([1])
becomes
from numpy import *
numpy.array([1])
EDIT: I have since created a much more robust and useful command line utility here
Here's a simpler solution. It uses the ast module to strip the code out of a file and then compares it to the list of available functions found by the inspect module. Just replace the yourfilename and the yourmodulename before running.
import ast, inspect, yourmodulename as mymodule
filename='yourfilename.py'
tab = ' '*4
funcs = {m[0] for m in inspect.getmembers(mymodule)
if str(m[1])[1:].split(' ')[0] in ('function', 'class') and
inspect.getmodule(m[1]) == mymodule}
with open(filename) as f:
code = ast.parse(f.read())
words = {node.id for node in ast.walk(code) if isinstance(node, ast.Name)}
print('from', mymodule.__name__, 'import (')
out = tab
for word in (', '.join((sorted(words & funcs)))+')').split():
if len(out + word) > 80:
print(out.rstrip())
out = tab + word + ' '
else:
out += word + ' '
if out:
print(out.rstrip())
Edit: I checked it against my 3k+ line script with dozens of functions and it works.
Edit: Added check to make sure that It's only listing functions that are actually part of the desired module, not its submodules. Also, modified to make it list both classes and functions to import.

Finding word context with regular expressions

I have created a function to search for the contexts of a given word(w) in a text, with left and right as parameters for flexibility in the number of words to record.
import re
def get_context (text, w, left, right):
text.insert (0, "*START*")
text.append ("*END*")
all_contexts = []
for i in range(len(text)):
if re.match(w,text[i], 0):
if i < left:
context_left = text[:i]
else:
context_left = text[i-left:i]
if len(text) < (i+right):
context_right = text[i:]
else:
context_right = text[i:(i+right+1)]
context = context_left + context_right
all_contexts.append(context)
return all_contexts
So for example if a have a text in the form of a list like this:
text = ['Python', 'is', 'dynamically', 'typed', 'language', 'Python',
'functions', 'really', 'care', 'about', 'what', 'you', 'pass', 'to',
'them', 'but', 'you', 'got', 'it', 'the', 'wrong', 'way', 'if', 'you',
'want', 'to', 'pass', 'one', 'thousand', 'arguments', 'to', 'your',
'function', 'then', 'you', 'can', 'explicitly', 'define', 'every',
'parameter', 'in', 'your', 'function', 'definition', 'and', 'your',
'function', 'will', 'be', 'automagically', 'able', 'to', 'handle',
'all', 'the', 'arguments', 'you', 'pass', 'to', 'them', 'for', 'you']
The function works fine for example:
get_context(text, "function",2,2)
[['language', 'python', 'functions', 'really', 'care'], ['to', 'your', 'function', 'then', 'you'], ['in', 'your', 'function', 'definition', 'and'], ['and', 'your', 'function', 'will', 'be']]
Now I am trying to build a dictionary with the contexts of every word in the text doing the following:
d = {}
for w in set(text):
d[w] = get_context(text,w,2,2)
But I am getting this error.
Traceback (most recent call last):
File "<pyshell#32>", line 2, in <module>
d[w] = get_context(text,w,2,2)
File "<pyshell#20>", line 9, in get_context
if re.match(w,text[i], 0):
File "/usr/lib/python3.4/re.py", line 160, in match
return _compile(pattern, flags).match(string)
File "/usr/lib/python3.4/re.py", line 294, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/lib/python3.4/sre_compile.py", line 568, in compile
p = sre_parse.parse(p, flags)
File "/usr/lib/python3.4/sre_parse.py", line 760, in parse
p = _parse_sub(source, pattern, 0)
File "/usr/lib/python3.4/sre_parse.py", line 370, in _parse_sub
itemsappend(_parse(source, state))
File "/usr/lib/python3.4/sre_parse.py", line 579, in _parse
raise error("nothing to repeat")
sre_constants.error: nothing to repeat
I don't understand this error. Can anyone help me with this?
The problem is that "*START*" and "*END*" are being interpreted as regex. Also, note that inserting "*START*" and "*END*" in text in the begging of the function will cause problem. You should do it just once.
Here is a complete version of the working code:
import re
def get_context(text, w, left, right):
all_contexts = []
for i in range(len(text)):
if re.match(w,text[i], 0):
if i < left:
context_left = text[:i]
else:
context_left = text[i-left:i]
if len(text) < (i+right):
context_right = text[i:]
else:
context_right = text[i:(i+right+1)]
context = context_left + context_right
all_contexts.append(context)
return all_contexts
text = ['Python', 'is', 'dynamically', 'typed', 'language',
'Python', 'functions', 'really', 'care', 'about', 'what',
'you', 'pass', 'to', 'them', 'but', 'you', 'got', 'it', 'the',
'wrong', 'way', 'if', 'you', 'want', 'to', 'pass', 'one',
'thousand', 'arguments', 'to', 'your', 'function', 'then',
'you', 'can', 'explicitly', 'define', 'every', 'parameter',
'in', 'your', 'function', 'definition', 'and', 'your',
'function', 'will', 'be', 'automagically', 'able', 'to', 'handle',
'all', 'the', 'arguments', 'you', 'pass', 'to', 'them', 'for', 'you']
text.insert(0, "START")
text.append("END")
d = {}
for w in set(text):
d[w] = get_context(text,w,2,2)
Maybe you can replace re.match(w,text[i], 0) with w == text[i].
The whole thing can be re-written very succinctly follows,
text = 'Python is dynamically typed language Python functions really care about what you pass to them but you got it the wrong way if you want to pass one thousand arguments to your function then you can explicitly define every parameter in your function definition and your function will be automagically able to handle all the arguments you pass to them for you'
Keeping it a str, assuming context = 'function',
pat = re.compile(r'(\w+\s\w+\s)functions?(?=(\s\w+\s\w+))')
pat.findall(text)
[('language Python ', ' really care'),
('to your ', ' then you'),
('in your ', ' definition and'),
('and your ', ' will be')]
Now, minor customization will be needed in the regex to allow for, words like say, functional or functioning not only function or functions. But the important idea is to do away with indexing and go more functional.
Please comment if this doesn't work out for you, when you apply it in bulk.
At least one of the elements in text contains characters that are special in a regular expression. If you're just trying to find whether the word is in the string, just use str.startswith, i.e.
if text[i].startswith(w): # instead of re.match(w,text[i], 0):
But I don't understand why you are checking for that anyways, and not for equality.

How to open and search large txt files in Python/ flask

So i am currently making a flask app that is a part of speech tagger, and part of the app uses a couple of txt files to check if a word is a noun or a verb, by seeing if that word is in the file. for example, here is my object I use for that:
class Word_Ref (object):
#used for part of speech tagging, and word look up.
def __init__(self, selection):
if selection == 'Verbs':
wordfile = open('Verbs.txt', 'r')
wordstring = wordfile.read()
self.reference = wordstring.split()
wordfile.close()
elif selection == 'Nouns':
wordfile = open('Nouns.txt', 'r')
wordstring = wordfile.read()
self.reference = wordstring.split()
wordfile.close()
elif selection == 'Adjectives':
wordfile = open('Adjectives.txt', 'r')
wordstring = wordfile.read()
self.reference = wordstring.split()
wordfile.close()
elif selection == 'Adverbs':
wordfile = open('Adverbs.txt', 'r')
wordstring = wordfile.read()
self.reference = wordstring.split()
wordfile.close()
elif selection == 'Pronouns':
self.reference = ['i', 'me', 'my', 'mine', 'myself', 'you', 'your', 'yours', 'yourself', 'he', 'she', 'it', 'him', 'her'
'his', 'hers', 'its', 'himself', 'herself', 'itself', 'we', 'us', 'our', 'ours', 'ourselves',
'they', 'them', 'their', 'theirs', 'themselves', 'that', 'this']
elif selection == 'Coord_Conjunc':
self.reference = ['for', 'and', 'nor', 'but', 'or', 'yet', 'so']
elif selection == 'Be_Verbs':
self.reference = ['is', 'was', 'are', 'were', 'could', 'should', 'would', 'be', 'can', 'cant', 'cannot'
'does', 'do', 'did', 'am', 'been']
elif selection == 'Subord_Conjunc':
self.reference = ['as', 'after', 'although', 'if', 'how', 'till', 'unless', 'until', 'since', 'where', 'when'
'whenever', 'where', 'wherever', 'while', 'though', 'who', 'because', 'once', 'whereas'
'before']
elif selection =='Prepositions':
self.reference = ['on', 'at', 'in']
else:
raise ReferenceError('Must choose a valid reference library.')
def __contains__(self, other):
if other[-1] == ',':
return other[:-1] in self.reference
else:
return other in self.reference
And then here is my flask app py document:
from flask import Flask, render_template, request
from POS_tagger import *
app = Flask(__name__)
#app.route('/', methods=['GET', 'POST'])
def index(result=None):
if request.args.get('mail', None):
retrieved_text = request.args['mail']
result = process_text(retrieved_text)
return render_template('index.html', result=result)
def process_text(text):
elem = Sentence(text)
tag = tag_pronouns(elem)
tag = tag_preposition(tag)
tag = tag_be_verbs(tag)
tag = tag_coord_conj(tag)
tag = tag_subord_conj(tag)
tagged = package_sentence(tag)
new = str(tagged)
return new
if __name__ == '__main__':
app.run()
So, when ever the process_text function in the flask app uses any function that uses open() and then .read(), it causes an internal server, even if I use it with the Word_Ref object or not. Also, I also tested this with a txt file with 3 lines, and it still caused the same internal server error. All the other functions of my POS_tagger work within the flask app, and all of these functions, even the open() work in the interpreter.
Any alternate solution to the open() way of looking in txt files for this purpose?
EDIT: here are the tracebacks:
File "/Users/Josh/PycharmProjects/Informineer/POS_tagger.py", line 174, in tag_avna
adverbs = Word_Ref('Adverbs')
File "/Users/Josh/PycharmProjects/Informineer/POS_tagger.py", line 91, in __init__
wordfile = open('Adverbs.txt', 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'Adverbs.txt'
The txt files are in the same directory though as the flask app
Maybe try something like this in your Flask app.py program :
import os
_dir = os.path.abspath(os.path.dirname(__file__))
adverb_file = os.path.join(_dir, 'Adverbs.txt')
You may need to modify depending on where you want _dir to point to but it will be a bit more dynamic.
Also consider using a Context Manager for File IO. It will condense the code a bit and also guarantees that the file is closed in case of Exceptions, etc.
For example:
with open(adverb_file, 'r') as wordfile:
wordstring = wordfile.read()
self.reference = wordstring.split()

Categories