I am trying to decrypt a file in Python that I encrypted with another program. Some letters are correctly decrypted while others are not. I am not sure what is going on. All I essentially did was reverse the code for the decryption files. I think it has to do with the way it is iterating through the text, but I am not sure how to fix it.
Here is my decryption code:
decryption_library = {'%':'A','9':'a','#':'B','#':'b','1':'C','2':'c','3':'D','4':'d',
'5':'E','6':'e','7':'F','8':'f','0':'G','}':'g','{':'H',']':'h','[':'I',',':'i',
'.':'J','>':'j','<':'K','/':'k','0':'L','\-':'l','\"':'M',':':'m',';':'N',
'+':'n','$':'O','-':'o','$':'Q','%':'q','^':'R','&':'r','*':'S',
'(':'s',')':'T','~':'t','`':'U','5':'u','\\':'V','+':'v','=':'W','7':'w',
'~':'X',')':'x','2':'Y','*':'y',']':'Z','8':'z'}
orig_file = open('ENCRYPTED_Plain_Text_File.txt','r')
file_read = orig_file.read()
orig_file.close()
encrypt_file = open('DECRYPTED_Plain_Text_File.txt','w')
for ch in file_read:
if ch in decryption_library:
encrypt_file.write(decryption_library[ch])
else:
encrypt_file.write(ch)
encrypt_file.close()
encrypt_file = open('ENCRYPTED_Plain_Text_File.txt','r')
file_read = encrypt_file.read()
encrypt_file.close()
codes_items = decryption_library.items()
for ch in file_read:
if not ch in decryption_library.values() or ch == '.' or ch == ',' or ch == '!':
print(ch)
else:
for k,v in codes_items:
if ch == v and ch != '.':
print(k,end='')
Here is the encrypted text:
)]6 ^-94 ;-~ )9/6+
#2 ^$#5^) 7^$*)
)7- &-94( 4,+6&}64 ,+ 9 *6\-\--7 7--4,
%+4 (-&&* [ 2-5\-4 +-~ ~&9+6\- #-~]
%+4 #6 -+6 ~&9+6\-6&, \--+} [ (~--4
%+4 \---/64 4-7+ -+6 9( 89& 9( [ 2-5\-4
)- 7]6&6 ,~ #6+~ ,+ ~]6 5+46&}&-7~];
Here is what it should be:
The Road Not Taken
BY ROBERT FROST
Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;
Here is what it decrypts to:
xZe Road NoX xakev
BY RQBuRx wRQyx
xwo roads diverged iv a yeVoVoow woodi
qvd sorry I YouVod voX XraveVo boXZ
qvd be ove XraveVoeri Voovg I sXood
qvd Voooked dowv ove as zar as I YouVod
xo wZere iX bevX iv XZe uvdergrowXZN
Your decryption_library is not correct. F.e for index ')' you have value 'T' and also 'x'
Related
the following code is what I've tried to do so far:
import json
uids = {'483775843796': '"jared trav"','483843796': '"azu jared"', '483843996': '"hello azu"', '44384376': '"bitten virgo"', '48384326': '"bitten hello"', '61063868': '"charm feline voxela derp virgo"', '11136664': '"jessica"', '11485423': '"yukkixxtsuki"', '10401438': '"howen"', '29176667': '"zaku ramba char"', '36976082': '"bulma zelda dame prince"', '99661300': '"voxela"', '76923817': '"juniperrose"', '16179876': '"gnollfighter"', '45012369': '"pianist fuzz t travis blunt trav ttttttttttttttttttyt whole ryann lol tiper cuz"', '62797501': '"asriel"', '73647929': '"voxela"', '95019796': '"dao daoisms"', '70094978': '"mort"', '16233382': '"purrs"', '89270209': '"apocalevie waify"', '42873540': '"tear slash peaches attitude maso lyra juvia innocent"', '61284894': '"pup"', '68487075': '"ninja"', '66451758': '"az"', '23492247': '"vegeta"', '77980169': '"virus"'}
def _whois(string):
a = []
for i in uids:
i = json.loads(uids[i])
i = i.split()
if string in i:
a += i
for i in uids:
i = json.loads(uids[i])
i = i.split()
if bool(set(i) & set(a)) == True:
a += i
return list(set(a))
def whois(string):
a = []
ret = _whois(string)
for i in ret:
a += _whois(i)
return list(set(a))
print(whois("charm"))
I am trying to match a search term with accounts that share an id with the term in it, and then match each of those other accounts that are with the id to other accounts on other ids and so on and basically see all of the linked accounts that start from a single term.
For example, if I searched "charm" it would return: "charm feline voxela derp virgo bitten hello" from the example uids above.
After a certain way down the line of connected accounts it stops matching. How would I successfully do this so that it matches all accounts potentially infinitely?
i think i got it to work:
import json
terms = {'4837759863453450996': '"mamma riyoken"','4833480984509580996': '"mamma heika"','483775980980996': '"nemo heika"','4867568843796': '"control nemo"','4956775843796': '"t control"','483775843796': '"jared trav"','483843796': '"azu jared"', '483843996': '"hello azu"', '44384376': '"bitten virgo"', '48384326': '"bitten hello"', '61063868': '"charm feline voxela derp virgo"', '11136664': '"jessica"', '11485423': '"yukkixxtsuki"', '10401438': '"howen"', '29176667': '"zaku ramba char"', '36976082': '"bulma zelda dame prince"', '99661300': '"voxela"', '76923817': '"juniperrose"', '16179876': '"gnollfighter"', '45012369': '"pianist fuzz t travis blunt trav ttttttttttttttttttyt whole ryann lol tiper cuz"', '62797501': '"asriel"', '73647929': '"voxela"', '95019796': '"dao daoisms"', '70094978': '"mort"', '16233382': '"purrs"', '89270209': '"apocalevie waify"', '42873540': '"tear slash peaches attitude maso lyra juvia innocent"', '61284894': '"pup"', '68487075': '"ninja"', '66451758': '"az"', '23492247': '"vegeta"', '77980169': '"virus"'}
def _search(string):
a = []
for i in terms:
i = json.loads(terms[i])
i = i.split()
if string in i:
a += i
return list(set(a))
def search(string):
a = []
a.append(string)
while True:
l = len(a)
for n in a:
a += _search(n)
a = list(set(a))
if l == len(a):
break
return a
print(search("charm"))
Try this:
ids = {'483775843796': '"jared trav"','483843796': '"azu jared"', '483843996': '"hello azu"', '44384376': '"bitten virgo"', '48384326': '"bitten hello"', '61063868': '"charm feline voxela derp virgo"', '11136664': '"jessica"', '11485423': '"yukkixxtsuki"', '10401438': '"howen"', '29176667': '"zaku ramba char"', '36976082': '"bulma zelda dame prince"', '99661300': '"voxela"', '76923817': '"juniperrose"', '16179876': '"gnollfighter"', '45012369': '"pianist fuzz t travis blunt trav ttttttttttttttttttyt whole ryann lol tiper cuz"', '62797501': '"asriel"', '73647929': '"voxela"', '95019796': '"dao daoisms"', '70094978': '"mort"', '16233382': '"purrs"', '89270209': '"apocalevie waify"', '42873540': '"tear slash peaches attitude maso lyra juvia innocent"', '61284894': '"pup"', '68487075': '"ninja"', '66451758': '"az"', '23492247': '"vegeta"', '77980169': '"virus"'}
def find_word(word,dict):
for i,j in dict.items():
if word.lower() in j.lower():
print(i,j)
find_word('jared', ids)
Result:
483775843796 "jared trav"
483843796 "azu jared"
I am new to python and I want to convert a text file into json file.
Here's how it looks like:
#Q Three of these animals hibernate. Which one does not?
^ Sloth
A Mouse
B Sloth
C Frog
D Snake
#Q What is the literal translation of the Greek word Embioptera, which denotes an order of insects, also known as webspinners?
^ Lively wings
A Small wings
B None of these
C Yarn knitter
D Lively wings
#Q There is a separate species of scorpions which have two tails, with a venomous sting on each tail.
^ False
A True
B False
Contd
.
.
.
.
^ means Answer.
I want it in json format as shown below.
Example:
{
"questionBank": [
{
"question": "Grand Central Terminal, Park Avenue, New York is the worlds",
"a": "largest railway station",
"b": "Longest railway station",
"c": "highest railway station",
"d": "busiest railway station",
"answer": "largest railway station"
},
{
"question": "Eritrea, which became the 182nd member of the UN in 1993, is in the continent of",
"a": "Asia",
"b": "Africa",
"c": "Europe",
"d": "Oceania",
"answer": "Africa"
}, Contd.....
]
}
I came across a few similar posts and here's what I have tried:
dataset = "file.txt"
data = []
with open(dataset) as ds:
for line in ds:
line = line.strip().split(",")
print(line)
To which the output is:
['']
['#Q What part of their body do the insects from order Archaeognatha use to spring up into the air?']
['^ Tail']
['A Antennae']
['B Front legs']
['C Hind legs']
['D Tail']
['']
['#Q What is the literal translation of the Greek word Embioptera', ' which denotes an order of insects', ' also known as webspinners?']
['^ Lively wings']
['A Small wings']
['B None of these']
['C Yarn knitter']
['D Lively wings']
['']
Contd....
The sentences containing commas are separated by python lists. I tried to use .join but didn't get the results I was expecting.
Please let me know how to approach this.
dataset = "text.txt"
question_bank = []
with open(dataset) as ds:
for i, line in enumerate(ds):
line = line.strip("\n")
if len(line) == 0:
question_bank.append(question)
question = {}
elif line.startswith("#Q"):
question = {"question": line}
elif line.startswith("^"):
question['answer'] = line.split(" ")[1]
else:
key, val = line.split(" ", 1)
question[key] = val
question_bank.append(question)
print({"questionBank":question_bank})
#for storing json file to local directory
final_output = {"questionBank":question_bank}
with open("output.json", "w") as outfile:
outfile.write(json.dumps(final_output, indent=4))
Rather than handling the lines one at a time, I went with using a regex pattern approach.
This also more reliable as it will error out if the input data is in a bad format - rather than silently ignoring a grouping which is missing a field.
PATTERN = r"""[#]Q (?P<question>.+)\n\^ (?P<answer>.+)\nA (?P<option_a>.+)\nB (?P<option_b>.+)\n(?:C (?P<option_c>.+)\n)?(?:D (?P<option_d>.+))?"""
def parse_qa_group(qa_group):
"""
Extact question, answer and 2 to 4 options from input string and return as a dict.
"""
# "group" here is a set of question, answer and options.
matches = PATTERN.search(qa_group)
# "group" here is a regex group.
question = matches.group('question')
answer = matches.group('answer')
try:
c = matches.group('option_c')
except IndexError:
c = None
try:
d = matches.group('option_d')
except IndexError:
d = None
results = {
"question": question,
"answer": answer,
"a": matches.group('option_a'),
"b": matches.group('option_b')
}
if c:
results['c'] = c
if d:
results['d'] = d
return results
# Split into groups using the blank line.
qa_groups = question_answer_str.split('\n\n')
# Process each group, building up a list of all results.
all_results = [parse_qa_group(qa_group) for qa_group in qa_groups]
print(json.dumps(all_results, indent=4))
Further details in my gist. Read more on regex Grouping
I left out reading the text and writing a JSON file.
I want to find stand-alone or successively connected nouns in a text. I put together below code, but it is neither efficient nor pythonic. Does anybody have a more pythonic way of finding these nouns with spaCy?
Below code builds a dict with all tokens and then runs through them to find stand-alone or connected PROPN or NOUN until the for-loop runs out of range. It returns a list of the collected items.
def extract_unnamed_ents(doc):
"""Takes a string and returns a list of all succesively connected nouns or pronouns"""
nlp_doc = nlp(doc)
token_list = []
for token in nlp_doc:
token_dict = {}
token_dict['lemma'] = token.lemma_
token_dict['pos'] = token.pos_
token_dict['tag'] = token.tag_
token_list.append(token_dict)
ents = []
k = 0
for i in range(len(token_list)):
try:
if token_list[k]['pos'] == 'PROPN' or token_list[k]['pos'] == 'NOUN':
ent = token_list[k]['lemma']
if token_list[k+1]['pos'] == 'PROPN' or token_list[k+1]['pos'] == 'NOUN':
ent = ent + ' ' + token_list[k+1]['lemma']
k += 1
if token_list[k+1]['pos'] == 'PROPN' or token_list[k+1]['pos'] == 'NOUN':
ent = ent + ' ' + token_list[k+1]['lemma']
k += 1
if token_list[k+1]['pos'] == 'PROPN' or token_list[k+1]['pos'] == 'NOUN':
ent = ent + ' ' + token_list[k+1]['lemma']
k += 1
if token_list[k+1]['pos'] == 'PROPN' or token_list[k+1]['pos'] == 'NOUN':
ent = ent + ' ' + token_list[k+1]['lemma']
k += 1
if ent not in ents:
ents.append(ent)
except:
pass
k += 1
return ents
Test:
extract_unnamed_ents('Chancellor Angela Merkel and some of her ministers will discuss at a cabinet '
"retreat next week ways to avert driving bans in major cities after Germany's "
'top administrative court in February allowed local authorities to bar '
'heavily polluting diesel cars.')
Out:
['Chancellor Angela Merkel',
'minister',
'cabinet retreat',
'week way',
'ban',
'city',
'Germany',
'court',
'February',
'authority',
'diesel car']
spacy has a way of doing this but I'm not sure it is giving you exactly what you are after
import spacy
text = """Chancellor Angela Merkel and some of her ministers will discuss
at a cabinet retreat next week ways to avert driving bans in
major cities after Germany's top administrative court
in February allowed local authorities to bar heavily
polluting diesel cars.
""".replace('\n', ' ')
nlp = spacy.load("en_core_web_sm")
doc = nlp(text)
print([i.text for i in doc.noun_chunks])
gives
['Chancellor Angela Merkel', 'her ministers', 'a cabinet retreat', 'ways', 'driving bans', 'major cities', "Germany's top administrative court", 'February', 'local authorities', 'heavily polluting diesel cars']
Here, however the i.lemma_ line doesn't really give you what you want (I think this might be fixed by this recent PR).
Since it isn't quite what you are after you could use itertools.groupby like so
import itertools
out = []
for i, j in itertools.groupby(doc, key=lambda i: i.pos_):
if i not in ("PROPN", "NOUN"):
continue
out.append(' '.join(k.lemma_ for k in j))
print(out)
gives
['Chancellor Angela Merkel', 'minister', 'cabinet retreat', 'week way', 'ban', 'city', 'Germany', 'court', 'February', 'authority', 'diesel car']
This should give you exactly the same output as your function (the output is slightly different here but I believe this is due to different spacy versions).
If you are feeling really adventurous you could use a list comprehension
out = [' '.join(k.lemma_ for k in j)
for i, j in itertools.groupby(doc, key=lambda i: i.pos_)
if i in ("PROPN", "NOUN")]
Note I see slightly different results with different spacy versions. The output above is from spacy-2.1.8
(Excuse me for my English, I'm also new at this, so be gentle, thank you)
I'm trying to extract a logical statement(SWRL) from any possible sentence that contains actions and conditions
This is the kind of logical statement I'd like to obtain:
IF (CONDITION) THEN (ACTION | NOT ACTION | ACTION OR NOT ACTION)
I've been trying to apply some NLP techniques with Spacy and Stanford NLP library, but my lack of knowledge about grammatical English structures makes it almost impossible for me.
I'd like to know if someone could help me with this research, either with ideas or with unknown libraries for me.
For example:
import nltk
import spacy
nlp = spacy.load('en_core_web_sm')
sent="The speed limit is 90 kilometres per hour on roads outside built-up areas."
doc=nlp(sent)
Obtaining the root:
def sent_root(sent):
for index,token in enumerate(sent):
if token.head == token:
return token, index
Out: (is, 3)
Obtaining the subject:
def sent_subj(sent):
for index,token in enumerate(sent):
if token.dep_ == 'nsubj':
return token, index
Out: (limit, 2)
Obtaining the childrens (dependencies of the word):
def sent_child(token):
complete_subj = ''
for child in token.children:
if(child.is_punct == False):
if(child.dep_ == 'compound'):
complete_subj += child.text + ' ' + token.text+' '
else:
complete_subj += child.text + ' '
for child_token in child.children:
if(child.is_punct == False):
complete_subj += child_token.text+' '
return complete_subj
Out: 'The speed limit '
Doc ents + root:
def doc_ents_root(sent, root):
ents_root = root.text+' '
for token in sent.ents:
ents_root += token.text + ' '
return ents_root
Out: 'is 90 kilometres per hour '
Extracting the action:
def action(sent):
#Obtaining the sent root
root, root_idx = sent_root(sent)
#Obtaining the subject
subj, subj_idx = sent_subj(sent)
#Obtaining the whole subject (subj + comps)
complete_subj = sent_child(subj)
complete_ents = doc_ents_root(sent, root)
return complete_subj + complete_ents
Applying all the funcions
action(doc)
Out: 'A traffic light with signal indicates '
I've been working on a function which will update two dictionaries (similar authors, and awards they've won) from an open text file. The text file looks something like this:
Brabudy, Ray
Hugo Award
Nebula Award
Saturn Award
Ellison, Harlan
Heinlein, Robert
Asimov, Isaac
Clarke, Arthur
Ellison, Harlan
Nebula Award
Hugo Award
Locus Award
Stephenson, Neil
Vonnegut, Kurt
Morgan, Richard
Adams, Douglas
And so on. The first name is an authors name (last name first, first name last), followed by awards they may have won, and then authors who are similar to them. This is what I've got so far:
def load_author_dicts(text_file, similar_authors, awards_authors):
name_of_author = True
awards = False
similar = False
for line in text_file:
if name_of_author:
author = line.split(', ')
nameA = author[1].strip() + ' ' + author[0].strip()
name_of_author = False
awards = True
continue
if awards:
if ',' in line:
awards = False
similar = True
else:
if nameA in awards_authors:
listawards = awards_authors[nameA]
listawards.append(line.strip())
else:
listawards = []
listawards.append(line.strip()
awards_authors[nameA] = listawards
if similar:
if line == '\n':
similar = False
name_of_author = True
else:
sim_author = line.split(', ')
nameS = sim_author[1].strip() + ' ' + sim_author[0].strip()
if nameA in similar_authors:
similar_list = similar_authors[nameA]
similar_list.append(nameS)
else:
similar_list = []
similar_list.append(nameS)
similar_authors[nameA] = similar_list
continue
This works great! However, if the text file contains an entry with just a name (i.e. no awards, and no similar authors), it screws the whole thing up, generating an IndexError: list index out of range at this part Zname = sim_author[1].strip()+" "+sim_author[0].strip() )
How can I fix this? Maybe with a 'try, except function' in that area?
Also, I wouldn't mind getting rid of those continue functions, I wasn't sure how else to keep it going. I'm still pretty new to this, so any help would be much appreciated! I keep trying stuff and it changes another section I didn't want changed, so I figured I'd ask the experts.
How about doing it this way, just to get the data in, then manipulate the dictionary any ways you want.
test.txt contains your data
Brabudy, Ray
Hugo Award
Nebula Award
Saturn Award
Ellison, Harlan
Heinlein, Robert
Asimov, Isaac
Clarke, Arthur
Ellison, Harlan
Nebula Award
Hugo Award
Locus Award
Stephenson, Neil
Vonnegut, Kurt
Morgan, Richard
Adams, Douglas
And my code to parse it.
award_parse.py
data = {}
name = ""
awards = []
f = open("test.txt")
for l in f:
# make sure the line is not blank don't process blank lines
if not l.strip() == "":
# if this is a name and we're not already working on an author then set the author
# otherwise treat this as a new author and set the existing author to a key in the dictionary
if "," in l and len(name) == 0:
name = l.strip()
elif "," in l and len(name) > 0:
# check to see if recipient is already in list, add to end of existing list if he/she already
# exists.
if not name.strip() in data:
data[name] = awards
else:
data[name].extend(awards)
name = l.strip()
awards = []
# process any lines that are not blank, and do not have a ,
else:
awards.append(l.strip())
f.close()
for k, v in data.items():
print("%s got the following awards: %s" % (k,v))