I'm trying to write a book cipher decoder, and the following is what i got so far.
code = open("code.txt", "r").read()
my_book = open("book.txt", "r").read()
book = my_book.txt
code_line = 0
while code_line < 6 :
sl = code.split('\n')[code_line]+'\n'
paragraph_num = sl.split(' ')[0]
line_num = sl.split(' ')[1]
word_num = sl.split(' ')[2]
x = x+1
the loop changes the following variables:
paragraph
line
word
and every thing is working just fine .
but what I need now is how to specify the paragraph then the line then the word
a for loop in the while loop would work perfectly..
so I want to get from paragraph number "paragraph_num" and line number "line_num" the word number "word_num"
that's my code file, which I'm trying to convert into words
"paragraph number","line number","word number"
70 1 3
50 2 2
21 2 9
28 1 6
71 2 2
27 1 4
and then I want my output to look something like this
word1
word2
word3
word4
word5
word6
by the way , my book "that file that i need to get the words from" looks something like this
word1 word2 word3
word4 word5 word6...
...word.. word.. last word
(The words are not identical)
Related: How to count paragraphs?
You already know how to read in the book file, break it into lines, and break each of those into words.
If paragraphs are defined as being separated by "\n\n", you can split the contents of the book file on that, and break each paragraph into lines. Or, after you break the book into lines, any empty line signals a change of paragraph.
This may be quite a late answer; but better now than never I guess?
I completed a book cipher implementation,
that I would like to say; does exactly what you are asking after.
It takes a book file (in my example test run, at the bottom "Shakespeare.txt")
and a message (words)
finds the index of each words typed in, and gets the same words from that -> but in the book.
It prints out the book's-Words-Indexes.
Give it a look? Hope it helps!
I worked as crazy on this one. Took me, literally Years to complete
this!
Have a great day!
I believe a answer with working code, or at least a try in that direction That's why I'm providing this code too;
Really hope it helps both you & The future viewers!
MAJOR EDIT:
Edit 1: Providing code here, easier for the future viewers; and for you hopefully:
Main Program:
Originally from my GitHub: https://github.com/loneicewolf/Book-Cipher-Python
I shortened it and removed stuff (that wasn't needed in this case) to make it more 'elegant' (& hopefully it became that too)
# Replace "document1.txt" with whatever your book / document's name is.
BOOK="document1.txt" # This contains your "Word Word Word Word ...." I believed from the very start that you meant, they are not the same - (obviously)
# Read book into "boktxt"
def GetBookContent(BOOK):
ReadBook = open(BOOK, "r")
txtContent_splitted = ReadBook.read();
ReadBook.close()
Words=txtContent_splitted
return(txtContent_splitted.split())
boktxt = GetBookContent(BOOK)
words=input("input text: ").split()
print("\nyou entered these words:\n",words)
i=0
words_len=len(words)
for word in boktxt:
while i < words_len:
print(boktxt.index(words[i]))
i=i+1
x=0
klist=input("input key-sequence sep. With spaces: ").split()
for keys in klist:
print(boktxt[int(klist[x])])
x=x+1
TEST ADDED:
EDIT: I think I could provide a sample run with a book, to show it in action, at least.. Sorry for not providing this sooner:
I executed the python script: and I used Shakespeare.txt as my 'book' file.
input text: King of dragon, lord of gold, queen of time has a secret, which 3 or three, can hold if two of them are dead
(I added a digit in it too, so it would prove that it works with digits too, if somebody in the future wonders)
and it outputs your book code:
27978 130 17479 2974 130 23081 24481 130 726 202 61 64760 278 106853 1204 38417 256 8204 97 6394 130 147 16 17084
For example:
27978 means the 27978'th word in Shakespeare.txt
To decrypt it, you feed in the book code and it outputs the plain text! (the words you originally typed in)
input key-sequence sep. With spaces: 27978 130 17479 2974 130 23081 24481 130 726 202 61 64760 278 106853 1204 38417 256 8204 97 6394 130 147 16 17084
-> it outputs ->
King of dragon, lord of gold, queen of time has a secret, which 3 or three, can hold if two of them are dead
//Wishes
William.
Related
So I have to take the numbers from a certain file
containing:
1 5
2 300
3 3
9 155
7 73
7 0
Multiply them and add them to a new file
I used the script under here but for some reason, it now gives a syntax error.
f=open('multiply.txt')
f2=open('resulted.txt','w')
while True:
line=f.readline()
if len(line)==0:
break
line=line.strip()
result=line.split(" ")
multiply=int(result[0])*int(result[1])
multiply=str(multiply)
answer=print(result[0],"*",result[1],"=",multiply)
f2.write(str(multiply))
f.close()
f2.close()
i found out that f2.write(multiply) works
but i get all the answers as 1 string (5600913955110)
how do i get it to be 1 good text file and give the right calculation
Update:
f=open('multiply.txt')
f2=open('result.txt','w')
while True:
line=f.readline()
if len(line)==0:
break
line=line.strip()
result=line.split(" ")
multiply=int(result[0])*int(result[1])
multiply=str(multiply)
answer=print(result[0],"*",result[1],"=",multiply)
answer=str(answer)
f2.write(str(answer))
f2.write(str(multiply))
f.close()
f2.close()
output:
None5None600None9None1395None511None0
at the end of the code you have this line:
f2.write(str(answer)
notice there is not a ) at the end and you have two ( in the line.
try this:
f2.write(str(answer))
Also the name of the post sounds like its provoking opinion response. Try to change it so it doesn't mention your friend but the problem at hand.
In most programming languages, there are escape sequences. Escape sequences allow you to do many things. in your case you need to add the escape sequence
"\n"
this will add a new line onto each thing you append to the file.
like this:
answer=str(result[0])+"*"+str(result[1])+"="+str(multiply)
print(answer)
f2.write(str(answer)+"\n")
This question already has answers here:
How do I count the occurrences of a list item?
(29 answers)
Closed 2 years ago.
I'm trying to have my program read a single line formed by words separated by commas. For example if we have:
hello,cat,man,hey,dog,boy,Hello,man,cat,woman,dog,Cat,hey,boy
in the input file, the program would need to separate each word on a single line and ditch the commas. After that the program would count frequencies of the words in the input file.
f = open('input1.csv') # create file object
userInput = f.read()
seperated = userInput.split(',')
for word in seperated:
freq = seperated.count(word)
print(word, freq)
The problem with this code is it prints the initial count for the same word that's counted twice. The output for this program would be:
hello 1
cat 2
man 2
hey 2
dog 2
boy 1
Hello 1
man 2
cat 2
woman 1
dog 2
Cat 1
hey 2
boy
1
The correct output would be:
hello 1
cat 2
man 2
hey 2
dog 2
boy 2
Hello 1
woman 1
Cat 1
Question is how do I make my output look more polished by having the final count instead of the initial one?
This is a common pattern and core programming skill. You should try collecting and counting words each time you encounter them, in a dictionary. I'll give you the idea, but it's best you practise the exact implementation yourself. Happy hacking!
(Also recommend the "pretty print" python built-in method)
import pprint
for word in file:
word_dict[word] += 1
pprint.pprint(word_dict)
A couple of extra tips - you may want to f.close() your file when you're finished, (E: I misread so disregard the rest...) and it looks like you want to look at converting your words to lower case so that different capitalisations aren't counted seperately. There are python built in methods to do this you can find by searching
try using a dictionary:
f = open('input1.csv') # create file object
userInput = f.read()
seperated = userInput.split(',')
wordsDict = {}
for word in seperated:
if word not in wordsDict:
wordsDict[word] = 1
else:
wordsDict[word] = int(wordsDict.get(word)) + 1
for i in wordsDict:
print i, wordsDict[i]
)
Create a new dictionary. Add the word as key and the count of that as value to it
count_dict={}
for w in seperated:
count_dict[w]=seperated.count(w)
for key,value in count_dict.items():
print(key,value)
I don't really know how to word the question, but I have this file with a number and a decimal next to it, like so(the file name is num.txt):
33 0.239
78 0.298
85 1.993
96 0.985
107 1.323
108 1.000
I have this string of numbers that I want to find the certain numbers from the file, take the decimal numbers, and append it to a list:
['78','85','108']
Here is my code so far:
chosen_number = ['78','85','108']
memory_list = []
for line in open(path/to/num.txt):
checker = line[0:2]
if not checker in chosen_number: continue
dec = line.split()[-1]
memory_list.append(float(dec))
The error they give to me is that it is not in a list and they only account for the 3 digit numbers. I don't really understand why this is happening and would like some tips to know how to fix it. Thanks.
As for the error, there is no actual error. The only problem is that they ignore the two digit numbers and only get the three digit numbers. I want them to get both the 2 and 3 digit numbers. For example, the script would pass 78 and 85, going to the line with '108'.
Your checker is undefined. The below code works.
N.B. I have used startswith because, the number might appear elsewhere in the line.
chosen_number = ['78','85','108']
memory_list = []
with open('path/to/num.txt') as f:
for line in f:
if any(line.startswith(i) for i in chosen_number):
memory_list.append(float(line.split()[1]))
print(memory_list)
Output:
[0.298, 1.993, 1.0]
The following would should work:
chosen_number = ['78','85','108']
memory_list = []
with open('num.txt') as f_input:
for line in f_input:
v1, v2 = line.split()
if v1 in chosen_number:
memory_list.append(float(v2))
print memory_list
Giving you:
[0.298, 1.993, 1.0]
Also, it is better to use a with statement when dealing with files so that the file is automatically closed afterwards.
Try to use this code:
chosen_number = ['78 ', '85 ', '108 ']
memory_list = []
for line in open("num.txt"):
for num in chosen_number:
if num in line:
dec = line.split()[-1]
memory_list.append(float(dec))
In chosen number, I declared numbers with a space after: '85 '. Otherwise when 0.985 is found, the if condition would be true, as they're used as string. I hope, I'm clear enough.
I'm not that experienced with code and have a question pertaining to my GCSE Computer Science controlled assessment. I have got pretty far, it's just this last hurdle is holding me up.
This task requires me to use a previously made simple file compression system, and to "Develop a program that builds [upon it] to compress a text file with several sentences, including punctation. The program should be able to compress a file into a list of words and list of positions to recreate the original file. It should also be able to take a compressed file and recreate the full text, including punctuation and capitalisation, of the original file".
So far, I have made it possible to store everything as a text file with my first program:
sentence = input("Enter a sentence: ")
sentence = sentence.split()
uniquewords = []
for word in sentence:
if word not in uniquewords:
uniquewords.append(word)
positions = [uniquewords.index(word) for word in sentence]
recreated = " ".join([uniquewords[i] for i in positions])
print (uniquewords)
print (recreated)
positions=str(positions)
uniquewords=str(uniquewords)
positionlist= open("H:\Python\ControlledAssessment3\PositionList.txt","w")
positionlist.write(positions)
positionlist.close
wordlist=open("H:\Python\ControlledAssessment3\WordList.txt","w",)
wordlist.write(uniquewords)
wordlist.close
This makes everything into lists, and converts them into a string so that it is possible to write into a text document. Now, program number 2 is where the issue lies:
uniquewords=open("H:\Python\ControlledAssessment3\WordList.txt","r")
uniquewords= uniquewords.read()
positions=open("H:\Python\ControlledAssessment3\PositionList.txt","r")
positions=positions.read()
positions= [int(i) for i in positions]
print(uniquewords)
print (positions)
recreated = " ".join([uniquewords[i] for i in positions])
FinalSentence=
open("H:\Python\ControlledAssessment3\ReconstructedSentence.txt","w")
FinalSentence.write(recreated)
FinalSentence.write('\n')
FinalSentence.close
When I try and run this code, this error appears:
Traceback (most recent call last):
File "H:\Python\Task 3 Test 1.py", line 7, in <module>
positions= [int(i) for i in positions]
File "H:\Python\Task 3 Test 1.py", line 7, in <listcomp>
positions= [int(i) for i in positions]
ValueError: invalid literal for int() with base 10: '['
So, how do you suppose I get the second program to recompile the text into the sentence? Thanks, and I'm sorry if this was a lengthy post, I've spent forever trying to get this working.
I'm assuming this is something to do with the list that has been converted into a string including brackets, commas, and spaces etc. so is there a way to revert both strings back into their original state so I can recreate the sentence? Thanks.
So firstly, it is a big strange to save positions as a literal string; you should save each element (same with uniquewords). With this in mind, something like:
program1.py:
sentence = input("Type sentence: ")
# this is a test this is a test this is a hello goodbye yes 1 2 3 123
sentence = sentence.split()
uniquewords = []
for word in sentence:
if word not in uniquewords:
uniquewords.append(word)
positions = [uniquewords.index(word) for word in sentence]
with open("PositionList.txt","w") as f:
for i in positions:
f.write(str(i)+' ')
with open("WordList.txt","w") as f:
for i in uniquewords:
f.write(str(i)+' ')
program2.py:
with open("PositionList.txt","r") as f:
data = f.read().split(' ')
positions = [int(i) for i in data if i!='']
with open("WordList.txt","r") as f:
uniquewords = f.read().split(' ')
sentence = " ".join([uniquewords[i] for i in positions])
print(sentence)
PositionList.txt
0 1 2 3 0 1 2 3 0 1 2 4 5 6 7 8 9 10
WordList.txt
this is a test hello goodbye yes 1 2 3 123
Hey there, I have a rather large file that I want to process using Python and I'm kind of stuck as to how to do it.
The format of my file is like this:
0 xxx xxxx xxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
1 xxx xxxx xxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
So I basically want to read in the chunk up from 0-1, do my processing on it, then move on to the chunk between 1 and 2.
So far I've tried using a regex to match the number and then keep iterating, but I'm sure there has to be a better way of going about this. Any suggestion/info would be greatly appreciated.
If they are all within the same line, that is there are no line breaks between "1." and "2." then you can iterate over the lines of the file like this:
for line in open("myfile.txt"):
#do stuff
The line will be disposed of and overwritten at each iteration meaning you can handle large file sizes with ease. If they're not on the same line:
for line in open("myfile.txt"):
if #regex to match start of new string
parsed_line = line
else:
parsed_line += line
and the rest of your code.
Why don't you just read the file char by char using file.read(1)?
Then, you could - in each iteration - check whether you arrived at the char 1. Then you have to make sure that storing the string is fast.
If the "N " can only start a line, then why not use use the "simple" solution? (It sounds like this already being done, I am trying to reinforce/support it ;-))
That is, just reading a line at a time, and build up the data representing the current N object. After say N=0, and N=1 are loaded, process them together, then move onto the next pair (N=2, N=3). The only thing that is even remotely tricky is making sure not to throw out a read line. (The line read that determined the end condition -- e.g. "N " -- also contain the data for the next N).
Unless seeking is required (or IO caching is disabled or there is an absurd amount of data per item), there is really no reason not to use readline AFAIK.
Happy coding.
Here is some off-the-cuff code, which likely contains multiple errors. In any case, it shows the general idea using a minimized side-effect approach.
# given an input and previous item data, return either
# [item_number, data, next_overflow] if another item is read
# or None if there are no more items
def read_item (inp, overflow):
data = overflow or ""
# this can be replaced with any method to "read the header"
# the regex is just "the easiest". the contract is just:
# given "N ....", return N. given anything else, return None
def get_num(d):
m = re.match(r"(\d+) ", d)
return int(m.groups(1)) if m else None
for line in inp:
if data and get_num(line) ne None:
# already in an item (have data); current line "overflows".
# item number is still at start of current data
return [get_num(data), data, line]
# not in item, or new item not found yet
data += line
# and end of input, with data. only returns above
# if a "new" item was encountered; this covers case of
# no more items (or no items at all)
if data:
return [get_num(data), data, None]
else
return None
And usage might be akin to the following, where f represents an open file:
# check for error conditions (e.g. None returned)
# note feed-through of "overflow"
num1, data1, overflow = read_item(f, None)
num2, data2, overflow = read_item(f, overflow)
If the format is fixed, why not just read 3 lines at a time with readline()
If the file is small, you could read the whole file in and split() on number digits (might want to use strip() to get rid of whitespace and newlines), then fold over the list to process each string in the list. You'll probably have to check that the resultant string you are processing on is not initially empty in case two digits were next to each other.
If the file's content can be loaded in memory, and that's what you answered, then the following code (needs to have filename defined) may be a solution.
import re
regx = re.compile('^((\d+).*?)(?=^\d|\Z)',re.DOTALL|re.MULTILINE)
with open(filename) as f:
text = f.read()
def treat(inp,regx=regx):
m1 = regx.search(inp)
numb,chunk = m1.group(2,1)
li = [chunk]
for mat in regx.finditer(inp,m1.end()):
n,ch = mat.group(2,1)
if int(n) == int(numb) + 1:
yield ''.join(li)
numb = n
li = []
li.append(ch)
chunk = ch
yield ''.join(li)
for y in treat(text):
print repr(y)
This code, run on a file containing :
1 mountain
orange 2
apple
produce
2 gas
solemn
enlightment
protectorate
3 grimace
song
4 snow
wheat
51 guludururu
kelemekinonoto
52asabi dabada
5 yellow
6 pink
music
air
7 guitar
blank 8
8 Canada
9 Rimini
produces:
'1 mountain\norange 2\napple\nproduce\n'
'2 gas\nsolemn\nenlightment\nprotectorate\n'
'3 grimace\nsong\n'
'4 snow\nwheat\n51 guludururu\nkelemekinonoto\n52asabi dabada\n'
'5 yellow\n'
'6 pink \nmusic\nair\n'
'7 guitar\nblank 8\n'
'8 Canada\n'
'9 Rimini'