Python - find pattern in file in max 4 lines of code - python

i have the following task. I have to find a specific pattern(word) in my file.txt(is a song centered on page) and to print out the row number + the row which has the pattern in it getting rid of the left spaces.
You can see the correct output here:
92 Meant in croaking "Nevermore."
99 She shall press, ah, nevermore!
107 Quoth the Raven, "Nevermore."
115 Quoth the Raven, "Nevermore."
and without this: my_str += ' '+str(count)+ ' ' + line.lstrip(), it will print:
92 Meant in croaking "Nevermore."
99 She shall press, ah, nevermore!
107 Quoth the Raven, "Nevermore."
115 Quoth the Raven, "Nevermore."
This is my code, but i want to have only 4 lines of code
```python
def find_in_file(pattern,filename):
my_str = ''
with open(filename, 'r') as file:
for count,line in enumerate(file):
if pattern in line.lower():
if count >= 10 and count <= 99:
my_str += ' '+str(count)+ ' ' + line.lstrip()
else:
my_str += str(count)+ ' ' + line.lstrip()
print(my_str)

In fact, one line can be completed:
''.join(f' {count} {line.lstrip()}' if 10 <= count <= 99 else f'{count} {line.lstrip()}' for count, line in enumerate(file) if pattern in line.lower())
However, this seems a little too long...
According to the comment area, it can be simplified:
''.join(f'{count:3} {line.lstrip()}' for count, line in enumerate(file) if pattern in line.lower())

def find_in_file(pattern,filename):
with open(filename, 'r') as file:
# 0 based line numbering, for 1 based use enumerate(file,1)
for count,line in enumerate(file):
if pattern in line.lower():
print(f"{count:>3} {line.strip()}")
would be 4 lines of code (inside the function) and should be equivalent to what you got.
Possible in one line as well:
def find_in_file(pattern,filename):
# 1 based line numbering
return '\n'.join(f'{count:>3} {line.strip()}' for count, line in enumerate(file,1) if pattern in line.lower())
See pythons mini format language.

You can use formatted strings to make sure the numbers always use three characters, even when they have only 1 or 2 digits.
I also prefer to use str.strip rather than str.lstrip, to get rid of trailing whitespace; in particular, lines read from the file will typically end with a linebreak, and then print will add a second linebreak, and we end up with too many linebreaks if we don't strip them away.
def find_in_file(pattern,filename):
with open(filename, 'r') as file:
for count,line in enumerate(file):
if pattern in line.lower():
print('{:3d} {}'.format(count, line.strip()))
find_in_file('nevermore','theraven.txt')
# 55 Quoth the Raven "Nevermore."
# 62 With such name as "Nevermore."
# 69 Then the bird said "Nevermore."
# 76 Of 'Never—nevermore'."
# 83 Meant in croaking "Nevermore."
# 90 She shall press, ah, nevermore!
# 97 Quoth the Raven "Nevermore."
# 104 Quoth the Raven "Nevermore."
# 111 Quoth the Raven "Nevermore."
# 118 Quoth the Raven "Nevermore."
# 125 Shall be lifted—nevermore!

Related

Looking at a list of numbers and getting that number from another file>

I don't really know how to word the question, but I have this file with a number and a decimal next to it, like so(the file name is num.txt):
33 0.239
78 0.298
85 1.993
96 0.985
107 1.323
108 1.000
I have this string of numbers that I want to find the certain numbers from the file, take the decimal numbers, and append it to a list:
['78','85','108']
Here is my code so far:
chosen_number = ['78','85','108']
memory_list = []
for line in open(path/to/num.txt):
checker = line[0:2]
if not checker in chosen_number: continue
dec = line.split()[-1]
memory_list.append(float(dec))
The error they give to me is that it is not in a list and they only account for the 3 digit numbers. I don't really understand why this is happening and would like some tips to know how to fix it. Thanks.
As for the error, there is no actual error. The only problem is that they ignore the two digit numbers and only get the three digit numbers. I want them to get both the 2 and 3 digit numbers. For example, the script would pass 78 and 85, going to the line with '108'.
Your checker is undefined. The below code works.
N.B. I have used startswith because, the number might appear elsewhere in the line.
chosen_number = ['78','85','108']
memory_list = []
with open('path/to/num.txt') as f:
for line in f:
if any(line.startswith(i) for i in chosen_number):
memory_list.append(float(line.split()[1]))
print(memory_list)
Output:
[0.298, 1.993, 1.0]
The following would should work:
chosen_number = ['78','85','108']
memory_list = []
with open('num.txt') as f_input:
for line in f_input:
v1, v2 = line.split()
if v1 in chosen_number:
memory_list.append(float(v2))
print memory_list
Giving you:
[0.298, 1.993, 1.0]
Also, it is better to use a with statement when dealing with files so that the file is automatically closed afterwards.
Try to use this code:
chosen_number = ['78 ', '85 ', '108 ']
memory_list = []
for line in open("num.txt"):
for num in chosen_number:
if num in line:
dec = line.split()[-1]
memory_list.append(float(dec))
In chosen number, I declared numbers with a space after: '85 '. Otherwise when 0.985 is found, the if condition would be true, as they're used as string. I hope, I'm clear enough.

arnold/book cipher with python

I'm trying to write a book cipher decoder, and the following is what i got so far.
code = open("code.txt", "r").read()
my_book = open("book.txt", "r").read()
book = my_book.txt
code_line = 0
while code_line < 6 :
sl = code.split('\n')[code_line]+'\n'
paragraph_num = sl.split(' ')[0]
line_num = sl.split(' ')[1]
word_num = sl.split(' ')[2]
x = x+1
the loop changes the following variables:
paragraph
line
word
and every thing is working just fine .
but what I need now is how to specify the paragraph then the line then the word
a for loop in the while loop would work perfectly..
so I want to get from paragraph number "paragraph_num" and line number "line_num" the word number "word_num"
that's my code file, which I'm trying to convert into words
"paragraph number","line number","word number"
70 1 3
50 2 2
21 2 9
28 1 6
71 2 2
27 1 4
and then I want my output to look something like this
word1
word2
word3
word4
word5
word6
by the way , my book "that file that i need to get the words from" looks something like this
word1 word2 word3
word4 word5 word6...
...word.. word.. last word
(The words are not identical)
Related: How to count paragraphs?
You already know how to read in the book file, break it into lines, and break each of those into words.
If paragraphs are defined as being separated by "\n\n", you can split the contents of the book file on that, and break each paragraph into lines. Or, after you break the book into lines, any empty line signals a change of paragraph.
This may be quite a late answer; but better now than never I guess?
I completed a book cipher implementation,
that I would like to say; does exactly what you are asking after.
It takes a book file (in my example test run, at the bottom "Shakespeare.txt")
and a message (words)
finds the index of each words typed in, and gets the same words from that -> but in the book.
It prints out the book's-Words-Indexes.
Give it a look? Hope it helps!
I worked as crazy on this one. Took me, literally Years to complete
this!
Have a great day!
I believe a answer with working code, or at least a try in that direction That's why I'm providing this code too;
Really hope it helps both you & The future viewers!
MAJOR EDIT:
Edit 1: Providing code here, easier for the future viewers; and for you hopefully:
Main Program:
Originally from my GitHub: https://github.com/loneicewolf/Book-Cipher-Python
I shortened it and removed stuff (that wasn't needed in this case) to make it more 'elegant' (& hopefully it became that too)
# Replace "document1.txt" with whatever your book / document's name is.
BOOK="document1.txt" # This contains your "Word Word Word Word ...." I believed from the very start that you meant, they are not the same - (obviously)
# Read book into "boktxt"
def GetBookContent(BOOK):
ReadBook = open(BOOK, "r")
txtContent_splitted = ReadBook.read();
ReadBook.close()
Words=txtContent_splitted
return(txtContent_splitted.split())
boktxt = GetBookContent(BOOK)
words=input("input text: ").split()
print("\nyou entered these words:\n",words)
i=0
words_len=len(words)
for word in boktxt:
while i < words_len:
print(boktxt.index(words[i]))
i=i+1
x=0
klist=input("input key-sequence sep. With spaces: ").split()
for keys in klist:
print(boktxt[int(klist[x])])
x=x+1
TEST ADDED:
EDIT: I think I could provide a sample run with a book, to show it in action, at least.. Sorry for not providing this sooner:
I executed the python script: and I used Shakespeare.txt as my 'book' file.
input text: King of dragon, lord of gold, queen of time has a secret, which 3 or three, can hold if two of them are dead
(I added a digit in it too, so it would prove that it works with digits too, if somebody in the future wonders)
and it outputs your book code:
27978 130 17479 2974 130 23081 24481 130 726 202 61 64760 278 106853 1204 38417 256 8204 97 6394 130 147 16 17084
For example:
27978 means the 27978'th word in Shakespeare.txt
To decrypt it, you feed in the book code and it outputs the plain text! (the words you originally typed in)
input key-sequence sep. With spaces: 27978 130 17479 2974 130 23081 24481 130 726 202 61 64760 278 106853 1204 38417 256 8204 97 6394 130 147 16 17084
-> it outputs ->
King of dragon, lord of gold, queen of time has a secret, which 3 or three, can hold if two of them are dead
//Wishes
William.

Total from text file

I have a text file with the data with numbers of good and bad of a product by each gender
Male 100 120
Female 110 150
How can I calculate the total from this text file for both gender so that it prints out 480
Here is my attempt to code:
def total():
myFile = open("product.txt", "r")
for result in myFile:
r = result.split()
print(r[1]+r[2])
total()
It prints outs what the column has but it doesn't add them
The result of split is a sequence of strings, not of integers.
"Adding" two strings with + concatenates the strings.
Example interaction with enough clues for you to solve it:
>>> s = "123 456"
>>> ss = s.split()
>>> ss
['123', '456']
>>> ss[0] + ss[1]
'123456'
>>> int(ss[0])
123
>>> int(ss[1])
456
>>> int(ss[0]) + int(ss[1])
579
When you get unexpected results, opening your interpreter and looking at things interactively usually provides plenty of clues.
You need to convert each of the split text entries into an integer, and keep a running total as follows:
def total():
the_total = 0
with open("product.txt", "r") as myFile:
for result in myFile:
r = result.split()
the_total += int(r[1]) + int(r[2])
return the_total
print(total())
This would display:
480
Using with will automatically close the file for you.
Yet another one
def total():
with open('product.txt') as f:
nums = (int(el) for r in f for el in r.split()[1:])
return sum(nums)
print(total())
It works for any number of columns you may have in each row
e.g. with four columns
Male 111 222 333 444
Female 666 777 888 999
produces
4440
As mentioned in the comments by jonrsharpe, you aren't adding the previous values.
Since you want to add everything, keep track of the previous values and add the new lines (all converted to integer). Change your code to:
def total():
t = 0
with open("product.txt", "r") as myFile:
for result in myFile:
r = result.split()
t += int(r[1]) + int(r[2])
return t
print(total()) # 480
Since this got chosen, I'm editing to include file closing.
Mentioned by Martin Evans:
Using with will automatically close the file for you.
>>> def total():
myfile = open("/home/prashant/Desktop/product.txt" , "r")
for res in myfile:
r = res.split()
print (int(r[0])+int(r[1]))
str isn't converted into int that's your problem

printing column numbers in python

How can I print first 52 numbers in the same column and so on (in 6 columns in total that repeats). I have lots of float numbers and I want to keep the first 52 and so on numbers in the same column before starting new column that will as well have to contain the next 52 numbers. The numbers are listed in lines separated by one space in a file.txt document. So in the end I want to have:
1 53 105 157 209 261
2
...
52 104 156 208 260 312
313 ... ... ... ... ...
...(another 52 numbers and so on)
I have try this:
with open('file.txt') as f:
line = f.read().split()
line1 = "\n".join("%-20s %s"%(line[i+len(line)/52],line[i+len(line)/6]) for i in range(len(line)/6))
print(line1)
However this only prints of course 2 column numbers . I have try to add line[i+len()line)/52] six time but the code is still not working.
for row in range(52):
for col in range(6):
print line[row + 52*col], # Dangling comma to stay on this line
print # Now go to the next line
Granted, you can do this in more Pythonic ways, but this will show you the algorithm structure and let you tighten the code as you wish.

Using Python to extract segment of numbers from a file?

I know there are questions on how to extract numbers from a text file, which have helped partially. Here is my problem. I have a text file that looks like:
Some crap here: 3434
A couple more lines
of crap.
34 56 56
34 55 55
A bunch more crap here
More crap here: 23
And more: 33
54 545 54
4555 55 55
I am trying to write a script that extracts the lines with the three numbers and put them into separate text files. For example, I'd have one file:
34 56 56
34 55 55
And another file:
54 545 54
4555 55 55
Right now I have:
for line in file_in:
try:
float(line[1])
file_out.write(line)
except ValueError:
print "Just using this as placeholder"
This successfully puts both chunks of numbers into a single file. But I need it to put one chunk in one file, and another chunk in another file, and I'm lost on how to accomplish this.
You didn't specify what version of Python you were using but you might approach it this way in Python2.7.
string.translate takes a translation table (which can be None) and a group of characters to translate (or delete if table is None).
You can set your delete_chars to everything but 0-9 and space by slicing string.printable correctly:
>>> import string
>>> remove_chars = string.printable[10:-6] + string.printable[-4:]
>>> string.translate('Some crap 3434', None, remove_chars)
' 3434'
>>> string.translate('34 45 56', None, remove_chars)
'34 45 56'
Adding a strip to trim white space on the left and right and iterating over a testfile containing the data from your question:
>>> with open('testfile.txt') as testfile:
... for line in testfile:
... trans = line.translate(None, remove_chars).strip()
... if trans:
... print trans
...
3434
34 56 56
34 55 55
23
33
54 545 54
4555 55 55
You can use regex here.But this will require reading file into a variable by file.read() or something.(If the file is not huge)
((?:(?:\d+ ){2}\d+(?:\n|$))+)
See demo.
https://regex101.com/r/tX2bH4/20
import re
p = re.compile(r'((?:(?:\d+ ){2}\d+(?:\n|$))+)', re.IGNORECASE)
test_str = "Some crap here: 3434\nA couple more lines\nof crap.\n34 56 56\n34 55 55\nA bunch more crap here\nMore crap here: 23\nAnd more: 33\n54 545 54\n4555 55 55"
re.findall(p, test_str)
re.findall returns a list.You can easily put each content of list in a new file.
To know if a string is a number you can use str.isdigit:
for line in file_in:
# split line to parts
parts = line.strip().split()
# check all parts are numbers
if all([str.isdigit(part) for part in parts]):
if should_split:
split += 1
with open('split%d' % split, 'a') as f:
f.write(line)
# don't split until we skip a line
should_split = False
else:
with open('split%d' % split, 'a') as f:
f.write(line)
elif not should_split:
# skipped line means we should split
should_split = True

Categories