improving a word combination script - python

Any way to make this better or more simple? I know it generates a whole lot of words and when you try to combine more than 4 lines on one sentence it doesn't look the way it should.
infile = open('Wordlist.txt.txt','r')
wordlist = []
for line in infile:
wordlist.append(line.strip())
infile.close()
outfile = open('output.txt','w')
for word1 in wordlist:
for word2 in wordlist:
out = '%s %s' %(word1,word2)
#feel free to #comment one of these two lines to not output to file or screen
print out
outfile.write(out + '\n')
outfile.close()

Use itertools.product
with open('Wordlist.txt.txt') as infile:
words = [line.strip() for line in infile]
with open('output.txt', 'w') as outfile:
for word1, word2 in itertools.product(words, repeat=2):
outfile.write("%s %s\n" %(word1, word2))

If each line in your infile contains exactly 2 words you may consider:
from itertools import product
with open('Wordlist.txt.txt','r') as infile:
wordlist=infile.readlines()
with open('output','w') as ofile:
ofile.write('\n'.join(map(product, [line.strip().split() for line in wordlist])))

Related

Returning a line of txt.-file that has a word with more than 6 characters and starts with "A" in Python

I have a task to accomplish in Python with only one sentence:
I need to return lines of my txt-file that include words which have more than 6 characters and start with the letter "A".
My code is the following:
[line for line in open('test.txt') if line.split().count('A') > 6]
I am not sure how to implement another command in order to say that my word starts with "A" and has to have more than 6 characters. That is the furthest I could do. I thank you for your time.
Greetings
I would split up your for loop so that it's not a list comprehension, to make it easier to understand what's going on. Once you do that, it should be clearer what you're missing so you can assemble it back into a list comprehension.
lines = []
with open('test.txt', 'r') as f:
for line in f: # this line reads each line in the file
add_line = False
for word in line.split():
if (word.startswith('A') and len(word) > 6):
add_line = True
break
if (add_line):
lines.append(line)
This roughly translates to
[line for line in open('test.txt', 'r') if any(len(word) > 6 and word.startswith('A') for word in line.split())]
You should break each line and compare each word separately
[line for line in open('test.txt') if len([word for word in line.split(' ') if word[0].lower() == 'a' and len(word)> 6]) > 0]

how can i sort file elements in python?

I have to sort some elements in a text file that contains the names with the schedules of some teachers. Searching on google, I found this program:
def sorting(filename):
infile = open(filename)
words = []
for line in infile:
temp = line.split()
for i in temp:
words.append(i)
infile.close()
words.sort()
outfile = open("result.txt", "w")
for i in words:
outfile.writelines(i)
outfile.writelines(" ")
outfile.close()
sorting("file.txt")
The code works, but it sorts the elements of the file on a single line, while I want it to be written as in the file, but in alphabetical and numerical orders. To unterstand it, let's say I have this file:
c
a
b
What I want is to sort it like this:
a
b
c
but it sorts it like this:
a b c
I use python 3.10. Can anyone please help me? Thanks.
def sorting(filename):
infile = open(filename)
words = []
for line in infile:
temp = line.split()
for i in temp:
words.append(i)
infile.close()
words.sort()
outfile = open("result.txt", "w")
for i in words:
outfile.writelines(i)
outfile.writelines("\n") # edited Instead of ' ' write '\n'
outfile.close()
sorting("test.txt")

how to add numbering inside the for loop?

I have a problem printing the line with the number. How to add numbering before of line? I made a comment on which part.
f = open('filename', "r")
lines = f.readlines()
for line in lines:
synonyms = []
print(line) # I want my interface to be, 1. word 2. word, and so on
answer = input("Answer: ").lower()
for syn in wordnet.synsets(line.strip()):
for l in syn.lemmas():
synonyms.append(l.name())
My code is just printing
word1
Answer:
word2
Answer:
My ideal code is:
1.word1
Answer:
2.word2
Answer:
replace your loop:
for i, line in enumerate(lines):
print(str(i + 1) + '. ' + str(line))
"i" will be the number waited...
you could use string interpolation if you are at min python3.6
print(f'{i + 1}. {line}')
Instead of traversing through every line in the lines list, just go through the indexes of every element, and then print the index+1, as obviously list indexes start from 0, but we want our numbering to start from 1. So, instead of printing only the line, we'll print line_no: line
f = open('filename', "r")
lines = f.readlines()
for line_no in range(len(lines)):
synonyms = []
print(f"{line_no+1}: {lines[line_no]}") # I want my interface to be, 1. word 2. word, and so on
answer = input("Answer: ").lower()
for syn in wordnet.synsets(line.strip()):
for l in syn.lemmas():
synonyms.append(l.name())
Hope my answer helped :D
Try this:
f = open('filename', "r")
lines = f.readlines()
for line_no,line in enumerate(lines):
synonyms = []
print(str(line_no+1)+'.'+line) # I want my interface to be, 1. word 2. word, and so on
answer = input("Answer: ").lower()
for syn in wordnet.synsets(line.strip()):
for l in syn.lemmas():
synonyms.append(l.name())
Add a variable called count and increment it with each iteration of the for loop.
Code:
count = 1
f = open('filename', "r")
lines = f.readlines()
for line in lines:
synonyms = []
print(str(count) + ". " + line)
count += 1
answer = input("Answer: ").lower()
for syn in wordnet.synsets(line.strip()):
for l in syn.lemmas():
synonyms.append(l.name())

How to skip n number of lines after finding a specific line in Python 3?

Let's say I have a large text file, and I want to skip a line containing some keyword, as well as 2 lines after that line.
Original_file:
line1 some words
line2 some words
line3 keyword
line4 some words
line5 some words
line6 some words
line7 some words
line8 some words
New_file:
line1 some words
line2 some words
line6 some words
line7 some words
line8 some words
A simplified snippet of my code:
with open('Original_file','r') as f:
lines = f.readlines()
nf = open('New_file', 'w')
for line in lines:
if 'keyword' in line:
for i in range(3): continue
else:
nf.write(line + "\n")
The loop, "for i in range(3): continue" doesn't skip lines (I assume because it just continues within that nested for loop instead of the "for line in lines" for loop. I also tried "next(f)" instead of "continue" and got a StopIteration error message.
Of course, if I try,
with open('Original_file','r') as f:
lines = f.readlines()
nf = open('New_file', 'w')
for line in lines:
if 'keyword' in line:
continue
else:
nf.write(line + "\n")
it succeeds in skipping a line, but only the line with the keyword, whereas I want to skip the next two lines as well (a total of 3 lines).
Any suggestions are appreciated. Thank you for your help.
You can try iterate over indexes instead of elements:
with open('Original_file','r') as f:
lines = f.readlines()
nf = open('New_file', 'w')
i = 0
while i< len(lines):
line = lines[i]
if 'keyword' in line:
i+=3
else:
nf.write(line + "\n")
i+=1
Make a counter and skip lines based off that.
with open('Original_file','r') as f:
lines = f.readlines()
nf = open('New_file', 'w')
skip_lines = 0
for line in lines:
if skip_lines > 0:
skip_lines -= 1
elif 'keyword' in line:
skip_lines = 3
else:
nf.write(line + "\n")
You can achieve this with a flag /counter
with open('Original_file','r') as f:
lines = f.readlines()
skip = 0
nf = open('New_file', 'w')
for line in lines:
if skip:
skip -=1
elif 'keyword' in line:
skip = 3
else:
nf.write(line + "\n")
The problem is that you extract lines directly from a list and because of that, you have no possible interaction with the underlying iterator used by for line in lines.
You should simply use the file object as an iterator:
with open('Original_file','r') as f, open('New_file', 'w') as nf
for line in f:
if 'keyword' in line:
for i in range(2): next(f)
else:
nf.write(line)
The only assumption here is that any line containing keyword is followed with at least 2 other lines.
You can use next(), i.e.:
with open("old.txt") as f, open("new.txt", "w") as w:
for line in f:
if "keyword" in line:
next(f), next(f)
continue
w.write(line)
Demo
If you prefer a list comprehension, you can also use:
with open("old.txt") as f, open("new.txt", "w") as w:
[w.write(line) if not "keyword" in line else [next(f) for _ in range(2)] for line in f]
Demo

Split string within list into words in Python

I'm a newbie in Python, and I need to write a code in Python that will read a text file, then split each words in it, sort it and print it out.
Here is the code I wrote:
fname = raw_input("Enter file name: ")
fh = open(fname)
lst = list()
words = list()
for line in fh:
line = line.strip()
line.split()
lst.append(line)
lst.sort()
print lst
That's my output -
['Arise fair sun and kill the envious moon', 'But soft what light through yonder window breaks', 'It is the east and Juliet is the sun', 'Who is already sick and pale with grienter code herew',
'with', 'yonder']
However, when I try to split lst.split() it saying
List object has no attribute split
Please help!
You should extend the new list with the splitted line, rather than attempt to split the strings after appending:
for line in fh:
line = line.strip()
lst.extend(line.split())
The issue is split() does not magically mutate the string that is split into a list. You have to do sth with the return value.
for line in fh:
# line.split() # expression has has no effect
line = line.split() # statement does
# lst += line # shortcut for loop underneath
for token in line:
lst = lst + [token]
lst += [token]
The above is a solution that uses a nested loop and avoids append and extend. The whole line by line splitting and sorting can be done very concisely, however, with a nested generator expression:
print sorted(word for line in fh for word in line.strip().split())
You can do:
fname = raw_input("Enter file name: ")
fh = open(fname, "r")
lines = list()
words = list()
for line in fh:
# get an array of words for this line
words = line.split()
for w in words:
lines.append(w)
lines.sort()
print lines
To avoid dups:
no_dups_list = list()
for w in lines:
if w not in no_dups_list:
no_dups_list.append(w)

Categories