Is there a way to make this python system count the prefixes in the array? I keep getting a prefixcount result of 0
Any help would be appreciated :)
The code I have is below
file = input("What is the csv file's name?")+".csv"
openfile = open(file)
text = sorted(openfile)
print("Sorted list")
print(text)
dictfile = open("DICT.txt")
prefix = ['de', 'dys', 'fore', 'wh']
prefixcount = 0
for word in text:
for i in range(0, len(prefix)):
if word>prefix[i]:
break
if word[0:len(prefix[i])] == prefix[i]:
prefixcount+=1
break
print(prefixcount)
Firstly, when you meet a condition where you want to skip an iteration, use continue - break ends the loop entirely.
Secondly, word>prefix[i] does not do what you want; it compares the strings lexicographically (see e.g. the Python docs), and you really want to know len(word) < len(prefix[i]).
I think what you want is:
prefixcount = 0
for word in text:
for pref in prefix:
if word.startswith(pref):
prefixcount += 1
Related
I'm trying to create a simple program that opens a file, splits it into single word lines (for ease of use) and creates a dictionary with the words, the key being the word and the value being the number of times the word is repeated. This is what I have so far:
infile = open('paragraph.txt', 'r')
word_dictionary = {}
string_split = infile.read().split()
for word in string_split:
if word not in word_dictionary:
word_dictionary[word] = 1
else:
word_dictionary[word] =+1
infile.close()
word_dictionary
The line word_dictionary prints nothing, meaning that the lines are not being put into a dictionary. Any help?
The paragraph.txt file contains this:
This is a sample text file to be used for a program. It should have nothing important in here or be used for anything else because it is useless. Use at your own will, or don't because there's no point in using it.
I want the dictionary to do something like this, but I don't care too much about the formatting.
Two things. First of all the shorter version of
num = num + 1
is
num += 1
not
num =+ 1
code
infile = open('paragraph.txt', 'r')
word_dictionary = {}
string_split = infile.read().split()
for word in string_split:
if word not in word_dictionary:
word_dictionary[word] = 1
else:
word_dictionary[word] +=1
infile.close()
print(word_dictionary)
Secondly you need to print word_dictionary
I am opening trying to create a function that opens a .txt file and counts the words that have the same length as the number specified by the user.
The .txt file is:
This is a random text document. How many words have a length of one?
How many words have the length three? We have the power to figure it out!
Is a function capable of doing this?
I'm able to open and read the file, but I am unable to exclude punctuation and find the length of each word.
def samplePractice(number):
fin = open('sample.txt', 'r')
lstLines = fin.readlines()
fin.close
count = 0
for words in lstLines:
words = words.split()
for i in words:
if len(i) == number:
count += 1
return count
You can try using the replace() on the string and pass in the desired punctuation and replace it with an empty string("").
It would look something like this:
puncstr = "Hello!"
nopuncstr = puncstr.replace(".", "").replace("?", "").replace("!", "")
I have written a sample code to remove punctuations and to count the number of words. Modify according to your requirement.
import re
fin = """This is a random text document. How many words have a length of one? How many words have the length three? We have the power to figure it out! Is a function capable of doing this?"""
fin = re.sub(r'[^\w\s]','',fin)
print(len(fin.split()))
The above code prints the number of words. Hope this helps!!
instead of cascading replace() just use strip() a one time call
Edit: a cleaner version
pl = '?!."\'' # punctuation list
def samplePractice(number):
with open('sample.txt', 'r') as fin:
words = fin.read().split()
# clean words
words = [w.strip(pl) for w in words]
count = 0
for word in words:
if len(word) == number:
print(word, end=', ')
count += 1
return count
result = samplePractice(4)
print('\nResult:', result)
output:
This, text, many, have, many, have, have, this,
Result: 8
your code is almost ok, it just the second for block in wrong position
pl = '?!."\'' # punctuation list
def samplePractice(number):
fin = open('sample.txt', 'r')
lstLines = fin.readlines()
fin.close
count = 0
for words in lstLines:
words = words.split()
for i in words:
i = i.strip(pl) # clean the word by strip
if len(i) == number:
count += 1
return count
result = samplePractice(4)
print(result)
output:
8
I'd like to know how to achieve the same result as the code I listed below without using any collections, or for someone to explain what goes on inside the Counter collection (in code or in a way that isn't confusing) since I can't seem to find it anywhere. This code is meant to read a text file called juliet.txt. I am trying to make it count the amount of letters and spaces inside the document and then print it as a result.
Code:
from collections import Counter
text = open('juliet.txt', 'r').read()
letters = 0
counter = Counter(text)
spacesAndNewlines = counter[' '] + counter['\n']
while letters < len(text):
print (text[letters])
letters += 1
while letters == len(text):
print (letters)
letters += 1
print (spacesAndNewlines)
Sounds like a homework question to me, in which case you won't get any benefit from me answering you.
letters = {}
with open('juliet.txt') as fh:
data = fh.read()
for char in data:
if char in letters:
letters[char] = 1
else:
letters[char] += 1
print(letters)
This uses a standard dictionary - normally I would use a defaultdict but for some weird reason you don't like collections. With the defaultdict you wouldn't need to do the laborious test to see if the char is already in the dictionary.
I'm relatively new to python and coding and I'm trying to write a code that counts the number of times each different character comes out in a text file while disregarding the case of the characters.
What I have so far is
letters = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o',
'p','q','r','s','t','u','v','w','x','y','z']
prompt = "Enter filename: "
titles = "char count\n---- -----"
itemfmt = "{0:5s}{1:10d}"
totalfmt = "total{0:10d}"
whiteSpace = {' ':'space', '\t':'tab', '\n':'nline', '\r':'crtn'}
filename = input(prompt)
fname = filename
numberCharacters = 0
fname = open(filename, 'r')
for line in fname:
linecount +=1
word = line.split()
word += words
for word in words:
for char in word:
numberCharacters += 1
return numberCharacters
Somethings seems wrong about this. Is there a more efficient way to do my desired task?
Thanks!
from collections import Counter
frequency_per_character = Counter(open(filename).read().lower())
Then you can display them as you wish.
A better way would be to use the str methods such as isAlpha
chars = {}
for l in open('filename', 'rU'):
for c in l:
if not c.isalpha(): continue
chars[c] = chars.get(c, 0) + 1
And then use the chars dict to write the final histogram.
You over-complicating it, you can just convert your file content to a set in order to eliminate duplicated characters:
number_diff_value = len(set(open("file_path").read()))
Well, I have a problem in a Python script, I need to do is that the index of the split function, increases automatically with every iteration of the loop. I do this:
tag = "\'"
while loop<=302:
for line in f1.readlines():
if tag in line:
word = line.split(tag)[num] #num is the index I need to increase
text = "Word: "+word+"."
f.write(text)
num = num + 1
loop = loop + 1
But...the "num" variable on index doesn't change...it simply stays the same. The num index indicates the word I need to take. So this is why "num = num + 1" would have to increase...
What is the problem in the loop?
Thanks!
Your question is confusing. But I think you want to move num = num + 1 into the for loop and if statement.
tag = "\'"
while loop<=302:
for line in f1.readlines():
if tag in line:
word = line.split(tag)[num] #num is the index I need to increase
num = num + 1
text = "Word: "+word+"."
f.write(text)
loop = loop + 1
Based on Benyi's comment in the question - do you just want this for the individual sentences? You might not need to index.
>>> mystring = 'hello i am a string'
>>> for word in mystring.split():
print 'Word: ',word
Word: hello
Word: i
Word: am
Word: a
Word: string
There seems to be a lot of things wrong with this.
First
while loop <= 302:
for line in f1.readlines():
f1.readlines() is going be [] for every iteration past the first
Second
for line in f1.readline():
word = line.split(tag)[num]
...
text = "Word: "+word+"."
Even if you made the for loop work, text will always be using the last iteration of the word. Maybe this is desired behavior, but it seems strange.
Third
while loop<=302:
...
loop = loop += 1
Seems like it would be better written as
for _ in xrange(302):
Since loop isn't used at all inside that scope. This is assuming loop starts at 0, if it doesn't then you just adjust 302 to however many iterations you wanted.
Lastly
num = num + 1
This is outside your inner loop, so num will always be the same for the first iteration, then won't matter latter because of the empty f1.readlines() as stated before.
I have a different approach to your problem as mentioned by you in the comment. Consider input.txt has the following entry:
this is a an input file.
then the Following code will give you the desired output
lines = []
with open (r'C:\temp\input.txt' , 'r') as fh:
lines = fh.read()
with open (r'C:\temp\outputfile.txt' , 'w') as fh1:
for words in lines.split():
fh1.write("Words:"+ words+ "\n" )