I've been trying to develop a program that will take a list of unique words and a list of their indexes from a file to create the original text but I can't get the integers converted back from str.
file=open("compressed_file_words.txt", "r")
listofwords = file.read()
file=open("compressed_file_word_positions.txt", "r")
positions = file.read()
for i in positions:
reconstructed_text = reconstructed_text + listofwords[i] + " "
this fails with following error
TypeError: string indices must be integers
How do I get the str converted back to int? I have tried various methods but none seem to work
Try this:
for i in range(0, len(positions)):
reconstructed_text = reconstructed_text + listofwords[i] + " "
Your problem is that positions is actually not an integer, and therefore i won't be an integer. for will literally iterate through every element of position and name that element i for the purposes of the for block. Since your block assumes an integer but is getting a string (a line from the file), you need to make your iterator an integer tied to the length of positions.
Note that you are probably misusing read and are actually looking for readlines.
Related
load_datafile() takes a single string parameter representing the filename of a datafile.
This function must read the content of the file, convert all letters to their lowercase, and store
the result in a string, and finally return that string. I will refer to this string as data throughout
this specification, you may rename it. You must also handle all exceptions in case the datafile
is not available.
Sample output:
data = load_datafile('harry.txt')
print(data)
the hottest day of the summer so far was drawing to a close and a drowsy silence
lay over the large, square houses of privet drive.
load_wordfile() takes a single string argument representing the filename of a wordfile.
This function must read the content of the wordfile and store all words in a one-dimensional
list and return the list. Make sure that the words do not have any additional whitespace or newline character in them. You must also handle all exceptions in case the files are not
available.
Sample outputs:
pos_words = load_wordfile("positivewords.txt")
print(pos_words[2:9])
['abundance', 'abundant', 'accessable', 'accessible', 'acclaim', 'acclaimed',
'acclamation']
neg_words = load_wordfile("negativewords.txt")
print(neg_words[10:19])
['aborts', 'abrade', 'abrasive', 'abrupt', 'abruptly', 'abscond', 'absence',
'absent-minded', 'absentee']
MY CODE BELOW
def load_datafile('harryPotter.txt'):
data = ""
with open('harryPotter.txt') as file:
lines = file.readlines()
temp = lines[-1].lower()
return data
Your code has two main problems. The first one is that you are assigning an empty string to the variable data and returning it, so no matter what you do with the contents of the file you always return an empty string. The second one is that file.readlines() returns a list of strings, where each line in the file is an element on the list and you are only converting the last element lines[-1] to lowercase.
To fix your code you should make sure that you store the contents of the file on the data variable and you should apply the lower() function to each line on the file and not just the last one. Something like this:
def load_datafile(file_name):
data = ''
with open(file_name) as file:
lines = file.readlines()
for line in lines:
data = data + line.lower() + '\n'
return data
The previous example is not the best way of doing this but it's very easy to understand what is happening and I think that is more important when you are starting. To make it more efficient you might want to change it to:
def load_datafile(file_name):
with open(file_name) as file:
return '\n'.join(line.lower() for line in file.readlines())
i'm looking for advice on how to create a script that will search a file for a key word.
My text file looks like this
1,1467800,968.00,957.00,8850,1005,963,546,950,8.00,
0.00,202149.00,12,
1,146928,1005,97995.00,979.00,967.000,824,955,826,
1,147,957.00,883.00,
it's from a Bluetooth device that I was having trouble with them talking over each other my solution was to make one device send a float the other send an int. I'm now trying to separate the numbers, and place them in 2 separate text documents. are there any functions I can do to make this project easier?
This is my current code that just takes in my text file
f = open("file.txt","r")
f1 = open("output.txt","w")
text = ""
for line in f:
text = line
text = text.rstrip("\n")
print(text)
f1.close()
f.close()
my_list = text.split(",")
ints, floats = []
for item in my list:
if '.' in item: #(if float)
floats.append(float(item))
else:
ints.append(int(item))
Explanation:
Split funtion converts text into a list by splitting it into elements using given key. (comma in this case)
Then you can write them into two different documents. For the sake of simplicity I divided them into two other lists which you can use to write a new file.
If your floats are not all actually integers, you can use the is_integer function of float and list comprehensions:
with open('your_file') as fd:
numbers = fd.read().split(',')
floats = [float(num) for num in numbers if not float(num).is_integer()]
integers = [float(num) for num in numbers if float(num).is_integer()]
You can also convert the numbers to a set after getting the floats and substract it from the original numbers list.
Otherwise:
with open('your_file') as fd:
numbers = fd.read().split(',')
floats = [float(num) for num in numbers if '.' in num]
integers = [float(num) for num in numbers if float(num).is_integer()]
I have a number of text files with a single long hex number inside each file. I want to find out the length of each hex number, i.e. ['FFFF0F'] =6, ['A23000000000000FD'] =17.
i read the file in:
file_to_open = open(myFile , 'r')
filey = file_to_open.readlines()
print(type(filey))
a = hex(int(filey, 16))
print(type(a))
n = len(filey)
print('length = ', n)
And my error is:
TypeError: int() cannot convert non-string with explicit base
if I remove the base 16 I get the error:
TypeError : int() argument must be a string, a bytes-like object or a number, not 'list'
Any ideas on how to just read in the number and find how many hex digits it contains?
readlines returns list of strs (lines) - in case of one-line file it is list with one element. Use read to get whole text as single str, strip leading and trailing whitespaces, then just get len:
with open(myFile , 'r') as f:
filey = f.read()
filey = filey.strip()
n = len(filey)
Note also that I used with so I do not have to care about closing that file handle myself. I assume all your files are single-line and contain some hex number. Note that if your number has any leading 0s, they will be counted too, so for example length of 000F is 4.
I am working on writing a function that returns the highest integer number in a specified file. The files only contain numbers. I came up with the following code;
def max_num_in_file(filename):
"""DOCSTRING"""
with open(filename, 'r') as file:
return max(file.read())
When I test this with a text file that I created, it returns the highest digit in any of the lines in the file. I need it to return the overall highest number rather than a single digit.
Assuming your file contains one number on each line:
with open(path, 'r') as file:
m = max(file.readlines(), key=lambda x: int(x))
Then m holds as a string the greatest number of the file, and int(m) is the value you are looking for.
file.readlines() gives you a list whose elements are the lines of the file.
The max built-in function takes an iterable (here, that list of lines), and an optional key argument.
The key argument is how you want the elements to be compared.
The elements of my iterable are strings which I know represent integers.
Therefore, I want them to be compared as integers.
So my key is lambda x: int(x), which is an anonymous function that returns int(x) when fed x.
Now, why did max(file.read()) not work?
file.read() gives you the string corresponding to the whole content of the file.
Then again, max compares the elements of the iterable it is passed, and returns the greatest one, according to the order relation defined on the elements' type(s).
For strings (str instances), it is the lexicographical order.
So if your file contains only numbers, all characters are digits, and the greatest element is the character corresponding to the greatest digit.
So max(file.read()) will most likely return '9' in most cases.
As long as your file is clean and has no empty/non number lines:
def max_num_in_file(filename):
"""DOCSTRING"""
with open(filename, 'r') as file:
return max([int(_x.strip()) for _x in file.readlines()])
You need to iterate the file object and convert each line to int(). If the file is very large, I would advise agains using readlines() as it will alocate a huge list into the memory. I'ts better to use an iterator to do the job, iterate one line at a time:
def max_num_in_a_file(filename):
def line_iterator(filename):
with open(filename) as f:
for line in f:
yield int(line)
return max(line_iterator(filename))
Beware the script will thrown an Exception if any line in your file is not convertable to an int() object. You can protect your iterator for such case and just skips the line, as follows:
def max_num_in_a_file(filename):
def line_iterator(filename):
with open(filename) as f:
for line in f:
try:
num = int(line)
except ValueError:
continue
yield num
return max(line_iterator(filename))
This function will work for a file with numbers and other data, and will just skips lines that are not convertible to int().
d=f.read()
max(map(int,d.split())) #given that file contains only numbers separated by ' '
# if file has other characters as well
max(map(int,[i for i in d.split() if i.isdigit()]))
You may also go through it.
def max_num_in_file(filename):
"""DOCSTRING"""
with open(filename, 'r') as file:
# read every line and converting into list
ls = [x.strip().split() for x in file.readlines()]
return max(map(int, sum(ls, [])))
# sum(ls,[]) is used for converting into a single list
# map() is used for convert string to int
I've been stuck on this Python homework problem for awhile now: "Write a complete python program that reads 20 real numbers from a file inner.txt and outputs them in sorted order to a file outter.txt."
Alright, so what I do is:
f=open('inner.txt','r')
n=f.readlines()
n.replace('\n',' ')
n.sort()
x=open('outter.txt','w')
x.write(print(n))
So my thought process is: Open the text file, n is the list of read lines in it, I replace all the newline prompts in it so it can be properly sorted, then I open the text file I want to write to and print the list to it. First problem is it won't let me replace the new line functions, and the second problem is I can't write a list to a file.
I just tried this:
>>> x= "34\n"
>>> print(int(x))
34
So, you shouldn't have to filter out the "\n" like that, but can just put it into int() to convert it into an integer. This is assuming you have one number per line and they're all integers.
You then need to store each value into a list. A list has a .sort() method you can use to then sort the list.
EDIT:
forgot to mention, as other have already said, you need to iterate over the values in n as it's a list, not a single item.
Here's a step by step solution that fixes the issues you have :)
Opening the file, nothing wrong here.
f=open('inner.txt','r')
Don't forget to close the file:
f.close()
n is now a list of each line:
n=f.readlines()
There are no list.replace methods, so I suggest changing the above line to n = f.read(). Then, this will work (don't forget to reassign n, as strings are immutable):
n = n.replace('\n','')
You still only have a string full of numbers. However, instead of replacing the newline character, I suggest splitting the string using the newline as a delimiter:
n = n.split('\n')
Then, convert these strings to integers:
`n = [int(x) for x in n]`
Now, these two will work:
n.sort()
x=open('outter.txt','w')
You want to write the numbers themselves, so use this:
x.write('\n'.join(str(i) for i in n))
Finally, close the file:
x.close()
Using a context manager (the with statement) is good practice as well, when handling files:
with open('inner.txt', 'r') as f:
# do stuff with f
# automatically closed at the end
I guess real means float. So you have to convert your results to float to sort properly.
raw_lines = f.readlines()
floats = map(float, raw_lines)
Then you have to sort it. To write result back, you have to convert to string and join with line endings:
sortеd_as_string = map(str, sorted_floats)
result = '\n'.join(sortеd_as_string)
Finally you have have to write result to destination.
Ok let's look it step by step what you want to do.
First: Read some integers out of a textfile.
Pythonic Version:
fileNumbers = [int(line) for line in open(r'inner.txt', 'r').readlines()]
Easy to get version:
fileNumbers = list()
with open(r'inner.txt', 'r') as fh:
for singleLine in fh.readlines():
fileNumbers.append(int(singleLine))
What it does:
Open the file
Read each line, convert it to int (because readlines return string values) and append it to the list fileNumbers
Second: Sort the list
fileNumbers.sort()
What it does:
The sort function sorts the list by it's value e.g. [5,3,2,4,1] -> [1,2,3,4,5]
Third: Write it to a new textfile
with open(r'outter.txt', 'a') as fh:
[fh.write('{0}\n'.format(str(entry))) for entry in fileNumbers]