List Comprehension; compacting code to two lines - python

The basic outline of this problem is to read the file, look for integers using the re.findall(), looking for a regular expression of '[0-9]+' and then converting the extracted strings to integers and summing up the integers.
I have finished the problem, but I would like to go extra and condense the code down to two lines.
This is my original code:
import re
fh = raw_input("Enter filename: ")
#returns regex_sum_241882.txt as default when nothing is entered
if len(fh)<1 : fh = "regex_sum_241882.txt"
file = open(fh)
sums = list()
#goes through each line in the file
for line in file:
#finds the numbers in each line and puts them in a list
nums = re.findall('[0-9]+',line)
#adds the numbers to an existing list
for num in nums:
sums.append(int(num))
#sums the list
print sum(sums)
Now here's my current compact code:
import re
lst = list()
print sum(for num in re.findall('[0-9]+',open("regex_sum_241882.txt").read())): int(num))
It doesn't work and gives me SyntaxError: invalid syntax
Can anyone point me in the right direction?
I feel I'm doing the same thing, but I'm not sure what the syntaxerror is about.

Try this way:
print sum(int(num) for num in re.findall('[0-9]+', open("regex_sum_241882.txt").read()))

Related

List Comprehension with Regular Expressions in a Text File Python

I'm doing a Python course and want to find all numbers in a text file with regular expression and sum them up.
Now I want to try to do it with list comprehension.
import re
try:
fh = open(input('Enter a file Name: ')) #input
except:
print('Enter an existing file name') #eror
quit()
he = list() #store numbers
for lines in fh:
lines.rstrip()
stuff = re.findall('[0-9]+', lines)
if len(stuff) == 0: #skip lines with no number
continue
else:
for i in stuff:
he.append(int(i)) #add numbers to storage
print(sum(he)) #print sum of stored numbers
This is my current code. The instructor said its possible to write the code in 2 lines or so.
import re
print( sum( [ ****** *** * in **********('[0-9]+',**************************.read()) ] ) )
the "*" should be replaced.
This text should be used to practice:
Why should you learn to write programs? 7746
12 1929 8827
Writing programs (or programming) is a very creative
7 and rewarding activity. You can write programs for
many reasons, ranging from making your living to solving
8837 a difficult data analysis problem to having fun to helping 128
someone else solve a problem. This book assumes that
everyone needs to know how to program ...
I know the general concept of list comprehension but I have no idea what to do.
The solution using list comprehension is:
import re
with open(input('Enter a file name: '), 'r') as fh:
print(sum(int(i) for i in re.findall('[0-9]+', fh.read())))
Explanation:
• The with statement is used to open the file and automatically close it after the indented block is executed.
• re.findall('[0-9]+', fh.read()) returns a list of all the numbers in the file as strings.
• The list comprehension int(i) for i in re.findall('[0-9]+', fh.read()) converts each string to an integer.
• Finally, sum() calculates the sum of all the integers in the list.
I think your instructor meant something like this:
import re
print(sum([int(i) for i in re.findall('[0-9]+', open(input('Enter a file Name: ')).read())]))
I spread it out into more lines so we can read it more easily:
print(
sum([
int(i) for i in re.findall(
'[0-9]+', open(input('Enter a file Name: ')).read()
)
])
)
To explain what is going on here, let's replace the parts of your code step by step.
You can create the stuff variable in the same way as your original code in only one line:
stuff = re.findall('[0-9]+', open(input('Enter a file Name: ')).read())
All I did there was move the file opening, open(input('Enter a file Name: ')) into the re.findall(), and not bother doing for lines in fh.
Then, instead of doing a for loop, for i in stuff and adding int(i) into the he list one-by-one, we can use our first list comprehension:
he = [int(i) for i in stuff]
Or, if we replace stuff with what we wrote before,
he = [int(i) for i in re.findall('[0-9]+', open(input('Enter a file Name: ')).read())]
Finally, we put a sum around that to get the sum of all items in the list he that we have created.
hope this solution helps you
File text:
import re
with open('./sum_numbers.txt', 'r') as f:
# this is the line for sum all numbers in the file
print(sum([int(no) for no in re.findall('\d+', f.read())])) # 91

Troubles averaging all the grades from a "CSV" file

So I am new to python and I am having a hard time figuring this code. I am trying to use "CSV File" called exam_grades.csv and then write a function that reads in all my values in the file but using the string class split() method to split this long string into a list of strings. Each string represents a grade. Then my function should return the average of all the grades.
So far this is what I have; I can open the .csv file just fine but I'm having troubles averaging all the grades. I have some commented out because I am sure where to go from what I have been doing :(
def fileSearch():
'Problem 4'
readfile = open('exam_grades.csv', "r")
for line in readfile:
l = line.split(str(","))
#num_grades = len(l)
#averageAllGrades = l * 500
#return num_grades
print(l)
fileSearch()
Any advice?
Thanks!
Most CSV files have a header at the top, you'll want to skip that but for simplicity sake, let's say we ignore that.
Here's some code that works:
def fileSearch():
'Problem 4'
readfile = open('exam_grades.csv', "r")
grade_sum = 0
grade_count = 0
for line in readfile:
l = line.split(str(","))
for grade in l:
grade_sum += int(grade)
grade_count += 1
print(grade_sum/grade_count)
fileSearch()
This assumes you have multiple lines with grades and multiple grades per line.
We're keeping track of two variables here, the sum of all grades and the number of all grades we've added to the list (we're also casting to integers, since you're going to be reading strings).
When you add all the grades up and divide by the number of grades, you get an average.
Hope this helped.

Python reading file and analysing lines with substring

In Python, I'm reading a large file with many many lines. Each line contains a number and then a string such as:
[37273738] Hello world!
[83847273747] Hey my name is James!
And so on...
After I read the txt file and put it into a list, I was wondering how I would be able to extract the number and then sort that whole line of code based on the number?
file = open("info.txt","r")
myList = []
for line in file:
line = line.split()
myList.append(line)
What I would like to do:
since the number in message one falls between 37273700 and 38000000, I'll sort that (along with all other lines that follow that rule) into a separate list
This does exactly what you need (for the sorting part)
my_sorted_list = sorted(my_list, key=lambda line: int(line[0][1:-2]))
Use tuple as key value:
for line in file:
line = line.split()
keyval = (line[0].replace('[','').replace(']',''),line[1:])
print(keyval)
myList.append(keyval)
Sort
my_sorted_list = sorted(myList, key=lambda line: line[0])
How about:
# ---
# Function which gets a number from a line like so:
# - searches for the pattern: start_of_line, [, sequence of digits
# - if that's not found (e.g. empty line) return 0
# - if it is found, try to convert it to a number type
# - return the number, or 0 if that conversion fails
def extract_number(line):
import re
search_result = re.findall('^\[(\d+)\]', line)
if not search_result:
num = 0
else:
try:
num = int(search_result[0])
except ValueError:
num = 0
return num
# ---
# Read all the lines into a list
with open("info.txt") as f:
lines = f.readlines()
# Sort them using the number function above, and print them
lines = sorted(lines, key=extract_number)
print ''.join(lines)
It's more resilient in the case of lines without numbers, it's more adjustable if the numbers might appear in different places (e.g. spaces at the start of the line).
(Obligatory suggestion not to use file as a variable name because it's a builtin function name already, and that's confusing).
Now there's an extract_number() function, it's easier to filter:
lines2 = [L for L in lines if 37273700 < extract_number(L) < 38000000]
print ''.join(lines2)

how to create a list and sum up

I am relatively new to python and got stuck on the below:
Below is the code I am working with
import re
handle = open ('RegExWeek2.txt')
for line in handle:
line = line.rstrip()
x = re.findall('[0-9]+', line)
if len(x) > 0:
print x
The return from this code looks like this:
['7430']
['9401', '9431']
['2248', '2047']
['5517']
['3184', '1241']
['9939']
['2185', '9450', '8428']
['369']
['3683', '6442', '7654']
Question: how do I combine this to one list and sum up the numbers?
Please help
You may change your code like this,
handle = open ('RegExWeek2.txt')
num = []
for line in handle:
num.extend(re.findall('[0-9]+', line))
print sum(int(i) for i in num)
Since you're using re.findall, this line.rstrip() line is not necessary.
And also there won't be possible for x to be an empty list, since we are using + next to [0-9] (repeats the previous token one or more times) not * (zero or more times)
There's no need to rstrip, and you should open files using with:
import re
all_numbers = []
with open('RegExWeek2.txt') as file:
for line in file:
numbers = re.findall('[0-9]+', line)
for number in numbers:
all_numbers.append(int(number))
print(sum(all_numbers))
This is really beginner code, and a direct translation of yours. Here's how I would write it:
with open('RegExWeek2.txt') as file:
all_numbers = [int(num) for num in re.findall('[0-9]+', file.read())]
print(sum(all_numbers))

Adding Up Numbers From A List In Python

basically i'm trying to complete a read file. i have made the "make" file that will generate 10 random numbers and write it to a text file. here's what i have so far for the "read" file...
def main():
infile = open('mynumbers.txt', 'r')
nums = []
line = infile.readline()
print ('The random numbers were:')
while line:
nums.append(int(line))
print (line)
line = infile.readline()
total = sum(line)
print ('The total of the random numbers is:', total)
main()
i know it's incomplete, i'm still a beginner at this and this is my first introduction to computer programming or python. basically i have to use a loop to gather up the sum of all the numbers that were listed in the mynumbers.txt. any help would be GREATLY appreciated. this has been driving me up a wall.
You don't need to iterate manually in Python (this isn't C, after all):
nums = []
with open("mynumbers.txt") as infile:
for line in infile:
nums.append(int(line))
Now you just have to take the sum, but of nums, of course, not of line:
total = sum(nums)
The usual one-liner:
total = sum(map(int, open("mynumbers.txt")))
It does generate a list of integers (albeit very temporarily).
Although I would go with Tim's answer above, here's another way if you want to use readlines method
# Open a file
infile = open('mynumbers.txt', 'r')
sum = 0
lines = infile.readlines()
for num in lines:
sum += int(num)
print sum
Just another solution... :-)
with open("x.txt") as file:
total = sum(int(line) for line in file)
This solution sums the "results" of a generator object so it isn't memory intensive yet short and elegant (pythonic).

Categories