Sum of everything inside a list [duplicate] - python

This question already has answers here:
Sum of strings extracted from text file using regex
(10 answers)
Closed 6 years ago.
I am extracting numbers from sentences within a large file. Example sentence (in file finalsum.txt):
the cat in the 42 hat and the cow 1772 jumps over the moon.
I then use regular expressions to create a list where I want to add together 1772 + 42 for the entire file creating a final sum of all the numbers.
Here is my current code :
import re
import string
fname = raw_input('Enter file name: ')
try:
if len(fname) < 1: fname = "finalsum.txt"
handle = open(fname, 'r')
except:
print 'Cannot open file:', fname
counts = dict()
numlist = list()
for line in handle:
line = line.rstrip()
x = re.findall('([0-9]+)', line)
if len(x) > 0:
result = map(int, x)
numlist.append(result)
print numlist
I know this code can be written in two lines using line comprehension, but I am just learning. Thank you!

You should not append the list but instead join the lists using + operator.
for line in handle:
line = line.rstrip()
x = re.findall('([0-9]+)', line)
if len(x) > 0:
result = map(int, x)
numlist+=result
#numlist.append(result)
print numlist
print sum(numlist)
Illustration:
>>> a = [1,2]
>>> b = [2,3]
>>> c = [4,5]
>>> a.append(b)
>>> a
[1, 2, [2, 3]]
>>> print b + c
[2, 3, 4, 5]
As you can see, append() appends list object at the end where as + joins the two lists.

Check this:
import re
line = 'the cat in the 42 hat and the cow 1772 jumps over the moon 11 11 11'
y = re.findall('([0-9]+)', line)
print sum([int(z) for z in y])
It does not create nested lists rather it adds all the value in a given list after converting it to integer. You can manage a temporary variable which will store the final sum and keep adding the sum of all the integers in the current line to this temporary variable.
Like this:
answer = 0
for line in handle:
line = line.rstrip()
x = re.findall('([0-9]+)', line)
if len(x) > 0:
answer += sum([int(z) for z in x])
print answer

Related

How to put a group of integers in a row in a text file into a list?

I have a text file composed mostly of numbers something like this:
3 011236547892X
9 02321489764 Q
4 031246547873B
I would like to extract each of the following (spaces 5 to 14 (counting from zero)) into a list:
1236547892
321489764
1246547873
(Please note: each "number" is 10 "characters" long - the second row has a space at the end.)
and then perform analysis on the contents of each list.
I have umpteen versions, however I think I am closest with:
with open('k_d_m.txt') as f:
for line in f:
range = line.split()
num_lst = [x for x in range(3,10)]
print(num_lst)
However I have: TypeError: 'list' object is not callable
What is the best way forward?
What I want to do with num_lst is, amongst other things, as follows:
num_lst = list(map(int, str(num)))
print(num_lst)
nth = 2
odd_total = sum(num_lst[0::nth])
even_total = sum(num_lst[1::nth])
print(odd_total)
print(even_total)
if odd_total - even_total == 0 or odd_total - even_total == 11:
print("The number is ok")
else:
print("The number is not ok")
Use a simple slice:
with open('k_d_m.txt') as f:
num_lst = [x[5:15] for x in f]
Response to comment:
with open('k_d_m.txt') as f:
for line in f:
num_lst = list(line[5:15])
print(num_lst)
First of all, you shouldn't name your variable range, because that is already taken for the range() function. You can easily get the 5 to 14th chars of a string using string[5:15]. Try this:
num_lst = []
with open('k_d_m.txt') as f:
for line in f:
num_lst.append(line[5:15])
print(num_lst)

Splitting an array into two arrays in Python

I have a list of numbers like so;
7072624 through 7072631
7072672 through 7072687
7072752 through 7072759
7072768 through 7072783
The below code is what I have so far, i've removed the word "through" and it now prints a list of numbers.
import os
def file_read(fname):
content_array = []
with open (fname) as f:
for line in f:
content_array.append(line)
#print(content_array[33])
#new_content_array = [word for line in content_array[33:175] for word in line.split()]
new_content_array = [word for line in content_array[33:37] for word in line.split()]
while 'through' in new_content_array: new_content_array.remove('through')
print(new_content_array)
file_read('numbersfile.txt')
This gives me the following output.
['7072624', '7072631', '7072672', '7072687', '7072752', '7072759', '7072768', '7072783']
So what I'm wanting to do but struggling to find is how to split the 'new_content_array' into two arrays so the output is as follows.
array1 = [7072624, 7072672, 7072752, 7072768]
array2 = [7072631, 7072687, 7072759, 7072783]
I then want to be able to take each value in array 2 from the value in array 1
7072631 - 7072624
7072687 - 7072672
7072759 - 7072752
7072783 - 7072768
I've been having a search but can't find anything similar to my situation.
Thanks in advance!
Try this below:
list_data = ['7072624', '7072631', '7072672', '7072687', '7072752', '7072759', '7072768', '7072783']
array1 = [int(list_data[i]) for i in range(len(list_data)) if i % 2 == 0]
array2 = [int(list_data[i]) for i in range(len(list_data)) if i % 2 != 0]
l = ['7072624', '7072631', '7072672', '7072687', '7072752', '7072759','7072768', '7072783']
l1 = [l[i] for i in range(len(l)) if i % 2 == 0]
l2 = [l[i] for i in range(len(l)) if i % 2 == 1]
print(l1) # ['7072624', '7072672', '7072752', '7072768']
print(l2) # ['7072631', '7072687', '7072759', '7072783']
result = list(zip(l1,l2))
As a result you will get:
[('7072624', '7072631'),
('7072672', '7072687'),
('7072752', '7072759'),
('7072768', '7072783')]
I think that as comprehension list, but you could also use filter
You could try to split line using through keyword,
then removing all non numeric chars such as new line or space using a lambda function and regex inside a list comprehension
import os
import re
def file_read(fname):
new_content_array = []
with open (fname) as f:
for line in f:
line_array = line.split('through')
new_content_array.append([(lambda x: re.sub(r'[^0-9]', "", x))(element) for element in line_array])
print(new_content_array)
file_read('numbersfile.txt')
Output looks like this:
[['7072624', '7072631'], ['7072672', '7072687'], ['7072752', '7072759'], ['7072768', '7072783']]
Then you just could extract first element of each nested list to store separately in a variable and so on with second element.
Good luck

How can I merge each two lines of a large text file into a Python list?

I have a .txt file that is split into multiple lines, but each two of these lines I would like to merge into a single line of a list. How do I do that?
Thanks a lot!
What I have is organized like this:
[1 2 3 4
5 6]
[1 2 3 4
5 6 ]
while what I need would be:
[1 2 3 4 5 6]
[1 2 3 4 5 6]
data =[]
with open(r'<add file path here >','r') as file:
x = file.readlines()
for i in range(0,len(x),2):
data.append(x[i:i+2])
new =[' '.join(i) for i in data]
for i in range(len(new)):
new[i]=new[i].replace('\n','')
new_file_name = r'' #give new file path here
with open(new_file_name,'w+') as file:
for i in new:
file.write(i+'\n')
Try This
final_data = []
with open('file.txt') as a:
fdata= a.readlines()
for ln in range(0,len(fdata),2):
final_data.append(" ".join([fdata[ln].strip('\n'), fdata[ln+1].strip('\n')]))
print (final_data)
I feel you can use a regex for solving this scenario :
#! /usr/bin/env python2.7
import re
with open("textfilename.txt") as r:
text_data = r.read()
independent_lists = re.findall(r"\[(.+?)\]",r ,re.DOTALL)
#now that we have got each independent_list we can next work on
#turning it into a list
final_list_of_objects = [each_string.replace("\n"," ").split() for each_string in independent_lists]
print final_list_of_objects
However if you do not want them to be as a list object and rather just want the outcome without the newline characters inbetween the list then:
#! /usr/bin/env python2.7
import re
with open("textfilename.txt") as r:
text_data = r.read()
new_txt = ""
for each_char in text_data:
if each_char == "[":
bool_char = True
elif each_char == "]":
bool_char = False
elif each_char == "\n" and bool_char:
each_char = " "
new_txt += each_char
new_txt = re.sub(r"\s+", " ", new_txt) # to remove multiple space lines between numbers
You can do two things here:
1) If the text file was created by writing using numpy's savetxt function, you can simply use numpy.loadtxt function with appropriate delimiter.
2) Read file in a string and use a combination of replace and split functions.
file = open(filename,'r')
dataset = file.read()
dataset = dataset.replace('\n',' ').replace('] ',']\n').split('\n')
dataset = [x.replace('[','').replace(']','').split(' ') for x in dataset]
with open('test.txt') as file:
new_data = (" ".join(line.strip() for line in file).replace('] ',']\n').split('\n')) # ['[1 2 3 4 5 6]', ' [1 2 3 4 5 6 ]']
with open('test.txt','w+') as file:
for data in new_data:
file.write(data+'\n')
line.rstrip() removes just the trailing newline('\n') from the line.
you need to pass all read and stripped lines to ' '.join(), not
each line itself. Strings in python are sequences to, so the string
contained in line is interpreted as separate characters when passed on
it's own to ' '.join().

No output from sum of Python regex list

The problem is to read the file, look for integers using the re.findall(), looking for a regular expression of '[0-9]+' and then converting the extracted strings to integers and summing up the integers.
MY CODE: in which sample.txt is my text file
import re
hand = open('sample.txt')
for line in hand:
line = line.rstrip()
x = re.findall('[0-9]+',line)
print x
x = [int(i) for i in x]
add = sum(x)
print add
OUTPUT:
You need to append the find results to another list. So that the number found on current line will be kept back when iterating over to the next line.
import re
hand = open('sample.txt')
l = []
for line in hand:
x = re.findall('[0-9]+',line)
l.extend(x)
j = [int(i) for i in l]
add = sum(j)
print add
or
with open('sample.txt') as f:
print sum(map(int, re.findall(r'\d+', f.read())))
try this
import re
hand = open("a.txt")
x=list()
for line in hand:
y = re.findall('[0-9]+',line)
x = x+y
sum=0
for z in x:
sum = sum + int(z)
print(sum)

Python3: read from a file and sort the values

I have a txt file that contains data in the following fashion:
13
56
9
32
99
74
2
each value in a different file. I created three function:
the first one is to swap the values
def swap(lst,x,y):
temp = lst[x]
lst[x] = lst[y]
lst[y] = temp
and the second function is to sort the values:
def selection_sort(lst):
for x in range(0,len(lst)-1):
print(lst)
swap(lst,x,findMinFrom(lst[x:])+ x)
the third function is to find the minimum value from the list:
def findMinFrom(lst):
minIndex = -1
for m in range(0,len(lst)):
if minIndex == -1:
minIndex = m
elif lst[m] < lst[minIndex]:
minIndex = m
return minIndex
Now, how can I read from the file that contains the numbers and print them sorted?
Thanks in advance!
I used:
def main():
f = []
filename = input("Enter the file name: ")
for line in open(filename):
for eachElement in line:
f += eachElement
print(f)
selectionSort(f)
print(f)
main()
but still not working! any help?
Good programmers don't reinvent the wheel and use sorting routines that are standard in most modern languages. You can do:
with open('input.txt') as fp:
for line in sorted(fp):
print(line, end='')
to print the lines sorted alphabetically (as strings). And
with open('input.txt') as fp:
for val in sorted(map(int, fp)):
print(val)
to sort numerically.
To read all the lines in a file:
f = open('test.txt')
your_listname = list(f)
To sort and print
selection_sort(output_listname)
print(output_listname)
You may need to strip newline characters before sorting/printing
stripped_listname=[]
for i in your_listname:
i = i.strip('\n')
stripped_listname.append(i)
You probably also want to take the print statement out of your sort function so it doesn't print the list many times while sorting it.

Categories