sum up objects of a list - python

i'm writing a code that should take in a filename and create an initial list. Then, i'm trying to sum up each item in the list. The code i've written so far looks something like this...
filename = input('Enter filename: ')
Lists = []
for line in open(filename):
line = line.strip().split()
Lists = line
print(Lists)
total = 0
for i in Lists:
total = sum(int(Lists[i]))
print(total)
I take in a filename and set all the objects in the line = to the List. Then, I make a variable total which should print out the total of each item in the list. For instance, if List = [1,2,3] then the total will be 6. However, is it possible to append integer objects to a list? The error i'm receiving is...
File "/Users/sps329/Desktop/testss copy 2.py", line 10, in main
total = sum(int(Lists[i]))
TypeError: list indices must be integers, not str
Something like this doesn't work also because the items in the List are strings and not numbers. Would I have to implement the function isdigit even though I know the input file will always be integers?...
total = sum(i)

Instead of
Lists = line
you need
Lists.append(line)
You can get the total sum like this
total = sum(sum(map(int, item)) for item in Lists)
If you dont want to create list of lists, you can use extend function
Lists.extend(line)
...
total = sum(map(int, Lists))

# creates a list of the lines in the file and closes the file
with open(filename) as f:
Lists = f.readlines()
# just in case, perhaps not necessary
Lists = [i.strip() for i in Lists]
# convert all elements of Lists to ints
int_list = [int(i) for i in Lists]
# sum elements of Lists
total = sum(int_list)

print sum([float(x.strip()) for x in open(filename)])

Related

Python: how to dynamically add a number to the end of the name of a list

In python I am appending elements to a list. The element name I am getting from an input file, so the element_name is an unknown. I would like to create a new list by adding a number to the end of the original list (dyn_list1) or even by using one of the elements in the list (element_name). How can this be achieved?
dyn_list = []
for i in range(4):
dyn_list.append(element_name)
I thought it would look something like:
dyn_list = []
for i in range(4):
dyn_list%i.append(element_name)
print(dyn_list)
But that obviously doesn't work.
To try to be clearer, what I am ultimately trying to do is create a dynamically named list, with a list name that I didn't have to explicitly define in advance.
Update:
This is not perfect, but it gets close to accomplishing what I want it to:
for i in range(4):
dyn_list.append(element_name)
globals()[f"dyn_list{i}"] = dyn_list
print("dyn_list %s" %dyn_list)
print("Dyn_list1 %s" %dyn_list1)
my_list = ["something"]
g = globals()
for i in range(1, 5):
g['dynamiclist_{0}'.format(i)] = my_list
print(dynamiclist_1)
print(dynamiclist_2)
print(dynamiclist_3)
print(dynamiclist_4)
something like that ?
I'm not sure I understand correctly, but I think you want to add the data you get by reading a txt file or something like that at the end of your list. I'm answering this assuming. I hope it helps.
list_ = ["ethereum", "is", "future"] #your list
new_list = [] #declare your new list
list_len = len(list_) #get list len
with open('dynamic.txt') as f: #read your file for dynamic string
lines = f.readlines()
while list_len: # so that the dynamic data does not exceed the length of the list
for i in range(list_len):
new_element = list_[i] + "." + str(lines[i])
new_list.append(new_element)
list_len -= 1 #Decrease to finish the while loop
for p in new_list: print(p)
You may want to use a dictionary instead of a list. You could use something like
dyn_list = {}
for i in range(4) :
element_name = ...#some code to get element_name
dyn_list[i] = element_name
Then if you want to retrieve the ith element, you can use
dyn_list[i]

How to read and create a new list without duplicate words in Python?

I am new in Python and I have the following problem to solve:
"Open the file sample.txt and read it line by line. For each line, split the line into a list of words using the split() method. The program should build a list of words. For each word on each line check to see if the word is already in the list and if not append it to the list. When the program completes, sort and print the resulting words in alphabetical order."
I have done the following code, with some good result, but I can't understand the reason my result appears to multiple list. I just need to have the words in one list.
thanks in advance!
fname = input("Enter file name: ")
fh = open(fname)
lst = list()
lst=fh.read().split()
final_list=list()
for line in lst:
if line in lst not in final_list:
final_list.append(line)
final_list.sort()
print(final_list)
Your code is largely correct; the major problem is the conditional on your if statement:
if line in lst not in final_list:
The expression line in lst produces a boolean result, so this will end up looking something like:
if false not in final_list:
That will always evaluate to false (because you're adding strings to your list, not boolean values). What you want is simply:
if line not in final_list:
Right now, you're sorting and printing your list inside the loop, but it would be better to do that once at the end, making your code look like this:
fname = input("Enter file name: ")
fh = open(fname)
lst = list()
lst=fh.read().split()
final_list=list()
for line in lst:
if line not in final_list:
final_list.append(line)
final_list.sort()
print(final_list)
I have a few additional comments on your code:
You don't need to explicitly initialize a variable (as in lst = list())) if you're going to immediately assign something to it. You can just write:
fh = open(fname)
lst=fh.read().split()
On the other hand, you do need to initialize final_list because
you're going to try to call the .append method on it, although it
would be more common to write:
final_list = []
In practice, it would be more common to use a set to
collect the words, since a set will de-duplicate things
automatically:
final_list = set()
for line in lst:
final_list.add(line)
print(sorted(final_list))
Lastly, if I were to write this code, it might look like this:
fname = input("Enter file name: ")
with open(fname) as fh:
lst = fh.read().split()
final_list = set(word.lower() for word in lst)
print(sorted(final_list))
Your code has following problems as is:
if line in lst not in final_list - Not sure what you are trying to do here. I think you expect this to go over all words in the line and check in the final_list
Your code also have some indentation issues
Missing the call to close() method
You need to read all the lines to a list and iterate over the list of lines and perform the splitting and adding elements to the list as:
fname = input("Enter file name: ")
fh = open(fname)
lst = list()
lst = fh.read().split()
final_list=list()
for word in lst:
if word not in final_list:
final_list.append(word)
final_list.sort()
print(final_list)
fh.close()

Changing data types within a list of lists

I have a list of 50 lists each sub-list has 5 elements. What I want to do is take the 2nd element and change it from a string into an integer for 49 of the lists.
for i in range(len(Data)):
Data[i] = Data[i].strip()
Data[i] = Data[i].split(',')
Data[i] = int(x) for x in Data[[1:][1]]
In my mind this should start with element 2 of the main list and then change element 1 into an integer for all the lists. Hints the range 1:
But obviously this is not working
Each list inside the list has an element 0 a state name and element 1 state population. I want to sum the state populations. But first I need to switch the population figure from a string to an integer.
First element of the list is:
[['Alabama', '4802982']
I want to change '4802982' to an integer. So I can use the sum function. And sum the rest of the 49 states (each following list)
As you are trying to convert string to int only second element of list of list you can try this way.
# From your question I assume your data will be like
Data = [["list-1", "1", "something"], ["list-2", "2", "something else"]]
updated_data = []
for d in Data:
d[1] = int(d[1])
updated_data.append(d)
print(updated_data)
Output: [['list-1', 1, 'something'], ['list-2', 2, 'something else']]
How about:
Data[i] = [int(x) for x in Data[i]]
It's a good idea to make the data conversion a function of its own that works on a single "row" or "line":
def map_line(line):
line = line.strip().split(',')
return line[0] + [int(x) for x in line[1:]]
for i in range(len(Data)):
Data[i] = map_line(Data[i])

Several list comprehensions - one after each other

I have written some code, and to try and grasp the concept of list comprehensions, I am trying to convert some of the code into list comprehensions.
I have a nested for loop:
with (Input) as searchfile:
for line in searchfile:
if '*' in line:
ID = line[2:13]
IDstr = ID.strip()
print IDstr
hit = line
for i, x in enumerate(hit):
if x=='*':
position.append(i)
print position
I have made the first part of the code into a list comprehension as such:
ID = [line[2:13].strip() for line in Input if '*' in line]
print ID
This works fine. I have tried to do some of the next, but it is not working as intended. How do I make several list comprehensions after each other. The "Hit = …"-part below works fine, if it is the first list comprehension, but not if it is the second. The same with the above - it seems to work only, if it is the first. Why is this?
Hit = [line for line in Input if '*' in line]
print Hit
Positions = [(i, x) for i, x in enumerate(Hit) if x == '*']
print Positions
it seems to work only, if it is the first. Why is this?
This is because file objects -- input in your case -- are iterators, i.e. they are exhausted once you iterated them once. In your for loop this is not a problem, because you are iterating the file just once for both ID and position. If you want to use two list comprehensions like this, you either have to open the file anew for the second one, or read the lines from the file into a list, and use that list in the list comprehensions.
Also note that your positions list comprehension is wrong, as it enumerates the Hit list, and not each of the elements in the list, as was the case in your loop.
You could try like this (not tested):
# first, get the lines with '*' just once, cached as a list
star_lines = [line for line in input if '*' in line]
# now get the IDs using those cached lines
ids = [line[2:13].strip() for line in star_lines]
# for the positions we need a nested list comprehension
positions = [i for line in star_lines for i, x in enumerate(line) if x == '*']
That nested list comprehension is about equivalent to this nested loop:
positions = []
for line in star_lines:
for i, x in enumerate(line):
if x == '*':
posiitons.append(i)
Basically, you just "flatten" that block of code and put the thing to be appended to the front.

finding sum of values in a nested dictionary in python

I have around around 20000 text files, numbered 5.txt,10.txt and so on..
I am storing the filepaths of these files in a list "list2" that i have created.
I also have a text file "temp.txt" with a list of 500 words
vs
mln
money
and so on..
I am storing these words in another list "list" that i have created.
Now i create a nested dictionary d2[file][word]=frequency count of "word" in "file"
Now,
I need to iterate through these words for each text file as,
i am trying to get the following output :
filename.txt- sum(d[filename][word]*log(prob))
Here, filename.txt is of the form 5.txt,10.txt and so on...
"prob",which is a value that i have already obtained
I basically need to find the sum of the inner keys'(words) values, (which is the frequency of the word) for every outer key(file).
Say:
d['5.txt']['the']=6
here "the" is my word and "5.txt" is the file.Now 6 is the number of times "the" occurs in "5.txt".
Similarly:
d['5.txt']['as']=2.
I need to find the sum of the dictionary values.
So,here for 5.txt: i need my answer to be :
6*log(prob('the'))+2*log(prob('as'))+...`(for all the words in list)
I need this to be done for all the files.
My problem lies in the part where I am supposed to iterate through the nested dictionary
import collections, sys, os, re
sys.stdout=open('4.txt','w')
from collections import Counter
from glob import glob
folderpath='d:/individual-articles'
folderpaths='d:/individual-articles/'
counter=Counter()
filepaths = glob(os.path.join(folderpath,'*.txt'))
#test contains: d:/individual-articles/5.txt,d:/individual,articles/10.txt,d:/individual-articles/15.txt and so on...
with open('test.txt', 'r') as fi:
list2= [line.strip() for line in fi]
#temp contains the list of words
with open('temp.txt', 'r') as fi:
list= [line.strip() for line in fi]
#the dictionary that contains d2[file][word]
d2 =defaultdict(dict)
for fil in list2:
with open(fil) as f:
path, name = os.path.split(fil)
words_c = Counter([word for line in f for word in line.split()])
for word in list:
d2[name][word] = words_c[word]
#this portion is also for the generation of dictionary "prob",that is generated from file 2.txt can be overlooked!
with open('2.txt', 'r+') as istream:
for line in istream.readlines():
try:
k,r = line.strip().split(':')
answer_ca[k.strip()].append(r.strip())
except ValueError:
print('Ignoring: malformed line: "{}"'.format(line))
#my problem lies here
items = d2.items()
small_d2 = dict(next(items) for _ in range(10))
for fil in list2:
total=0
for k,v in small_d2[fil].items():
total=total+(v*answer_ca[k])
print("Total of {} is {}".format(fil,total))
for fil in list2: #list2 contains the filenames
total = 0
for k,v in d[fil].iteritems():
total += v*log(prob[k]) #where prob is a dict
print "Total of {} is {}".format(fil,total)
with open(f) as fil assigns fil to whatever the contents of f are. When you later access the entries in your dictionary as
total=sum(math.log(prob)*d2[fil][word].values())
I believe you mean
total = sum(math.log(prob)*d2[f][word])
though, this doesn't seem to quite match up with the order you were expecting, so I would instead suggest something more like this:
word_list = [#list of words]
file_list = [#list of files]
dictionary = {#your dictionary}
summation = lambda file_name,prob: sum([(math.log(prob)*dictionary[word][file_name]) for word in word_list])
return_value = []
for file_name in file_list:
prob = #something
return_value.append(summation(file_name))
The summation line there is defining an anonymous function within python. These are called lambda functions. Essentially, what that line in particular means is:
summation = lambda file_name,prob:
is almost the same as:
def summation(file_name, prob):
and then
sum([(math.log(prob)*dictionary[word][file_name]) for word in word_list])
is almost the same as:
result = []
for word in word_list:
result.append(math.log(prob)*dictionary[word][file_name]
return sum(result)
so in total you have:
summation = lambda file_name,prob: sum([(math.log(prob)*dictionary[word][file_name]) for word in word_list])
instead of:
def summation(file_name, prob):
result = []
for word in word_list:
result.append(math.log(prob)*dictionary[word][file_name])
return sum(result)
though the lambda function with the list comprehension is much faster than the for loop implementation. There are very few cases in python where one should use a for loop instead of a list comprehension, but they certainly exist.

Categories