python add specific lists within a list - python

For this problem I am dealing with a big list,that it was imported from a CSV file, but let's say
I have a list like this:
[['name','score1','score2''score3''score4']
['Mike','5','1','6','2']
['Mike','1','1','1','1']
['Mike','3','0','3','0']
['jose','0','1','2','3']
['jose','2','3','4','5']
['lisa','4','4','4','4']]
and I want to have another list with this form(the sum of all score for each student):
[['Mike','9','2','10','3']
['jose','2','4','6','8']
['lisa','4','4','4','4']]
any ideas how this can be done?
I've been trying many ways, and I could not make it.
I was stuck when there where more than 2 same names, my solution only kept the last 2 lines to add.
I am new in python and programming in general.

If you are just learning Python I always recommend try to implement things without relying on external libraries. A good starting step is to start by trying to break the problem up into smaller components:
Remove the first entry (the column titles) from the input list. You don't need it for your result.
For each remaining entry:
Convert every entry except the first to an integer (so you can add them).
Determine if you have already encountered an entry with the same name (first column value). If not: add the entry to the output list. Otherwise: merge the entry with the one already in the output list (by adding values in the columns).
One possible implementation follows (untested):
input_list = [['name','score1','score2''score3''score4'],
['Mike','5','1','6','2'],
['Mike','1','1','1','1'],
['Mike','3','0','3','0'],
['jose','0','1','2','3'],
['jose','2','3','4','5'],
['lisa','4','4','4','4']]
print input_list
# Remove the first element
input_list = input_list[1:]
# Initialize an empty output list
output_list = []
# Iterate through each entry in the input
for val in input_list:
# Determine if key is already in output list
for ent in output_list:
if ent[0] == val[0]:
# The value is already in the output list (so merge them)
for i in range(1, len(ent)):
# We convert to int and back to str
# This could be done elsewhere (or not at all...)
ent[i] = str(int(ent[i]) + int(val[i]))
break
else:
# The value wasn't in the output list (so add it)
# This is a useful feature of the for loop, the following
# is only executed if the break command wasn't reached above
output_list.append(val)
#print input_list
print output_list
The above is not as efficient as using a dictionary or importing a library that can perform the same operation in a couple of lines, however it demonstrates a few features of the language. Be careful when working with lists though, the above modifies the input list (try un-commenting the print statement for the input list at the end).

Let us say you have
In [45]: temp
Out[45]:
[['Mike', '5', '1', '6', '2'],
['Mike', '1', '1', '1', '1'],
['Mike', '3', '0', '3', '0'],
['jose', '0', '1', '2', '3'],
['jose', '2', '3', '4', '5'],
['lisa', '4', '4', '4', '4']]
Then, you can use Pandas ...
import pandas as pd
temp = pd.DataFrame(temp)
def test(m):
try: return int(m)
except: return m
temp = temp.applymap(test)
print temp.groupby(0).agg(sum)
If you are importing it from a cvs file, you can directly read the file using pd.read_csv

You could use better solution as suggested but if you'd like to implement yourself and learn, you can follow and I will explain in comments:
# utilities for iteration. groupby makes groups from a collection
from itertools import groupby
# implementation of common, simple operations such as
# multiplication, getting an item from a list
from operator import itemgetter
def my_sum(groups):
return [
ls[0] if i == 0 else str(sum(map(int, ls))) # keep first one since it's name, sum otherwise
for i, ls in enumerate(zip(*groups)) # transpose elements and give number to each
]
# list comprehension to make a list from another list
# group lists according to first element and apply our function on grouped elements
# groupby reveals group key and elements but key isn't needed so it's set to underscore
result = [my_sum(g) for _, g in groupby(ls, key=itemgetter(0))]
To understand this code, you need to know about list comprehension, * operator, (int, enumerate, map, str, zip) built-ins and some handy modules, itertools and operator.
You edited to add header which will break our code so we need to remove it such that we need to pass ls[1:] to groupby instead of ls. Hope it helps.

As a beginner I would consider turning your data into a simpler structure like a dictionary, so that you are just summing a list of list. Assuming you get rid of the header row then you can turn this into a dictionary:
>>> data_dict = {}
>>> for row in data:
... data_dict.setdefault(row[0], []).append([int(i) for i in row[1:]])
>>> data_dict
{'Mike': [[5, 1, 6, 2], [1, 1, 1, 1], [3, 0, 3, 0]],
'jose': [[0, 1, 2, 3], [2, 3, 4, 5]],
'lisa': [[4, 4, 4, 4]]}
Now it should be relatively easy to loop over the dict and sum up the lists (you may want to look a sum and zip as a way to do that.

This is well suited for collections.Counter
from collections import Counter, defaultdict
csvdata = [['name','score1','score2','score3','score4'],
['Mike','5','1','6','2'],
['Mike','1','1','1','1'],
['Mike','3','0','3','0'],
['jose','0','1','2','3'],
['jose','2','3','4','5'],
['lisa','4','4','4','4']]
student_scores = defaultdict(Counter)
score_titles = csvdata[0][1:]
for row in csvdata[1:]:
student = row[0]
scores = dict(zip(score_titles, map(int, row[1:])))
student_scores[student] += Counter(scores)
print(student_scores["Mike"])
# >>> Counter({'score3':10, 'score1':9, 'score4':3, 'score2':2})
collections.defaultdict

Related

How to call for information from a list?

I'm quite new to Python and I would like some help. I'm currently trying to store information from the first line of a txt file in a tuple and I am having some trouble. The txt file's second line is:
Water: 0 0 4 2 1 3
I want to store the numbers only so my current code is:
water = []
with open(file_name) as f:
lines = f.readlines()
water_values = lines[1].strip()
splitted = water_values.split(" ")
splitted.remove("Water:")
water.append(splitted)
However, when I call for water[1], expecting to receive 0, I find that the index is out of range and that the len(water) is only 1. When I print it, it says:
[['0', '0', '4', '2', '1', '3']]
How can I change it so that I can call for each element?
When you call water.append(splitted) you are adding a new element to the end of the list and splitted is a list so you get a list of lists.
If you want to combine two lists, you should instead call water += splitted. The += operator means that you are adding to the left side value, what ever is on the right side and is analogous to water = water + splitted.
You should use .extend rather than .append, i.e. instead of
water.append(splitted)
do
water.extend(splitted)
Simple example to show difference:
a = []
b = []
a.append([1,2,3])
b.extend([1,2,3])
print(a)
print(b)
output:
[[1, 2, 3]]
[1, 2, 3]
If you know to want more about handling lists in python read More on Lists in docs
your code water.append(splitted) just adds splitted (which is a list) as a first element of water list. To add values from splitted you could just do following:
water += splitted
instead of
water.append(splitted)
Doing so - you will get water = ['0', '0', '4', '2', '1', '3'].
You can read more here How do I concatenate two lists in Python?

How Do I create a new list from distinct elements of existing list?

I have a list
current_list = [#,'1','2','3','4','5','6','7','8','9']
I want to create a new list by indexing current_list such that
new_list = ['1','5','9']
I have tried
new_list = current_list[1] + current_list[5] + current_list[9]
but I get
>>> 159
and not
>>> ['1','5','9']
How do I create new_list from current_list such that
new_list = ['1','5','9'] ?
New to programming and appreciate your patience.
you are adding list items by using + sign . Try:
new_list = [current_list[1] , current_list[5] , current_list[9]]
your list must contain at least 10 item otherwise you will get index out of bound error
You can do this if you want your result. For the new list the elements needs to be arranged as list.'+' sign is used esp. in strings (concationation) or simple addition process. So,
current_list = ['#','1','2','3','4','5','6','7','8','9']
new_list=[current_list[1],current_list[5],current_list[9]]
Instead of hard-coding the indexes (e.g. [current_list[1],current_list[5],current_list[9]]), I would recommend programatically inserting the indexes so that it is easy to modify in the future, or you can easily generate the indexes you want from a function
indexes = [1, 5, 9]
current_list = ['#','1','2','3','4','5','6','7','8','9']
new_list = [current_list[i] for i in indexes]
## gives ['1','5','9']
Now, if you need to change the indexes, you can just modify the indexes line.
Or, if down the road a user needs to specify the indexes from a file, you can read those numbers from a file. Either way, the way you generate new_list from current_list stays the same. (As a new programmer, it is important that you learn early the importance of writing code so that it is easy to modify in the future.)
from operator import itemgetter
def make_list_from(*indices, lst):
# Create a function that will get values from given indices
values = itemgetter(*indices)
# Get those values as a tuple and convert them into a list
return list(values(lst))
current_list = ['0', '1','2','3','4','5','6','7','8','9']
print(make_list_from(1, 5, 9, lst=current_list))
# ['1', '5', '9']
you can use itemgetter:
from operator import itemgetter
my_indices = [1, 5, 9]
new_list = list(itemgetter(*my_indices)(current_list))
or you can pick the elements by your indeces using list comprehension:
new_list = [current_list[i] for i in my_indices]
similar with:
new_list = []
for index in my_indices:
new_list.append(current_list[index])
print(new_list)
output:
['1', '5', '9']

Python build one dictionary from a list of keys, and a list of lists of values

So I have a list of keys:
keys = ['id','name', 'date', 'size', 'actions']
and I also have a list of lists of vales:
values=
[
['1','John','23-04-2015','0','action1'],
['2','Jane','23-04-2015','1','action2']
]
How can I build a dictionary with those keys matched to the values?
The output should be:
{
'id':['1','2'],
'name':['John','Jane'],
'date':['23-04-2015','23-04-2015'],
'size':['0','1'],
'actions':['action1','action2']
}
EDIT:
I tried to use zip() and dict(), but that would only work if the list of values had 1 list, i.e. values = [['1','John','23-04-2015','0','action1']]
for list in values:
dic = dict(zip(keys,list))
I also thought about initialising a dic with the keys, then building the list of values on my own, but I felt that there had to be an easier way to do it.
dic = dict.fromkeys(keys)
for list in values:
ids = list[0]
names = list[1]
dates = list[2]
sizes = list[3]
actions = list[4]
and then finally
dic['id'] = ids
dic['name'] = names
dic['date'] = dates
dic['size'] = sizes
dic['action'] = actions
This seemed really silly and I was wondering what a better way of doing it would be.
>>> keys = ['id','name', 'date', 'size', 'actions']
>>> values = [['1','John','23-04-2015','0','action1'], ['2','Jane','23-04-2015','1','action2']]
>>> c = {x:list(y) for x,y in zip(keys, zip(*values))}
>>> c
{'id': ['1', '2'], 'size': ['0', '1'], 'actions': ['action1', 'action2'], 'date': ['23-04-2015', '23-04-2015'], 'name': ['John', 'Jane']}
>>> print(*(': '.join([item, ', '.join(c.get(item))]) for item in sorted(c, key=lambda x: keys.index(x))), sep='\n')
id: 1, 2
name: John, Jane
date: 23-04-2015, 23-04-2015
size: 0, 1
actions: action1, action2
This uses several tools:
c is created with a dictionary comprehension. Comprehensions are a different way of expressing an iterable like a dictionary or a list. Instead of initializing an empty iterable and then using a loop to add elements to it, a comprehension moves these syntactical structures around.
result = [2*num for num in range(10) if num%2]
is equivalent to
result = []
for num in range(10):
if num%2: # shorthand for "if num%2 results in non-zero", or "if num is not divisible by 2"
result.append(2*num)
and we get [2, 6, 10, 14, 18].
zip() creates a generator of tuples, where each element of each tuple is the corresponding element of one of the arguments you passed to zip().
>>> list(zip(['a','b'], ['c','d']))
[('a', 'c'), ('b', 'd')]
zip() takes multiple arguments - if you pass it one large list containing smaller sublists, the result is different:
>>> list(zip([['a','b'], ['c','d']]))
[(['a', 'b'],), (['c', 'd'],)]
and generally not what we want. However, our values list is just such a list: a large list containing sublists. We want to zip() those sublists. This is a great time to use the * operator.
The * operator represents an "unpacked" iterable.
>>> print(*[1,2,3])
1 2 3
>>> print(1, 2, 3)
1 2 3
It is also used in function definitions:
>>> def func(*args):
... return args
...
>>> func('a', 'b', [])
('a', 'b', [])
So, to create the dictionary, we zip() the lists of values together, then zip() that with the keys. Then we iterate through each of those tuples and create a dictionary out of them, with each tuple's first item being the key and the second item being the value (cast as a list instead of a tuple).
To print this, we could make a large looping structure, or we can make generators (quicker to assemble and process than full data structures like a list) and iterate through them, making heavy use of * to unpack things. Remember, in Python 3, print can accept multiple arguments, as seen above.
We will first sort the dictionary, using each element's position in keys as the key. If we use something like key=len, that sends each element to the len() function and uses the returned length as the key. We use lambda to define an inline, unnamed function, that takes an argument x and returns x's index in the list of keys. Note that the dictionary isn't actually sorted; we're just setting it up so we can iterate through it according to a sort order.
Then we can go through this sorted dictionary and assemble its elements into printable strings. At the top level, we join() a key with its value separated by ': '. Each value has its elements join()ed with ', '. Note that if the elements weren't strings, we would have to turn them into strings for join() to work.
>>> list(map(str, [1,2,3]))
['1', '2', '3']
>>> print(*map(str, [1,2,3]))
1 2 3
The generator that yields each of these join()ed lines is then unpacked with the * operator, and each element is sent as an argument to print(), specifying a separator of '\n' (new line) instead of the default ' ' (space).
It's perfectly fine to use loops instead of comprehensions and *, and then rearrange them into such structures after your logic is functional, if you want. It's not particularly necessary most of the time. Comprehensions sometimes execute slightly faster than equivalent loops, and with practice you may come to prefer the syntax of comprehensions. Do learn the * operator, though - it's an enormously versatile tool for defining functions. Also look into ** (often referred to with "double star" or "kwargs"), which unpacks dictionaries into keyword arguments and can also be used to define functions.

In Python how do I split a string into multiple integers?

I'm reading a string which is always five numbers separated by a space, I want to split this up into five individual integers so that I can process them separately.
so far I have:
reader = csv.reader([data.data], skipinitialspace=True)
for r in reader:
print r
which allows me to print the values out, how do I store them as individual integers?
You could do it like this. Assuming s is your line from reader.
>>> s='2 3 4 5 6'
>>> s.split(' ')
['2', '3', '4', '5', '6'] #just split will give strings
>>> [int(i) for i in s.split(' ')] #typecasting to ints
[2, 3, 4, 5, 6] #now you have ints
A word of caution though, I am assuming there is no other data type in the line from reader. Otherwise this code has potential to crash. You can of course put try: except: to circumvent that or use more fine-tuned parsing techniques.
UPDATE 0: Brilliant one liner from #Pavel Annosov - map(int, s.split()). map is a nifty python in-built function which lets you map a function to any iterable.
UPDATE 1: Its fairly simple to see how this will work. but to make it even more clear to #user2152165
ints_list = []
for r in reader:
ints_list = map(int, r.strip().split(' '))
ints_list has a list of your ints. do what you want with it...

Python - assign lists within nest to variable

I am new to python and would appreciate a little help.
How does one do the following:
Having converted each line within a file to a nested list,
e.g. [['line 1', 'a'], ['line 2','b']] how do I flatten the list so that each line is associated with a variable. Assume that the first member in each list, i.e. i[:][0], is known.
Is it possible to associate more than one list with one variable, i.e. can x = [list1], [list2]?
Having used a for loop on a list, how those one associate aspects of that list with a variable? See example below.
Example:
for i in list_1:
if i[:][0] == 'm':
i[2] = a
i[3] = b
i[4] = c
The above returns NameError, a, b, c, not defined. How does one define variables resulting from iterations in a for loop or loops in general?
Hope I was clear and succinct as I am perplexed!
Update:
To clarify:
I have a nested list, where each list within the nest holds strings. These strings are actually numbers. I wish to convert the strings to integers in order to perform arithmetic operations.
Example:
[['1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']]
Now, to convert each string to an integer, is abs() appropriate? How should this be implemented?
Also, to sum the third item of each list within the nest and assign the total to a variable? Should I define a function for this?
Any suggestions on how to deal with this are much appreciated!
Also, the earlier suggestions, made me realise that my thinking was creating the problem! Thanks!
# Answer to question 1 - just use the built-in functionality of lists.
#
# There is no need to use variables when lists let you do so much more
# in a quick and organised fashion.
lines = []
for line in open_file:
lines.append(line)
Since Li0liQ already answered questions 2 and 3, I'd just like to add a recommendation regarding question 3. You really don't need to make a copy of the list via i[:] since you're just testing a value in the list.
No. 2: I can't see how that would be possible - surely you can only assign one value to a variable?
Why do you want to associate each
item in a list with a variable? You
cannot tell the number of list
entries beforehand thus you do not
know the exact number of variables
to use.
You can use tuple: x = ([list1],
[list2])
You should write assignment vice-a-versa:
for i in list_1:
if i[:][0] == 'm':
a = i[2]
b = i[3]
c = i[4]
do you want:
a, b, c = i[2:5]
if I understand well, you have a list of lists, which can have length 2 or 1 (when the variable name is not known)
you would probably want to use a dict to store the lines
yet to mention i[:][0] means something different you wanted, it's the same as i[0] (i[:] would be a copy of list i)
list_1 = [['line 1', 'a'], ['line 2','b'], ['line 3']]
d = {}
for i in list_1:
if len(i) != 2:
continue
key = i[1]
value = i[0]
d[key] = value
then for a, you would use d[a]
if you eventually want to convert them to variables, you can call locals().update(d)

Categories