I have a CSV file with names and scores in it. I've made each line a separate list but when appending a variable to this list it doesn't actually do it.
My code is:
import csv
f = open('1scores.csv')
csv_f = csv.reader(f)
newlist = []
for row in csv_f:
newlist.append(row[0:4])
minimum = min(row[1:4])
newlist.append(minimum)
print(newlist)
With the data in the file being
Person One,4,7,4
Person Two,1,4,2
Person Three,3,4,1
Person Four,2
Surely the output would be ['Person One', '4', '7', '4', '4'] as the minimum is 4, which I'm appending to the list. But I get this: ['Person One', '4', '7', '4'], '4',
What am I doing wrong? I want the minimum to be inside the list, instead of outside but don't understand.
Append the min to each row and then append the row itself, you are appending the list you slice first then adding the min value to newlist not to the sliced list:
for row in csv_f:
row.append(min(row[1:],key=int)
newlist.append(row)
You could also use a list comp:
new_list = [row + [min(row[1:], key=int)] for row in csv_f]
You also need the, key=int or you might find you get strange results as your scores/strings will be compared lexicographically:
In [1]: l = ["100" , "2"]
In [2]: min(l)
Out[2]: '100'
In [3]: min(l,key=int)
Out[3]: '2'
Related
I have an example:
list = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
for i in range(len(list)):
if list[i][-1] == "last":
del(list[i+1])
del(list[i])
I'd like to delete this list where the last item is "last" and the next item on the list.
In this example there is a problem every time - I tried different configurations, replacing with numpy array - nothing helps.
Trackback:
IndexError: list index out of range
I want the final result of this list to be ['3', '4', 'next']
Give me some tips or help how I can solve it.
Try this:
l = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
delete_next = False
to_ret = []
for x in l:
if x[-1] == 'last':
delete_next = True
elif delete_next:
delete_next = False
else:
to_ret.append(x)
Using a variable to store if this needs to be deleted
Loop over the list, if the last element of that iteration == 'last' then skip, else, append to a new list.
Also, it is not recommended to edit lists while iterating over them as strange things can happen, as mentioned in the comments above, like the indexes changing.
l = [['2 a', 'nnn', 'xxxx','last'], ['next, next'], ['3', '4', 'next']]
newlist = []
for i in l:
if i[-1] == 'last':
continue
else:
newlist.append(i)
All,
I've recently picked up Python and currently in the process of dealing with lists. I'm using a test file containing several lines of characters indented by a tab and then passing this into my python program.
The aim of my python script is to insert each line into a list using the length as the index which means that the list would be automatically sorted. I am considering the most basic case and am not concerned about any complex cases.
My python code below;
newList = []
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
newList.insert(size, data)
for i in range(len(newList)):
print ( newList[i])
My 'test' file below;
2 2 2 2
1
3 2
2 3 3 3 3
3 3 3
My expectation of the output of the python script is to print the contents of the list in the following order sorted by length;
['1']
['3', '2']
['3', '3', '3']
['2', '2', '2', '2']
['2', '3', '3', '3', '3']
However, when I pass in my test file to my python script, I get the following;
cat test | ./listSort.py
['2', '2', '2', '2']
['1']
['3', '2']
['3', '3', '3']
['2', '3', '3', '3', '3']
The first line of the output ['2', '2', '2', '2'] is incorrect. I'm trying to figure out why it isn't being printed at the 4th line (because of length 4 which would mean that it would have been inserted into the 4th index of the list). Could someone please provide some insight into why this is? My understanding is that I am inserting each 'data' into the list using 'size' as the index which means when I print out the contents of the list, they would be printed in sorted order.
Thanks in advance!
Inserting into lists work quite differently than what you think:
>>> newList = []
>>> newList.insert(4, 4)
>>> newList
[4]
>>> newList.insert(1, 1)
>>> newList
[4, 1]
>>> newList.insert(2, 2)
>>> newList
[4, 1, 2]
>>> newList.insert(5, 5)
>>> newList
[4, 1, 2, 5]
>>> newList.insert(3, 3)
>>> newList
[4, 1, 2, 3, 5]
>>> newList.insert(0, 0)
>>> newList
[0, 4, 1, 2, 3, 5]
Hopefully you can see two things from this example:
The list indices are 0-based. That is to say, the first entry has index 0, the second has index 1, etc.
list.insert(idx, val) inserts things into the position which currently has index idx, and bumps everything after that down a position. If idx is larger than the current length of the list, the new item is silently added in the last position.
There are several ways to implement the functionality you want:
If you can predict the number of lines, you can allocate the list beforehand, and simply assign to the elements of the list instead of inserting:
newList = [None] * 5
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
newList[size - 1] = data
for i in range(len(newList)):
print ( newList[i])
If you can predict a reasonable upper bound of the number of lines, you can also do this, but you need to have some way to remove the None entries afterwards.
Use a dictionary:
newList = {}
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
newList[size - 1] = data
for i in range(len(newList)):
print ( newList[i])
Add elements to the list as necessary, which is probably a little bit more involved:
newList = []
for line in sys.stdin:
data = line.strip().split('\t')
size = len(data)
if len(newList) < size: newList.extend([None] * (size - len(newList)))
newList[size - 1] = data
for i in range(len(newList)):
print ( newList[i])
I believe I've figured out the answer to my question, thanks to mkrieger1. I append to the list and then sort it using the length as the key;
newList = []
for line in sys.stdin:
data = line.strip().split('\t')
newList.append(data)
newList.sort(key=len)
for i in range(len(newList)):
print (newList[i])
I got the output I wanted;
/listSort.py < test
['1']
['3', '2']
['3', '3', '3']
['2', '2', '2', '2']
['2', '3', '3', '3', '3']
I am trying to import a CSV file while removing the '$' signs from the first column.
Is there any way I can omit the '$' sign with csv.reader?
If not, how can I modify aList to remove the $ signs?
>>> import csv
>>> with open('test.csv', 'rb') as csvfile:
... reader = csv.reader(csvfile, delimiter=',')
... for a in reader:
... print a
...
['$135.20 ', '2']
['$137.20 ', '3']
['$139.20 ', '4']
['$141.20 ', '5']
['$143.20 ', '8']
>>> print(aList)
[['$135.20 ', '2'], ['$137.20 ', '3'], ['$139.20 ', '4'], ['$141.20 ', '5'], ['$143.20 ', '8']]
Ultimately, I would like to prep aList for Numpy functions.
You can modify the first column and then accumulate the results somewhere else:
for col_a, col_b in reader:
results.append([col_a[1:], col_b])
That will remove the first character from the first column and append both columns to another list results
You can do it like this:
for a in reader:
print a[0][1:], a[1]
a[0] is the first entry in your array, a[0][1:] is the first entry starting with the second character.
For example:
a="$123"
print a[1:]
# prints 123
If you want to modify the list itself, try the following:
for x in xrange(len(reader)):
reader[x]=[reader[x][0][1:], reader[x][1]]
Here is my code (Sorry, I know it's messy):
import csv
with open('test.csv', 'w', newline='') as fp:
a = csv.writer(fp, delimiter=',')
data = [['Bob', '4', '6', '3'],
['Dave', '7', '9', '10'],
['Barry', '5', '2', '3']]
a.writerows(data)
print("1 for for alphabetical orderwith each students highest score\n2 for highest score, highest to lowest\n3 for average score, highest to lowest")
cho_two=int(input())
class_a = open("test.csv")
csv_a = csv.reader(class_a)
a_list = []
for row in csv_a:
row[1] = int(row[1])
row[2] = int(row[2])
row[3] = int(row[3])
minimum = min(row[1:3])
row.append(minimum)
maximum = max(row[1:3])
row.append(maximum)
average = sum(row[1:3])//3
row.append(average)
a_list.append(row[0:9])
if cho_two == 1:
alphabetical = [[x[0],x[6]] for x in a_list]
print("\nCLASS A\nEach students highest by alphabetical order \n")
for alpha_order in sorted(alphabetical):
print(alpha_order)
class_a.close()
Here is what is should output:
['Barry', 5]
['Bob', 6]
['Dave', 10]
I want it to output the highest score that the person got in alphabetical order.
It actually outputs this:
['Barry', 2]
['Bob', 3]
['Dave', 5]
Thanks.
You have made a few mistakes here:
First you store the average in the 5th position then check the list in the 6th position.
Secondly you don't actually slice the values from the row correctly.
minimum = min(row[1:3])
This only gets the first 2 elements, you need to get all of the elements. I'd suggest reading from 1 onwards with row[1:]
Then you need to fix your code that computes the average:
average = sum(row[1:3])//3
This does not do what you think it does! The difference is between integer division and real division.
Then when you append the row, you probably should append all of it with a_list.append(row). If you need a copy then slice all of it.
Really quick question here, some other people helped me on another problem but I can't get any of their code to work because I don't understand something very fundamental here.
8000.5 16745 0.1257
8001.0 16745 0.1242
8001.5 16745 0.1565
8002.0 16745 0.1595
8002.5 16745 0.1093
8003.0 16745 0.1644
I have a data file as such, and when I type
f1 = open(sys.argv[1], 'rt')
for line in f1:
fields = line.split()
print list(fields [0])
I get the output
['1', '6', '8', '2', '5', '.', '5']
['1', '6', '8', '2', '6', '.', '0']
['1', '6', '8', '2', '6', '.', '5']
['1', '6', '8', '2', '7', '.', '0']
['1', '6', '8', '2', '7', '.', '5']
['1', '6', '8', '2', '8', '.', '0']
['1', '6', '8', '2', '8', '.', '5']
['1', '6', '8', '2', '9', '.', '0']
Whereas I would have expected from trialling stuff like print list(fields) to get something like
[16825.5, 162826.0 ....]
What obvious thing am I missing here?
thanks!
Remove the list; .split() already returns a list.
You are turning the first element of the fields into a list:
>>> fields = ['8000.5', '16745', '0.1257']
>>> fields[0]
'8000.5'
>>> list(fields[0])
['8', '0', '0', '0', '.', '5']
If you want to have the first column as a list, you can build a list as you go:
myfirstcolumn = []
for line in f1:
fields = line.split()
myfirstcolumn.append(fields[0])
This can be simplified into a list comprehension:
myfirstcolumn = [line.split()[0] for line in f1]
The last command is the problem.
print list(fields[0]) takes the zero'th item from your split list, then takes it and converts it into a list.
Since you have a list of strings already ['8000.5','16745','0.1257'], the zero'th item is a string, which converts into a list of individual elements when list() is applied to it.
Your first problem is that you apply list to a string:
list("123") == ["1", "2", "3"]
Secondly, you print once per line in the file, but it seems you want to collect the first item of each line and print them all at once.
Third, in Python 2, there's no 't' mode in the call to open (text mode is the default).
I think what you want is:
with open(sys.argv[1], 'r') as f:
print [ line.split()[0] for line in f ]
The problem was you were converting the first field which you correctly extracted into a list.
Here's a solution to print the first column:
with open(sys.argv[1]) as f1:
first_col = []
for line in f1:
fields = line.split()
first_col.append(fields[0])
print first_col
gives:
['8000.5', '8001.0', '8001.5', '8002.0', '8002.5', '8003.0']
Rather than doing f1 = open(sys.argv[1], 'rt') consider using with which will close the file when you are done or in case of an exception. Also, I left off rt since open() defaults to read and text mode.
Finally, this could also be written using list comprehension:
with open(sys.argv[1]) as f1:
first_col = [line.split()[0] for line in f1]
Others have already done a great job answering this question, the behavior that your seeing is because you're using list on a string. list will take any object that you can iterate over and turn it into a list -- one element at a time. This isn't really surprising except that the object doesn't even have to have an __iter__ method (which is the case with strings) -- There are a number of posts on SO about __iter__ so I won't focus on that part.
In any event, try the following code and see what it prints out:
>>> def enlighten_me(obj):
... print (list(obj))
... print (hasattr(obj))
...
>>> enlighten_me("Hello World")
>>> enlighten_me( (1,2,3,4) )
>>> enlighten_me( {'red':'wagon',1:5} )
Of course, you can try the example with sets, lists, generators ... Anything you can iterate over.
Levon posted a nice answer about how to create a column while reading your file. I will demonstrate the same thing using the built-in zip function.
rows=[]
for row in myfile:
rows.append(row.split())
#now rows is stored as [ [col1,col2,...] , [col1,col2,...], ... ]
At this point we could get the first column by (Levon's answer):
column1=[]
for row in rows:
column1.append(row[0])
or more succinctly:
column1=[row[0] for row in rows] #<-- This is called a list comprehension
But what if you want all the columns? (and what if you don't know how many columns there are?). This is a job for zip.
zip takes iterables as input and matches them up. In other words:
zip(iter1,iter2)
will take iter1[0] and match it with iter2[0], and match iter1[1] with iter2[1] and so on -- kind of like a zipper if you think about it. But, zip can take more than just 2 arguments ...
zip(iter1,iter2,iter3) #results in [ [iter1[0],iter2[0],iter3[0]] , [iter1[1],iter2[1],iter3[1]], ... ]
Now, the last piece of the puzzle that we need is argument unpacking with the star operator.
If I have a function:
def foo(a,b,c):
print a
print b
print c
I can call that function like this:
A=[1,2,3]
foo(A[0],A[1],A[2])
Or, I can call it like this:
foo(*A)
Hopefully this makes sense -- the star takes each element in the list and "unpacks" it before passing it to foo.
So, putting the pieces together (remember back to the list of rows), we can unpack the list of rows and pass it to zip which will match corresponding indices in each row (i.e. columns).
columns=zip(*rows)
Now to get the first column, we just do:
columns[0] #first column
for lists of lists, I like to think of zip(*list_of_lists) as a sort of poor-man's transpose.
Hopefully this has been helpful.