Python and Indexing
I've been working on a vanilla Python code to separate data from a csv. My goal is to recreate this code using multiple strategies in order to better my understanding of Python. Improvements to this code will come later. The code I have works, but there are a couple things that I don't understand. Here it is:
with open('C:\My Super Secret Path\primary_debates_cleaned.csv') as primaryData:
headers = primaryData.readline().strip('\n').split(',')
flag = 0
for lines in primaryData:
sepInit = lines.strip('\n').split('"')
if flag == 1:
sep1 = [item for item in sepInit[0].split(',') if item is not '']
sep2 = sepInit[1]
sep3 = [item for item in sepInit[2].split(',') if item is not '']
#sep4 = sepInit[3]
sep4 = sepInit[-2]
#sep5 = sepInit[4].strip(',')
sep5 = sepInit[-1].strip(',')
#sepFinal = [sep1[0], sep1[1], sep2, sep3[0], sep3[1], sep4, sep5]
sepFinal = [sep1[0], sep1[1], sep2, sep3[0:1], sep3[1:2], sep4, sep5]
if flag == 0:
sepFinal = headers
flag = 1
print sepFinal
My first question concerns this snippet, specifically indexing:
#sep4 = sepInit[3]
sep4 = sepInit[-2]
#sep5 = sepInit[4].strip(',')
sep5 = sepInit[-1].strip(',')
The commented part is what I want to do, and the uncommented part is what works. It seems like I have to reverse the index in order to grab the proper information. The "type" seems to be the same, both being lists. Is there something I'm doing incorrectly at the start, or am I missing something simple here?
My next question has a similar flavor, from the snippet below:
#sepFinal = [sep1[0], sep1[1], sep2, sep3[0], sep3[1], sep4, sep5]
sepFinal = [sep1[0], sep1[1], sep2, sep3[0:1], sep3[1:2], sep4, sep5]
Why is it that I can get the information I need from sep1 simply using 0 and 1, but I cannot do the same for sep3?
Finally, when printing out the list sepFinal, the elements for sep4 and sep5 appear as lists. Everything else is just an element of the list sepFinal, but sep4 and sep5 are lists within the list. If this needs clarification, let me know. So, why do sep4 and sep5 appear as lists within my list?
EDIT0: There are no inputs to this. I am going into the PowerShell, and typing python mySecretProgramName.py to run it. The print sepFinal shows the following, parenthesized:
>>> [element 1, element 2, element 3, [element 4], [element 5]]
From the start, I'd like it to be:
>>> [element 1, element 2, element 3, element 4, element 5]
EDIT1: The negative indexing was needed due to data being improperly split. The length of sepInit was changing, so the indexing was not correct. Thank you to #martineau for pointing out this possibility. I tested this by simply putting print(len(sepInit)) after sepInit in the loop.
Negative indexing information: someList[-1] grabs the last item in a list, someList[-2] grabs the second to last item in a list, etc.
EDIT2: This concerns sep3[0:1] and the like. This essentially takes slice of the list, where sep3[0:1] would return whatever element falls between the places 0 and 1.
Related
Is there a pythonic way to add to a list at a known index that is past the end of the list? I cannot use append, as I'm looking add at an index that is more than 1 past the end. (For example, I want to put a value at x[6] when len(x) == 3).
I have a code that performs actions for sequential steps, and each step has a set of inputs. The users create an input file with these inputs. I store those inputs as a dictionary for each step, then a list of dictionaries to keep the order of the steps. I had just been reading the inputs for each step, then appending the dictionary to the list. I want to harden the code against the steps being out of order in the input files. If the user puts step 6 before step 3, I can't just append. I do not know the total number of steps until after the file has been read. I have a method worked out, but it seems clunky and involves multiple copies.
My kludgy attempt. In this case InputSpam and CurrentStep would actually be read from the user file
import copy
AllInputs = []
InputSpam = {'Key',999}
for i in xrange(0,3):
AllInputs.append(InputSpam.copy())
CurrentStep = 7
if CurrentStep - 1 == len(AllInputs):
AllInputs.append(InputSpam.copy())
elif CurrentStep - 1 < len(AllInputs):
AllInputs[CurrentStep-1] = InputSpam.copy()
elif CurrentStep - 1 > len(AllInputs):
Spam = [{}]*CurrentStep
Spam [:len(AllInputs)] = copy.deepcopy(AllInputs)
AllInputs = copy.deepcopy(Spam)
AllInputs[CurrentStep-1] = InputSpam.copy()
del Spam
Only after I wrote the answer I notice you use pyhton 2. Python 2 is unsupported for a long time now. You should switch to python 3. (The following solution is only valid for python 3.)
You can use collections.UserList to crate your own variation of a list like this:
from collections import UserList
class GappedList(UserList):
PAD_VALUE = object() # You may use None instead
def __setitem__(self, index, value):
self.data.extend(self.PAD_VALUE for _ in range(len(self.data), index+1))
self.data[index] = value
Inheriting from the UserList makes the whole structure to mostly behave like a regular list, unless specified otherwise. The data attribute gives access to "raw" underlying list. Only thing we need to redefine here is __setitem__ method which cares to assignments like my_list[idx] = val. We redefine in to firstly fill in a gap inbetween the end of the current list and the index you want to write in. (Actually it fills the list including the index you want to write to and then re-writes to value -- it makes the code a bit simpler).
You might need to redefine alse __getitem__ method if you want to handle access to index in the gaps somewhat differently.
Usage:
my_list = GappedList([0,1,2])
my_list.append(3)
my_list[6] = 6
my_list.append(7)
my_list[5] = 5
print(my_list)
# output:
[0, 1, 2, 3, <object object at 0x7f42cbd5ec80>, 5, 6, 7]
I am trying to get the sum of x in this type of list: myList=[[y,x],[y,x],[y,x]
Here is my code I have been trying:
myLists = [['0.9999', '2423.99000000'], ['0.9998', '900.00000000'], ['0.9997', '4741.23000000'], ['0.9995', '6516.16000000'], ['0.9991', '10.01000000'], ['0.9990', '9800.00000000']]
if chckList(myLists):
floatList = []
listLength = len(acceptibleBids)
acceptibleBids0 = list(map(float, acceptibleBids[0]))
acceptibleBids1 = list(map(float, acceptibleBids[1]))
floatList.append(acceptibleBids0)
floatList.append(acceptibleBids1)
sumAmounts = sum(amount[1] for amount in floatList)
print(sumAmounts)
print(acceptibleBids)
I have run into many problems, but my current problem are listed below:
1. This list is the way I receive it, so the fact that they are all strings I have been trying to change them to floats so that I can the the sum(myList[1]) of each list inside myList.
2. The list ranges from 1 to 100
You can use list comprehension:
total = sum([float(x[1]) for x in myLists])
print(total) # 24391.39
This should do:
sum = 0
for pair in myLists:
sum+= float(pair[1])
#of course, if there is something that can't
#be a float there, it'll raise an error, so
#do make all the checks you need to make
I'm unsure where acceptibleBids comes from in that code, but I'll assume it is a copy of myList, or something similar to it. The problem with your code is that acceptibleBids[0] is just ['0.9999', '2423.99000000']. Similarly, acceptibleBids[1] is just ['0.9998', '900.00000000']. So when end up with acceptibleBids0 as [[0.9999, 2423.99000000]] and acceptibleBids1 is similarly wrong. Then this makes floatList not be what you wanted it to be.
Edit: list comprehension works too, but I kinda like this way of looking at it. Either way, with list comprehension this would be sum_floats = sum(float([pair[1]) for pair in myLists]).
The following will do:
>>> sum([float(x[0]) for x in myLists])
5.997
Given the data for the row index to be found as max_sw and list is sw_col.
I tried this and some other variation, but nothing worked.
print(i for i in range(len(sw_col)) if sw_col[i]== max_sw)
The line you have is almost there. If you put the generator into a list and use only index position zero, this will give you the correct answer:
sw_col = ['a','b','c']
max_sw = 'c'
print([i for i in range(len(sw_col)) if sw_col[i]== max_sw][0]) # prints 2
A more concise solution would be to look up the item directly in the list, like so:
sw_col = ['a','b','c']
max_sw = 'c'
print(sw_col.index(max_sw)) # prints 2
Ive been trying to create a part of code that takes data from an excel file then adds it into a list but only once. all other times should be ignored, ive managed to get all the data i need, just need to know how to pop unwanted duplicates. Also wondering if i should do this in a dictionary and how it would be done if i did
for cellObj in rows:<br>
Lat = str(cellObj[5].value)<br>
if 'S' in Lat:<br>
majorCity.append(str(cellObj[3].value))<br>
print(majorCity)<br>
elif majorCity == majorCity:<br>
majorCity.pop(str(cellObj[3].value))<br>
You can use set(), it will remove duplicates from a sequence.
a= set()
a.add("1")
a.add("1")
print a
Output:
set(['1'])
set is indeed a good way to do this:
>>> my_list = [1,1,2,2]
>>> my_list_no_dups = list(set(my_list))
>>> my_list_no_dups
[1, 2]
but it will not necessarily preserve the order of the list. If you do care about the order, you can do it like this:
my_list_no_dups = []
for item in my_list:
if item not in my_list_no_dups:
my_list_no_dups.append(item)
Am I able to slice a list of strings? If it is possible could anyone please tell me how to do it so that I am able to print out a particular string instead of the five that make up the list.
Cheers.
eg.
mylist = ['apples' 'oranges' 'lemons' 'cucumbers' 'bananas']
print 'orange'
** The programming language i am using is python. Every time I code it mylist[2] it comes out as an error. The list I am using is extracting the strings from a html rss feed. Each string is a new news heading. However, even when it updates constantly there are always 5 strings in the list and it tells me list index out of range. But if I just print the entire list it works fine**
#URLS for RSS Feeds
url_national = 'http://feeds.news.com.au/public/rss/2.0/news_national_3354.xml'
url_sport = 'http://feeds.news.com.au/public/rss/2.0/news_sport_3168.xml'
url_world = 'http://feeds.news.com.au/public/rss/2.0/news_theworld_3356.xml'
url_technology = 'http://feeds.news.com.au/public/rss/2.0/news_tech_506.xml'
def headlines (url):
web_page = urlopen(url)
html_code = web_page.read()
web_page.close()
return findall(r'<item><title>([^<]*)</title>', html_code)
#headlines list
list_national = [headlines(url_national)]
list_sport = [headlines(url_sport)]
list_world = [headlines(url_world)]
list_technology = [headlines(url_technology)]
def change_category():
if label_colour.get() == 'n':
changeable_label['text'] = list_national #here I would slice it but it doesn't work
elif label_colour.get() == 's':
changeable_label['text'] = list_sport
elif label_colour.get() =='w':
changeable_label['text'] = list_world
else:
changeable_label['text'] = list_technology
the reason I need to slice it into individual heading is so when the radio button is pressed for my GUI it prints them in a numbered list on the label not all just running on one line next to them - sorry i hope that makes sense
What language are you using here? Usually you can use an index to access a particular entry in a list. For example:
print myList[1]
Commas are missing in your list creation. You have to do it like this:
mylist = ['apples', 'oranges', 'lemons', 'cucumbers', 'bananas']
And you will be able to work with your list
mylist[0] # 'apples'
mylist[-1] # 'bananas'
mylist[2] # 'lemons'
I think the error you are getting is something like this:
mylist = ['apples' 'oranges' 'lemons' 'cucumbers' 'bananas']
print mylist[5]
IndexError: list index out of range
The reason is the elements in a list are indexed from 0 not 1.
The mylist has 5 elements starting from 0 to 4. So when you call print mylist[5] it will definitely give an error as there is no 6th element in the list.
Here is the official doc regarding list please have a look.
I hope it was helpful!