Copy portion of list - python

How do I copy only a portion of a list to another list. For example, if the length of the list is 105 but only 30 of the randomly selected elements need to be copied to a new list.
This is the code that I have written
for x in range (104):
if len(trainingSet1)>30:
break
trainingSet1[x]= (trainingSet[random.randint(1,103)])
But it keeps giving this error:
Traceback (most recent call last):
File "Q1_2.py", line 82, in <module>
main()
File "Q1_2.py", line 72, in main
trainingSet1[x]= (trainingSet[random.randint(1,103)])
IndexError: list index out of range

The bug is probably here:
trainingSet1[x] = ...
Unless you already populated trainingSet1, you’re trying to assign to
an element that doesn’t exist yet. Use trainingSet1.append(...)
instead.

Initialize the trainingSet1 as trainingSet1 =[] and then try to append values to that instead of using trainingSet1[x] = value . If you really want to assign as you have done in the code you can first initialize the array as trainingSet1 = [0] * 30. This will assign 30 0's to the list and those will be replaced by your randomly selected values later.

Related

Creating a Tuple made of variable strings

The following code works:
image_files=['imgpy1.png','imgpy2.png']
clip = moviepy.video.io.ImageSequenceClip.ImageSequenceClip(image_files, fps=20)
But I have 200 filenames 'imgpy1.png', 'imgpy2.png' and so on. How do I create an object called image_files, that contain these file names. I tried to create an array of strings:
flist=[]
flist=list(map(str,flist))
flist.append['imagepy1.png']
flist.append['imagepy2.png']
but that does not work. I get the error message
Traceback (most recent call last):
File "C:\Users\vince\image.py", line 53, in <module>
flist.append['imagepy1.png']
TypeError: 'builtin_function_or_method' object is not subscriptable
I am new to Python and can't figure out how to get an array or Tuple of strings. I've tried for 2-3 hours to non avail, at least 40 different ways including stuff I read online that don't work on my laptop. In the end, I am only interested in passing an argument to the video function that (1) does not crash the Python interpreter and (2) can be digested by the video function. So far, I am stuck with (1). But in the end I'm looking at solving (2). I don't care if it's an array, a list, a Tuple or anything else, as long as it works.
I think append function is not correct.
flist.append('imagepy1.png') # not brackets
flist.append('imagepy2.png')
flist = []
for i in range(200):
flist.append(f"imagepy{i + 1}.png") #you use square brackets not round ones
print(flist)
this gets you
['imagepy1.png', 'imagepy2.png', 'imagepy3.png', 'imagepy4.png', 'imagepy5.png', 'imagepy6.png', 'imagepy7.png', 'imagepy8.png', 'imagepy9.png', 'imagepy10.png', 'imagepy11.png', 'imagepy12.png', 'imagepy13.png', 'imagepy14.png', 'imagepy15.png', 'imagepy16.png', 'imagepy17.png', 'imagepy18.png', 'imagepy19.png', 'imagepy20.png', 'imagepy21.png', 'imagepy22.png', 'imagepy23.png', 'imagepy24.png', 'imagepy25.png', 'imagepy26.png', 'imagepy27.png', 'imagepy28.png', 'imagepy29.png', 'imagepy30.png', 'imagepy31.png', 'imagepy32.png', 'imagepy33.png', 'imagepy34.png', 'imagepy35.png', 'imagepy36.png', 'imagepy37.png', 'imagepy38.png', 'imagepy39.png', 'imagepy40.png', 'imagepy41.png', 'imagepy42.png', 'imagepy43.png', 'imagepy44.png', 'imagepy45.png', 'imagepy46.png', 'imagepy47.png', 'imagepy48.png', 'imagepy49.png', 'imagepy50.png', 'imagepy51.png', 'imagepy52.png', 'imagepy53.png', 'imagepy54.png', 'imagepy55.png', 'imagepy56.png', 'imagepy57.png', 'imagepy58.png', 'imagepy59.png', 'imagepy60.png', 'imagepy61.png', 'imagepy62.png', 'imagepy63.png', 'imagepy64.png', 'imagepy65.png', 'imagepy66.png', 'imagepy67.png', 'imagepy68.png', 'imagepy69.png', 'imagepy70.png', 'imagepy71.png', 'imagepy72.png', 'imagepy73.png', 'imagepy74.png', 'imagepy75.png', 'imagepy76.png', 'imagepy77.png', 'imagepy78.png', 'imagepy79.png', 'imagepy80.png', 'imagepy81.png', 'imagepy82.png', 'imagepy83.png', 'imagepy84.png', 'imagepy85.png', 'imagepy86.png', 'imagepy87.png', 'imagepy88.png', 'imagepy89.png', 'imagepy90.png', 'imagepy91.png', 'imagepy92.png', 'imagepy93.png', 'imagepy94.png', 'imagepy95.png', 'imagepy96.png', 'imagepy97.png', 'imagepy98.png', 'imagepy99.png', 'imagepy100.png', 'imagepy101.png', 'imagepy102.png', 'imagepy103.png', 'imagepy104.png', 'imagepy105.png', 'imagepy106.png', 'imagepy107.png', 'imagepy108.png', 'imagepy109.png', 'imagepy110.png', 'imagepy111.png', 'imagepy112.png', 'imagepy113.png', 'imagepy114.png', 'imagepy115.png', 'imagepy116.png', 'imagepy117.png', 'imagepy118.png', 'imagepy119.png', 'imagepy120.png', 'imagepy121.png', 'imagepy122.png', 'imagepy123.png', 'imagepy124.png', 'imagepy125.png', 'imagepy126.png', 'imagepy127.png', 'imagepy128.png', 'imagepy129.png', 'imagepy130.png', 'imagepy131.png', 'imagepy132.png', 'imagepy133.png', 'imagepy134.png', 'imagepy135.png', 'imagepy136.png', 'imagepy137.png', 'imagepy138.png', 'imagepy139.png', 'imagepy140.png', 'imagepy141.png', 'imagepy142.png', 'imagepy143.png', 'imagepy144.png', 'imagepy145.png', 'imagepy146.png', 'imagepy147.png', 'imagepy148.png', 'imagepy149.png', 'imagepy150.png', 'imagepy151.png', 'imagepy152.png', 'imagepy153.png', 'imagepy154.png', 'imagepy155.png', 'imagepy156.png', 'imagepy157.png', 'imagepy158.png', 'imagepy159.png', 'imagepy160.png', 'imagepy161.png', 'imagepy162.png', 'imagepy163.png', 'imagepy164.png', 'imagepy165.png', 'imagepy166.png', 'imagepy167.png', 'imagepy168.png', 'imagepy169.png', 'imagepy170.png', 'imagepy171.png', 'imagepy172.png', 'imagepy173.png', 'imagepy174.png', 'imagepy175.png', 'imagepy176.png', 'imagepy177.png', 'imagepy178.png', 'imagepy179.png', 'imagepy180.png', 'imagepy181.png', 'imagepy182.png', 'imagepy183.png', 'imagepy184.png', 'imagepy185.png', 'imagepy186.png', 'imagepy187.png', 'imagepy188.png', 'imagepy189.png', 'imagepy190.png', 'imagepy191.png', 'imagepy192.png', 'imagepy193.png', 'imagepy194.png', 'imagepy195.png', 'imagepy196.png', 'imagepy197.png', 'imagepy198.png', 'imagepy199.png', 'imagepy200.png']
is this what you were looking for?

Indexing error after removing line from 2D array

I am facing an 'List Index out of range' error when trying to iterate a for-loop over a table I've created from a CSV extract, but cannot figure out why - even after trying many different methods.
Here is the step by step description of how the error happens :
I'm removing the first line of an imported CSV file, as this
line contains the columns' names but no data. The CSV has the following structure.
columnName1, columnName2, columnName3, columnName4
This, is, some, data
I, have, in, this
very, interesting, CSV, file
After storing the CSV in a first array called oldArray, I want to populate a newArray that will get all values from oldArray but not the first line, which is the column name line, as previously
mentioned. My newArray should then look like this.
This, is, some, data
I, have, in, this
very, interesting, CSV, file
To create this newArray, I'm using the following code with the append() function.
tempList = []
newArray = []
for i in range(len(oldArray)):
if i > 0: #my ugly way of skipping line 0...
for j in range(len(oldArray[0])):
tempList.append(oldArray[i][j])
newArray.append(tempList)
tempList = []
I also stored the columns in their own separate list.
i = 0
for i in range(len(oldArray[0])):
my_columnList[i] = oldArray[0][i]
And the error comes up next : I now want to populate a treeview table from this newArray, using a for-loop and insert (in a function). But I always get the 'Index List out of range error' and I cannot figure out why.
def populateTable(my_tree, newArray, my_columnList):
i = 0
for i in range(len(newArray)):
my_tree.insert('','end', text=newArray[i][0], values = (newArray[i][1:len(newArray[0]))
#(im using the text option to bypass treeview's column 0 problem)
return my_tree
Error message --> " File "(...my working directory...)", line 301, in populateTable
my_tree.insert(parent='', index='end', text=data[i][0], values=(data[i][1:len(data[0])]))
IndexError: list index out of range "
Using that same function with different datasets and columns worked fine, but not for this here newArray.
I'm fairy certain that the error comes strictly from this 'newArray' and is not linked to another parameter.
I've tested the validity of the columns list, of the CSV import in oldArray through some print() functions, and everything seems normal - values, row dimension, column dimension.
This is a great mystery to me...
Thank you all very much for your help and time.
You can find a problem from your error message: File "(...my working directory...)", line 301, in populateTable my_tree.insert(parent='', index='end', text=data[i][0], values=(data[i][1:len(data[0])])) IndexError: list index out of range
It means there is an index out of range in line 301: data[i][0] or data[i][1:len(data[0])]
(i is over len(data)) or (0 or 1 is over len(data[0]))
My guess is there is some empty list in data(maybe data[-1]?).
if data[i] is [] or [some_one_item], then data[i][1:len(data[0])] try to access to second item which not exists.
there is no problem in your "ugly" way to skip line 0 but I recommend having a look on this way
new_array = old_array.copy()
new_array.remove(new_array[0])
now for fixing your issue
looks like you have a problem in the indexing
when you use a for loop using the range of the length of an array you use normal indexing which starts from one while you identify your i variable to be zero
to make it simple
len(oldArray[0])
this is equal to 4 so when you use it in the for loop it's just like saying
for i in range(4):
to fix this you can either subtract 1 from the length of the old array or just identify the i variable to be 1 at the first
i = 1
for i in range(len(oldArray[0])):
my_columnList[i] = oldArray[0][i]
or
i = 0
for i in range(len(oldArray[0])-1):
my_columnList[i] = oldArray[0][i]
this mistake is also repeated in your populateTree function
so in the same way your code would be
def populateTree(my_tree, newArray, my_columnList):
i = 0
for i in range(len(newArray)-1):
my_tree.insert('','end', text=newArray[i][0], values = (newArray[i][1:len(newArray[0]))
#(im using the text option to bypass treeview's column 0 problem)
return my_tree

select and make new list with specific information

EDIT2: Nevermind this, someone pointed my error. Thanks
first of all, this is an example of results i have
(172, 'Nucleus')
(172, 'Nucleus')
(472, 'Cytoplasm')
(472, 'Cytoplasm')
(472, 'Nucleus')
what i`m trying to do is to match the first number (position 0) and then look if there is a part of the word "nucleus" (here, it would be "nuc") It can happens that in each number there is only word that has nucleus.
i'm trying to make 2 lists : the first list would be only the number containing only "nuc" word. the second list would be containing those with nuc and other things (like cytoplasm in my example)
That is only a little part of my result.
I don't have example of code, because i have really no clue how to include only one valor of my query in the list ( as on the example, i would enter the number 172 two time) (oops i now have an example of code)
EDIT: oops wrote that before i wrote the code i tried...
right now, my code looks like that :
here is how i got my example a little bit higher
def number1(self, position):
self.position = position
List = [self.name()]
for item in List:
for i in range(position, self.c.rowcount):
self.number(i)
def separate_list(self, list_signal):
nuc_list = []
not_nuc_list = []
for i in list_signal:
print(list_signal(i))
if list_signal(i)(0) == list_signal(i+1)(0):
if list_signal(i)(1) and list_signal(i+1)(1) == re.search("nuc"):
nuc_list.append(list_signal(i))
else:not_nuc_list.append(list_signal(i))
return nuc_list and not_nuc_list
dc = connection()
dc.separate_list(dc.number1(0))
error:
Traceback (most recent call last):
File "class vincent.py", line 91, in <module>
dc.separate_list(dc.number1(0))
File "class vincent.py", line 61, in separate_list
for i in list_signal:
TypeError: 'NoneType' object is not iterable
i know this is not cute, i tried doing it the best way i can .. (new to python and programming in itself)
EDIT2: Nevermind this, someone pointed my error. Thanks
A few things, if you are trying to get the index, position 0 of the list as you say, you would use list_name[0], if you are using position to sort, use a different method
Are (172, 'Nucleus') ... (172, 'Nucleus') tuples or are they lists of their own? List you can use index with the [0] method, tuple you can assign it to two variables to work with the data as number, cell_type = (172, 'nucleus')
Also, at the moment dc.number1 doesn't return anything so it cant be used at input to another function. Add a return of some sort or change what you are using as the input to whatever self.number is modifying.
You may want to make a list of all your results, e.g. [(172, 'Nucleus'), ...(172, 'Nucleus')] then you can iterate through with
for item in results_list:
for number, cell_type in item:
print str(number), cell_type
#Should give you "172 Nucleus"

ValueError: y contains new labels: ['#']

I have a list of lists with every list containing 1 up to 5 tags. I have constructed a list containing the top 50 tags. My goal is to construct a new list of lists where every list contains only the top 50 tags. My approach went like this:
First I constructed a new list of lists with only the top 50 tags:
top_50 = list(np.array(pd.read_csv(os.path.join(dir,"Tags.csv")))[:,1])
train = pd.read_csv(os.path.join(dir,"Train.csv"),iterator = True)
top_50 = top_50[:51]
tags = list(np.array(train.get_chunk(50000))[:,3])
top_50_tags = [[tag for tag in list if tag in top_50] for list in tags]
Then I tried to encode the tags:
coder = preprocessing.LabelEncoder()
coder = coder.fit(top_50)
tags = [coder.transform(tag) for tag in list for list in top_50_tags]
This however gave me this error:
Traceback (most recent call last):
File "C:\Users\Ano\workspace\final_submission\src\rf_test.py", line 69, in <module>
main()
File "C:\Users\Ano\workspace\final_submission\src\rf_test.py", line 33, in main
labels = [coder.transform(tag) for tag in list for list in top_50_tags]
File "C:\Python27\lib\site-packages\sklearn\preprocessing\label.py", line 120, in transform
raise ValueError("y contains new labels: %s" % str(diff))
ValueError: y contains new labels: ['#']
I think this error rises because some of my lists are empty, since there were no top 50 tags in them. But the error specifically states that ["#"] is the newly seen label. Am I right with my hypothesis? And what should I do with the error message?
Edit:
For the people wondering why I am using list as a variable in list comprehension, I actually use a different word as a variable in my real program.
Update
I checked for differences in my top_50 and the tags:
print(len(top_50.difference(tags)))
which gave me a length of 0. This should mean that my empty lists are the problem?
Maybe you can check this issue: https://github.com/scikit-learn/scikit-learn/issues/3123
In scikit-learn 0.17 version, this bug has been solved.

Accessing array items

I've got a list that is one dimensional that I need separated by spaces. I am running this script in spyder for ubuntu running on parallels on Mac OS 10.8.
what I'm getting is this:
print poly
Output: [array([ 0.01322341, 0.07460202, 0.00832512])]
print poly[0]
Output: [array([ 0.01322341, 0.07460202, 0.00832512])]
print poly[1]
Output:
Traceback (most recent call last):
File "/home/parallels/.../RampEst.py", line 38, in <module>
print poly[1]
IndexError: list index out of range
The "..." is the rest of the file directory.
What I need is:
print poly[0]
Output: 0.01322341
print poly[1]
Output: 0.07460202
print poly[2]
Output: 0.00832512
You are constructing your list objects in the wrong way. If you post where you're doing that it might be clear as to the actual problem.
You'll notice that the following:
mylist = [ 0.01322341, 0.07460202, 0.00832512]
mylist[0] # 0.01322341
mylist[1] # 0.07460202
mylist[2] # 0.00832512
Works fine. From what you're posted, you have a list of array types. When you access the 0 element you are retrieving the only array object in the list. If you can't change the structure of your list, this will work fine.
poly[0][0] # 0.01322341
poly[0][1] # 0.07460202
poly[0][2] # 0.00832512
How do you first initialize poly?
It seems that the array is within a list, which would make it the zeroth element of the list. Therefore, when you request poly[1] it fails.
To check if what I am saying is correct, do poly[0][0], poly[0][1] and poly[0][2]. If these return the numbers you want, then your poly is inside a list.

Categories