Python - list index out of range in if statement - python

I have a list that usually contains items but is sometimes empty.
Three items from the list are added to the database but I run into errors if it's empty, even though I'm using an if statement.
if item_list[0]:
one = item_list[0]
else:
one = "Unknown"
if item_list[1]:
two = item_list[1]
else:
two = "Unknown"
if item_list[2]:
three = item_list[2]
else:
three = "Unknown"
This still raises the list index out of range error if the list is empty. I can't find any other ways in which it could be done, but there must be a better way (I've also read that you should avoid using else statements?)

If a list is empty, the list has no index; and trying to access an index of the list causes an error.
The error actually occurs in the if statement.
you could obtain the result you expect by doing this:
one, two, three = item_list + ["unknown"] * (3 - len(item_list))
This line of code creates a temporary list consisting in the concatenation of item_list and a list of (3 minus the size of item_list) "unknown" strings; which is always a 3-items list. It then unpacks this list in the one, two and three variables
details:
You can multiply a list to obtain a bigger list with duplicate items: ['a', 1, None] * 2 gives ['a', 1, None, 'a', 1, None]. This is used to create a list of "unknow" strings. Note that multiplying a list by 0 results in an empty list (as expected).
You can use the addition operator to concatenate 2 (or more) lists: ['a', 'b'] + [1, 2] gives ['a', 'b', 1, 2]. This is used to create a 3-items list from item_list and the 'unknown' list created by multiplication.
You can unpack a list in several variable with the assignation operator: a, b = [1, 2] gives a = 1 and b = 2. It it even possible to use extended unpacking a, *b = [1, 2, 3] gives a = 1 and b = [2, 3].
example:
>>> item_list = [42, 77]
>>> one, two, three = item_list + ["unknown"] * (3 - len(item_list))
>>> one, two, three
(42, 77, 'unknown')

Python will throw this error if you try to access an element of an array that doesn't exist. So an empty array won't have index 0.
if item_list: # an empty list will be evaluated as False
one = item_list[0]
else:
one = "Unknown"
if 1 < len(item_list):
two = item_list[1]
else:
two = "Unknown"
if 2 < len(item_list):
three = item_list[2]
else:
three = "Unknown"

item_list[1] will immediately raise an error if there aren't 2 elements in the list; the behavior isn't like that of languages like Clojure, where a null value is instead returned.
Use len(item_list) > 1 instead.

You need to check if your list is long enough to have a value in the index position you are trying to retrieve from. If you are also trying to avoid using else in your condition statement, you can pre-assign your variables with default values.
count = len(item_list)
one, two, three = "Unknown", "Unknown", "Unknown"
if count > 0:
one = item_list[0]
if count > 1:
two = item_list[1]
if count > 2:
three = item_list[2]

Related

Python: Get item(s) at index list

I am trying to write something that takes a list, and gets the item(s) at an index, either one or multiple.
The example below which I found in another post here works great when I have more than one index. This example doesnt work if b = a single index.
a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [ a[i] for i in b]
How do I get this to work with both 1 and multiple index?
Example:
a = [-2,1,5,3,8,5,6]
b = 2
c = [ a[i] for i in b] doesnt work in this case
You can actually check if the type your trying to use for fetching the indices is a list (or a tuple, etc.). Here it is, wrapped into a function:
def find_values(in_list, ind):
# ind is a list of numbers
if isinstance(ind, list):
return [in_list[i] for i in ind]
else:
# ind is a single numer
return [in_list[ind]]
in_list = [-2,1,5,3,8,5,6]
list_of_indices = [1,2,5]
one_index = 3
print(find_values(in_list, list_of_indices))
print(find_values(in_list, one_index))
The function takes the input list and the indices (renamed for clarity - it's best to avoid single letter names). The indices can either be a list or a single number. If isinstance determines your input is a list, it proceeds with a list comprehension. If it's a number - it just treats it as an index. If it is anything else, the program crashes.
This post gives you more details on isinstance and recognizing other iterables, like tuples, or lists and tuples together.
a = [-2, 1, 5, 3, 8, 5, 6]
a2 = [-2]
b = [1, 2, 5]
b2 = [1]
c = [a[i] for i in b]
c2 = [a2[i-1] for i in b2]
The first item of the list is 0, the list with one item is perfectly valid.
Instead of creating a list that manually validates the value of list b in the list a, you could create a separate 3 line code to print out the overlapping intersection of list a and b by this:
a = [-2,1,5,3,8,5,6]
b = [3,4,6]
for i in range(0,len(b)):
if b[i] in a:
print(b[i])
By doing so, you would be able to print out the overlapping intersection even if there were 1 or even no value stored in list b.

Identify a single difference in a python list

I would have to get some help concerning a part of my code.
I have some python list, example:
list1 = (1,1,1,1,1,1,5,1,1,1)
list2 = (6,7,4,4,4,1,6,7,6)
list3 = (8,8,8,8,9)
I would like, for each list, know if there is a single value that is different compare to every other values if and only if all of these other values are the same. For example, in the list1, it would identify "5" as a different value, in list2 it would identify nothing as there are more than 2 different values and in list3 it would identify "9"
What i already did is :
for i in list1:
if list1(i)==len(list1)-1
print("One value identified")
The problem is that i get "One value identified" as much time as "1" is present in my list ...
But what i would like to have is an output like that :
The most represented value equal to len(list1)-1 (Here "1")
The value that is present only once (Here "5")
The position in the list where the "5"
You could use something like that:
def odd_one_out(lst):
s = set(lst)
if len(s)!=2: # see comment (1)
return False
else:
return any(lst.count(x)==1 for x in s) # see comment (2)
which for the examples you provided, yields:
print(odd_one_out(list1)) # True
print(odd_one_out(list2)) # False
print(odd_one_out(list3)) # True
To explain the code I would use the first example list you provided [1,1,1,1,1,1,5,1,1,1].
(1) converting to set removes all the duplicate values from your list thus leaving you with {1, 5} (in no specific order). If the length of this set is anything other than 2 your list does not fulfill your requirements so False is returned
(2) Assuming the set does have a length of 2, what we need to check next is that at least one of the values it contains appear only once in the original list. That is what this any does.
You can use the built-in Counter from High-performance container datatypes :
from collections import Counter
def is_single_diff(iterable):
c = Counter(iterable)
non_single_items = list(filter(lambda x: c[x] > 1, c))
return len(non_single_items) == 1
Tests
list1 = (1,1,1,1,1,1,5,1,1,1)
list2 = (6,7,4,4,4,1,6,7,6)
list3 = (8,8,8,8,9)
In: is_single_diff(list1)
Out: True
In: is_single_diff(list2)
Out: False
In: is_single_diff(list3)
Out: True
Use numpy unique, it will give you all the information you need.
myarray = np.array([1,1,1,1,1,1,5,1,1,1])
vals_unique,vals_counts = np.unique(myarray,return_counts=True)
You can first check for the most common value. After that, go through the list to see if there is a different value, and keep track of it.
If you later find another value that isn't the same as the most common one, the list does not have a single difference.
list1 = [1,1,1,1,1,1,5,1,1,1]
def single_difference(lst):
most_common = max(set(lst), key=lst.count)
diff_idx = None
diff_val = None
for idx, i in enumerate(lst):
if i != most_common:
if diff_val is not None:
return "No unique single difference"
diff_idx = idx
diff_val = i
return (most_common, diff_val, diff_idx)
print(single_difference(list1))

Unknown error on PySpark map + broadcast

I have a big group of tuples with tuple[0] = integer and tuple[1] = list of integers (resulting from a groupBy). I call the value tuple[0] key for simplicity.
The values inside the lists tuple[1] can be eventually other keys.
If key = n, all elements of key are greater than n and sorted / distinct.
In the problem I am working on, I need to find the number of common elements in the following way:
0, [1,2]
1, [3,4,5]
2, [3,7,8]
.....
list of values of key 0:
1: [3,4,5]
2: [3,7,8]
common_elements between list of 1 and list of 2: 3 -> len(common_elements) = 1
Then I apply the same for keys 1, 2 etc, so:
list of values of 1:
3: ....
4: ....
5: ....
The sequential script I wrote is based on pandas DataFrame df, with the first column v as list of 'keys' (as index = True) and the second column n as list of list of values:
for i in df.v: #iterate each value
for j in df.n[i]: #iterate within the list
common_values = set(df.n[i]).intersection(df.n[j])
if len(common_values) > 0:
return len(common_values)
Since is a big dataset, I'm trying to write a parallelized version with PySpark.
df.A #column of integers
df.B #column of integers
val_colA = sc.parallelize(df.A)
val_colB = sc.parallelize(df.B)
n_values = val_colA.zip(val_colB).groupByKey().MapValues(sorted) # RDD -> n_values[0] will be the key, n_values[1] is the list of values
n_values_broadcast = sc.broadcast(n_values.collectAsMap()) #read only dictionary
def f(element):
for i in element[1]: #iterating the values of "key" element[0]
common_values = set(element[1]).intersection(n_values_broadcast.value[i])
if len(common_values) > 0:
return len(common_values)
collection = n_values.map(f).collect()
The programs fails after few seconds giving error like KeyError: 665 but does not provide any specific failure reason.
I'm a Spark beginner thus not sure whether this the correct approach (should I consider foreach instead? or mapPartition) and especially where is the error.
Thanks for the help.
The error is actually pretty clear and Spark specific. You are accessing Python dict with __getitem__ ([]):
n_values_broadcast.value[i]
and if key is missing in the dictionary you'll get KeyError. Use get method instead:
n_values_broadcast.value.get(i, [])

Optimize search to find next matching value in a list

I have a program that goes through a list and for each objects finds the next instance that has a matching value. When it does it prints out the location of each objects. The program runs perfectly fine but the trouble I am running into is when I run it with a large volume of data (~6,000,000 objects in the list) it will take much too long. If anyone could provide insight into how I can make the process more efficient, I would greatly appreciate it.
def search(list):
original = list
matchedvalues = []
count = 0
for x in original:
targetValue = x.getValue()
count = count + 1
copy = original[count:]
for y in copy:
if (targetValue == y.getValue):
print (str(x.getLocation) + (,) + str(y.getLocation))
break
Perhaps you can make a dictionary that contains a list of indexes that correspond to each item, something like this:
values = [1,2,3,1,2,3,4]
from collections import defaultdict
def get_matches(x):
my_dict = defaultdict(list)
for ind, ele in enumerate(x):
my_dict[ele].append(ind)
return my_dict
Result:
>>> get_matches(values)
defaultdict(<type 'list'>, {1: [0, 3], 2: [1, 4], 3: [2, 5], 4: [6]})
Edit:
I added this part, in case it helps:
values = [1,1,1,1,2,2,3,4,5,3]
def get_next_item_ind(x, ind):
my_dict = get_matches(x)
indexes = my_dict[x[ind]]
temp_ind = indexes.index(ind)
if len(indexes) > temp_ind + 1:
return(indexes)[temp_ind + 1]
return None
Result:
>>> get_next_item_ind(values, 0)
1
>>> get_next_item_ind(values, 1)
2
>>> get_next_item_ind(values, 2)
3
>>> get_next_item_ind(values, 3)
>>> get_next_item_ind(values, 4)
5
>>> get_next_item_ind(values, 5)
>>> get_next_item_ind(values, 6)
9
>>> get_next_item_ind(values, 7)
>>> get_next_item_ind(values, 8)
There are a few ways you could increase the efficiency of this search by minimising additional memory use (particularly when your data is BIG).
you can operate directly on the list you are passing in, and don't need to make copies of it, in this way you won't need: original = list, or copy = original[count:]
you can use slices of the original list to test against, and enumerate(p) to iterate through these slices. You won't need the extra variable count and, enumerate(p) is efficient in Python
Re-implemented, this would become:
def search(p):
# iterate over p
for i, value in enumerate(p):
# if value occurs more than once, print locations
# do not re-test values that have already been tested (if value not in p[:i])
if value not in p[:i] and value in p[(i + 1):]:
print(e, ':', i, p[(i + 1):].index(e))
v = [1,2,3,1,2,3,4]
search(v)
1 : 0 2
2 : 1 2
3 : 2 2
Implementing it this way will only print out the values / locations where a value is repeated (which I think is what you intended in your original implementation).
Other considerations:
More than 2 occurrences of value: If the value repeats many times in the list, then you might want to implement a function to walk recursively through the list. As it is, the question doesn't address this - and it may be that it doesn't need to in your situation.
using a dictionary: I completely agree with Akavall above, dictionary's are a great way of looking up values in Python - especially if you need to lookup values again later in the program. This will work best if you construct a dictionary instead of a list when you originally create the list. But if you are only doing this once, it is going to cost you more time to construct the dictionary and query over it than simply iterating over the list as described above.
Hope this helps!

Python Values in Lists

I am using Python 3.0 to write a program. In this program I deal a lot with lists which I haven't used very much in Python.
I am trying to write several if statements about these lists, and I would like to know how to look at just a specific value in the list. I also would like to be informed of how one would find the placement of a value in the list and input that in an if statement.
Here is some code to better explain that:
count = list.count(1)
if count > 1
(This is where I would like to have it look at where the 1 is that the count is finding)
Thank You!
Check out the documentation on sequence types and list methods.
To look at a specific element in the list you use its index:
>>> x = [4, 2, 1, 0, 1, 2]
>>> x[3]
0
To find the index of a specific value, use list.index():
>>> x.index(1)
2
Some more information about exactly what you are trying to do would be helpful, but it might be helpful to use a list comprehension to get the indices of all elements you are interested in, for example:
>>> [i for i, v in enumerate(x) if v == 1]
[2, 4]
You could then do something like this:
ones = [i for i, v in enumerate(your_list) if v == 1]
if len(ones) > 1:
# each element in ones is an index in your_list where the value is 1
Also, naming a variable list is a bad idea because it conflicts with the built-in list type.
edit: In your example you use your_list.count(1) > 1, this will only be true if there are two or more occurrences of 1 in the list. If you just want to see if 1 is in the list you should use 1 in your_list instead of using list.count().
You can use list.index() to find elements in the list besides the first one, but you would need to take a slice of the list starting from one element after the previous match, for example:
your_list = [4, 2, 1, 0, 1, 2]
i = -1
while True:
try:
i = your_list[i+1:].index(1) + i + 1
print("Found 1 at index", i)
except ValueError:
break
This should give the following output:
Found 1 at index 2
Found 1 at index 4
First off, I would strongly suggest reading through a beginner’s tutorial on lists and other data structures in Python: I would recommend starting with Chapter 3 of Dive Into Python, which goes through the native data structures in a good amount of detail.
To find the position of an item in a list, you have two main options, both using the index method. First off, checking beforehand:
numbers = [2, 3, 17, 1, 42]
if 1 in numbers:
index = numbers.index(1)
# Do something interesting
Your other option is to catch the ValueError thrown by index:
numbers = [2, 3, 17, 1, 42]
try:
index = numbers.index(1)
except ValueError:
# The number isn't here
pass
else:
# Do something interesting
One word of caution: avoid naming your lists list: quite aside from not being very informative, it’ll shadow Python’s native definition of list as a type, and probably cause you some very painful headaches later on.
You can find out in which index is the element like this:
idx = lst.index(1)
And then access the element like this:
e = lst[idx]
If what you want is the next element:
n = lst[idx+1]
Now, you have to be careful - what happens if the element is not in the list? a way to handle that case would be:
try:
idx = lst.index(1)
n = lst[idx+1]
except ValueError:
# do something if the element is not in the list
pass
list.index(x)
Return the index in the list of the first item whose value is x. It is an error if there is no such item.
--
In the docs you can find some more useful functions on lists: http://docs.python.org/tutorial/datastructures.html#more-on-lists
--
Added suggestion after your comment: Perhaps this is more helpful:
for idx, value in enumerate(your_list):
# `idx` will contain the index of the item and `value` will contain the value at index `idx`

Categories