I'd like to slice a numpy array to get all but the first item, unless there's only one element, in which case, I only want to select that element (i.e. don't slice).
Is there a way to do this without using an if-statement?
x = np.array([1,2,3,4,5])
y = np.array([1])
print(x[1:]) # works
print(y[1 or None:]) # doesn't work
I tried the above, but it didn't work.
A way to write that without a conditional is to use negative indexing with -len(arr) + 1:
>>> x = np.array([1,2,3,4,5])
>>> y = np.array([1])
>>> x[-len(x)+1:]
array([2, 3, 4, 5])
>>> y[-len(y)+1:]
array([1])
If the array has N elements where N > 1, slice becomes -N+1:. Since -N+1 < 0, it is effectively (N + (-N + 1)): === 1:, i.e, first one onwards.
Elsewhen N == 1, slice is 0:, i.e., take the first element onwards which is the only element.
Because of how slicing works, an empty array (i.e., N = 0 case) will result in an empty array too.
You can just write an if / else:
x[1 if len(x) > 1 else 0:]
array([2, 3, 4, 5])
y[1 if len(y) > 1 else 0:]
array([1])
Or:
y[int(len(y) > 1):]
array([1])
x[int(len(x) > 1):]
array([2, 3, 4, 5])
Just move None out of the brackets
x = [1,2,3,4,5]
y = [1]
print(x[1:])
print(y[1:] or None)
Use a ternary expression to make the logic clear but still be able to use it as a function argument or inside other expressions.
The ternary expression x if len(x) == 1 else x[1:] works and is very clear. And you can use it as a parameter in a function call, or in a larger expression.
E.g.:
>>> x = np.array([1,2,3,4,5])
>>> y = np.array([1])
>>> print(x if len(x) == 1 else x[1:])
[2 3 4 5]
>>> print(y if len(y) == 1 else y[1:])
[1]
Musings on other solutions
I'm not sure if you're looking for the most concise code possible to do this, or just for the ability to have the logic inside a single expression.
I don't recommend fancy slicing solutions with negative indexing, for the sake of legibility of your code. Think about future readers of your code, even yourself in a year or two.
Using this is a larger expression
In the comments, you mention you need a solution that can be incorporated into something like a comprehension. The ternary expression can be used as is within a comprehension. For example, this code works:
l = [np.array(range(i)) for i in range(5)]
l2 = [
x if len(x) == 1 else x[1:]
for x in l
]
I've added spacing to make the code easier to read, but it would also work as a one liner:
l2 = [x if len(x) == 1 else x[1:] for x in l]
EDIT note
Earlier, I thought you wanted the first element extracted from the list in the single-element case, i.e., x[0], but I believe you actually want that single-element list unsliced, i.e., x, so I've updated my answer accordingly.
Check the array size is greater than 1 if the case you can delete the first element from array and it will give new array without it.
print(np.delete(x, 0))
Now you can get a new array which will contain only remaining items other than first.
Related
Hi I'm really new to programming.
I want to add 1 (+1) to each item in a list, except for the items that have value = 3.
I tried with a for loop:
p = [1,2,3]
p= [x+1 for x in p if x != 3]
print (p)
#[2,3]
But the output is [2,3], so it adds 1 to the first to items but doesn't output the last one. That's not what I want, I wan't it to show all 3 items but don't add anything to those who are = 3.
Then I tried this but still doesn't work:
p = [1,2,3]
p= [x+1 for x!=3 in p]
print (p)
#SyntaxError: invalid syntax
As you have found the guard expression [<expr> for x in p if <guard>] will filter the list to only those that meet the guard expression.
As you are looking to work on every value then you should not use guard but look at the ternary operator (aka Conditional Expressions):
[x+1 if x != 3 else x for x in p]
Since booleans are also integers you can just add the result of the condition to the value:
[x + (x!=3) for x in p]
And if you wanna get fancy you can use numpy:
p[p!=3] +=1
Your code is only selecting those elements of p not equal to 3. Instead, you need to the conditional applying to the outcome:
[x+1 if x != 3 else x for x in p]
Just another way, using that bools act like 0 and 1:
[x + (x != 3) for x in p]
I haven't coded in Python for a long time and sometimes I'm quite confused.
When I have an array, like eg.: arr = [0, 1, 0, 3, 12]
and say:
for i in arr:
if i == 0:
arr.remove(i)
it is removing the two zeroes the way I want.
But if the array is something like: arr = [0, 0, 1], with the above method, one 0 will remain.
So could someone explain me why the behavior is like that? I don't find an explanation for this.
Better try this:
arr = [n for n in arr if n != 0]
This uses a list comprehension, it's a lot safer than what you're doing: removing the elements at the same time you're iterating is a bad idea, and you'll experience problems such as elements not being removed.
This is because the list size is reduced and the iterator traversing it will find less elements than it was expecting when the iteration began.
I think I found why your method doesn't work. The problem comes from the way you iterate.
In your example, your function seems to work for arr = [0,1,0,3,12] but not on your second array arr2 = [0,0,2] and returns [0,2]. One interesting thing to investigate then, is the fact that in your second example, you have two consecutive zeros.
Take a look at this code and try to execute it :
for i in arr:
print('i = '+str(i))
if(i == 0):
arr.remove(i)
With your first array, you noticed that your output is the one you expected but that was lucky. As a matter of fact, if you run the code above, you would see that it prints in your console :
> i = 0
> i = 0
> i = 12
So, actually, this means that your remove statement changes the array you iterate on. After a deletion, you skip an element in your array.
This means you should prefer another way, like the ones suggested in comments.
Hope this helps
you can filter out your zeros with the built-in function filter:
arr = list(filter(None, arr))
you have to pay attention if you use filter function with None as first parameter, this will apply bool over your items if you have elements like None, 0 or the empty string '' the result will be the same, False and all these elements will be filtered out, for safety reasons you may use:
arr = list(filter(lambda x: x != 0 , arr))
So I'm writing a function that is going to multiply each number at an odd index in a list by 2. I'm stuck though, as I really don't know how to approach it.
This is my code.
def produkt(pnr):
for i in pnr:
if i % 2 != 0:
i = i * 2
return pnr
If I, for example, type produkt([1,2,3]) I get [1,2,3] back but I would want it to be [2,2,6].
note that modifying i in your example does not change the value from the input list (integers are immutable). And you're also mixing up the values with their position.
Also, since indices start at 0 in python, you got it the wrong way.
In those cases, a simple list comprehension with a ternary expression will do, using enumerate to be able to get hold of the indices (making it start at 1 to match your case, you can adjust at will):
[p*2 if i%2 else p for i,p in enumerate(pnr,1)]
(note if i%2 is shorter that if i%2 != 0)
using list comprehensions:
multiply odd numbers by 2:
[x*2 if x%2 else x for x in pnr]
After clarification of question wording:
multiply numbers at odd indices by 2:
[x*2 if i%2 else x for i,x in enumerate(pnr)]
Consider using list comprehensions:
def produkt(pnr):
return [k * 2 if k % 2 else k for k in pnr]
Doing i = i * 2 you just override a local variable.
UPDATE (question was changed):
def produkt(pnr):
return [k * 2 if i % 2 else k for i, k in enumerate(pnr, 1)]
You can get the indices using enumerate, however that starts by default with index 0 (not 1) but it accepts a start argument to override that default.
The problem with your approach is that you don't change the actual list contents, you just assign a different value to the name i (which represented a list element until you assigned a different value to it with i = i*2). If you want it to work in-place you would need to modify the list itself: e.g. pnr[idx] *= 2 or pnr[idx] = pnr[idx] * 2.
However, it's generally easier to just create a new list instead of modifying an existing one.
For example:
def produkt(pnr):
newpnr = [] # create a new list
for idx, value in enumerate(pnr, 1):
# If you're testing for not-zero you can omit the "!=0" because every
# non-zero number is "truthy".
if idx % 2:
newpnr.append(value * 2) # append to the new list
else:
newpnr.append(value) # append to the new list
return newpnr # return the new list
>>> produkt([1,2,3])
[2, 2, 6]
Or even better: use a generator function instead of using all these appends:
def produkt(pnr):
for idx, value in enumerate(pnr, 1):
if idx % 2:
yield value * 2
else:
yield value
>>> list(produkt([1,2,3])) # generators should be consumed, for example by "list"
[2, 2, 6]
Of course you could also just use a list comprehension:
def produkt(pnr):
return [value * 2 if idx % 2 else value for idx, value in enumerate(pnr, 1)]
>>> produkt([1,2,3])
[2, 2, 6]
Try this:
def produkt(pnr):
return [ 2*x if i % 2 == 0 else x for i, x in enumerate(pnr)]
It will double every element in your list with an odd index.
>>> produkt([1,2,3])
[2, 2, 6]
Your code does not work, as i is no reference to the value inside the list, but just its value.
You have to store the new value in the list again.
def produkt(pnr):
for i in range(len(pnr)):
if pnr[i] % != 0:
pnr[i] *= 2
return pnr
or use this more convenient solution:
def produkt(pnr):
return [x * 2 if x % 2==0 else x for x in pnr]
Edit: As the question has been changed (completely) you should use this code:
def produkt(pnr):
return [x * 2 if ind % 2 else x for ind, x in enumerate(pnr)]
The first examples multiply each odd index by 2 and the former code multiplies the numbers at odd indices by 2.
Your problem is that i is a copy of the values in the pnr list, not the value in the list itself. So, you are not changing the list when doing i = i * 2.
The existing answers are already good and show the idiomatic way to achieve your goal. However, here is the minimum change to make it work as expected for learning purpose.
produkt(pnr):
new_pnr = list(pnr)
for ix in len(new_pnr):
if new_pnr[ix] % 2 != 0:
new_pnr[ix] *= 2
return new_pnr
Without new_pnr you'd be changing the list in place and then you wouldn't need to return it.
I've seen loops like this a lot on hackerrank but I still don't understand how they work. Why does it have a constant integer '1' in it? Shouldn't it be 'i' instead of '1'? Can anyone please explain this to me.
sum (1 for i in l if i >= a and i <= b)
Credit where credit is due. I copied this loop from a very elegant solution to a problem by Shashwat. The problem was 'Sherlock and Squares' in hackerrank algorithms the for curious ones.
I don't know your values so let's assume:
>>> l = list(range(10))
>>> a = 4
>>> b = 7
If you break down your line of code into a couple of steps and print the intermediate results it's clearer:
>>> [1 for i in l if i >= a and i <= b]
[1, 1, 1, 1]
This is what gets passed to sum. (When you leave off the square brackets it implicitly becomes a generator but this is what it looks like as a list.)
In case you don't understand the comprehension, it's equivalent to this:
>>> result = []
>>> for i in l:
... if i >= a and i <= b:
... result.append(1)
...
>>> result
[1, 1, 1, 1]
The summation would be equivalent to changing result = [] to result = 0 and result.append(1) to result += 1.
In your example they're basically adding 1 to a variable for every item in l if the item is larger than or equal to a and the item is smaller than or equal to b.
This is basically equal to this code:
x = []
for i in l:
if i >=a and i <= b:
x.append(1)
sum(x)
sum (1 for i in l if i >= a and i <= b)
What this is doing, is going to create a generator expression of 1s only if the condition i >= a and i <= b passes while iterating over l and i being your iterator.
Then, sum will add all the 1s together.
I am looking for the fastest way to output the index of the first difference of two arrays in Python. For example, let's take the following two arrays:
test1 = [1, 3, 5, 8]
test2 = [1]
test3 = [1, 3]
Comparing test1 and test2, I would like to output 1, while the comparison of test1 and test3 should output 2.
In other words I look for an equivalent to the statement:
import numpy as np
np.where(np.where(test1 == test2, test1, 0) == '0')[0][0]
with varying array lengths.
Any help is appreciated.
For lists this works:
from itertools import zip_longest
def find_first_diff(list1, list2):
for index, (x, y) in enumerate(zip_longest(list1, list2,
fillvalue=object())):
if x != y:
return index
zip_longest pads the shorter list with None or with a provided fill value. The standard zip does not work if the difference is caused by different list lengths rather than actual different values in the lists.
On Python 2 use izip_longest.
Updated: Created unique fill value to avoid potential problems with None as list value. object() is unique:
>>> o1 = object()
>>> o2 = object()
>>> o1 == o2
False
This pure Python approach might be faster than a NumPy solution. This depends on the actual data and other circumstances.
Converting a list into a NumPy array also takes time. This might actually
take longer than finding the index with the function above. If you are not
going to use the NumPy array for other calculations, the conversion
might cause considerable overhead.
NumPy always searches the full array. If the difference comes early,
you do a lot more work than you need to.
NumPy creates a bunch of intermediate arrays. This costs memory and time.
NumPy needs to construct intermediate arrays with the maximum length.
Comparing many small with very large arrays is unfavorable here.
In general, in many cases NumPy is faster than a pure Python solution.
But each case is a bit different and there are situations where pure
Python is faster.
with numpy arrays (which will be faster for big arrays) then you could check the lengths of the lists then (also) check the overlapping parts something like the following (obviously slicing the longer to the length of the shorter):
import numpy as np
n = min(len(test1), len(test2))
x = np.where(test1[:n] != test2[:n])[0]
if len(x) > 0:
ans = x[0]
elif len(test1) != len(test2):
ans = n
else:
ans = None
EDIT - despite this being voted down I will leave my answer up here in case someone else needs to do something similar.
If the starting arrays are large and numpy then this is the fastest method. Also I had to modify Andy's code to get it to work. In the order: 1. my suggestion, 2. Paidric's (now removed but the most elegant), 3. Andy's accepted answer, 4. zip - non numpy, 5. vanilla python without zip as per #leekaiinthesky
0.1ms, 9.6ms, 0.6ms, 2.8ms, 2.3ms
if the conversion to ndarray is included in timeit then the non-numpy nop-zip method is fastest
7.1ms, 17.1ms, 7.7ms, 2.8ms, 2.3ms
and even more so if the difference between the two lists is at around index 1,000 rather than 10,000
7.1ms, 17.1ms, 7.7ms, 0.3ms, 0.2ms
import timeit
setup = """
import numpy as np
from itertools import zip_longest
list1 = [1 for i in range(10000)] + [4, 5, 7]
list2 = [1 for i in range(10000)] + [4, 4]
test1 = np.array(list1)
test2 = np.array(list2)
def find_first_diff(l1, l2):
for index, (x, y) in enumerate(zip_longest(l1, l2, fillvalue=object())):
if x != y:
return index
def findFirstDifference(list1, list2):
minLength = min(len(list1), len(list2))
for index in range(minLength):
if list1[index] != list2[index]:
return index
return minLength
"""
fn = ["""
n = min(len(test1), len(test2))
x = np.where(test1[:n] != test2[:n])[0]
if len(x) > 0:
ans = x[0]
elif len(test1) != len(test2):
ans = n
else:
ans = None""",
"""
x = np.where(np.in1d(list1, list2) == False)[0]
if len(x) > 0:
ans = x[0]
else:
ans = None""",
"""
x = test1
y = np.resize(test2, x.shape)
x = np.where(np.where(x == y, x, 0) == 0)[0]
if len(x) > 0:
ans = x[0]
else:
ans = None""",
"""
ans = find_first_diff(list1, list2)""",
"""
ans = findFirstDifference(list1, list2)"""]
for f in fn:
print(timeit.timeit(f, setup, number = 1000))
Here one way to do it:
from itertools import izip
def compare_lists(lista, listb):
"""
Compare two lists and return the first index where they differ. if
they are equal, return the list len
"""
for position, (a, b) in enumerate(zip(lista, listb)):
if a != b:
return position
return min([len(lista), len(listb)])
The algorithm is simple: zip (or in this case, a more efficient izip) the two lists, then compare them element by element.
The eumerate function gives the index position which we can return if a discrepancy found
If we exit the for loop without any returns, one of the two possibilities can happen:
The two lists are identical. In this case, we want to return the length of either lists.
Lists are of different length and they are equal up to the length of the shorter list. In this case, we want to return the length of the shorter list
In ether cases, the min(...) expression is what we want.
This function has a bug: if you compare two empty lists, it returns 0, which seems wrong. I'll leave it to you to fix it as an exercise.
The fastest algorithm would compare every element up to the first difference and no more. So iterating through the two lists pairwise like that would give you this:
def findFirstDifference(list1, list2):
minLength = min(len(list1), len(list2))
for index in xrange(minLength):
if list1[index] != list2[index]:
return index
return minLength # the two lists agree where they both have values, so return the next index
Which gives the output you want:
print findFirstDifference(test1, test3)
> 2
Thanks for all of your suggestions, I just found a much simpler way for my problem which is:
x = numpy.array(test1)
y = np.resize(numpy.array(test2), x.shape)
np.where(np.where(x == y, x, 0) == '0')[0][0]
Here's an admittedly not very pythonic, numpy-free stab:
b = zip (test1, test2)
c = 0
while b:
b = b[1:]
if not b or b[0][0] != b[0][1]:
break
else:
c = c + 1
print c
For Python 3.x:
def first_diff_index(ls1, ls2):
l = min(len(ls1), len(ls2))
return next((i for i in range(l) if ls1[i] != ls2[i]), l)
(for Python 2.7 onwards substitute range by xrange)