Replacing values less than the threshold in Python [duplicate]

Replacing values less than the threshold in Python [duplicate] - python

I am trying to do the following with python and am having a strange behavior. Say I have the following list:
x = [5, 4, 3, 2, 1]
Now, I am doing something like:
x[x >= 3] = 3
This gives:
x = [5, 3, 3, 2, 1]
Why does only the second element get changed? I was expecting:
[3, 3, 3, 2, 1]

Because Python will evaluated the x>=3 as True and since True is equal to 1 so the second element of x will be converted to 3.
For such purpose you need to use a list comprehension :
>>> [3 if i >=3 else i for i in x]
[3, 3, 3, 2, 1]
And if you want to know that why x >= 3 evaluates as True, see the following documentation :
CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address.
In python-2.x and CPython implementation of course, a list is always greater than an integer type.As a string is greater than a list :
>>> ''>[]
True
In Python-3.X, however, you can't compare unorderable types together and you'll get a TypeError in result.
In [17]: '' > []
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-052e7eb2f6e9> in <module>()
----> 1 '' > []
TypeError: unorderable types: str() > list()

You can use this syntax with Numpy:
>>> import numpy as np
>>> x = np.array([5, 4, 3, 2, 1])
>>> x[x>3]=3
>>> x
array([3, 3, 3, 2, 1])
You can also do this with Pandas:
>>> import pandas as pd
>>> x = pd.Series([5, 4, 3, 2, 1])
>>> x
0 5
1 4
2 3
3 2
4 1
dtype: int64
>>> x[x>3]=3
>>> x
0 3
1 3
2 3
3 2
4 1
dtype: int64

You're using python lists. In python(2.x), comparison of a list with an int will compare the types, not the values. So, your comparison results in True which is equivalent to 1. In other words, your expression is equivalent to:
x[1] = 3 # x[1] == x[True] == x[x > 3]
Note, python3.x disallows this type of comparison (because it's almost certainly not what you meant) -- And if you want to be doing this sort of operation, you almost certainly thought of it by looking at numpy documentation as the numpy API has been designed specifically to support this sort of thing:
import numpy as np
array = np.arange(5)
array[array > 3] = 3

Related

R's which() and which.min() Equivalent in Python

I read the similar topic here. I think the question is different or at least .index() could not solve my problem.
This is a simple code in R and its answer:
x <- c(1:4, 0:5, 11)
x
#[1] 1 2 3 4 0 1 2 3 4 5 11
which(x==2)
# [1] 2 7
min(which(x==2))
# [1] 2
which.min(x)
#[1] 5
Which simply returns the index of the item which meets the condition.
If x be the input for Python, how can I get the indeces for the elements which meet criteria x==2 and the one which is the smallest in the array which.min.
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
x[x>2].index()
##'numpy.ndarray' object has no attribute 'index'

Numpy does have built-in functions for it
x = [1,2,3,4,0,1,2,3,4,11]
x=np.array(x)
np.where(x == 2)
np.min(np.where(x==2))
np.argmin(x)
np.where(x == 2)
Out[9]: (array([1, 6], dtype=int64),)
np.min(np.where(x==2))
Out[10]: 1
np.argmin(x)
Out[11]: 4

A simple loop will do:
res = []
x = [1,2,3,4,0,1,2,3,4,11]
for i in range(len(x)):
if check_condition(x[i]):
res.append(i)
One liner with comprehension:
res = [i for i, v in enumerate(x) if check_condition(v)]
Here you have a live example

NumPy for R provides you with a bunch of R functionalities in Python.
As to your specific question:
import numpy as np
x = [1,2,3,4,0,1,2,3,4,11]
arr = np.array(x)
print(arr)
# [ 1 2 3 4 0 1 2 3 4 11]
print(arr.argmin(0)) # R's which.min()
# 4
print((arr==2).nonzero()) # R's which()
# (array([1, 6]),)

The method based on python indexing and numpy, which returns the value of the desired column based on the index of the minimum/maximum value
df.iloc[np.argmin(df['column1'].values)]['column2']

built-in index function can be used for this purpose:
x = [1,2,3,4,0,1,2,3,4,11]
print(x.index(min(x)))
#4
print(x.index(max(x)))
#9
However, for indexes based on a condition, np.where or manual loop and enumerate may work:
index_greater_than_two1 = [idx for idx, val in enumerate(x) if val>2]
print(index_greater_than_two1)
# [2, 3, 7, 8, 9]
# OR
index_greater_than_two2 = np.where(np.array(x)>2)
print(index_greater_than_two2)
# (array([2, 3, 7, 8, 9], dtype=int64),)

You could also use heapq to find the index of the smallest. Then you can chose to find multiple (for example index of the 2 smallest).
import heapq
x = np.array([1,2,3,4,0,1,2,3,4,11])
heapq.nsmallest(2, (range(len(x))), x.take)
Returns
[4, 0]

Comparing scalars to Numpy arrays [duplicate]

This question already has an answer here:
Numpy error in Python
(1 answer)
Closed 6 years ago.
What I am trying to do is make a table based on a piece-wise function in Python. For example, say I wrote this code:
import numpy as np
from astropy.table import Table, Column
from astropy.io import ascii
x = np.array([1, 2, 3, 4, 5])
y = x * 2
data = Table([x, y], names = ['x', 'y'])
ascii.write(data, "xytable.dat")
xytable = ascii.read("xytable.dat")
print xytable
This works as expected, it prints a table that has x values 1 through 5 and y values 2, 4, 6, 8, 10.
But, what if I instead want y to be x * 2 only if x is 3 or less, and y to be x + 2 otherwise?
If I add:
if x > 3:
y = x + 2
it says:
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How do I code my table so that it works as a piece-wise function? How do I compare scalars to Numpy arrays?

You can possibly use numpy.where():
In [196]: y = np.where(x > 3, x + 2, y)
In [197]: y
Out[197]: array([2, 4, 6, 6, 7])
The code above gets the job done in a fully vectorized manner. This approach is generally more efficient (and arguably more elegant) than using list comprehensions and type conversions.

Start off not using numpy (or maybe you can, I don't know numpy) and just do in using regular python lists.
x = [ 1, 2, 3, 4, 5 ]
y = [ i * 2 if i < 3 else i + 2 for i in x ]
print y
Outputs:
[2, 4, 5, 6, 7]
Then you can make it a numpy array:
x = np.array(x)
y = np.array(y)

How to program python variable to point different reference location but same value [duplicate]

This question already has an answer here:
What's with the integer cache maintained by the interpreter?
(1 answer)
Closed 6 years ago.
After thinking a while this concept came to my mind which I was gone through few days ago.
In Python if I did x=y then automatically x and y will point same object reference location but is there any way I can manage to change y reference location but with same value if x.
for example:
x=100
y=x
now x and y share same object reference location of value 100 but I want to have a different location for y.
Edit: What I am trying to do
l1=[1,2,3,4]
l2=l1
i=0
j=len(l1)-1
while j >= 0 :
l1[i]=l2[j]
i=i+1
j=j-1
print("L1=",l1,"L2",l2,"i=",i,"j=",j)
What I am getting as outout
L1= [4, 2, 3, 4] L2 [4, 2, 3, 4] i= 1 j= 2
L1= [4, 3, 3, 4] L2 [4, 3, 3, 4] i= 2 j= 1
L1= [4, 3, 3, 4] L2 [4, 3, 3, 4] i= 3 j= 0
L1= [4, 3, 3, 4] L2 [4, 3, 3, 4] i= 4 j= -1
Thank you.

This is unlikely to occur with small numbers because numbers are immutable in Python and the runtime is almost assuredly going to make x and y "point to" the same object in the object store:
>>> x = 100
>>> id(x)
4534457488
>>> y = 100
>>> id(y)
4534457488
Note that even by making y reference a new copy of 100, I still got the same object.
We can try:
>>> y = 25 * (5 - 1)
>>> id(y)
4534457488
Same.
If we used something mutable like lists, we could do:
>>> x = [1, 2, 3]
>>> id(x)
4539581000
>>> y = [1, 2] + [3]
>>> id(y)
4539578440
>>> x == y
True
>>> id(x) == id(y)
False
And now you have two variables, each referencing a different object in the object store, but the object values are equal.
As wRAR points, out, copy.copy can do things much more cleanly:
>>> from copy import copy
>>> x = [1, 2, 3]
>>> id(x)
4539578440
>>> y = copy(x)
>>> id(y)
4539580232
Tada. This is more inline with what you want, but notice it might not work for immutables like numbers!
>>> from copy import copy
>>> x = 100
>>> id(x)
4534457488
>>> y = copy(x)
>>> y
100
>>> id(y)
4534457488
My Python interpreter is just going to make one int object with value 100 in the object store.
For larger numbers, this may not be the case. I tried:
>>> x = 393
>>> id(x)
4537495408
>>> y = 393
>>> id(y)
4539235760
I found information in this S.O. question where an answerer experimented with integers on the object pool. The number of pre-cached integer objects is probably implementation dependent in Python, whereas the JVM does define caching behavior.

You can not do it with int. More over:
x = 100
y = 100
x is y
>>> True
This is like a pool of small integers. They have only single representation in the memory.
With mutable objects you may do copy() operation:
a = [1]
b = copy(a)
a is b
>>> False

You can use copy.copy() for this.

Replace elements in numpy array using list of old and new values

I want to replace elements in a numpy array using a list of old values and new values. See below for a code example (replace_old is the requested method). The method must work for both int, float and string elements. How do I do that?
import numpy as np
dat = np.hstack((np.arange(1,9), np.arange(1,4)))
print dat # [1 2 3 4 5 6 7 8 1 2 3]
old_val = [2, 5]
new_val = [11, 57]
new_dat = replace_old(dat, old_val, new_val)
print new_dat # [1 11 3 4 57 6 7 8 1 11 3]

You can use np.place :
>>> np.place(dat,np.in1d(dat,old_val),new_val)
>>> dat
array([ 1, 11, 3, 4, 57, 6, 7, 8, 1, 11, 3])
For creating the mask array you can use np.in1d(arr1,arr2) which will give you :
a boolean array the same length as ar1 that is True where an element of ar1 is in ar2 and False otherwise
Edit:Note that the preceding recipe will replace old_values based on those order and as #ajcr mentioned it wont work for another arrays,so as a general way for now I suggest the following way using a loop (which I don't think that was the best way):
>>> dat2 = np.array([1, 2, 1, 2])
>>> old_val = [1, 2]
>>> new_val = [33, 66]
>>> z=np.array((old_val,new_val)).T
>>> for i,j in z:
... np.place(dat2,dat2==i,j)
...
>>> dat2
array([33, 66, 33, 66])
In this case you create a new array (z) which is contains the relevant pairs from old_val and new_val and then you can pass them to np.place and replace them .

Assigning the same value to different positions in a list in python

Assume that I have the following list in python:
x = [1,2,3,4,5,6,7,8,9,10]
I would like to assign the value 0 to specific positions on the list, for example positions 0, 7 and 9. Could I do something like the following in python without resorting to a loop?
x[0,7,9] = 0

There you go:
x[0] = x[7] = x[9] = 0
Also you can do this with numpy arrays in a more general and flexible fashion:
>>> import numpy as np
>>> x = np.array([1,2,3,4,5,6,7,8,9,10])
>>> indices = [0,7,9]
>>> x[indices] = 0 # or just x[[0,7,9]] = 0
>>> x
array([0, 2, 3, 4, 5, 6, 7, 0, 9, 0])
but this is probably not what you are looking for, as numpy is a slightly more advanced thing.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replacing values less than the threshold in Python [duplicate] - python

I am trying to do the following with python and am having a strange behavior. Say I have the following list: x = [5, 4, 3, 2, 1] Now, I am doing something like: x[x >= 3] = 3 This gives: x = [5, 3, 3, 2, 1] Why does only the second element get changed? I was expecting: [3, 3, 3, 2, 1]

Related

R's which() and which.min() Equivalent in Python

Comparing scalars to Numpy arrays [duplicate]

How to program python variable to point different reference location but same value [duplicate]

Replace elements in numpy array using list of old and new values

Assigning the same value to different positions in a list in python

Categories

Resources