I have a data structure
my_list = [ [a, b], [c,d], [e,f], [g,h], [i, j], [k, l] ....]
where the letters are floats.
I need to find the ratio between c,e and a >>>> c/a...e/a
Then find the ratio between d,f and b >>>> d/b, f/b
and continue this for all elements 12 elements in the list. So 8 ratios calculated.
Is there a function that can do this efficiently since we are going between list elements? Without having to extract the data in the arrays individually first and then do the math.
ex_array = [[5.0, 2.5], [10.0, 5.0], [20.0, 13.0]] # makes ndarray, which makes division easier
for i in xrange(len(ex_array)):
print "\n" + str(ex_array[i][0]) + " ratios for x values:\n"
for j in xrange(len(ex_array)):
print str(ex_array[i][0] / ex_array[j][0]) + "\t|{} / {}".format(ex_array[i][0], ex_array[j][0]) # gives ratios for each nested 0 index values against the others
for i in xrange(len(ex_array)):
print "\n" + str(ex_array[i][1]) + " ratios for x values:\n"
for j in xrange(len(ex_array)):
print str(ex_array[i][1] / ex_array[j][1]) + "\t|{} / {}".format(ex_array[i][1], ex_array[j][1]) # gives ratios for each nested 1 index values against the others
output formatted as such:
The required operations must be specified anyway.
def get(l):
return [l[i+k+1][j]/float(l[i][j]) for i in range(0, len(l)-2, 3) for j in range(2) for k in range(2)]
print get([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
Using list comprehension,
# sample list
a = [ [1.0, 2.0], [4.0, 8.0], [3.0, 9.0] ]
print('List:', a, '\n\nMatching:')
# divides each element to other elements besides itself
xs = [ x[0] / x1[0] for x1 in a for x in a if x[0] != x1[0] ]
ys = [ y[1] / y1[1] for y1 in a for y in a if y[1] != y1[1] ]
print("Quo of x's:", xs)
print("Quo of y's:", ys)
Outputs to
List: [[1.0, 2.0], [4.0, 8.0], [3.0, 9.0]]
Matching:
Quo of x's: [4.0, 3.0, 0.25, 0.75, 0.3333333333333333, 1.3333333333333333]
Quo of y's: [4.0, 4.5, 0.25, 1.125, 0.2222222222222222, 0.8888888888888888]
Or you can have some fun with difflib (builtin to stdlib since Python 2.1):
In [1]: from difflib import SequenceMatcher
In [2]: SequenceMatcher(None, (5,6), (6,)).ratio()
Out[2]: 0.6666666666666666
Related
Here I've a simple assoc. array of maps where I want to loop, but I want to print the arr['b'] by repeating 5 times.
number = 0
arr = {}
arr['a'] = map(float, [1, 2, 3])
arr['b'] = map(float, [4, 5, 6])
arr['c'] = map(float, [7, 8, 9])
arr['d'] = map(float, [10, 11, 12])
while number < 5:
print(list(arr['b']))
number = number + 1
Why is the output as such, instead of [4.0, 5.0, 6.0] repeating 5 times? How can I loop to get arr['b'] result 5 times?
Output:
[4.0, 5.0, 6.0]
[]
[]
[]
[]
This is the output I really want.
Intended Output:
[4.0, 5.0, 6.0]
[4.0, 5.0, 6.0]
[4.0, 5.0, 6.0]
[4.0, 5.0, 6.0]
[4.0, 5.0, 6.0]
map produces a generator which gets consumed the first time you access its content. Therefore, the first time you convert it to a list, it gives you the expected results, but the second time the resulting list is empty. Simple example:
a = map(float, [1, 2, 3])
print(list(a))
# out: [1.0, 2.0, 3.0]
print(list(a))
# out: []
Convert the map object/generator to a list once (outside the loop!) and you can print it as often as you need: arr['a'] = list(map(float, [1, 2, 3])) etc.
Other improvement: In Python you don't need counters in loops as you use it here. Instead, in order to do something 5 times, rather use range (the _ by convention denotes a value we are not interested in):
for _ in range(5):
print(list(arr['b']))
I have a numpy array of numpy arrays like the following example:
data = [[0.4, 1.5, 2.6],
[3.4, 0.2, 0.0],
[null, 3.2, 1.0],
[1.0, 4.6, null]]
I would like an efficient way of returning the row index, column index and value if the value meets a condition.
I need the row and column values because I feed them into func_which_returns_lat_long_based_on_row_and_column(column, row) which is applied if the value meets a condition.
Finally I would like to append the value, and outputs of the function to my_list.
I have solved my problem with the nested for loop solution shown below but it is slow. I believe I should be using np.where() however I cannot figure that out.
my_list = []
for ii, array in enumerate(data):
for jj, value in enumerate(array):
if value > 1:
lon , lat = func_which_returns_lat_long_based_on_row_and_column(jj,ii)
my_list.append([value, lon, lat])
I'm hoping there is a more efficient solution than the one I'm using above.
import numpy as np
import warnings
warnings.filterwarnings('ignore')
data = [[0.4, 1.5, 2.6],
[3.4, 0.2, 0.0],
[np.nan, 3.2, 1.0],
[1.0, 4.6, np.nan]]
x = np.array(data)
i, j = np.where(x > 1 )
for a, b in zip(i, j):
print('lon: {} lat: {} value: {}'.format(a, b, x[a,b]))
Output is
lon: 0 lat: 1 value: 1.5
lon: 0 lat: 2 value: 2.6
lon: 1 lat: 0 value: 3.4
lon: 2 lat: 1 value: 3.2
lon: 3 lat: 1 value: 4.6
As there is np.nan in comparison, there will be RuntimeWarning.
you can use
result = np.where(arr == 15)
it will return a np array of indices where element is in arr
try to build a function that works on arrays. For instance a function that adds to every element of the data the corresonding column and row index could look like:
import numpy as np
def func_which_returns_lat_long_based_on_row_and_column(data,indices):
# returns element of data + columna and row index
return data + indices[:,:,0] + indices[:,:,1]
data = np.array([[0.4, 1.5, 2.6],
[3.4, 0.2, 0.0],
[np.NaN, 3.2, 1.0],
[1.0, 4.6, np.NaN]])
# create a matrix of the same shape as data (plus an additional dim because they are two indices)
# with the corresponding indices of the element in it
x_range = np.arange(0,data.shape[0])
y_range = np.arange(0,data.shape[1])
grid = np.meshgrid(x_range,y_range, indexing = 'ij')
indice_matrix = np.concatenate((grid[0][:,:,None],grid[1][:,:,None]),axis=2)
# for instance:
# indice_matrix[0,0] = np.array([0,0])
# indice_matrix[1,0] = np.array([1,0])
# indice_matrix[1,3] = np.array([1,3])
# calculate the output
out = func_which_returns_lat_long_based_on_row_and_column(data,indice_matrix)
data.shape
>> (4,3)
indice_matrix.shape
>> (4, 3, 2)
indice_matrix
>>> array([[[0, 0],
[0, 1],
[0, 2]],
[[1, 0],
[1, 1],
[1, 2]],
[[2, 0],
[2, 1],
[2, 2]],
[[3, 0],
[3, 1],
[3, 2]]])
indice_matrix[2,1]
>> array([2, 1])
I have a nested list that contains both None elements and integers. It looks pretty much like this:
aList = [[None, 8.0, 1.0], [2.0, 3.0], [9.0], [5.0, None, 4.0]]
None elements don't follow any particular pattern and therefore can be found at any position inside the list. I'd like to obtain two things:
The minimum value (minimum) among all integers.
The indexes that define completely the position of this minimum value. In other words, those two numbers ( i, j ) that satisfy:
aList[i][j] = minimum
You can use this:
aList = [[None, 8.0, 1.0], [2.0, 3.0], [9.0], [5.0, None, 4.0]]
minimum = sys.maxsize
i_min, j_min = 0, 0
for i, a in enumerate(aList):
for j, b in enumerate(a):
if b and b < minimum:
i_min, j_min, minimum = i, j, b
print(minimum, i_min, j_min)
# 1.0 0 2
print(aList[i_min][j_min] == minimum)
# True
This is a possible solution:
import sys
aList = [[None, 8.0, 1.0], [2.0, 3.0], [9.0], [5.0, None, 4.0]]
mininum = sys.maxsize
for j, ele in enumerate(aList):
cur_min = min(float(i) for i in ele if i is not None)
if cur_min < mininum:
minimum = cur_min
pos_index = ele.index(minimum)
list_index = j
print(aList[list_index][pos_index])
I have two pandas dataframes A,B with identical shape, index and column. Each element of A is a np.ndarray with shape (n,1), and each element of B is a float value. Now I want to efficiently append B elementwise to A. A minimal example:
index = ['fst', 'scd']
column = ['a','b']
A
Out[23]:
a b
fst [1, 2] [1, 4]
scd [3, 4] [3, 2]
B
Out[24]:
a b
fst 0.392414 0.641136
scd 0.264117 1.644251
resulting_df = pd.DataFrame([[np.append(A.loc[i,j], B.loc[i,j]) for i in index] for j in column], columns=column, index=index)
resulting_df
Out[27]:
a b
fst [1.0, 2.0, 0.392414377685] [3.0, 4.0, 0.264117463613]
scd [1.0, 4.0, 0.641136433253] [3.0, 2.0, 1.64425062851]
Is there something similar to pd.DataFrame.applymap that can operate elementwise between two instead of just one pandas dataframe?
You can convert the elements in df2 to list using applymap and then just ordinary addition to combine the list i.e
index = ['fst', 'scd']
column = ['a','b']
A = pd.DataFrame([[[1, 2],[1, 4]],[[3, 4],[3, 2]]],index,column)
B = pd.DataFrame([[0.392414,0.264117],[ 0.641136 , 1.644251]],index,column)
Option 1 :
n = B.applymap(lambda y: [y])
ndf = A.apply(lambda x : x+n[x.name])
Option 2 :
using pd.concat to know how this works check here i.e
pd.concat([A,B]).groupby(level=0).apply(lambda g: pd.Series({i: np.hstack(g[i].values) for i in A.columns}))
To make you current method give correct output shift the loops i.e
pd.DataFrame([[np.append(A.loc[i,j], B.loc[i,j]) for j in A.columns] for i in A.index], columns=A.columns, index=A.index)
Output:
a b
fst [1.0, 2.0, 0.392414] [1.0, 4.0, 0.264117]
scd [3.0, 4.0, 0.641136] [3.0, 2.0, 1.644251]
You can simply do this:
>>> A + B.applymap(lambda x : [x])
a b
fst [1, 2, 0.392414] [1, 4, 0.264117]
scd [3, 4, 0.641136] [3, 2, 1.644251]
I have a list of values: [0,2,3,5,6,7,9] and want to get a list of the numbers in the middle in between each number: [1, 2.5, 4, 5.5, 6.5, 8]. Is there a neat way in python to do that?
It's a simple list comprehension (note I'm asuming you want all your values as floats rather than a mixture of ints and floats):
>>> lst = [0,2,3,5,6,7,9]
>>> [(a + b) / 2.0 for a,b in zip(lst, lst[1:])]
[1.0, 2.5, 4.0, 5.5, 6.5, 8.0]
(Dividing by 2.0 ensure floor division is not applied in Python 2)
Use a for loop:
>>> a = [0,2,3,5,6,7,9]
>>> [(a[x] + a[x + 1])/2 for x in range(len(a)-1)]
[1.0, 2.5, 4.0, 5.5, 6.5, 8.0]
However using zip as #Chris_Rands said is better... (and more readable ¬¬)
Obligatory itertools solution:
>>> import itertools
>>> values = [0,2,3,5,6,7,9]
>>> [(a+b)/2.0 for a,b in itertools.izip(values, itertools.islice(values, 1, None))]
[1.0, 2.5, 4.0, 5.5, 6.5, 8.0]
values = [0,2,3,5,6,7,9]
middle_values = [(values[i] + values[i + 1]) / 2.0 for i in range(len(values) - 1)]
Dividing by 2.0 rather than 2 is unnecessary in Python 3, or if you use from __future__ import division to change the integer division behavior.
The zip or itertools.izip answers are more idiomatic.
Simple for loop:
nums = [0,2,3,5,6,7,9]
betweens = []
for i in range(1, len(nums)):
if nums[i] - nums[i-1] > 1:
betweens.extend([item for item in range(nums[i-1]+1, nums[i])])
else:
betweens.append((nums[i] + nums[i-1]) / 2)
Output is as desired, which doesn't need further conversion (in Python3.x):
[1, 2.5, 4, 5.5, 6.5, 8]
[(l[i]+l[i+1])/2 for i in range(len(l)-1)]