I have a nested list named value, and I need to convert all things inside into string type and join them together.
This is currently how I do it:
value=[['2014-11-20 10:51:50', 7.36, 7.63, 0.4487, 12.37, 10.4, 39.85, 52.27, 0.41, 0.78, 6],
['2014-11-20 11:22:07', 7.41, 7.67, 0.4489, 12.44, 6.6, 40.39, 53.98, 0.41, 0.754, 6]]
for i, n in enumerate(value):
for j, m in enumerate(value[i]):
value[i][j]=str(value[i][j])
",".join(value[i])
As I am new to Python, I would like to know is there a better or faster way to do it. Or maybe there is some built in functions that could do the job?
value = [ ",".join(map(str,i)) for i in value ]
map will convert all float type to str and then join will join them
if you didn't understand about map how it working:
value = [ ",".join(str(x) for x in i) for i in value ]
Related
So, I'm sure similar questions have been asked before but I couldn't find quite what I need.
I have a program that outputs a 2D array like the one below:
arr = [[0.2, 3], [0.3, "End"], ...]
There may be more or less elements, but each is a 2-element array, where the first value is a float and the second can be a float or a string.
Both of those values may repeat. In each of those arrays, the second element takes on only a few possible values.
What I want to do is sum the first elements' value within the arrays that have the same value of the second element and output a similar array that does not have those duplicated values.
For example:
input = [[0.4, 1.5], [0.1, 1.5], [0.8, "End"], [0.05, "End"], [0.2, 3.5], [0.2, 3.5]]
output = [[0.5, 1.5], [0.4, 3.5], [0.85, "End"]]
I'd appreciate if the output array was sorted by this second element (floats ascending, strings at the end), although it's not necessary.
EDIT: Thanks for both answers; I've decided to use the one by Chris, because the code was more comprehensible to me, although groupby seems like a function designed to solved this very problem, so I'll try to read up on that, too.
UPDATE: The values of floats were always positive, by nature of the task at hand, so I used negative values to stop the usage of any strings - now I have a few if statements that check for those "encoded" negative values and replace them with strings again just before they're printed out, so sorting is now easier.
You could use a dictionary to accumulate the sum of the first value in the list keyed by the second item.
To get the 'string' items at the end of the list, the sort key could be set to positive infinity, float('inf'), in the sort key .
input_ = [[0.4, 1.5], [0.1, 1.5], [0.8, "End"], [0.05, "End"], [0.2, 3.5], [0.2, 3.5]]
d = dict()
for pair in input_:
d[pair[1]] = d.get(pair[1], 0) + pair[0]
L = []
for k, v in d.items():
L.append([v,k])
L.sort(key=lambda x: x[1] if type(x[1]) == float else float('inf'))
print(L)
This prints:
[[0.5, 1.5], [0.4, 3.5], [0.8500000000000001, 'End']]
You can try to play with itertools.groupby:
import itertools
out = [[key, sum([elt[0]for elt in val])] for key, val in itertools.groupby(a, key=lambda elt: elt[1])]
>>> [[0.5, 1.5], [0.8500000000000001, 'End'], [0.4, 3.5]]
Explanation:
Groupby the 2D list according to the 2nd element of each sublist using itertools.groupby and the key parameters. We define the lambda key=lambda elt: elt[1] to groupby on the 2nd element:
for key, val in itertools.groupby(a, key=lambda elt: elt[1]):
print(key, val)
# 1.5 <itertools._grouper object at 0x0000026AD1F6E160>
# End <itertools._grouper object at 0x0000026AD2104EF0>
# 3.5 <itertools._grouper object at 0x0000026AD1F6E160>
For each value of the group, compute the sum using the buildin function sum:
for key, val in itertools.groupby(a, key=lambda elt: elt[1]):
print(sum([elt[0]for elt in val]))
# 0.5
# 0.8500000000000001
# 0.4
Compute the desired output:
out = []
for key, val in itertools.groupby(a, key=lambda elt: elt[1]):
out.append([sum([elt[0]for elt in val]), key])
print(out)
# [[0.5, 1.5], [0.8500000000000001, 'End'], [0.4, 3.5]]
Then you said about sorting on the 2nd value but there are strings and numbers, it's quite a problem for the computer. It can't make a choice between a number and a string. Objects must be comparable.
I have these two variables:
instance = [0.45,6.54,19.0,3.34,2.34]
distance_tolerance = [5.00,10.00,20.00]
I like to sort each data in instance and categorize it based on their value that is fewer than each data in distance_tolerance and save it in a variable.
For example 0.45 < 5.00 then make a variable to save 0.45 and iterate for every data.
Expected result:
data5 = [0.45,3.34,2.34]
data10 = [0.45,6.54,3.34,2,34]
data20 = [0.45,6.54,19.0,3.34,2.34]
I need to do looping for this task since the real data is large.
What is the best way to perform this task? thanks
You can simply iterative through and append the values of instance which are lower than the value of distance_tolerance that your on.
So for some element in distance_tolerance you could have a function called itemsLowerThan(value) which will return an array of the elements in instance that are lower than the value you pass.
For example:
instance = [0.45,6.54,19.0,3.34,2.34]
distance_tolerance = [5.00,10.00,20.00]
def itemsLowerThan(value):
arr = []
for item in instance:
if (item < value):
arr.append(item)
return arr
for tolerance in distance_tolerance:
print(itemsLowerThan(tolerance))
Would give the output:
[0.45, 3.34, 2.34]
[0.45, 6.54, 3.34, 2.34]
[0.45, 6.54, 19.0, 3.34, 2.34]
You can use a nested list comprehension.
[[i for i in instance if i < tolerance] for tolerance in distance_tolerance]
Which is equivalent to:
[
[0.45, 3.34, 2.34],
[0.45, 6.54, 3.34, 2,34],
[0.45, 6.54, 19.0, 3.34, 2.34],
]
values = [[3.5689651969162908, 4.664618442892583, 3.338666695570425],
[6.293153787450157, 1.1285723419142026, 10.923859694586376],
[2.052506259736077, 3.5496423448584924, 9.995488620338277],
[9.41858935127928, 10.034233496516803, 7.070345442417161]]
def flatten(values):
new_values = []
for i in range(len(values)):
for v in range(len(values[0])):
new_values.append(values[i][v])
return new_values
v = flatten(values)
print("A 2D list contains:")
print("{}".format(values))
print("The flattened version of the list is:")
print("{}".format(v))
I am flatting the 2D list to 1D, but I can format it. I know the (v) is a list, and I tried to use for loop to print it, but I still can't get the result I want. I am wondering are there any ways to format the list. I want to print the (v) as a result with two decimal places. Like this
[3.57, 4.66, 3.34, 6.29, 1.13, 10.92, 2.05, 3.55, 10.00, 9.42, 10.03, 7.07]
I am using the Eclipse and Python 3.0+.
You could use:
print(["{:.2f}".format(val) for val in v])
Note that you can flatten your list using itertools.chain:
import itertools
v = list(itertools.chain(*values))
I would use the built-in function round(), and while I was about it I would simplify your for loops:
def flatten(values):
new_values = []
for i in values:
for v in i:
new_values.append(round(v, 2))
return new_values
How to flatten and transform the list in one line
[round(x,2) for b in [x for x in values] for x in b]
It returns a list of two decimals after the comma.
One you have v you can use a list comprehension like:
formattedList = ["%.2f" % member for member in v]
output was as follows:
['3.57', '4.66', '3.34', '6.29', '1.13', '10.92', '2.05', '3.55', '10.00', '9.42', '10.03', '7.07']
Hope that helps!
You can first flatten the list (as described here) and then use round to solve this:
flat_list = [number for sublist in l for number in sublist]
# All numbers are in the same list now
print(flat_list)
[3.5689651969162908, 4.664618442892583, 3.338666695570425, 6.293153787450157, ..., 7.070345442417161]
rounded_list = [round(number, 2) for number in flat_list]
# The numbers are rounded to two decimals (but still floats)
print(flat_list)
[3.57, 4.66, 3.34, 6.29, 1.13, 10.92, 2.05, 3.55, 10.00, 9.42, 10.03, 7.07]
This can be written shorter if we put the rounding directly into the list comprehension:
print([round(number, 2) for sublist in l for number in sublist])
I have a collection of key value pairs like this:
{
'key1': [value1_1, value2_1, value3_1, ...],
'key2': [value1_2, value2_2, value3_2, ...],
...
}
and also a list which is in the same order as the values list, which contains the weight each variable should have applied. So it looks like [weight_1, weight_2, weight_3, ...].
My goal is to end up with an ordered list of keys in accordance to which has the highest overall score of values. Note that the values aren't all standardized / normalized, so value1_x could range from 1 - 10 but value 2_x could range from 1 - 100000. This has been the tricky part for me as I have to normalize the data somehow.
I'm trying to make this algorithm run to scale for many different values, so it would take the same amount of time for 1 or for 100 (or at least logarithmically more time). Is that possible? Is there any really efficient way I can go about this?
You can't get linear-time, but you can do it faster; this looks like a matrix-multiply to me, so I suggest you use numpy:
import numpy as np
keys = ['key1', 'key2', 'key3']
values = np.matrix([
[1.1, 1.2, 1.3, 1.4],
[2.1, 2.2, 2.3, 2.4],
[3.1, 3.2, 3.3, 3.4]
])
weights = np.matrix([[10., 20., 30., 40.]]).transpose()
res = (values * weights).transpose().tolist()[0]
items = zip(res, keys)
items.sort(reverse=True)
which gives
[(330.0, 'key3'), (230.0, 'key2'), (130.0, 'key1')]
Edit: with thanks to #Ondro for np.dot and to #unutbu for np.argsort, here is an improved version entirely in numpy:
import numpy as np
# set up values
keys = np.array(['key1', 'key2', 'key3'])
values = np.array([
[1.1, 1.2, 1.3, 1.4], # values1_x
[2.1, 2.2, 2.3, 2.4], # values2_x
[3.1, 3.2, 3.3, 3.4] # values3_x
])
weights = np.array([10., 20., 30., 40.])
# crunch the numbers
res = np.dot(values, -weights) # negative of weights!
order = res.argsort(axis=0) # sorting on negative value gives
# same order as reverse-sort; there does
# not seem to be any way to reverse-sort
# directly
sortedkeys = keys[order].tolist()
which results in ['key3', 'key2', 'key1'].
Here's a normalization function, that will linearly transform your values into [0,1]
def normalize(val, ilow, ihigh, olow, ohigh):
return ((val-ilow) * (ohigh-olow) / (ihigh - ilow)) + olow
Now, use normalize to compute a new dictionary with normalized values. Then, sort by the weighted sum:
def sort(d, weights, ranges):
# ranges is a list of tuples containing the lower and upper bounds of the corresponding value
newD = {k:[normalize(v,ilow, ihigh, 0, 1) for v,(ilow, ihigh) in zip(vals, ranges)] for k,val in d.iteritems()} # d.items() in python3
return sorted(newD, key=lambda k: sum(v*w for v,w in zip(newD[k], weights)))
This is an offshoot of a previous question which started to snowball. If I have a matrix A and I want to use the mean/average of each row [1:] values to create another matrix B, but keep the row headings intact, how would I do this? I've included matrix A, my attempt at cobbling together a list comprehension, and the expected result.
from operator import sum,len
# matrix A with row headings and values
A = [('Apple',0.95,0.99,0.89,0.87,0.93),
('Bear',0.33,0.25.0.85,0.44,0.33),
('Crab',0.55,0.55,0.10,0.43,0.22)]
#List Comprehension
B = [(A[0],sum,A[1:]/len,A[1:]) for A in A]
Expected outcome
B = [('Apple', 0.926), ('Bear', 0.44), ('Crab', 0.37)]
Your list comprehension looks a little weird. You are using the same variable for the iterable and the item.
This approach seems to work:
def average(lst):
return sum(lst) / len(lst)
B = [(a[0], average(a[1:])) for a in A]
I've created a function average for readability. It matches your expected values, so I think that's what you want, although your use of mul suggests that I may be missing something.
Taking from #recursive and #Steven Rumbalski:
>>> def average(lst):
... return sum(lst) / len(lst)
...
>>> A = {
... 'Apple': (0.95, 0.99, 0.89, 0.87, 0.93),
... 'Bear': (0.33, 0.25, 0.85, 0.44, 0.33),
... 'Crab': (0.55, 0.55, 0.10, 0.43, 0.22),
... }
>>>
>>> B = [{key: average(values)} for key, values in A.iteritems()]
>>> B
[{'Apple': 0.92599999999999993}, {'Bear': 0.44000000000000006}, {'Crab': 0.37}]