This question already has answers here:
How to print column in python array?
(2 answers)
Closed 5 years ago.
I have the following list:
[[50.954818803035948, 55.49664787231189, 8007927.0, 0.0],
[50.630482185654436, 55.133473852776916, 8547795.0, 0.0],
[51.32738085400576, 55.118344981379266, 6600841.0, 0.0],
[49.425931642638567, 55.312890225131163, 7400096.0, 0.0],
[48.593467836476407, 55.073137270550006, 6001334.0, 0.0]]
I want to print the third element from every list. The desired result is:
8007927.0
8547795.0
6600841.0
7400096.0
6001334.0
I tried:
print data[:][2]
but it is not outputting the desired result.
Many way to do this. Here's a simple list way, without an explicit for loop.
tt = [[50.954818803035948, 55.49664787231189, 8007927.0, 0.0], [50.630482185654436, 55.133473852776916, 8547795.0, 0.0], [51.32738085400576, 55.118344981379266, 6600841.0, 0.0], [49.425931642638567, 55.312890225131163, 7400096.0, 0.0], [48.593467836476407, 55.073137270550006, 6001334.0, 0.0]]
print [x[2] for x in tt]
> [8007927.0, 8547795.0, 6600841.0, 7400096.0, 6001334.0]
And making is safe for potentially shorted lists
print [x[2] for x in tt if len(tt) > 3]
More sophisticated output (python 2.7), prints values as newline (\n) seperated
print '\n'.join([str(x[2]) for x in tt])
> 8007927.0
> 8547795.0
> 6600841.0
> 7400096.0
> 6001334.0
Try this:
for item in data:
if len(item) >= 3: # to prevent list out of bound exception.
print(int(item[2]))
map and list comprehensive have been given, I would like to provide two more ways, say d is your list:
With zip:
zip(*d)[2]
With numpy:
>>> import numpy
>>> nd = numpy.array(d)
>>> print(nd[:,2])
[ 8007927., 8547795., 6600841., 7400096., 6001334.]
Maybe you try a map function
In python 3:
list(map(lambda l: l[2], z))
In python 2:
map(lambda l: l[2], z)
In order to print the nth element of every list from a list of lists, you need to first access each list, and then access the nth element in that list.
In practice, it would look something like this
def print_nth_element(listset, n):
for listitem in listset:
print(int(listitem[n])) # Since you want them to be ints
Which could then be called in the form print_nth_element(data, 2) for your case.
The reason your data[:][2] is not yielding correct results is because data[:] returns the entire list of lists as it is, and then executing getting the 3rd element of that same list is just getting the thirst element of the original list. So data[:][2] is practically equivalent to data[2].
Related
This question already has answers here:
Rolling or sliding window iterator?
(29 answers)
Closed 6 days ago.
I have an array of digits: array = [1.0, 1.0, 2.0, 4.0, 1.0]
I would like to create a function that extracts sequences of digits from the input array and appends to one of two lists depending on defined conditions being met
The first condition f specifies the number of places to look ahead from index i and check if a valid index exists. If true, append array[i] to list1. If false, append to list2.
I have implemented it as follows:
def somefunc(array, f):
list1, list2 = [], []
for i in range(len(array)):
if i + f < len(array):
list1.append(array[i])
else:
list2.append(array[i])
return list1, list2
This functions correctly as follows:
somefunc(array,f=1) returns ([1.0, 1.0, 2.0, 4.0], [1.0])
somefunc(array,f=2) returns ([1.0, 1.0, 2.0], [4.0, 1.0])
somefunc(array,f=3) returns ([1.0, 1.0], [2.0, 4.0, 1.0])
However, I would like to add a second condition to this function, b, that specifies the window length for previous digits to be summed and then appended to the lists according to the f condition above.
The logic is this:
iterate through array and at each index i check if i+f is a valid index.
If true, append the sum of the previous b digits to list1
If false, append the sum of the previous b digits to list2
If the length of window b isn't possible (i.e. b=2 when i=0) continue to next index.
With both f and b conditions implemented. I would expect:
somefunc(array,f=1, b=1) returns ([1.0, 1.0, 2.0, 4.0], [1.0])
somefunc(array,f=1, b=2) returns ([2.0, 3.0, 6.0], [5.0])
somefunc(array,f=2, b=2) returns ([2.0, 3.0], [6.0, 5.0])
My first challenge is implementing the b condition. I cannot seem to figure out how. see edit below
I also wonder if there is a more efficient approach than the iterative method I have begun?
Given only the f condition, I know that the following functions correctly and would bypass the need for iteration:
def somefunc(array, f):
return array[:-f], array[-f:]
However, I again don't know how to implement the b condition in this approach.
Edit
I have managed an iterative solution which implements the f and b conditions:
def somefunc(array, f, b):
list1, list2 = [], []
for i in range(len(array)):
if i >= (b-1):
if i + f < len(array):
list1.append(sum(array[i+1-b: i+1]))
else:
list2.append(sum(array[i+1-b: i+1]))
return list1, list2
However, the indexing syntax feels horrible and I so I am certain there must be a more elegant solution. Also, anything with improved runtime would really be preferable.
I can see two minor improvements you could implement in your code:
def somefunc(array, f, b):
list1, list2 = [], []
size = len(array) # Will only measure the length of the array once
for i in range(b-1, size): # By starting from b-1 you can remove an if statement
if i + f < size: # We use the size here
list1.append(sum(array[i+1-b: i+1]))
else:
list2.append(sum(array[i+1-b: i+1]))
return list1, list2
Edit:
An ever better solution would be to add the new digit and substract the last at each iteration. This way you don't need to redo the whole sum each iteration:
def somefunc(array, f, b):
list1, list2 = [], []
value = 0
size = len(array)
for i in range(b-1, size):
if value != 0:
value = value - array[i-b] + array[i] # Get the last value, add the value at index i and remove the value at index i-b
else:
value = sum(array[i+1-b: i+1])
if i + f < size:
list1.append(value)
else:
list2.append(value)
return list1, list2
I am attempting to index a list to pull the first (0) and second (1) items for further calculations. My code currently looks like this:
def calculate_scores(list):
sat = list[0]
gpa = list[1]
weighted_sat = (sat / 160)
weighted_gpa = (gpa * 2)
This is in the function that I want to use to do the calculations. The part where this function is called in my main looks like this:
testscores = []
semestergrades = []
testscores.append(floatlist[0:4])
semestergrades.append(floatlist[4:])
calculate_scores(testscores)
The list that the testscores list is pulling from is 8 items long, all of them floats - however, when I try to run this code it gives me a 'list index out of range' error for the part where I try to set the variable 'gpa' equal to list[1]. However, it seems to be able to run the first part, setting the variable 'sat' equal to list[0]. Any idea why this is happening?
You're using .append() when you should either be using .extend() or just using the result of the slice:
# floatlist = [0.5, 1.5, 2.5, 3.5, 4.5, 5.5]
testscores = []
testscores.append(floatlist[0:4])
# testscores = [[0.5, 1.5, 2.5, 3.5]]
So, how you're currently doing it, testscores is a list with one element, that element being floatlist[0:4]. When you try to use the second element (index 1), you get an IndexError.
You can use .extend() instead of .append() to add all the items in the given iterable to the list. Or, you could just do
testscores = floatlist[0:4]
since list slicing produces a copy of the original anyway.
In the following code I want to check how many unique values are in the list and this can be done in for loop. After knowing the number of unique values I want to see how many times a single unique values appear in a and then I want to count their number. Can someone please guide me how to do that. List contains floating points. What if I convert it in numpy array and then find same values.
`a= [1.0, 1.0, 1.0, 1.0, 1.5, 1.5, 1.5, 3.0, 3.0]
list = []
for i in a:
if i not in list:
list.append(i)
print(list)
for j in range(len(list))
g= np.argwhere(a==list[j])
print(g)`
You can use np.unique to get it done
np.unique(np.array(a),return_counts=True)
You can also do it using counters from collections
from collections import Counter
Var=dict(Counter(a))
print(Var)
The primitive way is to use loops
[[x,a.count(x)] for x in set(a)]
If you are not familiar with list comprehensions, this is its explaination
ls=[]
for x in set(a):
ls.append([x,a.count(x)])
print(ls)
If you want it using if else,
counter = dict()
for k in a:
if not k in counter:
counter[k] = 1
else:
counter[k] += 1
print(counter)
I have a list of around 131000 arrays, each of length 300. I am using python
I want to check which of the arrays are repeating in this list. I am trying this by comparing each array with others. like :
Import numpy as np
wordEmbeddings = [[0.8,0.4....upto 300 elements]....upto 131000 arrays]
count = 0
for i in range(0,len(wordEmbeddings)):
for j in range(0,len(wordEmbeddings)):
if i != j:
if np.array_equal(wordEmbeddings[i],wordEmbeddings[j]):
count += 1
this is running very slowly, It might take hours to finish, how can I do this efficiently ?
You can use collections.Counter to count the frequency of each sub list
>>> from collections import Counter
>>> Counter(list(map(tuple, wordEmbeddings)))
We need to cast the sublist to tuples since list is unhashable i.e. it cannot be used as a key in dict.
This will give you result like this:
>>> Counter({(...4, 5, 6...): 1, (...1, 2, 3...): 1})
The key of Counter object here is the list and value is the number of times this list occurs. Next you can filter the resulting Counter object to only yield elements where value is > 1:
>>> items = Counter(list(map(tuple, wordEmbeddings)))
>>> list(filter(lambda x: items[x] > 1,items))
Timeit results:
$ python -m timeit -s "a = [range(300) for _ in range(131000)]" -s "from collections import Counter" "Counter(list(map(tuple, a)))"
10 loops, best of 3: 1.18 sec per loop
You can remove duplicate comparisons by using
for i in range(0,len(wordEmbeddings)):
for j in range(i,len(wordEmbeddings)):
You could look in to pypy for general purpose speed ups.
It might also be worth looking into hashing the arrays somehow.
Here's a question on the speeding up np array comparison. Do the order of the elements matter to you?
You can use set and tuple to find duplicated arrays inside another array. Create a new list contains tuples, we use tuples because lists are unhashable type. And then filter new list with using set.
tuple = list(map(tuple, wordEmbeddings))
duplications = set([t for t in tuple if tuple.count(t) > 1])
print(duplications)
maybe you can reduce the initial list to unique hashes, or non-unique sums,
and go over the hashes first - which may be a faster way to compare elements
I suggest you first sort the list (might also be helpful for further processing) and then compare. The advantage is that you only need to compare every array element to the previous one:
import numpy as np
from functools import cmp_to_key
wordEmbeddings = [[0.8, 0.4, 0.3, 0.2], [0.2,0.3,0.7], [0.8, 0.4, 0.3, 0.2], [ 1.0, 3.0, 4.0, 5.0]]
def smaller (x,y):
for i in range(min(len(x), len(y))):
if x[i] < y[i]:
return 1
elif y[i] < x[i]:
return -1
if len(x) > len(y):
return 1
else:
return -1
wordEmbeddings = sorted(wordEmbeddings, key=cmp_to_key(smaller))
print(wordEmbeddings)
# output: [[1.0, 3.0, 4.0, 5.0], [0.8, 0.4, 0.3, 0.2], [0.8, 0.4, 0.3, 0.2], [0.2, 0.3, 0.7]]
count = 0
for i in range(1, len(wordEmbeddings)):
if (np.array_equal(wordEmbeddings[i], wordEmbeddings[i-1])):
count += 1
print(count)
# output: 1
If N is the length of word embedding and n is the length of the inner array, then your approach was to do O(N*N*n) comparisons. When reducing the comparisons as in con--'s answer, then you still have O(N*N*n/2) comparisons.
Sorting will take O(N*log(N)*n) time and the subsequent step of counting only takes O(N*n) time which all in all is shorter than O(N*N*n/2)
def main():
my_list = [[float(i) for i in line.split(',')] for line in open("Alpha.txt")]
print(my_list)
for elem in my_list:
listA=[]
listA = elem
print(listA)
main()
this code prints out the correct data of which im looking for, however i need to set each print from the for loop into a object. Any help as to how i would go about doing that?
[1.2, 4.3, 7.0, 0.0]
[3.0, 5.0, 8.2, 9.0]
[4.0, 3.0, 8.0, 5.6]
[8.0, 4.0, 3.0, 7.4]
What you're thinking of/trying to do is to dynamically name variables.
Don't.
Either leave your data in the list and access it via index
my_list[0] #what you were trying to assign to 'a'
my_list[0][0] #the first element in that sub-list
Or, if you have meaningful identifiers that you want to assign to each, you can use a dict to assign "keys" to "values".
d = {}
for sublist, meaningful_identifier in zip(my_list, my_meaningful_identifiers):
d[meaningful_identifier] = sublist
Either way, leverage python data structures to do what they were supposed to do.
This is not a good idea, let me warn you, and you should never use this in production code (it is prone to code injection), and screws up your global namespace, but it does what you asked.
You would use exec() for this, which is a function that dynamically executes statements.
def main():
my_list = [[float(i) for i in line.split(',')] for line in open("Alpha.txt", "r")]
print(my_list)
for elem in my_list:
exec "%s = %s" % ("abcdefghijklmnopqrstuvwxyz"[my_list.index(elem)], elem) in globals()
main()
Now, your global namespace is filled with variables a, b, c, etc. corresponding to the elements.
It is also prone to exceptions, if you have more than 26 elements, you will get an IndexError, although you could work around that.
Try:
myList = [map(float, line.split(',')) for line in open ("Alpha.txt")]
Now you can get each line in a different variable if you want:
a = myList[0]
b = myList[1]
and so on. But since you have a list, it's better to use it and access elements using indices. Are you sure have a correct understanding of arrays?
As the other answers point out, it is dangerous and doesn't make sense to dynamically create variables.