How to sum value of integers based on position? - python

The situation is as followed. I want to sum and eventually calculate their average of specific values based on their positions. So far I have tried many different things and I can come up with the following code, I can't seem to figure out how to match these different positions with their belonging values.
count_pos = 0
for character in score:
asci = ord(character)
count_pos += 1
print(count_pos,asci)
if asci == 10 :
count_pos = 0
print asci generates the following output:
1 35
2 52
3 61
4 68
5 70
6 70
1 35
2 49
3 61
4 68
5 68
6 70
The numbers 1-6 are the positions and the other integers are the values belonging to this value. So what I basically am trying to do is to sum the value of position 1 (35+35) which should give me : 70, and the sum of the values of position 2 should give me (52+49) : 101 and this for all positions.
The only thing so far I thought about was comparing the counter like this:
if count_pos == count_pos:
#Do calculation
NOTE: This is just a part of the data. The real data goes on like this with more than 1000 of these counting and not just 2 like displayed here.

Solution
This would work:
from collections import defaultdict
score = '#4=DFF\n#1=DDF\n'
res = defaultdict(int)
for entry in score.splitlines():
for pos, char in enumerate(entry, 1):
res[pos] += ord(char)
Now:
>>> res
defaultdict(int, {1: 70, 2: 101, 3: 122, 4: 136, 5: 138, 6: 140})
>>> res[1]
70
>>> res[2]
101
In Steps
Your score string looks like this (extracted from your asci numbers):
score = '#4=DFF\n#1=DDF\n'
Instead of looking for asci == 10, just split at new line characters with
the string method splitlines().
The defaultdict from the module collections gives you a dictionary that
you can initiate with a function. We use int() here. That will call int() if we access a key does not exist. So, if you do:
res[pos] += ord(char)
and the key pos does not exit yet, it will call int(), which gives a 0
and you can add your number to it. The next time around, if the number of
pos is already a key in your dictionary, you will get the value and you add
to it, summing up the value for each position.
The enumerate here:
for pos, char in enumerate(entry, 1):
gives you the position in each row named pos, starting with 1.

If you have the two lists to be added in two lists you may do this :
Using zip:
[x + y for x, y in zip(List1, List2)]
or
zipped_list = zip(List1,List2)
print([sum(item) for item in zipped_list])
Eg: If the lists were,
List1=[1, 2, 3]
List2=[4, 5, 6]
Output would be : [5, 7, 9]
Using Numpy:
import numpy as np
all = [list1,list2,list3 ...]
result = sum(map(np.array, all))
Eg:
>>> li=[1,3]
>>> li1=[1,3]
>>> li2=[1,3]
>>> li3=[1,3]
>>> import numpy as np
>>> all=[li,li1,li2,li3]
>>> mylist = sum(map(np.array, all))
>>> mylist
array([ 4, 12])

Related

How to get inverse of integer?

I am not sure of inverse is the proper name, but I think it is.
This example will clarify what I need:
I have a max height, 5 for example, and so height can range from 0 to 4. In this case we're talking integers, so the options are: 0, 1, 2, 3, 4.
What I need, given an input ranging from 0 up to (and including) 4, is to get the inverse number.
Example:
input: 3
output: 1
visual:
0 1 2 3 4
4 3 2 1 0
I know I can do it like this:
position_list = list(range(5))
index_list = position_list[::-1]
index = index_list[3]
But this will probably use unnecessary memory, and probably unnecessary cpu usage creating two lists. The lists will be deleted after these lines of code, and will recreated every time the code is ran (within method). I'd rather find a way not needing the lists at all.
What is an efficient way to achieve the same? (while still keeping the code readable for someone new to the code)
Isn't it just max - in...?
>>> MAX=4
>>> def calc(in_val):
... out_val = MAX - in_val
... print('%s -> %s' % ( in_val, out_val ))
...
>>> calc(3)
3 -> 1
>>> calc(1)
1 -> 3
You just need to subtract from the max:
def return_inverse(n, mx):
return mx - n
For the proposed example:
position_list = list(range(5))
mx = max(position_list)
[return_inverse(i, mx) for i in position_list]
# [4, 3, 2, 1, 0]
You have maximum heigth, let's call it max_h.
Your numbers are counted from 0, so they are in [0; max_h - 1]
You want to find the complementation number that becomes max_h in sum with input number
It is max_h - 1 - your_number:
max_height = 5
input_number = 2
for input_number in range(5):
print('IN:', input_number, 'OUT:', max_height - input_number - 1)
IN: 1 OUT: 3
IN: 2 OUT: 2
IN: 3 OUT: 1
IN: 4 OUT: 0
Simply compute the reverse index and then directly access the corresponding element.
n = 5
inp = 3
position_list = list(range(n))
position_list[n-1-inp]
# 1
You can just derive the index from the list's length and the desired position, to arrive at the "inverse":
position_list = list(range(5))
position = 3
inverse = position_list[len(position_list)-1-position]
And:
for i in position_list:
print(i, position_list[len(position_list)-1-i])
In this case, you can just have the output = 4-input. If it's just increments of 1 up to some number a simple operation like that should be enough. For example, if the max was 10 and the min was 5, then you could just do 9-input+5. The 9 can be replaced by the max-1 and the 5 can be replaced with the min.
So max-1-input+min

How to append randomized float values into array within loop

I have a set of randomized float values that are to be arranged into an array at the end of each loop that produces 67 of them, however, there are 64 total loops.
As an example, if I had 4 values per loop and 3 total loops of integers, I would like it to be like this:
values = [[0, 4, 5, 1],[6, 6, 5, 3],[0,0,0,7]]
such that I could identify them as separate arrays, however, I am unsure of the best way to append the values after they are created, but am aware of how to return them. Forgive me as I am unskilled with the logic.
import math
import random
funcs = []
coord = []
pi = math.pi
funcAmt = 0
coordAmt = 0
repeatAmt = 0
coordPass = 0
while funcAmt < 64:
while coordAmt < 67:
coordAmt += 1
uniform = round(random.uniform(-pi, pi), 2)
print("Coord [",coordAmt,"] {",uniform,"} Func:", funcAmt + 1)
if uniform in coord:
repeatAmt += 1
print("Repeat Found!")
coordAmt -= 1
print("Repeat [",repeatAmt,"] Resolved")
pass
else:
coordPass += 1
coord.append(uniform)
#<<<Append Here>>>
funcAmt += 1
coord.clear()
coordAmt = 0
In my given code above, it would be similar to:
func = [
[<67 items>],
...63 more times
]
Your "append here" logic should append the coordinate list and then clear that list for the next iteration of the outer loop:
funcs.append(coord[:]) # The slice notation makes a copy of the list
coord.clear() # or simply coord = []
You should learn to use a for loop. This will simplify your looping: you don't have to maintain the counts yourself. For instance:
for funcAmt in range(64):
for coordAmt in range(67):
...
You might also look up how to make a "list comprehension", which can reduce your process to a single line of code -- a long, involved line, but readable with proper white space.
Does that get you moving?
There are a couple of ways around this. Instead of using while lists and counters, you could just use for loops. Or at least do that for the outer loop, since it looks like you still want to check for repeats. Here's an example using your original dimensions of 3 and 4:
from math import pi
import random
coord_sets = 3
coords = 4
biglist = []
for i in range(coord_sets):
coords_set = []
non_repeating_coords = 0
while non_repeating_coords < coords:
new_coord = round(random.uniform(-1.0*pi, pi), 2)
if new_coord not in coords_set:
coords_set.append(new_coord)
non_repeating_coords += 1
biglist.append(coords_set)
print(biglist)
You can use sets because they don't allow duplicate values:
from math import pi
import random
funcs = []
funcAmt = 0
while funcAmt < 64: # This is the number of loops
myset = set()
while len(myset) < 67: # This is the length of each set
uniform = round(random.uniform(-pi, pi), 2)
myset.add(uniform)
funcs.append(list(myset)) # Append randomly generated set as a list
funcAmt += 1
print(funcs)
maybe you can benefit from arrays in numpy:
import numpy as np
funcs = np.random.uniform(-np.pi, np.pi, [63, 67])
This creates an array of shape (63, 67) from uniform random between -pi to pi.

Memory efficient way to read an array of integers from single line of input in python2.7

I want to read a single line of input containing integers separated by spaces.
Currently I use the following.
A = map(int, raw_input().split())
But now the N is around 10^5 and I don't need the whole array of integers, I just need to read them 1 at a time, in the same sequence as the input.
Can you suggest an efficient way to do this in Python2.7
Use generators:
numbers = '1 2 5 18 10 12 16 17 22 50'
gen = (int(x) for x in numbers.split())
for g in gen:
print g
1
5
6
8
10
12
68
13
the generator object would use one item at a time, and won't construct a whole list.
You could parse the data a character at a time, this would reduce memory usage:
data = "1 50 30 1000 20 4 1 2"
number = []
numbers = []
for c in data:
if c == ' ':
if number:
numbers.append(int(''.join(number)))
number = []
else:
number.append(c)
if number:
numbers.append(int(''.join(number)))
print numbers
Giving you:
[1, 50, 30, 1000, 20, 4, 1, 2]
Probably quite a bit slower though.
Alternatively, you could use itertools.groupby() to read groups of digits as follows:
from itertools import groupby
data = "1 50 30 1000 20 4 1 2"
numbers = []
for k, g in groupby(data, lambda c: c.isdigit()):
if k:
numbers.append(int(''.join(g)))
print numbers
If you're able to destroy the original string, split accepts a parameter for the maximum number of breaks.
See docs for more details and examples.

Count how many times a given combination occurs in a nested list

I have a nested list called huge_list, as the name says it is pretty large. I need to know how I can get how many times a given combination of 2 elements of the sublists occur, for example:
huge_list = [[6,10,5,4,40,99],[1,10,3,6,40,71],[2,10,3,4,40,98]]
count = 0
for x in huge_list:
#print amount of times position 1 and 4 have the same combination
count = count + 1
and the output would be:
3
3
3
I tried something like :
sum(x.count(huge_list[count][1]) for x in huge_list)
But it works for just one of the items, not both of them. Any ideas?
If you're looking for a count of all the combinations of indexes 1 and 4 in a list of lists, it's hard to do better than:
import collections
huge_list = [[6,10,5,4,40,99],[1,10,3,6,40,71],[2,10,3,4,40,98]]
count = collections.Counter(((sublst[1], sublst[4]) for sublst in huge_list))
Which will give you:
In [3]: count
Out[3]: Counter({(10,40): 3})
You can get your exact requested output after this with:
for sublst in huge_list:
print(count.get((sublst[1], sublst[4]), 0))
If you are given two numbers to check you can sum :
huge_list = [[6,10,5,4,40,99],[1,10,3,6,40,71],[2,10,3,4,40,98]]
given = (10, 40)
print(sum((sub[1], sub[4]) == given for sub in huge_list))
Tried for your expected output.I dont know what you are expecting actually
huge_list = [[6,10,5,4,40,99],[1,10,3,6,40,71],[2,10,3,4,40,98]]
for i in huge_list:
c = 0
for j in huge_list:
if i[1]==j[1] and i[4]==j[4]:
c+=1
print c
#output
3
3
3

Counting number of values between interval

Is there any efficient way in python to count the times an array of numbers is between certain intervals? the number of intervals i will be using may get quite large
like:
mylist = [4,4,1,18,2,15,6,14,2,16,2,17,12,3,12,4,15,5,17]
some function(mylist, startpoints):
# startpoints = [0,10,20]
count values in range [0,9]
count values in range [10-19]
output = [9,10]
you will have to iterate the list at least once.
The solution below works with any sequence/interval that implements comparision (<, >, etc) and uses bisect algorithm to find the correct point in the interval, so it is very fast.
It will work with floats, text, or whatever. Just pass a sequence and a list of the intervals.
from collections import defaultdict
from bisect import bisect_left
def count_intervals(sequence, intervals):
count = defaultdict(int)
intervals.sort()
for item in sequence:
pos = bisect_left(intervals, item)
if pos == len(intervals):
count[None] += 1
else:
count[intervals[pos]] += 1
return count
data = [4,4,1,18,2,15,6,14,2,16,2,17,12,3,12,4,15,5,17]
print count_intervals(data, [10, 20])
Will print
defaultdict(<type 'int'>, {10: 10, 20: 9})
Meaning that you have 10 values <10 and 9 values <20.
I don't know how large your list will get but here's another approach.
import numpy as np
mylist = [4,4,1,18,2,15,6,14,2,16,2,17,12,3,12,4,15,5,17]
np.histogram(mylist, bins=[0,9,19])
You can also use a combination of value_counts() and pd.cut() to help you get the job done.
import pandas as pd
mylist = [4,4,1,18,2,15,6,14,2,16,2,17,12,3,12,4,15,5,17]
split_mylist = pd.cut(mylist, [0, 9, 19]).value_counts(sort = False)
print(split_mylist)
This piece of code will return this:
(0, 10] 10
(10, 20] 9
dtype: int64
Then you can utilise the to_list() function to get what you want
split_mylist = split_mylist.tolist()
print(split_mylist)
Output: [10, 9]
If the numbers are integers, as in your example, representing the intervals as frozensets can perhaps be fastest (worth trying). Not sure if the intervals are guaranteed to be mutually exclusive -- if not, then
intervals = [frozenzet(range(10)), frozenset(range(10, 20))]
counts = [0] * len(intervals)
for n in mylist:
for i, inter in enumerate(intervals):
if n in inter:
counts[i] += 1
if the intervals are mutually exclusive, this code could be sped up a bit by breaking out of the inner loop right after the increment. However for mutually exclusive intervals of integers >= 0, there's an even more attractive option: first, prepare an auxiliary index, e.g. given your startpoints data structure that could be
indices = [sum(i > x for x in startpoints) - 1 for i in range(max(startpoints))]
and then
counts = [0] * len(intervals)
for n in mylist:
if 0 <= n < len(indices):
counts[indices[n]] += 1
this can be adjusted if the intervals can be < 0 (everything needs to be offset by -min(startpoints) in that case.
If the "numbers" can be arbitrary floats (or decimal.Decimals, etc), not just integer, the possibilities for optimization are more restricted. Is that the case...?

Categories