How to count pair of peaks indexes including last entry? - python

I have hit a roadblock trying to include the last entry of an index to the output.
A pair of peaks is defined by a set of neighbouring values in a list that is higher than 3.
How can I include the index of the last entry into the output?
data_series_1 = [6,4,5,2,2,0,5,4,4,2,0,2,2,1,4,2,2,5,4,6]
def paired_peaks(data_series,threshold):
peaks =[]
for k in range(0,len(data_series)-1):
y_b = data_series[k-1]
y= data_series[k]
y_a = data_series[k+1]
if y>threshold:
if y_b>threshold or y_a>threshold:
peaks.append(k)
return peaks
print(paired_peaks(data_series_1,3))
I expected it to be [0, 1, 2, 6, 7, 8, 17, 18, 19], however the actual output is [0, 1, 2, 6, 7, 8, 17, 18].

Your problem happens when for loops comes to the end. When you get to the last element, you are trying to read the element that comes after it, and the element does not exist.
What might tricked you is because data_series[-1] actually read the last element instead of rising error.
Now, I don't know what your intentions are with your program, do you want to read the first element instead of nonexistant element? I assumed you do because first element is compared to the last and second one.
To fix your problem I did:
data_series_1 = [6,4,5,2,2,0,5,4,4,2,0,2,2,1,4,2,2,5,4,6]
def paired_peaks(data_series,threshold):
peaks =[]
l = len(data_series)
for k in range(l):
y_b = data_series[k-1]
y= data_series[k]
y_a = data_series[(k+1)%l]
if y > threshold:
if y_b > threshold or y_a > threshold:
peaks.append(k)
return peaks
print(paired_peaks(data_series_1,3))
I added wrote the length of data_series to variable l and then instead checked module value with data_series[(k+1)%l] to ensure your first element is read instead of nonexistant one.
This works as intended, however I advise you to check whether you want your first element to be compared to last element and do you want your last element be compared to the first element.

This would solve your problem:
data_series_1 = [6,4,5,2,2,0,5,4,4,2,0,2,2,1,4,2,2,5,4,6]
def paired_peaks(data_series,threshold):
peaks =[]
for k in range(len(data_series)):
y_b = data_series[k-1] if k - 1 in range(len(data_series)) else 0
y= data_series[k]
y_a = data_series[k+1] if k + 1 in range(len(data_series)) else 0
if y>threshold:
if y_b>threshold or y_a>threshold:
peaks.append(k)
return peaks
print(paired_peaks(data_series_1,3))
# returns: [0, 1, 2, 6, 7, 8, 17, 18, 19]
The reason your calculations stopped too early was because of: range(0,len(data_series)-1). You exited the loop to early. I also added if k +/- 1 in range(len(data_series)) else 0 to your code because the first item and last item of your list have no neighbour so it should be zero I assume. For the last item this would otherwise raise an Error because it's out of bounds. For the first item this didn't raise an error because data_series_1[-1] return the last item of your list, but I don't think that was intended in your code.

Related

how do I reference first argument of an enumeration?

I am writing code with enumerate() in Python, and I am having issues with referencing the first argument in enumerate:
For example, let nums be temperatures of different days:
nums = [1,5,20,9,3,10,50,7]
array = []
for j, distance in enumerate(nums):
for k, distance2 in enumerate(nums[1:],1):
if nums[j] < nums[k]:
array.append(distance2[j]-distance[k])
So, the challenge I have is: how do I reference the 'distance' and 'distance2' of each element respectively in my enumerations?
The aim of the problem is to determine for each day, how many days you'll have to wait for a warmer day, so for the example above, the output would be [1,1,4,3,1,1,0,0]; where there are no warmer days ahead, return 0.
Thanks
You need to calculate the distance based off the indexes not the values at the index.
You should not restart your subscript and inner index at 1 each time but rather at i each iteration.
nums = [1, 5, 20, 9, 3, 10, 50, 7]
array = []
for i, curr_temp in enumerate(nums):
days = 0
for j, future_temp in enumerate(nums[i:], i):
if curr_temp < future_temp:
# Set Days to Distance between Indexes
days = j - i
# Stop Looking Once Higher Value Found
break
array.append(days)
print(array)
Output:
[1, 1, 4, 2, 1, 1, 0, 0]

How to pull elements from a list that are nested between particular elements, and add them to new lists?

I have a daypart column (str), which has 1s or 0s for each hour of the day, depending if we choose to run a campaign during that hour.
Example:
daypart = '110011100111111100011110'
I want to convert this to the following string format:
'0-1, 4-6, 9-15, 19-22'
The above format is more readable, and shows during which hours the campaign ran.
Here's what I'm doing:
hours_list = []
ind = 0
for x in daypart:
if int(x) == 1:
hours_list.append(ind)
else:
hours_list.append('exclude')
ind += 1
The above gives me a list like this:
[0, 1, 'exclude', 'exclude', 4, 5, 6, 'exclude', 'exclude', 9, 10, 11, 12, 13, 14, 15, 'exclude', 'exclude', 'exclude', 19, 20, 21, 22, 'exclude']
Now I want to find a way to make the above into my desired output. What I am thinking of doing is finding which elements exist between 'exclude', and start adding them to new lists. I can then take the smallest and largest element from each list, join them with a '-', and append all such lists together.
Any ideas how I can do this, or a simpler way to do all of this?
Here's simple, readable code to get all intervals:
daypart = '1111111111111111111111'
hours= []
start, end = -1, -1
for i in range(len(daypart)):
if daypart[i] == "1":
if end != -1:
end += 1
else:
start = i
end = i
else:
if end!=-1:
hours.append([start, end])
start, end = -1,-1
if end!=-1:
hours.append([start, end])
start, end = -1,-1
print(hours)
I suggest that you convert directly to your desired format rather than using an intermediate representation that has the exact same information as the original input. Let's think about how we can do this in words:
Look for the first 1 in the input string
Add the index to a list
Look for the next 0 in the string.
Append one less than found index to a list. (Or maybe append the index from steps 2 and 4 as a pair?)
Continue by looking for the next 1 and repeat steps 2-4.
I leave translating this into code as an exercise for the reader.
This can be done using itertools.groupby, operator.itemgetter, enumerate in a comprehension to achieve this as well:
from itertools import groupby
from operator import itemgetter
daypart = '110011100111111100011110'
get_ends, get_one = itemgetter(0,-1), itemgetter(1)
output = ', '.join('{0[0]}-{1[0]}'.format(*get_ends(list(g))) for k,g in groupby(enumerate(daypart), get_one) if k=='1')
print(output)
0-1, 4-6, 9-15, 19-22
get_ends gets the first and last elements in each group and get_one just gets element 1 so to use it as a key.

Accessing elements from a list?

I am trying to calculate the distance between two lists so I can find the shortest distance between all coordinates.
Here is my code:
import random
import math
import copy
def calculate_distance(starting_x, starting_y, destination_x, destination_y):
distance = math.hypot(destination_x - starting_x, destination_y - starting_y) # calculates Euclidean distance (straight-line) distance between two points
return distance
def nearest_neighbour_algorithm(selected_map):
temp_map = copy.deepcopy(selected_map)
optermised_map = [] # we setup an empty optimised list to fill up
# get last element of temp_map to set as starting point, also removes it from temp_list
optermised_map.append(temp_map.pop()) # we set the first element of the temp_map and put it in optimised_map as the starting point and remove this element from the temp_map
for x in range(len(temp_map)):
nearest_value = 1000
neares_index = 0
for i in range(len(temp_map[x])):
current_value = calculate_distance(*optermised_map[x], *temp_map[x])
I get an error at this part and im not sure why:
for i in range(len(temp_map[x])):
current_value = calculate_distance(*optermised_map[x], *temp_map[x])
I am trying to find the distance between points between these two lists and the error I get is that my list index is out of range where the for loop is
On the first iteration optermised_map would be length 1. This would likely cause the error because it's iterating over len(temp_map) which is likely more than 1. I think you may have wanted:
for i in range(len(optermised_map)):
current_value = calculate_distance(*optermised_map[i], *temp_map[x])
Are the lengths of the lists the same? I could be wrong, but this sounds like a cosine similarity exercise to me. Check out this very simple exercise.
from scipy import spatial
dataSetI = [3, 45, 7, 2]
dataSetII = [2, 54, 13, 15]
result = 1 - spatial.distance.cosine(dataSetI, dataSetII)
result
# 0.97228425171235
dataSetI = [1, 2, 3, 10]
dataSetII = [2, 4, 6, 20]
result = 1 - spatial.distance.cosine(dataSetI, dataSetII)
result
# 1.0
dataSetI = [10, 200, 234, 500]
dataSetII = [45, 3, 19, 20]
result = 1 - spatial.distance.cosine(dataSetI, dataSetII)
result
# 0.4991255575740505
In the second iteration, we can see that the ratios of the numbers in the two lists are exactly the same, but the numbers are different. We focus in the ratios of the numbers.

How to improve time complexity of remove all multiplicands from array or list?

I am trying to find elements from array(integer array) or list which are unique and those elements must not divisible by any other element from same array or list.
You can answer in any language like python, java, c, c++ etc.
I have tried this code in Python3 and it works perfectly but I am looking for better and optimum solution in terms of time complexity.
assuming array or list A is already sorted and having unique elements
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
while i<len(A)-1:
while j<len(A):
if A[j]%A[i]==0:
A.pop(j)
else:
j+=1
i+=1
j=i+1
For the given array A=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] answer would be like ans=[2,3,5,7,11,13]
another example,A=[4,5,15,16,17,23,39] then ans would be like, ans=[4,5,17,23,39]
ans is having unique numbers
any element i from array only exists if (i%j)!=0, where i!=j
I think it's more natural to do it in reverse, by building a new list containing the answer instead of removing elements from the original list. If I'm thinking correctly, both approaches do the same number of mod operations, but you avoid the issue of removing an element from a list.
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
ans = []
for x in A:
for y in ans:
if x % y == 0:
break
else: ans.append(x)
Edit: Promoting the completion else.
This algorithm will perform much faster:
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
if (A[-1]-A[0])/A[0] > len(A)*2:
result = list()
for v in A:
for f in result:
d,m = divmod(v,f)
if m == 0: v=0;break
if d<f: break
if v: result.append(v)
else:
retain = set(A)
minMult = 1
maxVal = A[-1]
for v in A:
if v not in retain : continue
minMult = v*2
if minMult > maxVal: break
if v*len(A)<maxVal:
retain.difference_update([m for m in retain if m >= minMult and m%v==0])
else:
retain.difference_update(range(minMult,maxVal,v))
if maxVal%v == 0:
maxVal = max(retain)
result = list(retain)
print(result) # [2, 3, 5, 7, 11, 13]
In the spirit of the sieve of Eratostenes, each number that is retained, removes its multiples from the remaining eligible numbers. Depending on the magnitude of the highest value, it is sometimes more efficient to exclude multiples than check for divisibility. The divisibility check takes several times longer for an equivalent number of factors to check.
At some point, when the data is widely spread out, assembling the result instead of removing multiples becomes faster (this last addition was inspired by Imperishable Night's post).
TEST RESULTS
A = [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16] (100000 repetitions)
Original: 0.55 sec
New: 0.29 sec
A = list(range(2,5000))+[9697] (100 repetitions)
Original: 3.77 sec
New: 0.12 sec
A = list(range(1001,2000))+list(range(4000,6000))+[9697**2] (10 repetitions)
Original: 3.54 sec
New: 0.02 sec
I know that this is totally insane but i want to know what you think about this:
A = [4,5,15,16,17,23,39]
prova=[[x for x in A if x!=y and y%x==0] for y in A]
print([A[idx] for idx,x in enumerate(prova) if len(prova[idx])==0])
And i think it's still O(n^2)
If you care about speed more than algorithmic efficiency, numpy would be the package to use here in python:
import numpy as np
# Note: doesn't have to be sorted
a = [2, 2, 3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 16, 29, 29]
a = np.unique(a)
result = a[np.all((a % a[:, None] + np.diag(a)), axis=0)]
# array([2, 3, 5, 7, 11, 13, 29])
This divides all elements by all other elements and stores the remainder in a matrix, checks which columns contain only non-0 values (other than the diagonal), and selects all elements corresponding to those columns.
This is O(n*M) where M is the max size of an integer in your list. The integers are all assumed to be none negative. This also assumes your input list is sorted (came to that assumption since all lists you provided are sorted).
a = [4, 7, 7, 8]
# a = [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
# a = [4, 5, 15, 16, 17, 23, 39]
M = max(a)
used = set()
final_list = []
for e in a:
if e in used:
continue
else:
used.add(e)
for i in range(e, M + 1):
if not (i % e):
used.add(i)
final_list.append(e)
print(final_list)
Maybe this can be optimized even further...
If the list is not sorted then for the above method to work, one must sort it. The time complexity will then be O(nlogn + Mn) which equals to O(nlogn) when n >> M.

Print a line if conditions have not been met

Hello fellow stackoverflowers, I am practising my Python with an example question given to me (actually a Google interview practice question) and ran into a problem I did not know how to a) pose properly (hence vague title), b) overcome.
The question is: For an array of numbers (given or random) find unique pairs of numbers within the array which when summed give a given number. E.G: find the pairs of numbers in the array below which add to 6.
[1 2 4 5 11]
So in the above case:
[1,5] and [2,4]
The code I have written is:
from secrets import *
i = 10
x = randbelow(10)
number = randbelow(100) #Generate a random number to be the sum that we are after#
if number == 0:
pass
else:
number = number
array = []
while i>0: #Generate a random array to use#
array.append(x)
x = x + randbelow(10)
i -= 1
print("The following is a randomly generated array:\n" + str(array))
print("Within this array we are looking for a pair of numbers which sum to " + str(number))
for i in range(0,10):
for j in range(0,10):
if i == j or i>j:
pass
else:
elem_sum = array[i] + array[j]
if elem_sum == number:
number_one = array[i]
number_two = array[j]
print("A pair of numbers within the array which satisfy that condition is: " + str(number_one) + " and " + str(number_two))
else:
pass
If no pairs are found, I want the line "No pairs were found". I was thinking a try/except, but wasn't sure if it was correct or how to implement it. Also, I'm unsure on how to stop repeated pairs appearing (unique pairs only), so for example if I wanted 22 as a sum and had the array:
[7, 9, 9, 13, 13, 14, 23, 32, 41, 45]
[9,13] would appear twice
Finally forgive me if there are redundancies/the code isn't written very efficiently, I'm slowly learning so any other tips would be greatly appreciated!
Thanks for reading :)
You can simply add a Boolean holding the answer to "was at least one pair found?".
initialize it as found = false at the beginning of your code.
Then, whenever you find a pair (the condition block that holds your current print command), just add found = true.
after all of your search (the double for loop`), add this:
if not found:
print("No pairs were found")
Instead of actually comparing each pair of numbers, you can just iterate the list once, subtract the current number from the target number, and see if the remainder is in the list. If you convert the list to a set first, that lookup can be done in O(1), reducing the overall complexity from O(n²) to just O(n). Also, the whole thing can be done in a single line with a list comprehension:
>>> nums = [1, 2, 4, 5, 11]
>>> target = 6
>>> nums_set = set(nums)
>>> pairs = [(n, target-n) for n in nums_set if target-n in nums_set and n <= target/2]
>>> print(pairs)
[(1, 5), (2, 4)]
For printing the pairs or some message, you can use the or keyword. x or y is interpreted as x if x else y, so if the result set is empty, the message is printed, otherwise the result set itself.
>>> pairs = []
>>> print(pairs or "No pairs found")
No pairs found
Update: The above can fail, if the number added to itself equals the target, but is only contained once in the set. In this case, you can use a collections.Counter instead of a set and check the multiplicity of that number first.
>>> nums = [1, 2, 4, 5, 11, 3]
>>> nums_set = set(nums)
>>> [(n, target-n) for n in nums_set if target-n in nums_set and n <= target/2]
[(1, 5), (2, 4), (3, 3)]
>>> nums_counts = collections.Counter(nums)
>>> [(n, target-n) for n in nums_counts if target-n in nums_counts and n <= target/2 and n != target-n or nums_counts[n] > 1]
[(1, 5), (2, 4)]
List your constraints first!
numbers added must be unique
only 2 numbers can be added
the length of the array can be arbitrary
the number to be summed to can be arbitrary
& Don't skip preprocessing! Reduce your problem-space.
2 things off the bat:
Starting after your 2 print statements, the I would do array = list(set(array)) to reduce the problem-space to [7, 9, 13, 14, 23, 32, 41, 45].
Assuming that all the numbers in question will be positive, I would discard numbers above number. :
array = [x for x in array if x < number]
giving [7, 9, 9, 13, 13, 14]
Combine the last 2 steps into a list comprehension and then use that as array:
smaller_array = [x for x in list(set(array)) if x < number]
which gives array == [7, 9, 13, 14]
After these two steps, you can do a bunch of stuff. I'm fully aware that I haven't answered your question, but from here you got this. ^this is the kind of stuff I'd assume google wants to see.

Categories