Checking for overlap between time spans - python

I have a list of time entries (HHMM format) with a start time and a stop. I'm having trouble figuring out how to code it in Python where it returns if there's an overlap or not in the list.
Example
Entry 1: 1030, 1245;
Entry 2: 1115, 1300
== True
Entry 1: 0900, 1030;
Entry 2: 1215, 1400
== False

First we sort the list by the start time.
Then we loop over it checking if the next start time is lower then the previous end time.
This will check if x+1 overlaps with x (not if x+2 overlaps with x, etc.)
intervals = [[100,200],[150,250],[300,400]]
intervalsSorted = sorted(intervals, key=lambda x: x[0]) # sort by start time
for x in range(1,len(intervalsSorted)):
if intervalsSorted[x-1][1] > intervalsSorted[x][0]:
print "{0} overlaps with {1}".format( intervals[x-1], intervals[x] )
# result: [100, 200] overlaps with [150, 250]
The following should give you all overlappings in the whole list.
intervals = [[100,200],[150,250],[300,400],[250,500]]
overlapping = [ [x,y] for x in intervals for y in intervals if x is not y and x[1]>y[0] and x[0]<y[0] ]
for x in overlapping:
print '{0} overlaps with {1}'.format(x[0],x[1])
# results:
# [100, 200] overlaps with [150, 250]
# [250, 500] overlaps with [300, 400]
Note that this is a O(n*n) lookup. (anyone correct me here if I'm wrong!)
This is likely slower than the first (didn't test it, but I assume it is) because this iterates over the whole list for each single index. Should be similar to arbarnert's nested for loops example. But then again this does give you all the overlapping values as opposed to the first method I showed that only checked for overlapping times between those next to it (sorted by start time).
Extended test gives:
intervals = [[100,200],[150,250],[300,400],[250,500],[10,900],[1000,12300],[-151,32131],["a","c"],["b","d"],["foo","kung"]]
overlapping = [ [x,y] for x in intervals for y in intervals if x is not y and x[1]>y[0] and x[0]<y[0] ]
for x in overlapping:
print '{0} overlaps with {1}'.format(x[0],x[1])
# results:
# [100, 200] overlaps with [150, 250]
# [250, 500] overlaps with [300, 400]
# [10, 900] overlaps with [100, 200]
# [10, 900] overlaps with [150, 250]
# [10, 900] overlaps with [300, 400]
# [10, 900] overlaps with [250, 500]
# [-151, 32131] overlaps with [100, 200]
# [-151, 32131] overlaps with [150, 250]
# [-151, 32131] overlaps with [300, 400]
# [-151, 32131] overlaps with [250, 500]
# [-151, 32131] overlaps with [10, 900]
# [-151, 32131] overlaps with [1000, 12300]
# ['a', 'c'] overlaps with ['b', 'd']

For future reference, the solution of #Roy doesn't work for intervals that have the same end or start times. The following solution does:
intervals = [[100, 200], [150, 250], [300, 400], [250, 500], [100, 150], [175, 250]]
intervals.sort()
l = len(intervals)
overlaps = []
for i in xrange(l):
for j in xrange(i+1, l):
x = intervals[i]
y = intervals[j]
if x[0] == y[0]:
overlaps.append([x, y])
elif x[1] == y[1]:
overlaps.append([x, y])
elif (x[1]>y[0] and x[0]<y[0]):
overlaps.append([x, y])
Also, an Interval Tree could be used for these kinds of problems.

To expand on #Roy's answer to include situations where something has the same time slot and you need to distinguish:
intervals = [[100,200, "math"],[100,200, "calc"], [150,250, "eng"],[300,400, "design"],[250,500, "lit"],[10,900, "english"],[1000,12300, "prog"],[-151,32131, "hist"]]
overlapping = [ [x,y] for x in intervals for y in intervals if x is not y and x[1]>y[0] and x[0]<y[0] or x[0]==y[0] and x[1]==y[1] and x is not y]
for x in overlapping:
print '{0} overlaps with {1}'.format(x[0],x[1])
# Prints
#[100, 200, 'math'] overlaps with [100, 200, 'calc']
#[100, 200, 'math'] overlaps with [150, 250, 'eng']
#[100, 200, 'calc'] overlaps with [100, 200, 'math']
#[100, 200, 'calc'] overlaps with [150, 250, 'eng']
#[250, 500, 'lit'] overlaps with [300, 400, 'design']
#[10, 900, 'english'] overlaps with [100, 200, 'math']
#[10, 900, 'english'] overlaps with [100, 200, 'calc']
#[10, 900, 'english'] overlaps with [150, 250, 'eng']
#[10, 900, 'english'] overlaps with [300, 400, 'design']
#[10, 900, 'english'] overlaps with [250, 500, 'lit']
#[-151, 32131, 'hist'] overlaps with [100, 200, 'math']
#[-151, 32131, 'hist'] overlaps with [100, 200, 'calc']
#[-151, 32131, 'hist'] overlaps with [150, 250, 'eng']
#[-151, 32131, 'hist'] overlaps with [300, 400, 'design']
#[-151, 32131, 'hist'] overlaps with [250, 500, 'lit']
#[-151, 32131, 'hist'] overlaps with [10, 900, 'english']
#[-151, 32131, 'hist'] overlaps with [1000, 12300, 'prog']

Assuming you have an intervals_overlap(interval1, interval2) function…
The first idea is a naive iteration over every pair of intervals in the list:
for interval1 in intervals:
for interval2 in intervals:
if interval1 is not interval2:
if intervals_overlap(interval1, interval2):
return True
return False
But you should be able to figure out smarter ways of dong this.

Simple way to do it:
I change the number into string since entry 3 contains 0900, which is invalid.
entry01 = ('1030', '1245')
entry02 = ('1115', '1300')
entry03 = ('0900', '1030')
entry04 = ('1215', '1400')
def check(entry01, entry02):
import itertools
input_time_series = list(itertools.chain.from_iterable([entry01, entry02]))
if input_time_series != sorted(input_time_series):
return False
return True
>>> check(entry01, entry02)
False
>>> check(entry03, entry04)
True

Related

How to add Element to a List in Python inside another list

I'm trying to add "7000" element to this list [10, 20, [300, 400, [5000, 6000, ], 500], 30, 40]
I want to add 7000 after 6000 on this list … already try some methods to add this element
Use the append() method to update an list in place
>>> x = [10, 20, [300, 400, [5000, 6000], 500], 30, 40]
>>> x[2][2].append(7000)
>>> x
[10, 20, [300, 400, [5000, 6000, 7000], 500], 30, 40]
You can use the append() method on any list, no matter how nested it is.
your list:- lst=[10, 20, [300, 400, [5000, 6000, ], 500], 30, 40]
well see in your list at index position 2 there is another list
that list:-[300, 400, [5000, 6000, ], 500]
in the above list there is another list at index 2 i.e [5000, 6000, ]
so use append() method in this code:-
lst[2][2].append(7000)

Splitting ranges according to list of excluded ranges

I have two arrays:
ranges = [[50, 60], [100, 100], [5000, 6000]]
exclude = [[1000, 1000], [5060, 5060]]
How to get the result as a sorted list that is
[[50, 60], [100, 100], [5000, 5059], [5061, 6000]]
Basically, remove the ranges of the second list from the ranges of the first list, creating new ranges where needed.
More examples:
ranges = [[2, 124235]]
exclude = [[2000, 3000], [400, 2500]]
that gives me output
[[2, 399], [3001, 124235]]

Clamping by image shape: replace X,Y coordinates by different rules in Numpy array

I have a picture of shape (225, 400, 3) and a set of x,y coordinates:
polygon = np.array([[150, 80], [350, 80], [420, 280], [350, 250], [150, 250]], np.int32)
I want to clamp those values to dimensions of my image, so any given pair of coordinates won't be outside of my image. This means that I need to replace all x-coordinates [X > 400, ...] with 400 and all y-coordinates [..., Y > 225] with 225.
I tried to replace all Y coordinates greater than 255 without success, it also clamps X coordinates.
polygon[(polygon > 225).all(axis=1)] = 225
What's the correct way of clamping Numpy array values by different rules in this case?
IIUC and by coordinates you mean that each row contains a (x,y) coordinate (note that this is not the same as axis), then you can index and use clip:
polygon[:,0] = polygon[:,0].clip(max=400)
polygon[:,1] = polygon[:,1].clip(max=255)
print(polygon)
array([[150, 80],
[350, 80],
[400, 255],
[350, 250],
[150, 250]])
You can use np.where:
polygon[:,0] = np.where(polygon[:,0] > 400, 400, polygon[:,0])
polygon[:,1] = np.where(polygon[:,1] > 225, 225, polygon[:,1])
You will obtain the next output:
array([[150, 80],
[350, 80],
[400, 225],
[350, 225],
[150, 225]])

Subtracting columns from a numpy array

This question is a follow-up of a previous post of mine:
Multiply each column of a numpy array with each value of another array.
Suppose i have the following arrays:
In [252]: k
Out[252]:
array([[200, 800, 400, 1600],
[400, 1000, 800, 2000],
[600, 1200, 1200,2400]])
In [271]: f = np.array([[100,50],[200,100],[300,200]])
In [272]: f
Out[272]:
array([[100, 50],
[200, 100],
[300, 200]])
How can i subtract f from k to obtain the following?
In [252]: g
Out[252]:
array([[100, 750, 300, 1550],
[200, 900, 600, 1900],
[300, 1000, 900,2200]])
Ideally, i would like to make the subtraction in as fewer steps as possible and in concert with the solution provided in my other post, but any solution welcome.
You can use np.tile, like this:
In [1]: k - np.tile(f, (1, 2))
Out[1]:
array([[ 100, 750, 300, 1550],
[ 200, 900, 600, 1900],
[ 300, 1000, 900, 2200]])
Also, if you happen to know for sure that each dimension of f evenly divides the corresponding dimension of k (which I assume you must, for your desired subtraction to make sense), then you could generalize it slightly:
In [2]: k - np.tile(f, np.array(k.shape) // np.array(f.shape))
Out[2]:
array([[ 100, 750, 300, 1550],
[ 200, 900, 600, 1900],
[ 300, 1000, 900, 2200]])
You can reshape k, to fit f in two dimensions, and use broadcasting:
>>> g = (k.reshape(f.shape[0], -1, f.shape[1]) - f[:, None, :]).reshape(k.shape)
array([[ 100, 750, 300, 1550],
[ 200, 900, 600, 1900],
[ 300, 1000, 900, 2200]])

NumPy replace values with arrays based on conditions

I am trying to produce a sorter that, for each weekly total (for multiple different products), places them in the right buckets based on the max cumulative allowable less what has already been sorted.
maxes=np.array([100,200,300],[100,400,900])
weeklytotals=np.array([100,150,200,250],[200,400,600,800)]
The desired output would be:
result=np.array([[100,0,0],[100,50,0],[100,100,0],[100,150,0]],[[100,100,0],[100,300,0],[100,400,100],[100,400,300]]
I do not want to use loops but I am racking my mind on how to avoid that method. Thanks in advance, still a Python beginner. I want to use NumPy because the end implementation will need to be extremely fast.
One vectorized approach could be:
result = np.minimum(weeklytotals[:,:,None], maxes.cumsum(1)[:,None,:])
result[...,1:] -= result[...,:-1]
result
#array([[[100, 0, 0],
# [100, 50, 0],
# [100, 100, 0],
# [100, 150, 0]],
# [[100, 100, 0],
# [100, 300, 0],
# [100, 400, 100],
# [100, 400, 300]]])
Firstly calculate the cumulative capacity for the buckets:
maxes.cumsum(1)
#array([[ 100, 300, 600],
# [ 100, 500, 1400]])
calculate the cumulative amount in the buckets by taking the minimum between weekly total and capacity:
result = np.minimum(weeklytotals[:,:,None], maxes.cumsum(1)[:,None,:])
#array([[[100, 100, 100],
# [100, 150, 150],
# [100, 200, 200],
# [100, 250, 250]],
# [[100, 200, 200],
# [100, 400, 400],
# [100, 500, 600],
# [100, 500, 800]]])
Take the difference of amounts between buckets and assign them back (except for the first bucket):
result[...,1:] -= result[...,:-1]
result

Categories