Calculate difference between two values (python) - python

If I have a variable x that returns a bunch of numbers (floats), how can I calculate the difference between all the adjacent numbers (e.g. (x - x-1), (x-1 - x-2) until the last term?).

Look at what you've written down in your question. The answer is there staring at you.
[x[i+1]-x[i] for i in range(len(x)-1)]
One of the nicest things about python is that is has declarative features. You can often get what you want by just describing it; you don't always have to explicitly give the recipe.

Related

creating a simple function using lists of operators and integers

EDIT: When I say function in the title, I mean mathematical function not programming function. Sorry for any confusion caused.
I'm trying to create a function from randomly generated integers and operators. The approach I am currently taking is as follows:
STEP 1: Generate random list of operators and integers as a list.
STEP 2: Apply a set of rules to the list so I always end up with an integer, operator, integer, operator... etc list.
STEP 3: Use the modified list to create a single answer once the operations have been applied to the integers.
For example:
STEP 1 RESULT: [1,2,+,-,2,/,3,8,*]
STEP 2 RESULT: [1,+,2,-,2,/,3,*,8] - Note that I am using the operator command to generate the operators within the list.
STEP 3 RESULT: The output is intended to be a left to right read function rather than applying BODMAS, so in this case I'd expect the output to be 8/3 (the output doesn't have to be an integer).
So my question is: What function (and within what module) is available to help me combine the list as defined above. OR should I be combining the list in a different way to allow me to use a particular function?
I am considering changing the way I generate the list in the first place so that I do the sort on the fly, but I think I'll end up in the same situation that I wouldn't know how to combine the integers and operators after going through the sort process.
I feel like there is a simple solution here and I am tying myself up in knots unnecessarily!
Any help is greatly appreciated,
Dom
Why not create one list for the ints and one for the operators and them append from each list step by step?
edit: you can first convert your ints to strings then, create a string by using string=''.joint(list) after that you can just eval(string)
edit2: you can also take a look at the Sympy module that allows you to use symbolic math in python
I don't know if it's easier, but an elegant way would be to use a binary tree, where leaves are operands and other nodes are operators. You can directly generate it (without lists) by doing something like this (quick and dirty, probably wrong, but you get the idea) :
def generate(root, end_depth, depth):
root = random_operator()
right_child = random_operand()
if depth == end_depth:
left_child = random_operand()
else:
generate(left_child, end_depth, depth + 1)
Your example would like this :
*
/ \
div 8
/ \
- 3
/ \
+ 2
/ \
1 2
It's "backwards" because when you evaluate, you need to start by the bottom, where the 2 operands are known.
So, for those that are interested. I achieved what I was after by using the eval() function. Although not the most robust, within the particular loop I have written the inputs are closely controlled so I am happy with this approach for now.

How to find a float number in a list?

I am trying to create a list of points from a road network. Here, I try to put their coordinates in a List of [x,y] whose items have a float format. As a new point from the network is picked, it should be checked with the existing points in the list. if it exists, then the same index will be given to the feature of network, otherwise a new point will be added to the list and the new index will be given to the feature.
I know that a float number will be saved differently form integers, but for exactly the same float numbers, I still cannot use:
If new_point in list_of_points:
#do something
and I should use:
for point in list_of_points:
if abs(point.x-new_point.x)<0.01 and abs(point.y-new_point.y)<0.01
#do something
the points are supposed to be exactly the same as I snap them using the ArcGIS software, and when I check the coordinates in the software they are exactly the same.
I asked this question for:
1- I think using "in" can make my code tidy and also faster while using for-loop is kind of clumsy way of coding for this situation.
2- I want to know: does that mean even exactly the same float numbers are stored differently?
It's never a good idea to check for equality between two floating point numbers. However, there are built in functions to do a comparison like that. From numpy you can use allclose. For example,
>>> np.allclose( (1.0,2.0), (1.00000001,2.0000001) )
True
This checks if the two array like inputs are element-wise equal within a certain tolerance. You can adjust the relative and absolute tolerances with keyword arguments.
Any given Python implenetation should always store a given floating point number in the same, deterministic, non-random way within itself. I do not believe you can take the same floating point number, input it twice, and have it stored in two different ways. But I'm also reluctant to believe that you're going to be getting exact duplicates of coordinates out of a geographic program like ArcGIS, especially if the resolution is very small. There are many ways that floating point math can mess with your expectations, so you shouldn't ever expect that you'll have identical floats. And between different machines and different versions, you get even more possibilities for error.
If you're worried about the elegance of your code, you can just create a function to abstract out the for loop.
def coord_in(coord, coord_list):
for other_coord in coord_list:
if abs(coord.x-other_coord.x)<0.00001 and abs(coord.y-other_coord.y)<0.00001:
return True
return False
For a large number of points, numpy will always be faster (and perhapd more elegant). If you have separated the x and y coords into (float) arrays arrx and arry:
numpy.sometrue((arrx-point.x)**2+(arry-point.y)**2<tol**2)
will return True if point is within distance tol of an existing point.
2: exactly the same literal (e.g., "2.3") will be stored as exactly the same float representation for for a given platform and data-type, but in general it depends on the bit-ness, endian-ness and perhaps the compiler used to make python.
To be certain when comparing numbers, you should at least round to the precision of the least precise number, or (better) do the kind of thing you are doing here.
>>> 1==1.00000000000000000000000000000000001
True
Old thread but helped me develop my own solution using list comprehension. Because of course it's not a good idea to compare two floats using ==. The following returns list of indices of all elements of the input list that are reasonably close to the value we're looking for.
def findFloats(listOfFloats, value):
return [i for i, number in enumerate(listOfFloats)
if abs(number-value) < 0.00001]

Minimizing an array and value in Python

I have a vector of floats (coming from an operation on an array) and a float value (which is actually an element of the array, but that's unimportant), and I need to find the smallest float out of them all.
I'd love to be able to find the minimum between them in one line in a 'Pythony' way.
MinVec = N[i,:] + N[:,j]
Answer = min(min(MinVec),N[i,j])
Clearly I'm performing two minimisation calls, and I'd love to be able to replace this with one call. Perhaps I could eliminate the vector MinVec as well.
As an aside, this is for a short program in Dynamic Programming.
TIA.
EDIT: My apologies, I didn't specify I was using numpy. The variable N is an array.
You can append the value, then minimize. I'm not sure what the relative time considerations of the two approaches are, though - I wouldn't necessarily assume this is faster:
Answer = min(np.append(MinVec, N[i, j]))
This is the same thing as the answer above but without using numpy.
Answer = min(MinVec.append(N[i, j]))

Python: create a polynomial of degree n

I have a feature set
[x1,x2....xm]
Now I want to create polynomial feature set
What that means is that if degree is two, then I have the feature set
[x1.... xm,x1^2,x2^2...xm^2, x1x2, x1x3....x1,xm......xm-1x1....xm-1xm]
So it contains terms of only of order 2..
same is if order is three.. then you will have cubic terms as well..
How to do this?
Edit 1: I am working on a machine learning project where I have close to 7 features... and a non-linear regression on this linear features are giving ok result...Hence I thought that to get more number in features I can map these features to a higher dimension..
So one way is to consider polynomial order of the feature vector...
Also generating x1*x1 is easy.. :) but getting the rest of the combinations are a bit tricky..
Can combinations give me x1x2x3 result if the order is 3?
Use
itertools.combinations(list, r)
where list is the feature set, and r is the order of desired polynomial features. Then multiply elements of the sublists given by the above. That should give you {x1*x2, x1*x3, ...}. You'll need to construct other ones, then union all parts.
[Edit]
Better: itertools.combinations_with_replacement(list, r) will nicely give sorted length-r tuples with repeated elements allowed.
You could use itertools.product to create all the possible sets of n values that are chosen from the original set; but keep in mind that this will generate (x2, x1) as well as (x1, x2).
Similarly, itertools.combinations will produce sets without repetition or re-ordering, but that means you won't get (x1, x1) for example.
What exactly are you trying to do? What do you need these result values for? Are you sure you do want those x1^2 type terms (what does it mean to have the same feature more than once)? What exactly is a "feature" in this context anyway?
Using Karl's answer as inspiration, try using product and then taking advantage of the set object. Something like,
set([set(comb) for comb in itertools.product(range(5),range(5)])
This will get rid of recurring pairs. Then you can turn the set back into a list and sort it or iterate over it as you please.
EDIT:
this will actually kill the x_m^2 terms, so build sorted tuples instead of sets. this will allow the terms to be hashable and nonrepeating.
set([tuple(sorted(comb)) for comb in itertools.product(range(5),range(5))])

Unexpected behaviour in python random number generation

I have the following code:
import random
rand1 = random.Random()
rand2 = random.Random()
rand1.seed(0)
rand2.seed(0)
rand1.jumpahead(1)
rand2.jumpahead(2)
x = [rand1.random() for _ in range(0,5)]
y = [rand2.random() for _ in range(0,5)]
According to the documentation of jumpahead() function I expected x and y to be (pseudo)independent sequences. But the output that I get is:
x: [0.038378463064751012, 0.79353887395667977, 0.13619161852307016, 0.82978789012683285, 0.44296031215986331]
y: [0.98374801970498793, 0.79353887395667977, 0.13619161852307016, 0.82978789012683285, 0.44296031215986331]
If you notice, the 2nd-5th numbers are same. This happens each time I run the code.
Am I missing something here?
rand1.seed(0)
rand2.seed(0)
You initialize them with the same values so you get the same (non-)randomness. Use some value like the current unix timestamp to seed it and you will get better values. But note that if you initialize two RNGs at the same time with the current time though, you will get the same "random" values from them of course.
Update: Just noticed the jumpahead() stuff: Have a look at How should I use random.jumpahead in Python - it seems to answer your question.
I think there is a bug, python's documentation does not make this as clear as it should.
The difference between your two parameters to jumpahead is 1, this means you are only guaranteed to get 1 unique value (which is what happens). if you want more values, you need larger parameters.
EDIT: Further Explanation
Originally, as the name suggests, jumpahead merely jumped ahead in the sequence. Its clear to see in that case where jumping 1 or 2 places ahead in the sequence would not produce independent results. As it turns out, jumping ahead in most random number generators is inefficient. For that reason, python only approximates jumping ahead. Because its only approximate, python can implement a more effecient algorithm. However, the method is "pretending" to jump ahead, passing two similiar integers will not result in a very different sequence.
To get different sequences you need the integers passed in to be far apart. In particular, if you want to read a million random integers, you need to seperate your jumpaheads by a million.
As a final note, if you have two random number generators, you only need to jumpahead on one of them. You can (and should) leave the other in its original state.

Categories