q table with gym (using box observation space)

q table with gym (using box observation space) - python

I'm trying to run a q-learning algorithm with this observation space:
self.observation_space = spaces.Box(low=np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), high=np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]), dtype=np.flo
when im trying to access the q table, like this:
q_value = q_table[state][action]
I'm getting this error:
IndexError: arrays used as indices must be of integer (or boolean) type
So my question is: how am i supposed to access the q table when my observation space is definded using space.box?
If thats needed, this is how the q_table is defined (its a code i took from the internet, trying to adjust it to my project):
num_box = tuple((env.observation_space.high + np.ones(env.observation_space.shape)).astype(int))
q_table = np.zeros(num_box + (env.action_space.n,))

You're not saying of what type q_table is. I will assume it's an numpy array defined as in OpenAI Gym and Python set up for Q-learning:
action_space_size = env.action_space.n
state_space_size = env.observation_space.n
q_table = np.zeros((state_space_size, action_space_size))
You're getting this error because you're not indexing the elements of the numpy array with an integer. Again, I haven't seen your code, but I believe you are trying to get a specific row of the Q table using a tuple.
Regardless, you should not use a Box observation space when using Q-learning, but rather a Discrete one. When using Q-learning, you need to know the number of states in advance, to initialize the Q-table.
Box spaces are for real values, and the number of dimensions of the space does not define the number of states. For example, if you create a Box space like this:
spaces.Box(low=0, high=1, shape=(2, 2), dtype=np.float16)
you won't have 4 states, but potentially infinite states. The parameters low=0 and high=1 indicate the minimum and maximum value of the four variables in the Box space, but there can be may several values between 0 and 1 (0.1, 0.2, etc.). For this reason, you cannot estimate the number of states beforehand.
If you use np.uint8 (or any integer type) as dtype, you could potentially count the number of states, but it would still be a stretch to use Box spaces instead of Discrete spaces. Moreover, even using integer values the following will not work:
num_box = tuple((env.observation_space.high + np.ones(env.observation_space.shape)).astype(int))
q_table = np.zeros(num_box + (env.action_space.n,))

Related

Using dataloader with replacement based on certain criteria in pytorch

I am working with the following torch_geometric dataset object. It consists of a multitude of graphs, each representing a molecule. To give an idea, this is the data inside the Dataset.
Data(x=[1118918, 43], edge_attr=[2161762, 6], edge_index=[2, 2161762], y=[54613, 1], mol_code=[54613])
There is x (atom or node features), edge_attr (bond or edge features), edge_index (adiacency matrix), and y (target). There are 54613 graphs in this dataset: each graph is a molecule. In fact, to be clear, the dimension of x is [average_number_of_atoms * 54613, n_atom_features] and for edge_attr [average_number_of_bonds * 54613, n_bond_features].
What is mol_code then? It pairs the molecule with an id. In fact, even if I have 54613 graphs == 54613 molecules, the molecules repeat themselves. For example, the first 20 elements of mol_code could be all 0 (the 0th molecule), while the next 21 all 1, and so on without a fixed dimension. For splitting this dataset, I do the following:
# Let's say I want the training to be all entries corresponding to molecule 0, 1, and 2
molecules = [0, 1, 2]
training_dataset = dataset[torch.isin(dataset.data.mol_code, torch.tensor(molecules)]
However, now I want to have a training_dataset with replacement. For example, molecules would be [0, 1, 1].
The easiest thing I tried was to do:
# some function generates randomly molecules with repetitions
# imagine molecules is [0, 1, 1]
training_dataset = dataset[torch.isin(dataset.data.mol_code, torch.tensor(molecules)]
The problem is that now the training set obviously shrinked, because it's taking all the entries that have mol_code either 0 or 1, but it's not taking the entries with 1 twice.
There is a sampler method I read about, but from what I understood it will sample individual entries, rather than group them by mol_code. Any ideas?

shapely interpolate in three dimensions returns Point Z but invalid results

I am trying to use interpolate along a three dimensional line. However, any changes in the Z axis are not taken into account by .interpolate.
LineString([(0, 0, 0), (0, 0, 1), (0, 0, 2)]).interpolate(1, normalized=True).wkt
'POINT Z (0 0 0)'
vs
LineString([(0, 0, 0), (0, 1, 0), (0, 2, 0)]).interpolate(1, normalized=True).wkt
'POINT Z (0 2 0)'
I read the documentation and it is silent on 3D lines or the restriction is documented at a higher level than the interpolate documentation.
Is this a bug? I can't believe I'm the first person to try this.
Assuming that there is no direct way to accomplish this, any suggestions for doing my own interpolation?

That does indeed seem like a bug from shapely. I looked into the source code a little bit and I'm willing to bet it's an upstream issue with PyGEOS.
Anyways, here's a little implementation I put together:
import numpy as np
import shapely
import geopandas as gpd # Only necessary for the examples, not the actual function
def my_interpolate(input_line, input_dist, normalized=False):
'''
Function that interpolates the coordinates of a shapely LineString.
Note: If you use this function on a MultiLineString geometry, it will
"flatten" the geometry and consider all the points in it to be
consecutively connected. For example, consider the following shape:
MultiLineString(((0,0),(0,2)),((0,4),(0,6)))
In this case, this function will assume that there is no gap between
(0,2) and (0,4). Instead, the function will assume that these points
all connected. Explicitly, the MultiLineString above will be
interpreted instead as the following shape:
LineString((0,0),(0,2),(0,4),(0,6))
Parameters
----------
input_line : shapely.geometry.Linestring or shapely.geometry.MultiLineString
(Multi)LineString whose coordinates you want to interpolate
input_dist : float
Distance used to calculate the interpolation point
normalized : boolean
Flag that indicates whether or not the `input_dist` argument should be
interpreted as being an absolute number or a percentage that is
relative to the total distance or not.
When this flag is set to "False", the `input_dist` argument is assumed
to be an actual absolute distance from the starting point of the
geometry. When this flag is set to "True", the `input_dist` argument
is assumed to represent the relative distance with respect to the
geometry's full distance.
The default is False.
Returns
-------
shapely.geometry.Point
The shapely geometry of the interpolated Point.
'''
# Making sure the entry value is a LineString or MultiLineString
if ((input_line.type.lower() != 'linestring') and
(input_line.type.lower() != 'multilinestring')):
return None
# Extracting the coordinates from the geometry
if input_line.type.lower()[:len('multi')] == 'multi':
# In case it's a multilinestring, this step "flattens" the points
coords = [item for sub_list in [list(this_geom.coords) for
this_geom in input_line.geoms]
for item in sub_list]
else:
coords = [tuple(coord) for coord in list(input_line.coords)]
# Transforming the list of coordinates into a numpy array for
# ease of manipulation
coords = np.array(coords)
# Calculating the distances between points
dists = ((coords[:-1] - coords[1:])**2).sum(axis=1)**0.5
# Calculating the cumulative distances
dists_cum = np.append(0,dists.cumsum())
# Finding the total distance
dist_total = dists_cum[-1]
# Finding appropriate use of the `input_dist` value
if normalized == False:
input_dist_abs = input_dist
input_dist_rel = input_dist / dist_total
else:
input_dist_abs = input_dist * dist_total
input_dist_rel = input_dist
# Taking care of some edge cases
if ((input_dist_rel < 0) or
(input_dist_rel > 1) or
(input_dist_abs < 0) or
(input_dist_abs > dist_total)):
return None
elif ((input_dist_rel == 0) or (input_dist_abs == 0)):
return shapely.geometry.Point(coords[0])
elif ((input_dist_rel == 1) or (input_dist_abs == dist_total)):
return shapely.geometry.Point(coords[-1])
# Finding which point is immediately before and after the input distance
pt_before_idx = np.arange(dists_cum.shape[0])[(dists_cum <= input_dist_abs)].max()
pt_after_idx = np.arange(dists_cum.shape[0])[(dists_cum >= input_dist_abs)].min()
pt_before = coords[pt_before_idx]
pt_after = coords[pt_after_idx]
seg_full_dist = dists[pt_before_idx]
dist_left = input_dist_abs - dists_cum[pt_before_idx]
# Calculating the interpolated coordinates
interpolated_coords = ((dist_left / seg_full_dist) * (pt_after - pt_before)) + pt_before
# Creating a shapely geometry
interpolated_point = shapely.geometry.Point(interpolated_coords)
return interpolated_point
The function above can be used on Shapely (Multi)LineStrings. Here's an example of it being applied to a simple LineString.
input_line = shapely.geometry.LineString([(0, 0, 0),
(1, 2, 3),
(4, 5, 6)])
interpolated_point = my_interpolate(input_line, 2.5, normalized=False)
print(interpolated_point.wkt)
> POINT Z (0.6681531047810609 1.336306209562122 2.004459314343183)
And here's an example of using the apply method to perform the interpolation on a whole GeoDataFrame of LineStrings:
line_df = gpd.GeoDataFrame({'id':[1,
2,
3],
'geometry':[input_line,
input_line,
input_line],
'interpolate_dist':[0.5,
2.5,
6.5],
'interpolate_dist_normalized':[True,
False,
False]})
interpolated_points = line_df.apply(
lambda row: my_interpolate(input_line=row['geometry'],
input_dist=row['interpolate_dist'],
normalized=row['interpolate_dist_normalized']),
axis=1)
print(interpolated_points.apply(lambda point: point.wkt))
> 0 POINT Z (1.419876550265357 2.419876550265356 3...
> 1 POINT Z (0.6681531047810609 1.336306209562122 ...
> 2 POINT Z (2.592529850263281 3.592529850263281 4...
> dtype: object
Important notes
Corner cases and error handling
Please note that the function I developed doesn't do error handling very well. In many cases, it just silently returns a None object. Depending on your use case, you might want to adjust that behavior.
MultiLineStrings
The function above can be used on MultiLineStrings, but it makes some simplifications and assumptions. If you use this function on a MultiLineString geometry, it will "flatten" the geometry and consider all the points in it to be consecutively connected. For example, consider the following shape:
MultiLineString(((0,0),(0,2)),((0,4),(0,6)))
In this case, the function will assume that there is no gap between (0,2) and (0,4). Instead, the function will assume that these points are all connected. Explicitly, the MultiLineString above will be interpreted instead as the following shape:
LineString((0,0),(0,2),(0,4),(0,6))

Someone asked me " Can you interpolate along each axis instead of doing all three together?" I think the answer is yes and here is the approach I used.
# Upsample to 1S intervals rather than our desired interval because resample throws
# out rows that do not fall on the desired interval, including the rows we want to keep.
int_df = df.resample('1S', origin='start').asfreq()
# For each axis, interpolate to fill in NAN values.
int_df['Latitude'] = int_df['Latitude'].interpolate(method='polynomial', order=order)
int_df['Longitude'] = int_df['Longitude'].interpolate(method='polynomial', order=order)
int_df['AGL'] = int_df['AGL'].interpolate(method='polynomial', order=order)
# Now downsample to our desired frequency
int_df = int_df.resample('5S', origin='start').asfreq()
I initially resampled at 5S intervals but that caused any existing points that were not on the interval boundaries to get dropped in favor of new ones that were on the interval boundaries. For my use case this is important. If you want regular intervals then you don't need to upsample then down sample.
After that, just interpolate each of the three axis.
So, if I started with:
I now have:

To answer the question of why the shapely manipulation functions are not operating on 3D / Z:
From shapely docs. (writing this when version 1.8.X is current)
A third z coordinate value may be used when constructing instances,
but has no effect on geometric analysis. All operations are performed
in the x-y plane.
I also need Z for my purposes. So was searching for this information to see if using geopandas (which uses shaply) was an option, rather then osgeo.ogr.

Neural Network Data Sparsity

I am using PyBrain to train a network on music. The input is two notes, and the output is the next two notes.
Each note is represented by an integer mapped to a note (E.G C# = 11, F = 7), the octave, and the duration. So I was using a dataset as such:
ds = SupervisedDataSet(6, 6)
Which would look like ([note1, octave1, duration1, note2, octave2, duration2], [note1, octave1, duration1, note2, octave2, duration2])
However, I ran into a problem with chords (I.E more than one note played at once). To solve this, I got rid of the first integer representing a note and replaced it with 22 integers, set to either one or zero, to indicate which notes are being played. I still have this followed by octave and duration.
So for example, the following
[0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 4, 0.5]
represents a chord of C#, E and A, with an octave of 4 and duration of 0.5.
PyBrain always gives me an output of all zeros after training and testing. I understand why it's doing this but I don't know how to fix it.
Is there a better way to represent the notes/chords so that PyBrain won't have this problem?
EDIT: I have since converted the bit vector into a decimal number, and while the network isn't just giving zeros anymore it's still pretty clear it's not learning the patterns correctly.
I am using a network like this:
net = buildNetwork(6, 24, 6, bias=True, hiddenclass=LSTMLayer, recurrent=True)
and a trainer like this:
trainer = BackpropTrainer(net, ds, verbose = True)
when I train I am getting a huge error, something like ten or a hundred thousand.

Your problem is not so clear for me, I think it needs more detailed explanation, but depended what I understood I suppose that you don't need reccurence in your network, also try to use another activation function in hidden layer, for example Softmax. I tested it on some data set of samples with 6 nodes input and 6 - output and it is being trained properly, so I there I suggest you my version:
from pybrain.tools.shortcuts import buildNetwork
from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.structure.modules import SoftmaxLayer
ds = SupervisedDataSet(6, 6)
#
# fill dataset
#
net = buildNetwork(6, 24, 6, bias=True, hiddenclass=SoftmaxLayer)
trainer = BackpropTrainer(net, ds)
train:
error = 10
while error > 0.00001: #choose error like you want
error = trainer.train()
print error #just for logging
#and activate
print net.activate([*,*,*,*,*,*])

How to Collapse a List of Labels | Python

So I realize this is both a theoretical question and a coding question, but say if I have a list of 10 labels (x1, x2,...,x10) and their corresponding "location" vectors (v1, v2, ..., v10).
I want to collapse them based on their L2-norm distance from each other. For example, if v1 is close to v10, then relabel all x10's as x1's and so on.
So the end result could hypothetically look like the new labels: (x1, x3, x7, x8). Is there a way to smartly just make this into (x1', x2', x3', x4')?, so that people don't get confused and assume the new labels are the same.
Given:
labels = vector of Nx1 that has all the labels (1,2,3...,10)
Example Code:
epsilon = 0.2 # defines distance
change = [] # initialize vector of labels to change
# matrix is NxN matrix of the pairwise distances between all our vectors (v1,..,v10)
for i in range(0, distancematrix):
for j in range(0, distancematrix):
# add all pairs of labels that are "close", so that we may relabel
if i!=j and distancematrix[i, j] < epsilon:
change.append((i,j))
This will produce a list of pairs that I want to relabel. Is there a smart way of rewriting 'labels', so that it merges all the pairs I want to merge AND keeps the labels that were not part of any merge. Then reorganizes it to go from (1,2,3,4), if I merge 6 pairs of numbers (10-6 = 4).
Thank you. I realize this is somewhat of a weird problem, so if you have questions please let me know!

This does the job actually for me.
# creates a list of numbers from 0 to the length of your newlabels vector
changeto = [i for i in range(0, len(np.unique(newlabels)))]
# get the unique values of your newlabels (e.g. 0, 3, 4, 5, 10)
currentlabels = np.unique(newlabels)
# change all your labels to your new mapping (e.g. 0 -> 0, 3 -> 1, 4 -> 2, etc.)
for i in range(0, len(changeto)):
if currentlabels[i] != changeto[i]:
# change the 'states' in newlabels to new label
newlabels = [changeto[i] if x==currentlabels[i] else x for x in newlabels]
Maybe it's not pretty, but you map your new labels onto the line 0, 1, 2,...x, where x is the length of your new condensed label vector.

What if a label is not involved in any merge? Do you want to keep the original label? If so, what if that label is outside the new range?
Overall, I think that this is simply generating new labels given only the quantity of labels:
new_label_list = ["x"+str(n+1)+"'" for n in range(len(change))]
For change of length 4, this gives you
["x1'", "x2'", "x3'", "x4'"]
Do you see how the new label is built?
leading "x"
string version of the index, 1 .. length
trailing prime character

Why NUMPY correlate and corrcoef return different values and how to "normalize" a correlate in "full" mode?

I'm trying to use some Time Series Analysis in Python, using Numpy.
I have two somewhat medium-sized series, with 20k values each and I want to check the sliding correlation.
The corrcoef gives me as output a Matrix of auto-correlation/correlation coefficients. Nothing useful by itself in my case, as one of the series contains a lag.
The correlate function (in mode="full") returns a 40k elements list that DO look like the kind of result I'm aiming for (the peak value is as far from the center of the list as the Lag would indicate), but the values are all weird - up to 500, when I was expecting something from -1 to 1.
I can't just divide it all by the max value; I know the max correlation isn't 1.
How could I normalize the "cross-correlation" (correlation in "full" mode) so the return values would be the correlation on each lag step instead those very large, strange values?

You are looking for normalized cross-correlation. This option isn't available yet in Numpy, but a patch is waiting for review that does just what you want. It shouldn't be too hard to apply it I would think. Most of the patch is just doc string stuff. The only lines of code that it adds are
if normalize:
a = (a - mean(a)) / (std(a) * len(a))
v = (v - mean(v)) / std(v)
where a and v are the inputted numpy arrays of which you are finding the cross-correlation. It shouldn't be hard to either add them into your own distribution of Numpy or just make a copy of the correlate function and add the lines there. I would do the latter personally if I chose to go this route.
Another, quite possibly better, alternative is to just do the normalization to the input vectors before you send it to correlate. It's up to you which way you would like to do it.
By the way, this does appear to be the correct normalization as per the Wikipedia page on cross-correlation except for dividing by len(a) rather than (len(a)-1). I feel that the discrepancy is akin to the standard deviation of the sample vs. sample standard deviation and really won't make much of a difference in my opinion.

According to this slides, I would suggest to do it this way:
def cross_correlation(a1, a2):
lags = range(-len(a1)+1, len(a2))
cs = []
for lag in lags:
idx_lower_a1 = max(lag, 0)
idx_lower_a2 = max(-lag, 0)
idx_upper_a1 = min(len(a1), len(a1)+lag)
idx_upper_a2 = min(len(a2), len(a2)-lag)
b1 = a1[idx_lower_a1:idx_upper_a1]
b2 = a2[idx_lower_a2:idx_upper_a2]
c = np.correlate(b1, b2)[0]
c = c / np.sqrt((b1**2).sum() * (b2**2).sum())
cs.append(c)
return cs

For a full mode, would it make sense to compute corrcoef directly on the lagged signal/feature? Code
from dataclasses import dataclass
from typing import Any, Optional, Sequence
import numpy as np
ArrayLike = Any
#dataclass
class XCorr:
cross_correlation: np.ndarray
lags: np.ndarray
def cross_correlation(
signal: ArrayLike, feature: ArrayLike, lags: Optional[Sequence[int]] = None
) -> XCorr:
"""
Computes normalized cross correlation between the `signal` and the `feature`.
Current implementation assumes the `feature` can't be longer than the `signal`.
You can optionally provide specific lags, if not provided `signal` is padded
with the length of the `feature` - 1, and the `feature` is slid/padded (creating lags)
with 0 padding to match the length of the new signal. Pearson product-moment
correlation coefficients is computed for each lag.
See: https://en.wikipedia.org/wiki/Cross-correlation
:param signal: observed signal
:param feature: feature you are looking for
:param lags: optional lags, if not provided equals to (-len(feature), len(signal))
"""
signal_ar = np.asarray(signal)
feature_ar = np.asarray(feature)
if np.count_nonzero(feature_ar) == 0:
raise ValueError("Unsupported - feature contains only zeros")
assert (
signal_ar.ndim == feature_ar.ndim == 1
), "Unsupported - only 1d signal/feature supported"
assert len(feature_ar) <= len(
signal
), "Unsupported - signal should be at least as long as the feature"
padding_sz = len(feature_ar) - 1
padded_signal = np.pad(
signal_ar, (padding_sz, padding_sz), "constant", constant_values=0
)
lags = lags if lags is not None else range(-padding_sz, len(signal_ar), 1)
if np.max(lags) >= len(signal_ar):
raise ValueError("max positive lag must be shorter than the signal")
if np.min(lags) <= -len(feature_ar):
raise ValueError("max negative lag can't be longer than the feature")
assert np.max(lags) < len(signal_ar), ""
lagged_patterns = np.asarray(
[
np.pad(
feature_ar,
(padding_sz + lag, len(signal_ar) - lag - 1),
"constant",
constant_values=0,
)
for lag in lags
]
)
return XCorr(
cross_correlation=np.corrcoef(padded_signal, lagged_patterns)[0, 1:],
lags=np.asarray(lags),
)
Example:
signal = [0, 0, 1, 0.5, 1, 0, 0, 1]
feature = [1, 0, 0, 1]
xcorr = cross_correlation(signal, feature)
assert xcorr.lags[xcorr.cross_correlation.argmax()] == 4

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

q table with gym (using box observation space) - python

Related

Using dataloader with replacement based on certain criteria in pytorch

shapely interpolate in three dimensions returns Point Z but invalid results

Neural Network Data Sparsity

How to Collapse a List of Labels | Python

Why NUMPY correlate and corrcoef return different values and how to "normalize" a correlate in "full" mode?

Categories

Resources