I'm trying to understand how Shapely works.
I can draw a simple line with the following code:
import matplotlib.pyplot as plt
A = Point(0,0)
B = Point(1,1)
AB = LineString([A,B])
plt.plot(AB)
However when I alter the coordinates:
A = Point(1,0)
B = Point(3,4)
AB = LineString([A,B])
plt.plot(AB)
Shapely decides to plot two lines, which is behaviour I don't understand.
Using Shapely 1.7.0
You are using plt.plot() incorrectly.
What plt.plot() does is Plot y versus x as lines and/or markers.
In the docs, you can see that since the call plot(AB) has only 1 argument, AB is being passed as the Y values.
The X value, in this case, is the index of the elements in the array of Y values.
It is the same as calling plt.plot([(1,0),(3,4)]). Since you have 2 tuples of Y values, you will get 2 different lines: [(0,1),(1,3)] and [(0,0),(1,4)]. (Notice the x values are 0 and 1, the index of the corresponding tuple of Y value.)
You can see in the screenshot of the output, that in the first case you also plot 2 lines. But in the case of these specific values, plt.plot([(0,0),(1,1)]) will plot the same line twice.
If you just want to graph a line from point A to point B, you can use:
A = Point(1,0)
B = Point(3,4)
AB = LineString([A,B])
plt.plot(*AB.xy)
plt.show()
Related
I have dataframes with columns containing x,y coordinates for multiple points. One row can consist of several points.
I'm trying to find out an easy way to be able to plot lines between each point generating a curve for each row of data.
Here is a simplified example where two lines are represented by two points each.
line1 = {'p1_x':1, 'p1_y':10, 'p2_x':2, 'p2_y':11 }
line2 = {'p1_x':2, 'p1_y':9, 'p2_x':3, 'p2_y':12 }
df = pd.DataFrame([line1,line2])
df.plot(y=['p1_y','p2_y'], x=['p1_x','p2_x'])
when trying to plot them I expect line 1 to start where x=1 and line 2 to start where x=2.
Instead, the x axis contains two value-pairs (1,2) and (2,3) and both lines have the same start and end-point in x-axis.
How do I get around this problem?
Edit:
If using matplotlib, the following hardcoded values generates the plot i'm interested in
plt.plot([[1,2],[2,3]],[[10,9],[11,12]])
While I'm sure that there should be a more succinct way using pure pandas, here's a simple approach using matplotlib and some derivatives from the original df.(I hope I understood the question correctly)
Assumption: In df, you place x values in even columns and y values in odd columns
Obtain x values
x = df.loc[:, df.columns[::2]]
x
p1_x p2_x
0 1 2
1 2 3
Obtain y values
y = df.loc[:, df.columns[1::2]]
y
p1_y p2_y
0 10 11
1 9 12
Then plot using a for loop
for i in range(len(df)):
plt.plot(x.iloc[i,:], y.iloc[i,:])
One does not need to create additional data frames. One can loop through the rows to plot these lines:
line1 = {'p1_x':1, 'p1_y':10, 'p2_x':2, 'p2_y':11 }
line2 = {'p1_x':2, 'p1_y':9, 'p2_x':3, 'p2_y':12 }
df = pd.DataFrame([line1,line2])
for i in range(len(df)): # for each row:
# plt.plot([list of Xs], [list of Ys])
plt.plot([df.iloc[i,0],df.iloc[i,2]],[df.iloc[i,1],df.iloc[i,3]])
plt.show()
The lines will be drawn in different colors. To get lines of same color, one can add option c='k' or whatever color one wants.
plt.plot([df.iloc[i,0],df.iloc[i,2]],[df.iloc[i,1],df.iloc[i,3]], c='k')
I generaly don't use the pandas plotting because I think it is rather limited, if using matplotlib is not an issue, the following code works:
from matplotlib import pyplot as plt
plt.plot(df.p1_x,df.p1_y)
plt.plot(df.p2_x,df.p2_y)
plt.plot()
if you got lots of lines to plot, you can use a for loop.
I have a data which looks like (example)
x y d
0 0 -2
1 0 0
0 1 1
1 1 3
And I want to turn this into a coloumap plot which looks like one of these:
where x and y are in the table and the color is given by 'd'. However, I want a predetermined color for each number, for example:
-2 - orange
0 - blue
1 - red
3 - yellow
Not necessarily these colours but I need to address a number to a colour and the numbers are not in order or sequence, the are just a set of five or six random numbers which repeat themselves across the entire array.
Any ideas, I haven't got a code for that as I don't know where to start. I have however looked at the examples in here such as:
Matplotlib python change single color in colormap
However they only show how to define colours and not how to link those colours to an specific value.
It turns out this is harder than I thought, so maybe someone has an easier way of doing this.
Since we need to create an image of the data, we will store them in a 2D array. We can then map the data to the integers 0 .. number of different data values and assign a color to each of them. The reason is that we want the final colormap to be equally spaced. So
value -2 --> integer 0 --> color orange
value 0 --> integer 1 --> color blue
and so on.
Having nicely spaced integers, we can use a ListedColormap on the image of newly created integer values.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.colors
# define the image as a 2D array
d = np.array([[-2,0],[1,3]])
# create a sorted list of all unique values from d
ticks = np.unique(d.flatten()).tolist()
# create a new array of same shape as d
# we will later use this to store values from 0 to number of unique values
dc = np.zeros(d.shape)
#fill the array dc
for i in range(d.shape[0]):
for j in range(d.shape[1]):
dc[i,j] = ticks.index(d[i,j])
# now we need n (= number of unique values) different colors
colors= ["orange", "blue", "red", "yellow"]
# and put them to a listed colormap
colormap = matplotlib.colors.ListedColormap(colors)
plt.figure(figsize=(5,3))
#plot the newly created array, shift the colorlimits,
# such that later the ticks are in the middle
im = plt.imshow(dc, cmap=colormap, interpolation="none", vmin=-0.5, vmax=len(colors)-0.5)
# create a colorbar with n different ticks
cbar = plt.colorbar(im, ticks=range(len(colors)) )
#set the ticklabels to the unique values from d
cbar.ax.set_yticklabels(ticks)
#set nice tickmarks on image
plt.gca().set_xticks(range(d.shape[1]))
plt.gca().set_yticks(range(d.shape[0]))
plt.show()
As it may not be intuitively clear how to get the array d in the shape needed for plotting with imshow, i.e. as 2D array, here are two ways of converting the input data columns:
import numpy as np
x = np.array([0,1,0,1])
y = np.array([ 0,0,1,1])
d_original = np.array([-2,0,1,3])
#### Method 1 ####
# Intuitive method.
# Assumption:
# * Indexing in x and y start at 0
# * every index pair occurs exactly once.
# Create an empty array of shape (n+1,m+1)
# where n is the maximum index in y and
# m is the maximum index in x
d = np.zeros((y.max()+1 , x.max()+1), dtype=np.int)
for k in range(len(d_original)) :
d[y[k],x[k]] = d_original[k]
print d
#### Method 2 ####
# Fast method
# Additional assumption:
# indizes in x and y are ordered exactly such
# that y is sorted ascendingly first,
# and for each index in y, x is sorted.
# In this case the original d array can bes simply reshaped
d2 = d_original.reshape((y.max()+1 , x.max()+1))
print d2
I provide a python-code which solves Gauss equations and plots a function graph. I have a problem in plotting my function. When I try to plot a function graph for example - "2sin(2πx)" I see lines which connect point and it isn't that i would see.
import numpy as np
import math
import random
import matplotlib.pyplot as plt
import pylab
from matplotlib import mlab
print 'case1=2sin(2πx)'
print 'case2=cos(2πx)'
print 'case3=5x^3 + x^2 + 5'
Your_function=raw_input("Enter your choise of your function: ")
def Choising_of_function(x, Your_function):
if Your_function=='case1':
return 2*math.sin(2*x*math.pi)
elif Your_function=='case2':
return math.cos(2*x*math.pi)
elif Your_function=='case3':
return 5*x**3 + x**2 + 5
Dimension_of_pol=int(raw_input("Enter your degree of polynom: "))
Points=int(raw_input("Enter number of points: "))# I just need only limited numbers of points to plot a function graph
Interval=int(raw_input("Enter interval of your points: "))
dx=float(raw_input("Enter interval your dx: "))
X_val=[]
Y_val=[]
for i in range(Points):# First, i generate my values of x
x = random.uniform(-Interval, Interval)
X_val.append(x)
for x in X_val:
y=Choising_of_function(x, Your_function)
Y_val.append(y)
print X_val, Y_val
Arr_Xo=[[x**i for i in range(Dimension_of_pol)] for x in X_val]
print Arr_Xo
D_mod={}
D={}
for y, x in zip(Y_val, X_val):
D_mod[y]=x
Arr_X_o=np.array(Arr_Xo)
print Arr_X_o
Arr_X=np.array(X_val) #My array of x-values
print Arr_X
Arr_Y=np.array(Y_val) #My array of y-values
print Arr_Y
m = np.linalg.lstsq(Arr_X_o, Arr_Y)[0]
print m
pylab.plot(Arr_X, Arr_Y, 'go')
line=plt.plot(Arr_X, Arr_Y)
line.show()
How i can plot my function without using frange.
My array of x:
[-15.9836388 13.78848867 -3.39805316 12.04429943 -12.34344464
-19.66512508 6.8480724 -5.58674018 7.59985149 11.46357551
-4.96507337 -2.40178658 -1.71320151 -12.87164233 -3.26385184
-7.44683254 5.52525074 -9.16879057 3.70939966 -4.80486815
-10.35409227 6.72283255 2.00436008 8.68484529 -17.81750773]
My array of y:
[ 0.20523902 -1.941802 -1.19527441 0.54952271 -1.66506802 1.72228361
-1.63215286 1.03684409 -1.17406016 0.45373838 0.43538662 -1.15733373
1.94677887 1.44373207 -1.99242991 -0.65576448 -0.31598064 -1.74524107
-1.9352764 1.88232214 -1.58727561 -1.97093284 0.05478352 -1.83473627
1.8227666 ]
I paste all of it in :
line=plt.plot(Arr_X, Arr_Y)
plt.show()
And my function graph doesnt looks like 2*sin(2px)
The problem is that your x axis values are not in order, therefore when you plot them your points will not be joined to the next point on the x axis, giving a graph that looks like the one in the question. A test of this will be to use plt.scatter instead of plt.plot:
This shows that the points you are generating are in the correct shape as seen in the left most image, however you are just generating the x values slightly wrong.
In order to get a nice looking graph you need to change the way you generate the x values. This can be done using np.linspace, the documentation can be found here.
# for i in range(Points): # First, i generate my values of x
# x = random.uniform(-Interval, Interval)
# X_val.append(x)
# replace the above 3 lines with the one below
X_val = np.linspace(-Interval,Interval,Points)
In addition, there is no need to assign plt.plot to a variable, therefore the last 3 lines of your code should be replaced with:
# pylab.plot(Arr_X, Arr_Y, 'go')
# line=plt.plot(Arr_X, Arr_Y)
# line.show()
# replace the above 3 lines with the one below
pylab.plot(Arr_X, Arr_Y)
plt.show()
This produces the following graph:
I do not know what the reason is to
pylab.plot(Arr_X, Arr_Y, 'go')
as well as
line=plt.plot(Arr_X, Arr_Y)
Why do you need pylab to plot instead of just using pyplot?
Your
line.show() in line 63 gives me an attribute error
"list" object has no attribute "show"
only plt has show(), if you see in print dir(plt)
As I am to lazy to go trough your full code stick to this general plotting example:
import matplotlib.pyplot as plt
figure, axis = plt.subplots(figsize=(7.6, 6.1))
for x in range(0, 500):
axis.plot(x, x*2, 'o-')
plt.show()
I am new to sympy but I already get a nice output when I plot the implicit function (actually the formula for Cassini's ovals) using sympy:
from sympy import plot_implicit, symbols, Eq, solve
x, y = symbols('x y')
k=2.7
a=3
eq = Eq((x**2 + y**2)**2-2*a**2*(x**2-y**2), k**4-a**4)
plot_implicit(eq)
Now is it actually possible to somehow get the x and y values corresponding to the plot? or alternatively solve the implicit equation without plotting at all?
thanks! :-)
This is an answer addressing your
is it actually possible to somehow get the x and y values corresponding to the plot?
and I say "addressing" because it's not possible to get the x and y values used to draw the curves — because the curves are not drawn using a sequenc of 2D points… more on this later,
TL;DR
pli = plot_implicit(...)
series = pli[0]
data, action = series.get_points()
data = np.array([(x_int.mid, y_int.mid) for x_int, y_int in data])
Let's start with your code
from sympy import plot_implicit, symbols, Eq, solve
x, y = symbols('x y')
k=2.7
a=3
eq = Eq((x**2 + y**2)**2-2*a**2*(x**2-y**2), k**4-a**4)
and plot it, with a twist: we save the Plot object and print it
pli = plot_implicit(eq)
print(pli)
to get
Plot object containing:
[0]: Implicit equation: Eq(-18*x**2 + 18*y**2 + (x**2 + y**2)**2, -27.8559000000000) for x over (-5.0, 5.0) and y over (-5.0, 5.0)
We are interested in this object indexed by 0,
ob = pli[0]
print(dir(ob))
that gives (ellipsis are mine)
['__class__', …, get_points, …, 'var_y']
The name get_points sounds full of promise, doesn't it?
print(ob.get_points())
that gives (edited for clarity and with a big cut)
([
[interval(-3.759774, -3.750008), interval(-0.791016, -0.781250)],
[interval(-3.876961, -3.867195), interval(-0.634768, -0.625003)],
[interval(-3.837898, -3.828133), interval(-0.693361, -0.683596)],
[interval(-3.847664, -3.837898), interval(-0.673830, -0.664065)],
...
[interval(3.837895, 3.847661), interval(0.664064, 0.673830)],
[interval(3.828130, 3.837895), interval(0.683596, 0.693362)],
[interval(3.867192, 3.876958), interval(0.625001, 0.634766)],
[interval(3.750005, 3.759770), interval(0.781255, 0.791021)]
], 'fill')
What is this? the documentation of plot_implicit has
plot_implicit, by default, uses interval arithmetic to plot functions.
Following the source code of plot_implicit.py and plot,py one realizes that, in this case, the actual plotting (speaking of the matpolotlib backend) is just a line of code
self.ax.fill(x, y, facecolor=s.line_color, edgecolor='None')
where x and y are constructed from the list of intervals, as returned from .get_points(), as follows
x, y = [], []
for intervals in interval_list:
intervalx = intervals[0]
intervaly = intervals[1]
x.extend([intervalx.start, intervalx.start,
intervalx.end, intervalx.end, None])
y.extend([intervaly.start, intervaly.end,
intervaly.end, intervaly.start, None])
so that for each couple of intervals matplotlib is directed to draw a filled rectangle, small enough that the eye sees a continuous line (note the use of None to have disjoint rectangles).
We can conclude that the list of couples of intervals
l_xy_intervals = ((pli[0]).get_points())[0]
represents rectangular areas where the implicit expression you are plotting is
"true enough"
You can do this, even with interval math, if you try getting the mid point of each interval. Starting from your code, and slightly change it, by saving the plot_implicit object in a variable called g we have:
from sympy import plot_implicit, symbols, Eq, solve
x, y = symbols('x y')
k=2.7
a=3
eq = Eq((x**2 + y**2)**2-2*a**2*(x**2-y**2), k**4-a**4)
g = plot_implicit(eq)
Now let's save in a variable named ptos the intervals that were used to draw the plot.
ptos = g[0].get_points()[0]
This way ptos[0][0] will be the first interval in the x axis and ptos[0][1] will be its pair in the y axis. The intervals have a property called mid which gives the middle point of the interval. So you can suppose that ptos[0][0].mid, ptos[0][1].mid will be a pair x,y "true enough" to be one of our numerical solutions.
This way, a data frame composed of this middle point pairs can be generated with:
intervs = np.array(dtype='object')
meio = lambda x0:x0.mid
px = list(map(meio, intervs[:,0]))
py = list(map(meio, intervs[:,1]))
import pandas as pd
dados = pd.DataFrame({'x':px, 'y':px})
dados.head()
Which in this example would give us:
x y
0 -1.177733 0.598826
1 -1.175389 0.596483
2 -1.175389 0.598826
3 -1.173045 0.596483
4 -1.173045 0.598826
This idea of getting the intervals middle points can be used whenever one needs to move from "interval math" to "standard" point level math. Hope this helps. Regards.
I have a dataset of three columns and n number of rows. column 1 contains name, column 2 value1, and column 3 value2 (rank2).
I want to plot a scatter plot with the outlier values displaying names.
The R commands I am using in are:
tiff('scatterplot.tiff')
data<-read.table("scatterplot_data", header=T)
attach(data)
reg1<-lm(A~B)
plot(A,B,col="red")
abline(reg1)
outliers<-data[which(2^(data[,2]-data[,3]) >= 4 | 2^(data[,2]-data[,3]) <=0.25),]
text(outliers[,2], outliers[,3],labels=outliers[,1],cex=0.50)
dev.off()
and I get a figure like this:
What I want is the labels on the lower half should be of one colour and the labels in upper half should be of another colour say green and red respectively.
Any suggestions, or adjustment in the commands?
You already have a logical test that works to your satisfaction. Just use it in the color spec to text:
text(outliers[,2], outliers[,3],labels=outliers[,1],cex=0.50,
col=c("blue", "green")[
which(2^(data[,2]-data[,3]) >= 4 , 2^(data[,2]-data[,3]) <=0.25)] )
It's untested of course because you offered no test case, but my reasoning is that the which() function should return 1 for the differences >= 4, and 2 for the ones <= 0.25, and integer(0) for all the others and that this should give you the proper alignment of color choices with the 'outliers' vector.
Using python, matplotlib (pylab) to plot, and scipy, numpy to fit data. The trick with numpy is to create a index or mask to filter out the results that you want.
EDIT: Want to selectively color the top and bottom outliers? It's a simple combination of both masks that we created:
import scipy as sci
import numpy as np
import pylab as plt
# Create some data
N = 1000
X = np.random.normal(5,1,size=N)
Y = X + np.random.normal(0,5.5,size=N)/np.random.normal(5,.1)
NAMES = ["foo"]*1000 # Customize names here
# Fit a polynomial
(a,b)=sci.polyfit(X,Y,1)
# Find all points above the line
idx = (X*a + b) < Y
# Scatter according to that index
plt.scatter(X[idx],Y[idx], color='r')
plt.scatter(X[~idx],Y[~idx], color='g')
# Find top 10 outliers
err = ((X*a+b) - Y) ** 2
idx_L = np.argsort(err)[-10:]
for i in idx_L:
plt.text(X[i], Y[i], NAMES[i])
# Color the outliers purple or black
top = idx_L[idx[idx_L]]
bot = idx_L[~idx[idx_L]]
plt.scatter(X[top],Y[top], color='purple')
plt.scatter(X[bot],Y[bot], color='black')
XF = np.linspace(0,10,1000)
plt.plot(XF, XF*a + b, 'k--')
plt.axis('tight')
plt.show()