I'm trying to make labels for some 3d objects, with an icon/triangle to show you where the object is, plus some text describing what the object is.
Essentially, I want to 1. display text using pyopengl and 2. have the text + icon stay at a constant size on the screen that can 3. still move around the screen.
(I looked around a bit and looked at orthographic projections, but I'm not sure that's what I should be using...)
I have not used opengl very much at all, so this might be a dumb question!
Any help is much appreciate.
A nice start would be to store your icon + text inside a quad.
There are plenty of good tutorials on "font rendering" which will help you to create the desired quad.
You can load your icon inside a openGL texture, once done you will have to create your quad and associate a vertex and fragment shader to it. The difficult part will be to set a fixed position to your quad, that won't be linked to your 3D scene.
In video-games when you want to draw the ui, or the hud (healthbar, minimap), you draw them at the end on top off everything. Thoose elements don't need the mvp matrices, or projection matrices you might be familiar with. All the magic will appen in the vertex shader, he will be responsible to to set the position of all your elements, the output of the vertex shader should be in the [-1, 1] range for all coordinates:
-1 -> left, top, near
1 -> right, bottom, far
We call this space ndc see diagram https://antongerdelan.net/opengl/raycasting.html
Your job will be to output values in this range. If you want your quad to be half the size of the width and a quarter of the height, centered in the middle, you can store this information in the vertices you sent to your shader.
GLfloat vertices[] = {-0.5, -0.25, 0, // bottom left corner
-0.5, 0.25, 0, // top left corner
0.5, 0.25, 0, // top right corner
0.5, -0.25, 0}; // bottom right corner
vertex shader, people might use pixel size quads with a ortho projection
in vec3 pos;
in vec2 uv;
out fuv;
void main()
{
//glm::mat4 translation = uTranslation might need to mvoe the quad on screen at runtime
gl_Position = vec4(pos.x, pos.y, -1.0, 1.0); //set it to near plane -1
fuv = uv;
}
fragment shader
in vec2 fuv;
out vec4 color;
uniform sampler2D renderedTexture;
void main(){
color = texture(renderedTexture, fuv);
}
Related
I have an object, there are 2 code on it. text printed on it. The text is curve. half of text is in the top side, and another half is in bottom side of object. Here is my sample image
I am using OPENCV, and Deep learning approaches and tessract to OCR it's code.
I logical approach(not Deep approach) I first used HoughCircles() andlogPloar() to align text in line then used tessract such this example sample code. But because of distortion in aligned text, tesseract fail to OCR it's text.
In Deep approach I cant find fine a optimum solution for curve text OCR in tensorflow or torch. There are many sources for text detection not recognition.
Regards,John
why not transform the circular text to linear? Similar to this De-skew characters in binary image just a bit more complicated. So detect (or manually select) the center of circle and convert the image to unrotated one ...
So create new image that has dimensions 6.28*max_radius , 2*max_radius and copy pixels using polar unwraping ... simply convert target pixel position into polar coordinates and convert that to Cartesian source pixel position.
I do not code in Python nor OpenCV but here is a simple C++ example of this:
//---------------------------------------------------------------------------
picture pic0,pic1; // pic0 - original input image,pic1 output
//---------------------------------------------------------------------------
void ExtractCircularText(int x0,int y0) // pic0 -> pic1 center = (x0,y0)
{
int x,y,xx,yy,RR;
float fx,fy,r,a,R;
// resize target image
x= -x0; y= -y0; a=sqrt((x*x)+(y*y)); R=a;
x=pic0.xs-x0; y= -y0; a=sqrt((x*x)+(y*y)); if (R<a) R=a;
x= -x0; y=pic0.ys-y0; a=sqrt((x*x)+(y*y)); if (R<a) R=a;
x=pic0.xs-x0; y=pic0.ys-y0; a=sqrt((x*x)+(y*y)); if (R<a) R=a;
R=ceil(R); RR=R;
pic1.resize((628*RR)/100,RR<<1);
for (yy=0;yy<pic1.ys;yy++)
for (xx=0;xx<pic1.xs;xx++)
{
// pic1 position xx,yy -> polar coordinates a,r
a=xx; a/=R; r=yy;
// a,r -> pic0 position
fx=r*cos(a); x=x0+fx;
fy=r*sin(a); y=y0+fy;
// copy pixel
if ((x>=0)&&(x<pic0.xs))
if ((y>=0)&&(y<pic0.ys))
{
pic1.p[ yy][pic1.xs-1-xx]=pic0.p[y][x]; // 2 mirrors as the text is not uniformly oriented
pic1.p[pic1.ys-1-yy][ xx]=pic0.p[y][x];
}
}
pic1.save("out.png");
}
//---------------------------------------------------------------------------
I use my own picture class for images so some members are:
xs,ys is size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) clears entire image with color
resize(xs,ys) resizes image to new resolution
And finally the resulting image:
I made a 2 copies of the un rotated image (hence 2*max_radius height) so I can copy image in 2 modes to made both orientations of the text readable (as they are mirrored to each other)
Text will be more straight if you chose the center (x0,y0)more precisely I did just click it by mouse on the center of the circle but I doubt the center of text has the same center as that circle/disc. After some clicking this is the best center I could found:
The result suggest that none of the two texts nor disc has the same center ...
The quality of input image is not good you should improve it before doing this (maybe even binarization is a good idea) also storing it as JPG is not a good idea as its lossy compression adding more noise to it. Take a look at these:
Enhancing dynamic range and normalizing illumination
OCR and character similarity
PS. The center could be computed geometrically from selected text (arc) simply find most distant points on it (edges) and point on the middle between them on the arc. From that you can compute arc center and radius... or even fit it ...
The black dot is a perfect feature for centering, and the polar unwarping seems to work fine, the deformation of the characters is negligible.
The failure of Tesserac might be explained by the low image quality (blur).
I am using VTK in python to import .stl files. then what i want to do is to scale down the mesh and making it smaller without changing the orientation matrix.
I tried vtkTransform with a scale tuple but the problem is the scaled polydata is getting rotated.
Here is the code:
def scaleSTL(filenameSTL, opacity=0.75, scale=(1,1,1), mesh_color="gold"):
colors = vtk.vtkNamedColors()
reader = vtk.vtkSTLReader()
reader.SetFileName(filenameSTL)
reader.Update()
transform = vtk.vtkTransform()
transform.Scale(scale)
transformFilter = vtk.vtkTransformPolyDataFilter()
transformFilter.SetInputConnection(reader.GetOutputPort())
transformFilter.SetTransform(transform)
transformFilter.Update()
mapper = vtk.vtkPolyDataMapper()
mapper.SetInputConnection(transformFilter.GetOutputPort())
actor = vtk.vtkActor()
actor.SetMapper(mapper)
actor.GetProperty().SetColor(colors.GetColor3d(mesh_color))
actor.GetProperty().SetOpacity(opacity)
return actor
def render_scene(my_actor_list):
renderer = vtk.vtkRenderer()
for arg in my_actor_list:
renderer.AddActor(arg)
namedColors = vtk.vtkNamedColors()
renderer.SetBackground(namedColors.GetColor3d("SlateGray"))
window = vtk.vtkRenderWindow()
window.SetWindowName("Oriented Cylinder")
window.AddRenderer(renderer)
interactor = vtk.vtkRenderWindowInteractor()
interactor.SetRenderWindow(window)
# Visualize
window.Render()
interactor.Start()
if __name__ == "__Main__":
filename = "400_tri.stl"
scale01 = (1, 1, 1)
scale02 = (0.5, 0.5, 0.5)
my_list = []
my_list.append(scaleSTL(filename, 0.75, scale01, "Gold"))
my_list.append(scaleSTL(filename, 0.75, scale02, "DarkGreen"))
render_scene(my_list)
I used my mesh file kidney.stl (yellow one) but what i getting is the scaled and rotated mesh. i set opacity to 0.75 to see both meshes. In the picture below you can see that the green one is completely moved but i want to scale so the green one is completely inside the original yellow mesh.
Simple answer (no explanation) can be found here: Scaling 3D models, finding the origin
That is because the scaling transformation is defined simply as multiplying the coordinates by a given factor (see e.g. https://www.tutorialspoint.com/computer_graphics/3d_transformation.htm). This intrinsically means that it is done with respect to a certain reference point. Your transform.Scale() call will use the origin (0,0,0) as this reference point and since your object is apparently not centered around origin, you get the translation (not rotation as you claim btw).
To get a locally centered scaling, you need to choose a reference point R on your object around which you want to scale (in your case, since you want the scaled object to be inside the original, you want some kind of center - since the object is "almost convex", centroid - average of all points - could be good enough). Translate the object by -R to align it with the coordinate system, scale and then translate back by +R.
Try a little exercise to visualize this: simple 2D example - draw yourself a square made of points with coordinates (2,2), (2,3), (3,3), (3,2) and "scale it by 2" - you get (4,4), (4,6),(6,6), (6,4) - draw it as well. Now try the alternative - first translate by the square's center (2.5,2.5), you get (-0.5,-0.5), (-0.5,0.5), (0.5,0.5), (0.5,-0.5) (draw it), scale by two, you get (-1,-1), (-1, 1), (1,1), (1,-1) (draw) and finally translate back by 2.5: (1.5, 1.5), (1.5,3.5), (3.5,3.5), (3.5, 1.5) and draw - see the difference?
I am troubleshooting a problem with my code that if the depth value of any primitive is not zero it will not render on the screen. I suspect that it gets clipped away.
Is there an easy pythonic way to set my clipping planes in pyglet ?
This is my code so far:
import pyglet
from pyglet.gl import *
import pywavefront
from camera import FirstPersonCamera
def drawloop(win,camera):
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)
#glClearColor(255,255,255,255)
glLoadIdentity()
camera.draw()
pyglet.graphics.draw(2, pyglet.gl.GL_POINTS,
('v3f', (10.0, 15.0, 0.0, 30.0, 35.0, 150.0))
)
glPointSize(20.)
return pyglet.event.EVENT_HANDLED
def main():
win = pyglet.window.Window()
win.set_exclusive_mouse(True)
win.clear()
camera = FirstPersonCamera(win)
#win.event
def on_draw():
drawloop(win,camera)
def on_update(delta_time):
camera.update(delta_time)
pyglet.clock.schedule(on_update)
pyglet.app.run()
if __name__ == '__main__':
main()
I am using the FirstPersonCamera snippet from here:
https://gist.github.com/mr-linch/f6dacd2a069887a47fbc
I am troubleshooting a problem with my code that if the depth value of any primitive is not zero it will not render on the screen. I suspect that it gets clipped away.
You have to set up a projection matrix to solve the issue. Either set up an orthographic projection matrix or a perspective projection matrix.
The projection matrix describes the mapping from 3D points of the view on a scene, to 2D points on the viewport. It transforms from eye space to the clip space, and the coordinates in the clip space are transformed to the normalized device coordinates (NDC) by dividing with the w component of the clip coordinates. The NDC are in range (-1,-1,-1) to (1,1,1). Every geometry which is out of the clippspace is clipped.
At Orthographic Projection the coordinates in the view space are linearly mapped to clip space coordinates and the clip space coordinates are equal to the normalized device coordinates, because the w component is 1 (for a cartesian input coordinate).
The values for left, right, bottom, top, near and far define a box. All the geometry which is inside the volume of the box is "visible" on the viewport.
At Perspective Projection the projection matrix describes the mapping from 3D points in the world as they are seen from of a pinhole camera, to 2D points of the viewport. The eye space coordinates in the camera frustum (a truncated pyramid) are mapped to a cube (the normalized device coordinates).
To set a projection matrix the projection matrix stack has to be selected by glMatrixMode.
An orthographic projection can be set by glOrhto:
w, h = 640, 480 # default pyglet window size
glMatrixMode(GL_PROJECTION)
glLoadIdentity()
glOrtho( -w/2, w/2, -h/2, h/2, -1000.0, 1000.0) # [near, far] = [-1000, 1000]
glMatrixMode(GL_MODELVIEW)
....
An perspective projection can be set by gluPerspective:
w, h = 640, 480 # default pyglet window size
glMatrixMode(GL_PROJECTION)
glLoadIdentity()
gluPerspective( 90.0, 640.0/480, 0.1, 1000.0) # fov = 90 degrees; [near, far] = [0.1, 1000]
glMatrixMode(GL_MODELVIEW)
....
I recommend to use the following coordinates, to "see" the points in both of the above cases:
e.g.:
pyglet.graphics.draw(2, pyglet.gl.GL_POINTS,
('v3f', (-50.0, -20.0, -200.0, 40.0, 20.0, -250.0)))
glPointSize(20.0)
I'm developping a 2D game engine, using PyOpenGL.
For coherence with previous versions of my engine, that used SDL, the graphic elements are first stored in a VBO, with a 2D coordinate system where (0, 0) is top-left and (640, 448) is bottom-right (so y axis is reversed). Let's call it SDL-coordinates.
Since my graphics use palette effects, I rendered them with shaders. My vertex shader simply convert my 2D coordinate system to the [-1;1] cube.
Now, I need to clip the display. My first idea was to do it via the pixel shader, by sending all vertices outside the clipping zone to a point outside the [-1 ; 1] cube (I took (2.0, 0.0, 0.0, 1.0)) but it went wrong : it deformed squared tiles which had some of their edges outside the clipping zone but not all.
So I consider using glFrustum, but I don't understand in which coordinate system I must specify the params.
In fact, I tried to put more or less anything as parameters without noticing anything when running the code. What am I doing wrong ?
For the moment, my drawing routine looks like that :
def draw(self):
glClearColor(1.0, 1.0, 0.0, 1.0)
glClear(GL_COLOR_BUFFER_BIT)
glEnable( GL_TEXTURE_2D )
glActiveTexture( GL_TEXTURE0 )
glBindTexture( GL_TEXTURE_2D, self.v_texture)
glEnable( GL_TEXTURE_1D )
glActiveTexture( GL_TEXTURE1 )
shaders.glUseProgram(self.shaders_program)
shaders.glUniform1i(self.texture_uniform_loc, 0)
shaders.glUniform1i(self.palette_uniform_loc, 1)
shaders.glUniform2f(self.offset_uniform_loc, 0, 0)
shaders.glUniform4f(self.color_uniform_loc, 1, 1, 1, 1)
# Draw layers
for layer in self.layers: #[0:1]:
layer.draw()
shaders.glUseProgram( 0 )
pygame.display.flip()
In class Layer:
def draw(self):
glFrustum(0.0, 0.5, 0.0, 0.5, 0.1, 1.0) # I tried anything here...
# offset is an offset to add to coordinates (in SDL_coordinates)
shaders.glUniform2f(self.vdp.offset_uniform_loc, self.x, self.y)
# color is likely irrelevant here
shaders.glUniform4f(self.vdp.color_uniform_loc, *self.color_modifier)
glBindTexture( GL_TEXTURE_1D, self.palette.get_id())
self.vbo.bind()
glEnableClientState(GL_VERTEX_ARRAY)
glEnableClientState(GL_TEXTURE_COORD_ARRAY)
glVertexPointer(3, GL_FLOAT, 20, self.vbo)
glTexCoordPointer(2, GL_FLOAT, 20, self.vbo + 12)
glDrawArrays(GL_QUADS, 0, len(self.vbo))
self.vbo.unbind()
glDisableClientState(GL_TEXTURE_COORD_ARRAY)
glDisableClientState(GL_VERTEX_ARRAY)
Note : I must say that I'm new to OpenGL. I learnt by reading tutorials and was quite confused with the 'old' and 'new' OpenGL.
I felt like frustum was more 'old' OpenGL, like many of tranformation matrix manipulation (most of it can be handled by vertex shaders). I may be totally wrong at that and glFrustum (or something else) may be unavoidable in my case. I'd like to read an article about what can be totally forgotten in 'old' OpenGL.
Unless you are using the built-in matrices (gl_ModelViewProjectionMatrix etc.) in your shaders, glFrustum won't do anything. If you aren't using those matrices, don't start using them now (you are correct that this is part of 'old' OpenGL).
It sounds like you want to use glScissor which defines a clipping rectangle in window coordinates (note that the origin of these is at the lower-left of the window). You have to enable it with glEnable(GL_SCISSOR_TEST).
As far as articles about what can be totally forgotten in 'old' OpenGL: Googling "Modern OpenGL", or "OpenGL deprecated" should give you a few starting points.
Overview:
I am trying to create a 3D application similar to this:
www.youtube.com/watch?v=h9kPI7_vhAU.
I am using OpenCV2.2, Python2.7 and pyOpenGL.
This can be achieved by this background maths and code snippet where x, y, z are the positions of the viewers eye (as grabbed from a webcam!)
Issue:
When I do this, the object (a cube) that I have rendered becomes stretched along the z axis (into the screen) and I'm not too sure why. It is likened to looking down a very tall skyscraper from above (as opposed to a cube). The cube's position changes very rapidly in the z direction as the z position of the eye changes. This is a frame of the result, it has been stretched!
Code (with bigD's edit):
def DrawGLScene():
#get some parameters for calculating the FRUSTUM
NEAR_CLIPPING_PLANE = 0.01
FAR_CLIPPING_PLANE = 2
window = glGetIntegerv(GL_VIEWPORT)
WINDOW_WIDTH = window[2]
WINDOW_HEIGHT= window[3]
#do facial detection and get eye co-ordinates
eye = getEye()
#clear window
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)
#before any projection transformation command comes these 2 lines:
glMatrixMode(GL_PROJECTION)
glLoadIdentity()
#transform projection to that of our eye
glFrustum(NEAR_CLIPPING_PLANE*(-WINDOW_WIDTH /2 - eye[0])/eye[2],
NEAR_CLIPPING_PLANE*( WINDOW_WIDTH /2 - eye[0])/eye[2],
NEAR_CLIPPING_PLANE*(-WINDOW_HEIGHT/2 - eye[1])/eye[2],
NEAR_CLIPPING_PLANE*( WINDOW_HEIGHT/2 - eye[1])/eye[2],
NEAR_CLIPPING_PLANE, FAR_CLIPPING_PLANE)
glMatrixMode(GL_MODELVIEW)
glLoadIdentity()
glTranslatef(-eye[0],-eye[1],-eye[2])
drawCube()
glutSwapBuffers()
an example of the data getEye() returns is:
[0.25,0.37,1] if viewers is has their face near lower left of screen and is 1m away
[-0.5,-0.1,0.5] if viewers is has their face near upper right of screen and is 0.5m away
The cube when drawn has height, width, depth of 2 and its centre at (0,0,0).
I will provide the full code if anyone wants to do a similar project and wants a kickstart or thinks that the issue lies somewhere else than code provided.
The reason why you're getting strange results is because of this:
glTranslatef(-eye[0],-eye[1],-eye[2])
This call should be made after
glMatrixMode(GL_MODELVIEW)
glLoadIdentity()
Because the projection matrix is ready as it is with your glFrustum call, if you multiply it by a translation matrix that won't make it a perspective projection matrix anymore. The modelview matrix has to describe all world AND camera transformations.
Also bear in mind that if the only transformation you do on your modelview matrix is a translation, then you will always be staring down the negative-Z axis.