have created a rtsp client in python that receives a h264 stream and returns single h264 raw frames as a binary strings. I am trying to process each h264 frames on-the-fly.
I have unsuccessfully tried several ways to convert this frame into a numpy array for processing.
So far I know that cv2.VideoCapture only accepts a file name as it argument, not a frame neither a StringIO object (file like pointer to a buffer), but I need to pass to it my string.
I have also tried something like:
nparr = np.fromstring(frame_bin_str, np.uint8)
img_np = cv2.imdecode(nparr, cv2.CV_LOAD_IMAGE_COLOR)
tried diferent flags. but also failed miserably.
after many other failed attempts , I ran out of ideas.
To summarize what I need to do: I have a h264 raw frame in a variable and I need to create an openvc valid numpy array of it, or somehow end up with a VideoCapture object containing that single frame, so I can process the frame.
Any pointers would be much appreciated.
Hope this all makes sense.
Thank you in advance
As Micka suggested, there is no support for h264 RAW format in OpenCV and we should convert it ourselves.
I think you should be reshaping the nparr to the shape of the incoming image. Not necessary to do imdecode. Use imshow to display the result and verify.
Here is the code I used to convert a 16 bit RAW image (grayscale) in a similar way. I have renormalized my image before displaying.
framenp = np.fromstring(framestr, dtype=np.uint16).reshape((1024,1280))
#renormalizing to float
framenp = (framenp*1./framenp.max())
framenp.dtype = np.float
cv2.imshow('frame', cv2.resize(framenp, (640,480)))
Related
I'm using a windows build of gphoto2 to generate a byte stream. Take the byte stream and look for the jpeg headers (ff d8) and footer (ff d9) and display a single image from the stream. Whenever I pass the parsed byte string into imdecode it returns None. I pass all of the data including the ff d8/ ff d9 into imdecode.
pipe = sp.Popen('gphoto2 --stdout-size --capture-movie', stdout = sp.PIPE)
founda=False
foundb=False
bytesl=b''
while True:
bytesl=bytesl+pipe.stdout.readline()
if ~founda:
a = bytesl.find(b'\xff\xd8') # JPEG start
bytesl = bytesl[a:]
if a!=-1:
founda=True
if founda and ~foundb:
b = bytesl.find(b'\xff\xd9') # JPEG end
if a!=-1 and b!=-1:
foundb=True
if founda and foundb:
jpgi = bytesl[:b+2]
imfbuffh = cv2.imdecode(np.frombuffer(jpgi, dtype=np.uint8),cv2.IMREAD_COLOR)
I keep getting nothing from imdecode and I'm not sure why. The byte string appears to correctly parse the data. Any help would be greatly appreciated.
Edit:
Something else I've noticed is if I just read a JPG from a file and I do a np.shape on the object from np.buffer I report something like (140000,1) versus when i do the np.shape when I'm reading it from the byte string I get (140000,). I've tried expanding the dimensions but that didn't work.
Edit2:
Well I realized that the header for the mjpeg is not just a standard jpeg header. I'm not sure how to convert it to the standard format. If anyone has any tips that would be great.
Edit3:
I simplified the output and write to file code to just read the pipe data.
I have two test cases one where I use --capture-movie 2 and one where I use --capture-image-and-download so that in the first case I capture 2 frames of MJPEG data and another where I capture 1 frame of jpeg data. I tried to display the data for both cases with my previous code and they failed to display the image even if I just wait for the stdout to finish rather than reading the data in real time.
Here is the code I used to just to write the bytes to a byte file. In my previous comment I was just recording the byte string from a print statement (stupid I know I'm not very good at this). Should be noted I think these byte strings need to be decoded.
pipe = sp.Popen('gphoto2 --stdout-size --capture-movie 2', stdout = sp.PIPE)
pipedata=pipe.stdout.read()
f = open('C:\\Users\\Work\\frame2out.txt', 'wb')
f.write(pipedata)
Attached are links to the two cases.
2 Frames from --capture-movie
https://www.dropbox.com/s/3wvyg8s1tflzwaa/frame2out.txt?dl=0
Bytes from --capture-image-and-download
https://www.dropbox.com/s/3arozhvfz6a77lr/imageout.txt?dl=0
In OpenCV it is possible to save an image to disk with a certain jpeg compression. Is there also a way to do this in memory? Or should I write a function using cv2.imsave() that loads the file and removes it again from disk? If anyone knows a better way that is also fine.
The use case is real-time data augmentation. Using something else than OpenCV would cause possibly unnecessary overhead.
Example of desired function im = cv2.imjpgcompress(90)
You can use imencode:
encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), 90]
result, encimg = cv2.imencode('.jpg', img, encode_param)
(The default value for IMWRITE_JPEG_QUALITY is 95.)
You can decode it back with:
decimg = cv2.imdecode(encimg, 1)
Snippet from here
What I mean by binary string, is the raw content of image file (That's what wand.image.make_blob() returns)
Is there a way to load it in OpenCV ?
Edit:
cv2.imdecode() doesn't work
img = cv2.imdecode( buf=wand_img.make_blob(), flags=cv2.IMREAD_UNCHANGED)
TypeError: buf is not a numpy array, neither a scalar
Have you tried cv2.imdecode which takes an image buffer and turns it into a CvMat object? Though I am not sure about this one.
See : http://docs.opencv.org/3.0-beta/modules/imgcodecs/doc/reading_and_writing_images.html
I am streaming some data down from a webcam. When I get all of the bytes for a full image (in a string called byteString) I want to display the image using OpenCV. Done fast enough, this will "stream" video from the webcam to an OpenCV window.
Here's what I've done to set up the window:
cvNamedWindow('name of window', CV_WINDOW_AUTOSIZE)
And here's what I do when the byte string is complete:
img = cvCreateImage(IMG_SIZE,PIXEL_DEPTH,CHANNELS)
buf = ctypes.create_string_buffer(byteString)
img.imageData = ctypes.cast(buf, ctypes.POINTER(ctypes.c_byte))
cvShowImage('name of window', img)
cvWaitKey(0)
For some reason this is producing an error:
File "C:\Python26\lib\site-packages\ctypes_opencv\highgui_win32.py", line 226, in execute
return func(*args, **kwargs)
WindowsError: exception: access violation reading 0x015399E8
Does anybody know how to do what I'm trying to do / how to fix this crazy violation error?
I actually solved this problem and forgot to post the solution. Here's how I did it, though it may not be entirely robust:
I analyzed the headers coming from the MJPEG of the network camera I was doing this to, then I just read from the stream 1 byte at a time, and, when I detected that the header of the next image was also in the bytestring, I cut the last 42 bytes off (since that's the length of the header).
Then I had the bytes of the JPEG, so I simply created a new Cv Image by using the open(...) method and passing it the byte string wrapped in a StringIO class.
Tyler:
I'm not sure what you are trying to do..i have a few guesses.
if you are trying to simply read an image from a webcam connected to your pc then this code should work:
import cv
cv.NamedWindow("camera", 1)
capture = cv.CaptureFromCAM(0)
while True:
img = cv.QueryFrame(capture)
cv.ShowImage("camera", img)
if cv.WaitKey(10) == 27:
break
are you trying to stream video from an internet cam?
if so, you should check this other post:
opencv-with-network-cameras
If for some reason you cannot do it in any of these ways then may be you can just somehow savethe image on the hard drive and then load it in your opencv program by doing a simple cvLoadImage ( of course this way is much slower).
another approach would be to set the new image pixels by hand by reading each of the values from the byteString, doing something like this:
for(int x=0;x<640;x++){
for(int y=0;y<480;y++){
uchar * pixelxy=&((uchar*) (img->imageData+img->widthStep*y))[x];
*pixelxy=buf[y*img->widthStep + x];
}
}
this is also slower but faster than using the hard drive.
Anyway, hope some of this helps, you should also specify which opencv version are you using.
I'm trying to write a video application in PyQt4 and I've used Python ctypes to hook into an old legacy video decoder library. The library gives me 32-bit ARGB data and I need to turn that into a QImage. I've got it working as follows:
# Copy the rgb image data from the pointer into the buffer
memmove(self.rgb_buffer, self.rgb_buffer_ptr, self.buffer_size)
# Copy the buffer to a python string
imgdata = ""
for a in self.rgb_buffer:
imgdata = imgdata + a
# Create a QImage from the string data
img = QImage(imgdata, 720, 288, QImage.Format_ARGB32)
The problem is that ctypes outputs the data as type "ctypes.c_char_Array_829440" and I need to turn it into a python string so that I can construct a QImage. My copying mechanism is currently taking almost 300ms per image so it's painfully slow. The decode and display part of the process is only taking about 50ms.
Can anyone think of any cunning shortcuts I can take to speed up this process and avoid the need to copy the buffer twice as I'm currently doing?
The ctypes.c_char_Array_829400 instance has the property .raw which returns a string possibly containing NUL bytes, and the property .value which returns the string up to the first NUL byte if it contains one or more.
However, you can also use ctypes the access the string at self.rgb_buffer_ptr, like this:
ctypes.string_at(self.rgb_buffer_ptr, self.buffer_size); this would avoid the need for the memmove call.