I'm using a windows build of gphoto2 to generate a byte stream. Take the byte stream and look for the jpeg headers (ff d8) and footer (ff d9) and display a single image from the stream. Whenever I pass the parsed byte string into imdecode it returns None. I pass all of the data including the ff d8/ ff d9 into imdecode.
pipe = sp.Popen('gphoto2 --stdout-size --capture-movie', stdout = sp.PIPE)
founda=False
foundb=False
bytesl=b''
while True:
bytesl=bytesl+pipe.stdout.readline()
if ~founda:
a = bytesl.find(b'\xff\xd8') # JPEG start
bytesl = bytesl[a:]
if a!=-1:
founda=True
if founda and ~foundb:
b = bytesl.find(b'\xff\xd9') # JPEG end
if a!=-1 and b!=-1:
foundb=True
if founda and foundb:
jpgi = bytesl[:b+2]
imfbuffh = cv2.imdecode(np.frombuffer(jpgi, dtype=np.uint8),cv2.IMREAD_COLOR)
I keep getting nothing from imdecode and I'm not sure why. The byte string appears to correctly parse the data. Any help would be greatly appreciated.
Edit:
Something else I've noticed is if I just read a JPG from a file and I do a np.shape on the object from np.buffer I report something like (140000,1) versus when i do the np.shape when I'm reading it from the byte string I get (140000,). I've tried expanding the dimensions but that didn't work.
Edit2:
Well I realized that the header for the mjpeg is not just a standard jpeg header. I'm not sure how to convert it to the standard format. If anyone has any tips that would be great.
Edit3:
I simplified the output and write to file code to just read the pipe data.
I have two test cases one where I use --capture-movie 2 and one where I use --capture-image-and-download so that in the first case I capture 2 frames of MJPEG data and another where I capture 1 frame of jpeg data. I tried to display the data for both cases with my previous code and they failed to display the image even if I just wait for the stdout to finish rather than reading the data in real time.
Here is the code I used to just to write the bytes to a byte file. In my previous comment I was just recording the byte string from a print statement (stupid I know I'm not very good at this). Should be noted I think these byte strings need to be decoded.
pipe = sp.Popen('gphoto2 --stdout-size --capture-movie 2', stdout = sp.PIPE)
pipedata=pipe.stdout.read()
f = open('C:\\Users\\Work\\frame2out.txt', 'wb')
f.write(pipedata)
Attached are links to the two cases.
2 Frames from --capture-movie
https://www.dropbox.com/s/3wvyg8s1tflzwaa/frame2out.txt?dl=0
Bytes from --capture-image-and-download
https://www.dropbox.com/s/3arozhvfz6a77lr/imageout.txt?dl=0
Related
I want to play a base64-encoded sound in Python, I've tried using Pygame.mixer but all I get is a hiss of white noise.
This is an example of my code:
import pygame
coinflip = b'data:audio/ogg;base64,T2dnUwACAAAAAAAAAACYZ...' # Truncated for brevity
flip = pygame.mixer.Sound(coinflip)
ch = flip.play()
while ch.get_busy():
pygame.time.wait(100)
The pygame mixer works well if I import a wav/mp3/ogg file, but I want to write a compact self-contained program that doesn't need external files, so I'm trying to embed a base64 encoded version of the sound in the Python code.
NB: The solution doesn't need to be using pygame, but it would be preferable since I'm already using it elsewhere in the program.
The reason you hear white noise is because you try to play audio data with a diffrent encoding then expected.
I think the documentation is not 100% clear about this, but it states that a Sound object represents actual sound sample data. It can be loaded from a file or a buffer. Apparently, when using a buffer, it does expect raw sample data, not some base64-encoded data (and not even raw MP3 or OGG file data).
Note that there has been an issue reported about this on the GitHub repository.
So there are two things you can do:
Get the raw bytes of your sound (e.g. using pygame.mixer.Sound(filename).get_raw(), or for simple sounds you could create them mathematically) and decode that in base64 format.
Wrap the original (MP3/OGG encoded) file data in a BytesIO object, which is a file-like object, so the Sound module will treat it like a file and properly decode it.
Note that in both cases, you still need to base64-decode the data first! The pygame module doesn't automatically do that for you.
Since you want a small file, option 2 is the best. But I'll give examples of both solutions.
Example 1
If you have the raw sample data, you could use that directly as the buffer argument for pygame.mixer.Sound(). Note that the sample data must match the frequency, bit size and number of channels used by the mixer. The following is a small example that plays a 400 Hz sine wave tone.
import base64
import pygame
# The following bytes object consists of 160 signed 8-bit samples,
# which are base64 encoded, When played at 8000 Hz, it results in a
# tone of 400 Hz. The duration of the sound is 0.02 Hz, so it should
# be looped 50 times per second for longer sounds.
base64_encoded_sound_data = b'''
gKfK5vj/+ObKp39YNRkHAQcZNViAp8rm+P/45sqngFg1GQcBBxk1WI
Cnyub4//jmyqeAWDUZBwEHGTVYf6fK5vj/+ObKp39YNRkHAQcZNViA
p8rm+P/45sqngFg1GQcBBxk1WH+nyub4//jmyqd/WDUZBwEHGTVYf6
fK5vj/+ObKp39YNRkHAQcZNViAp8rm+P/45sqnf1g1GQcBBxk1WA==
'''
pygame.mixer.init(frequency=8000, size=8, channels=1, allowedchanges=0)
sound_data = base64.b64decode(base64_encoded_sound_data)
sound = pygame.mixer.Sound(sound_data)
ch = sound.play(loops=50)
while ch.get_busy():
pygame.time.wait(100)
Example 2
If you want to use a MP3 or OGG file (which is generally much smaller), you could do it like the following example
import base64
import io
import pygame
# Your base64-encoded data here.
# NOTE: Do NOT include the "data:audio/ogg;base64," part.
base64_encoded_sound_file_data = b'T2dnUwACAAAAAAAAAACY...' # Truncated for brevity
pygame.mixer.init()
sound_file_data = base64.b64decode(base64_encoded_sound_file_data)
assert sound_file_data.startswith(b'OggS') # just to prove it is an Ogg Vorbis file
sound_file = io.BytesIO(sound_file_data)
# The following line will only work with VALID data. With above example data it will fail.
sound = pygame.mixer.Sound(sound_file)
ch = sound.play()
while ch.get_busy():
pygame.time.wait(100)
I would have preferred to use real data in this example as well, but the smallest useful Ogg file I could find was 9 kB, which would add about 120 long lines of data, and I don't think that is appropriate for a Stack Overflow answer. But if you replace it with your own data (which is hopefully a valid Ogg audio file), it should work.
s1 = wave.open('sound.wav', 'r')
frames_1 = []
for i in range(44100):
frames_1.append(struct.unpack('<h', s1.readframes(1)))
Basically, I am trying to unpack s1.readframes(1), which should return an integer value between ~-32000 and 32000. readframes() returns a byte object. It gives me the message "struct.error: unpack requires a buffer of 2 bytes".
I am not sure if I am even supposed to use struct.unpack() or if there is something else that I should do. I assume I am because to write the frames to the .wav file I have to first convert the frames to bytes using data = value.writeframesraw() and then use struct.pack('<h', data), however I am not sure if I am using unpack() correctly. If I try to change '<h' to something else, it just increases the number of bytes in the error message. Optimally I want to only use the wave library for this code, but if there are other libraries that I could import that can read and write .wav files easily then please recommend them.
I have a service that sends text to an external text to speech service that returns back audio in a response. This is how i access the audio:
res = requests.get(TTS_SERVICE_URL, params={"text":text_to_synth})
bytes_content = io.BytesIO(bytes(res.content))
audio = bytes_content.getvalue()
Now i would like to send multiple lines of text in different requests, and receive all the audio content in bytes, merge them into one audio and then display it, can anyone guide me as to how would i be able to merge the bytes_content into one audio byte stream
I got this to work, posting the answer here if someone else faces the same problem, solved it as such
Read the bytes_content into a numpy array using soundfile:
data, samplerate = sf.read(bytes_content)
datas.append(data)
where datas is an empty array where each file to be concatenated is added
Then combine the files again
combined = np.concatenate(datas)
and convert back to a byte stream if needed
out = io.BytesIO()
sf.write(out, combined, samplerate=samplerate, format="wav")
I am pretty sure that this isn't the right way to do things, but this is what worked for me
have created a rtsp client in python that receives a h264 stream and returns single h264 raw frames as a binary strings. I am trying to process each h264 frames on-the-fly.
I have unsuccessfully tried several ways to convert this frame into a numpy array for processing.
So far I know that cv2.VideoCapture only accepts a file name as it argument, not a frame neither a StringIO object (file like pointer to a buffer), but I need to pass to it my string.
I have also tried something like:
nparr = np.fromstring(frame_bin_str, np.uint8)
img_np = cv2.imdecode(nparr, cv2.CV_LOAD_IMAGE_COLOR)
tried diferent flags. but also failed miserably.
after many other failed attempts , I ran out of ideas.
To summarize what I need to do: I have a h264 raw frame in a variable and I need to create an openvc valid numpy array of it, or somehow end up with a VideoCapture object containing that single frame, so I can process the frame.
Any pointers would be much appreciated.
Hope this all makes sense.
Thank you in advance
As Micka suggested, there is no support for h264 RAW format in OpenCV and we should convert it ourselves.
I think you should be reshaping the nparr to the shape of the incoming image. Not necessary to do imdecode. Use imshow to display the result and verify.
Here is the code I used to convert a 16 bit RAW image (grayscale) in a similar way. I have renormalized my image before displaying.
framenp = np.fromstring(framestr, dtype=np.uint16).reshape((1024,1280))
#renormalizing to float
framenp = (framenp*1./framenp.max())
framenp.dtype = np.float
cv2.imshow('frame', cv2.resize(framenp, (640,480)))
I'm experiencing problems with image not being encoded properly in my custom multipart-form/data being posted.
After sending out the HTTP POST packet, I noticed the bytes representing the image is completely different. I did this comparison by capturing data packets for a working scenario (using the web browser) and using my python app.
There's no issues otherwise with how the multipart-form body is constructed, it's just that the image not being encoded properly for the body.
Here's what I did to open the image and prep it to be sent out:
image_data=(open('plane.jpg',mode='rb').read()) ## image_data is the jpeg in bytes ------------- fist bytes
body.append(str(image_data)) ## coverting the data to a string such that it can be appended to the body array. ------ bytes to string
body.append(CRLF)
body.append('--' + boundary + '--')
body.append(CRLF)
body=''.join(body)
## starting the post
unicode_data = body.encode('utf-8',errors='ignore') ## --------string encoded
multipart_header['content-length']=len(unicode_data)
req = urllib.request.Request('http://localhost/api/image/upload', data=unicode_data, headers=multipart_header) ## Packet sent here and the image section of the unicode_data looks wrong but the other sections look good.
Image being uploaded: http://tinypic.com/view.php?pic=5aq3w6&s=6
So what is the correct way to encode this image and append it as part of the body to be sent? I don't want to use any apis other than the ones that came with python 3.3. and would like to stay within urllib and urllib2
I tried appending the byte version of the image to the body but apparently string arrays can only contain strings which is why I created a new string with the image in bytes; I think this is where it goes down hill.
Thanks help is much appreciated!