I am developing a service that receives a video as a parameter (Base64 file).
I want to start working with the video from memory since I already have it, and NOT write it to a file and then load it with cv2.VideoCapture().
Is there any way to do this?
Related
The task I want to accomplish is to send a copy of the opened file, transfer it to a location on the server, and for the fast render farm pc to open it, render the file, then close itself, essentially dumping all hardware intensive tasks onto one computer.
I also want to make sure that only one file is rendered/opened at a time.
What do I need to know to accomplish this ? How would you go about this ? It's about Maya Batch Rendering(.ma) as well as Nuke files (.nk)
You can try using socket library(pre-installed) and the flask library. With them you can enstablish a connection between two or more pcs.
For Flask here is a site that can help you
https://pythonbasics.org/flask-upload-file/#:~:text=It%20is%20very%20simple%20to,it%20to%20the%20required%20location.
For Socket here is another site
https://www.thepythoncode.com/article/send-receive-files-using-sockets-python
And I tou search on google or youtube you can find mano tutorial about it
Background
I finally convinced someone willing to share his full archival node 5868GiB database for free (which now requires to be built in ram and thus requires 100000$ worth of ram in order to be built but can be run on an ssd once done).
However he want to send it only through sending a single tar file over raw tcp using a rather slow (400Mps) connection for this task.
I m needing to get it on dropbox and as a result, he don’t want to use the https://www.dropbox.com/request/[my upload key here] allowing to upload files through a web browser without a dropbox account (it really annoyed him that I talked about using an other method or compressing the database to the point he is on the verge of changing his mind about sharing it).
Because on my side, dropbox allows using 10Tib of storage for free during 30 days and I didn’t receive the required ssd yet (so once received I will be able to download it using a faster speed).
The problem
I m fully aware of upload file to my dropbox from python script but in my case the file doesn t fit into a memory buffer not even on disk.
And previously in api v1 it wasn t possible to append data to an exisiting file (but I didn t find the answer for v2).
To upload a large file to the Dropbox API using the Dropbox Python SDK, you would use upload sessions to upload it in pieces. There's a basic example here.
Note that the Dropbox API only supports files up to 350 GB though.
I would like to run a program on my laptop (Gazebo simulator) and send a stream of image data to a GCE instance, where it will be run through an object-detection network and sent back to my laptop in near real-time. Is such a set-up possible?
My best idea right now is, for each image:
Save the image as a JPEG on my personal machine
Stream the JPEG to a Cloud Storage bucket
Access the storage bucket from my GCE instance and transfer the file to the instance
In my python script, convert the JPEG image to numpy array and run through the object detection network
Save the detection results in a text file and transfer to the Cloud Storage bucket
Access the storage bucket from my laptop and download the detection results file
Convert the detection results file to a numpy array for further processing
This seems like a lot of steps, and I am curious if there are ways to speed it up, such as reducing the number of save and load operations or transporting the image in a better format.
If your question is "is it possible to set up such a system and do those actions in real time?" then I think the answer is yes I think so. If your question is "how can I reduce the number of steps in doing the above" then I am not sure I can help and will defer to one of the experts on here and can't wait to hear the answer!
I have implemented a system that I think is similar to what you describe for research of Forex trading algorithms (e.g. upload data to storage from my laptop, compute engine workers pull the data and work on it, post results back to storage and I download the compiled results from my laptop).
I used the Google PubSub architecture - apologies if you have already read up on this. It allows near-realtime messaging between programs. For example you can have code looping on your laptop that scans a folder that looks out for new images. When they appear it automatically uploads the files to a bucket and once theyre in the bucket it can send a message to the instance(s) telling them that there are new files there to process, or you can use the "change notification" feature of Google Storage buckets. The instances can do the work, send the results back to the storage and send a notification to the code running on your laptop that work is done and results are available for pick-up.
Note that I set this up for my project above and encountered problems to the point that I gave up with PubSub. The reason was that the Python Client Library for PubSub only supports 'asynchronous' message pulls, which seems to mean that the subscribers will pull multiple messages from the queue and process them in parallel. There are some features to help manage 'flow control' of messages built into the API, but even with them implemented I couldn't get it to work the way I wanted. For my particular application I wanted to process everything in order, one file at a time because it was important to me that I'm clear what the instance is doing and the order its doing it in. There are several threads on google search, StackOverflow and Google groups that discuss workarounds for this using queues, classes, allocating specific tasks for specific instances, etc which I tried, but even these presented problems for me. Some of these links are:
Run synchronous pull in PubSub using Python client API and pubsub problems pulling one message at a time and there are plenty more if you would like them!
You may find that if the processing of an image is relatively quick, order isn't too important and you don't mind an instance working on multiple things in parallel that my problems don't really apply to your case.
FYI, I ended up just making a simple loop on my 'worker instances' that scans the 'task list' bucket every 30 seconds or whatever to look for new files to process, but obviously this isn't quite the real-time approach that you were originally looking for. Good luck!
I'm working on a simple app that takes images optimizes them and saves them in cloud storage. I found an example that takes the file and uses PIL to optimize it. The code looks like this:
def inPlaceOptimizeImage(photo_blob):
blob_key = photo_blob.key()
new_blob_key = None
img = Image.open(photo_blob.open())
output = StringIO.StringIO()
img.save(output,img.format, optimized=True,quality=90)
opt_img = output.getvalue()
output.close()
# Create the file
file_name = files.blobstore.create(mime_type=photo_blob.content_type)
# Open the file and write to it
with files.open(file_name, 'a') as f:
f.write(opt_img)
# Finalize the file. Do this before attempting to read it.
files.finalize(file_name)
# Get the file's blob key
return files.blobstore.get_blob_key(file_name)
This works fine locally (although I don't know how well it's being optimized because when I run the uploaded image through something like http://www.jpegmini.com/ it gets reduced by 2.4x still). However when I deploy the app and try uploading images I frequently get 500 errors and these messages in the logs:
F 00:30:33.322 Exceeded soft private memory limit of 128 MB with 156 MB after servicing 7 requests total
W 00:30:33.322 While handling this request, the process that handled this request was found to be using too much memory and was terminated. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may have a memory leak in your application.
I have two questions:
Is this even the best way to optimize and save images in cloud storage?
How do I prevent these 500 errors from occurring?
Thanks in advance.
The error you're experiencing is happening due to the memory limits of your Instance class.
What I would suggest you to do is to edit your .yaml file in order to configure your module, and specify your Instance class to be F2 or higher.
In case you are not using modules, you should also add “module: default” at the beginning of your app.yaml file to let GAE know that this is your default module.
You can take a look to this article from the docs to see the different Instance classes available, and the way to easily configure them.
Another more basic workaround would be to limit the image size when uploading it, but you will eventually finish having a similar issue.
Regarding the previous matter and a way to optimize your images, you may want to take a look at the App Engine Images API that provides the ability to manipulate image data using a dedicated Images service. In your case, you might like the "I'm Feeling Lucky" transformation. By using this API you might not need to update your Instance class.
I'm working on a project that involves streaming .OGG (or .mp3) files from my webserver. I'd prefer not to have to download the whole file and then play it, is there a way to do that in pure Python (no GStreamer - hoping to make it truly cross platform)? Is there a way to use urllib to download the file chunks at a time and load that into, say, PyGame to do the actual audio playing?
Thanks!
I suppose your server supports Range requests. You ask the server by header Range with start byte and end byte of the range you want:
import urllib2
req = urllib2.Request(url)
req.headers['Range'] = 'bytes=%s-%s' % (startByte, endByte)
f = urllib2.urlopen(req)
f.read()
You can implement a file object and always download just a needed chunk of file from server. Almost every library accepts a file object as input.
It will be probably slow because of a network latency. You would need to download bigger chunks of the file, preload the file in a separate thread, etc. In other words, you would need to implement the streaming client logic yourself.