I'm trying to save a image after a process with opencv in a Python Blumix server but an error occurs. I'm reading a local image that I upload when push the app, but I can't write in the same location.
image = cv2.imread('static/images/image.jpg') #works
# do some process to the image...
cv2.imwrite('static/images/processed.jpg', processed_image) #works in local but not in server
Do I need some permissions? Special folder? The server is read only?
Thanks in advance
I would probably look to using object storage for the image. This will give you more options if you want to later split your code into microservices. Using local storage is also not advised:
Avoid Writing to the Local File System
Applications running on Cloud
Foundry should not write files to the local file system for the
following reasons:
Local file system storage is short-lived. When an application instance
crashes or stops, the resources assigned to that instance are
reclaimed by the platform including any local disk changes made since
the app started. When the instance is restarted, the application will
start with a new disk image. Although your application can write local
files while it is running, the files will disappear after the
application restarts.
Instances of the same application do not share a
local file system. Each application instance runs in its own isolated
container. Thus a file written by one instance is not visible to other
instances of the same application. If the files are temporary, this
should not be a problem. However, if your application needs the data
in the files to persist across application restarts, or the data needs
to be shared across all running instances of the application, the
local file system should not be used. We recommend using a shared data
service like a database or blobstore for this purpose.
Source: https://docs.cloudfoundry.org/devguide/deploy-apps/prepare-to-deploy.html#filesystem
Related
I have a Flask application deployed on GKE, duplicated in several pods.
For performance issues, I want this app to read local files from the app repository.
I need to update these files frequently with another Python script.
I could implement an endpoint to fetch the refreshed files from a server but if I ping this endpoint only one pod will update the local files, and I need all app instances to read from latest data files.
Do you have an idea on how to solve this issue?
There are multiple solutions, the best one is to set up a shared NFS volume for those files: The Flask Pods mount it read-only, the Python script mounts it read and write, and updates the files.
An NFS volume allows an existing NFS (Network File System) share to be mounted into a Pod. Unlike emptyDir, which is erased when a Pod is removed, the contents of an NFS volume are preserved and the volume is merely unmounted. This means that an NFS volume can be pre-populated with data, and that data can be shared between pods.
Reference link to create kubernetes NFS volume on GKE and advantages.
I have a small Node.js / Express app deployed to Heroku.
I'd like to use a lightweight database like NeDB to persist some data. Is it possible to periodically backup / copy a file from Heroku if I used this approach?
File-based databases aren't a good fit for Heroku due to its ephemeral filesystem (bold added):
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During the dyno’s lifetime its running processes can use the filesystem as a temporary scratchpad, but no files that are written are visible to processes in any other dyno and any files written will be discarded the moment the dyno is stopped or restarted. For example, this occurs any time a dyno is replaced due to application deployment and approximately once a day as part of normal dyno management.
Depending on your use case I recommend using a client-server database (this looks like a good fit here) or something like Amazon S3 for file storage.
What is the best method to grab files from a Windows shared folder on the same network?
Typically, I am extracting data from SFTPs, SalesForce, or database tables, but there are a few cases where end-users need to upload a file to a shared folder that I have to retrieve. My process up to now has been to have a script running on a Windows machine which just grabs any new/changed files and loads them to an SFTP, but that is not ideal. I can't monitor it in my Airflow UI, I need to change my password on that machine physically, mapped network drives seem to break, etc.
Is there a better method? I'd rather the ETL server handle all of this stuff.
Airflow is installed on remote Linux server (same network)
Windows folders are just standard UNC paths where people have access based on their NT ID. These users are saving files which I need to retrieve. These users are non-technical and did not want WinSCP installed to share the data through an SFTP instead or even a Sharepoint (where I could use Shareplum, I think).
I would like to avoid mounting these folders and instead use Python scripts to simply copy the files I need as per an Airflow schedule
Best if I can save my NT ID and password within an Airflow connection to access it with a conn_id
If I'm understanding the question correctly, you have a shared folder mounted on your local machine — not the Windows server where your Airflow install is running. Is it possible to access the shared folder on the server instead?
I think a file sensor would work your use case.
If you could auto sync the shared folder to a cloud file store like S3, then you could use the normal S3KeySensor and S3PrefixSensor that are commonly used . I think this would simplify your solution as you wouldn't have to be concerned with whether the machine(s) the tasks are running on has access to the folder.
Here are two examples of software that syncs a local folder on Windows to S3. Note that I haven't used either of them personally.
https://www.cloudberrylab.com/blog/how-to-sync-local-folder-with-amazon-s3-bucket-with-cloudberry-s3-explorer/
https://s3browser.com/amazon-s3-folder-sync.aspx
That said, I do think using FTPHook.retrieve_file is a reasonable solution if you can't have your files in cloud storage.
I am writing an python application which reads/parses a file of this this kind.
myalerts.ini,
value1=1
value2=3
value3=10
value4=15
Currently I store this file in local filesystem. If I need to change this file I need to have physical access to this computer.
I want to move this file to cloud so that I can change this file anywhere (another computer or from phone).
If this application is running on some machine I should be able to change this file on the cloud and the application which is running on another machine which I don't have physical access to will be able to read updated file.
Notes,
I am new to both python and aws.
I am currently running it on my local mac/linux and planning on deploying on aws.
There are many options!
Amazon S3: This is the simplest option. Each computer could download the file at regular intervals or just before they run a process. If the file is big, the app could instead check whether the file has changed before downloading.
Amazon Elastic File System (EFS): If your applications are running on multiple Amazon EC2 instances, EFS provides a shared file system that can be mounted on each instance.
Amazon DynamoDB: A NoSQL database instead of a file. Much faster than parsing a file, but less convenient for updating values — you'd need to write a program to update values, eg from the command-line.
AWS Systems Manager Parameter Store: A managed service for storing parameters. Applications (anywhere on the Internet) can request and update parameters. A great way to configure cloud-based application!
If you are looking for minimal change and you want it accessible from anywhere on the Internet, Amazon S3 is the easiest choice.
Whichever way you go, you'll use the boto3 AWS SDK for Python.
My site writes new .html files into /tmp after the dyno is created.
The cherrypy app is in /app due to the Heroku's structure.
This prevents me from routing the .html files created with Cherrypy. Any idea on how to do this?
Heroku's filesystem is ephemeral:
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During the dyno’s lifetime its running processes can use the filesystem as a temporary scratchpad, but no files that are written are visible to processes in any other dyno and any files written will be discarded the moment the dyno is stopped or restarted. For example, this occurs any time a dyno is replaced due to application deployment and approximately once a day as part of normal dyno management.
It's not meant for permanent storage, and anything you write out to disk can disappear at any moment.
If you need to write data out persistently you can use something like Amazon S3 or store it in a database.
Will it be possible to serve the code directly from db then? Assuming that i write the code into db?
Yes.
Heroku itself provides a PostgreSQL service and many others are available from the addons marketplace.