I'm currently running a t2.micro instance on EC2 right now. I have the html/web interface side of it working, along with a MySQL database.
The site allows users to register and stores them in the DB via a PHP script.
I want there to be an actual Python application that queries the MySQL database and returns user data, to then be executed in a Python script.
What I cannot find is whether I host this Python application as a totally separate instance or if it can exist on the same instance, in a different directory. I ultimately just need to query the database, which makes me thing it must exist on the same instance.
Could someone please provide some guidance?
Let me just be clear: this is not a Python web app. This Python backend is entirely separate except making queries against the database.
Either approach is possible, but there are pros & cons to each.
Running separate Python app on the same server:
Pros:
Setting up local access to the database is fairly simple
Only need to handle backups or making snapshots, etc. for a single instance
Cons:
Harder to scale up individual pieces if you need more memory, processing power, etc. in the future
Running the Python app on a separate server:
Pros:
Separate pieces means you can scale up & down the hardware each piece is running on, according to their individual needs
If you're using all micro instances, you get more resources to work with, without any extra costs (assuming you're still meeting all the other 'free tier eligible' criteria)
Cons:
In general, more pieces == more time spent on configuration, administration tasks, etc.
You have to open up the database to non-local access
Simplest: open up the database to access from anywhere (e.g. all remote IP addresses), and have the Python app log in via the internet
Somewhat safer, more complex: set the Python app server up with an elastic IP, open up the database to access only from that address
Much safer, more complex: set up your own virtual private cloud (VPC), and allow connections to the database only from within the VPC. You'd have to configure public access for each of the servers for whatever public traffic you'll have, presumably ports 80 and/or 443.
Related
I'm currently working on a website where i want the user to upload one or more images, my flask backend will do some changes on these pictures and then return them back to the front end.
Where do I optimally save these images temporarily especially if there are more then one user at the same time on my website (I'm planning on containerizing the website). Is it safe for me to save the images in the folder of the website or do I need e.g. a database for that?
You should use a database, or external object storage like Amazon S3.
I say this for a couple of reasons:
Accidents do happen. Say the client does an HTTP POST, gets a URL back, and does an HTTP GET to retrieve the result. But in the meantime, the container restarts (because the system crashed; your cloud instance got terminated; you restarted the container to upgrade its image; the application failed); the container-temporary filesystem will get lost.
A worker can run in a separate container. It's very reasonable to structure this application as a front-end Web server, that pushes messages into a job queue, and then a back-end worker picks up messages out of that queue to process the images. The main server and the worker will have separate container-local filesystems.
You might want to scale up the parts of this. You can easily run multiple containers from the same image; they'll each have separate container-local filesystems, and you won't directly control which replica a request goes to, so every container needs access to the same underlying storage.
...and it might not be on the same host. In particular, cluster technologies like Kubernetes or Docker Swarm make it reasonably straightforward to run container-based applications spread across multiple systems; sharing files between hosts isn't straightforward, even in these environments. (Most of the Kubernetes Volume types that are easy to get aren't usable across multiple hosts, unless you set up a separate NFS server.)
That set of constraints would imply trying to avoid even named volumes as much as you can. It makes sense to use volumes for the underlying storage for your database, and it can make sense to use Docker bind mounts to inject configuration files or get log files out, but ideally your container doesn't really use its local filesystem at all and doesn't care how many copies of itself are running.
(Do not rely on Docker's behavior of populating a named volume on first use. There are three big problems with it: it is on first use only, so if you update the underlying image, the volume won't get updated; it only works with Docker named volumes and not other options like bind-mounts; and it only works in Docker proper and not in Kubernetes.)
Other decisions are possible given other sets of constraints. If you're absolutely sure you will never ever want to run this application spread across multiple nodes, Docker volumes or bind mounts might make sense. I'd still avoid the container-temporary filesystem.
I'm writing an application for a venue that will have large-scale competitions. In order to effectively manage those competitions, multiple employees need to engage with and modify a set of data in real-time, from multiple machines in the gym. I have created a Python application which accomplishes this by communicating with a MySQL server (which allows as many instances of the application as necessary to communicate with it).
Is there a nice way to get MySQL server installed on a client machine along with this Python application (It only necessarily needs to end up on one machine)? Perhaps is there a way to wrap the installers together? Am I asking the right question? I have no experience with application distribution, and I'm open to all suggestions.
Thanks.
The 'normal' way to do it is to have a network setup (ethernet and/or wireless) to connect many Clients (with your Python app) to a single Server (with MySQL installed).
To have the "Server" distributed among multiple machines becomes much messier.
Plan A: One Master replicating to many Slaves -- good for scaling reads, but not writes.
Plan B: Galera Cluster -- good for writing to multiple instances; repairs itself in some situations.
But if you plan on having the many clients go down a lot, you are better off having a single server that you try to keep up all the time and have a reliable network so that the clients can get to that on server.
I need to develop an application that listens to a kafka topic and saves the data to a DB (cassandra). It will be a high density stream of data so saving the data will be resource expensive. Once the data is saved it will be queried and exposed through a REST API.
I see two options, but both of them have downsides:
Option 1
Create two services, each one in a separate docker container. One would run only the kafka listener process in python and the other one a flask web server.
Advantages: Every container runs only one process
Downsides: Both services connect to the same DB, which is not ideal according to the microservices pattern architecture, for the services are not completely decoupled.
Option 2
Run both, kafka listener and web service in one container.
Advantages: Just one service to connect to the DB.
Downsides: More than one process running in a single docker container, and one of them (saving and updating) would be a lot more resource expensive than the other, so it would not scale uniformly.
Is there another way to go that doesn't involve moving to a monolithic architecture? Or which one of them is the best practice?
Go with option 1. Use Docker Compose for setting up your containers:
One "service" for your Kafka consumer.
One "service" for your REST API process.
If you want to containerize your database, add a Cassandra container for that as well.
Using Docker Compose will allow you to spin up things together with one command, you can have dependencies and links (DNS name resolution) between your containers, centralized logging, etc. - it's ideal for cases like yours.
Separating the containers will allow you to scale, to control the lifecycle of your applications, and it will allow you start/stop/update each application individually. Also, you only need to run a single process per container, which is a proven and recommended best practice. It makes controlling the lifecycle of the container and the application easier, and it also keeps your container lean and easier to manage.
Example: What do you do if your Kafka listener goes down and the REST API keeps running? To fix this, you have to restart the whole container (unless you want to SSH into the container and restart one of the processes). One process per container makes this trivial - you restart just that container.
The fact that both are pointing to the same database is irrelevant - that is just something you'll have to live with if both services use the same data. The alternative would be to synchronize between two databases (one that the Kafka listener writes to, and one for the REST API). This would add more complexity and overhead. If you do a clean design, you can still add that later if you see a value in separating the data - I wouldn't worry about that initially.
I am working on scaling out a webapp and providing some database redundancy for protection against failures and to keep the servers up when updates are needed. The app is still in development, so I have chosen a simple multi-master redundancy with two separate database servers to try and achieve this. Each server will have the Django code and host its own database, and the databases should be as closely mirrored as possible (updated within a few seconds).
I am trying to figure out how to set up the multi-master (master-master) replication between databases with Django and MySQL. There is a lot of documentation about setting it up with MySQL only (using various configurations), but I cannot find any for making this work from the Django side of things.
From what I understand, I need to approach this by adding two database entries in the Django settings (one for each master) and then write a database router that will specify which database to read from and which to write from. In this scenario, both databases should accept both reads and writes, and writes/updates should be mirrored over to the other database. The logic in the router could simply use a round-robin technique to decide which database to use. From there on, further configuration to set up the actual replication should be done through MySQL configuration.
Does this approach sound correct, and does anyone have any experience with getting this to work?
Your idea of the router is great! I would add that you need automatically detect whether a databases is [slow] down. You can detect that by the response time and by connection/read/write errors. And if this happens then you exclude this database from your round-robin list for a while, trying to connect back to it every now and then to detect if the databases is alive.
In other words the round-robin list grows and shrinks dynamically depending on the health status of your database machines.
The another important notice is that luckily you don't need to maintain this round-robin list common to all the web servers. Each web server can store its own copy of the round-robin list and its own state of inclusion and exclusion of databases into this list. This is just because a database server can be seen from one web server and can be not seen from another one due to local network problems.
I am creating a Python application that uses embedded SQLite databases. The programme creates the db files and they are on a shared network drive. At this point there will be no more than 5 computers on the network running the programme.
My initial thought was to ask the user on startup if they are the server or client. If they are the server then they create the database. If they are the client they must find a server instance on the network. The one way I suppose is to send all db commands from client to server and server implements in the database. Will that solve the shared db issue?
Alternatively, is there some way to create a SQLite "server". I presume this would be the quicker option if available?
Note: I can't use a server engine such as MySQL or PostgreSQL at this point but I am implementing using ORM and so when this becomes viable, it should be easy to change over.
Here's a "SQLite Server", http://sqliteserver.xhost.ro/, but it looks like not in maintain for years.
SQLite supports concurrency itself, multiple processes can read data at one time and only one can write data into it. Also, When some process is writing, it'll lock the whole database file for a few seconds and others have to wait in the mean time according official document.
I guess this is sufficient for 5 processes as yor scenario. Just you need to write codes to handle the waiting.