Port desktop to web application (bioinformatic) - python

I want to port a few bioinformatic programs which I wrote for Windows OS to web applications. I'm using a few bioinformatic packages like BLAST, Bowtie or Primer3. These external tools usually take a file which the user provides, processes it and creates an output file which I parse and display. In addition these tools are using specific databases, which are created and reused by the user.
Up to now I was saving the databases created by the tools (the file is also provided by the user) and the output results on the PC where my software is installed. Now, I do not know how to handle such a setup on a web server. I cannot save all the databases created by the users from all over the world, but at the same time it is quite nasty to create a database again every time (e.g. the human genome db is 2.7 GB and takes some time to create it) when the user comes back (I guess one user creates about 5-10 databases per tool; I have 3 tools: 1 MB - 50 GB).
How can this problem be solved with web apps?
Edit
To make things more clear, I want actually only to know whether there is a more sophisticated way to reuse data which the user creates. I was thinking about to store those files temporally for a session. There is no possibility to ask for charging because those tools are quite specific and I don't have many users. In addition most users are close colleagues. After years fighting with different OS, debugging and maintaining my programs, I finally give up (I do this in my private time), it is simply to time consuming (in addition I have some request for Linux, Android and IOS).
Thanks

Related

How to mimize download time for a single .jpg file download?

I am attempting to download a small image file (e.g. https://cdn4.telesco.pe/file/some_long_string.jpg) in the shortest possible time.
My machine pings 200ms to the server, but I'm unable to achieve better than 650ms.
What's the science behind fast-downloading of a single file? What are the factors? Is a multipart download possible?
I find many resources for parallelizing downloads of multiple files, but nothing on optimizing for download-time on a single file.
It is not so easy to compare those two types of response time...
The commandline "machine ping" is a much more "lowlevel" and fast type of response in the network architecture between two devices, computers or servers.
With a python-script that asks for a file on a remote webserver you have much more "overhead" in the request where every instance consumes some milliseconds, like the speed of your local python runtime, the operating system from you and the remote server (win/osx/linux), the used webserver and its configuration (apache/iis/ngix) etc.

web service for recommendation system

I'm trying to build a recommendation system with python using lightfm library and an api created with Flask framework.
My question is more design related than coding.
The webservice which will be called when a user logs in the website, recieves a json with userid and return a json with userid and 5 product sku to be recommended.
My desire is to save those recommendations in a DB. I want to do that because in this way I can see and comparing this table with other tables in DB and find out if a user has purchased the product that I recommended.
My concern (maybe it's stupid) is that everything will slow down if I open a connection to DB and write data in it.
Potentially the service can be called between 5k to 7k times per day.
Thanks
What I've understood from your explanation is that you will be comparing the actual selected data by the user and the ones you recommended. So, considering you are comparing every week once, it won't affect much of your processing.
Your concern is, would everything slow down if a DB connection is opened?
It won't slow down the service. Considering the usage of service of 5k times per day, other major factors are there which will slow the service down or will cause it to stop. Like when the number of users is too high, one python process will fail.
What you need to do here is, use a web application server like Gunicorn or uwsgi Using Gunicorn with Flask
This way, what gunicorn does is it starts multiple python processes running flask so it will support a high number of concurrent users.

Is it possible to install MySQL Server along with standalone client-side software?

I'm writing an application for a venue that will have large-scale competitions. In order to effectively manage those competitions, multiple employees need to engage with and modify a set of data in real-time, from multiple machines in the gym. I have created a Python application which accomplishes this by communicating with a MySQL server (which allows as many instances of the application as necessary to communicate with it).
Is there a nice way to get MySQL server installed on a client machine along with this Python application (It only necessarily needs to end up on one machine)? Perhaps is there a way to wrap the installers together? Am I asking the right question? I have no experience with application distribution, and I'm open to all suggestions.
Thanks.
The 'normal' way to do it is to have a network setup (ethernet and/or wireless) to connect many Clients (with your Python app) to a single Server (with MySQL installed).
To have the "Server" distributed among multiple machines becomes much messier.
Plan A: One Master replicating to many Slaves -- good for scaling reads, but not writes.
Plan B: Galera Cluster -- good for writing to multiple instances; repairs itself in some situations.
But if you plan on having the many clients go down a lot, you are better off having a single server that you try to keep up all the time and have a reliable network so that the clients can get to that on server.

GUI for python app that uses interactive broker API that will eventually run on EC2

I have an Interactive Brokers [IB] account and am using the IB API to make an automated trading system in python. Version 1.0 is nearing the test stage.
I am thinking about creating a GUI for it so that I can real-time watch various custom indicators and adjust trading parameters. It is all (IB TWS/IB Gateway and my app) running on my local windows 10 pc (I could run it on Ubuntu if that made it easier) with startup config files presently being the only way to adjust parameter and then watch the results scroll by on the console window.
Eventually I would like to run both IB TWS/IB Gateway and the app on Amazon EC2/AWS and access it from anywhere. I only mention this as may be a consideration on how to setup the GUI now to avoid having to redo it then.
I am not going to write this myself and will contract someone else to do it. After spending 30+ hrs researching this I still really have no idea on what the best way would be to implement this (browser based, standalone app, etc.) and/or what skills the programmer would need for me to describe the job.
An estimate on how long it would take to get a bare bones GUI real-time displaying data from my app and real-time sending inputs back to my app would be additionally helpful.
The simplest and quickest way will probably be to add GUI directly to your Python App. If you don't need it to be pretty or to run on mobile, I'd say go with TKinter for simplicity. Then, connect to wherever the App is located and control it remotely.
Adding another component that will communicate with your Python App introduces a higher level of complexity which I think is redundant in this case.
You didn't specify in details what kind of data you will require the app to display. If this includes any form of charting, I'd use an existing charting software such as Ninjatrader / Multicharts / Sierracharts to run my indicators and see the positions status, and restrict the GUI of the python app to adjusting the trading parameters and reporting numerical stats.

Sharing tables across multiple sites

this is more of an architecture question which I can't solve this properly as I don't have enough experience with such architecture... I'm currently running the solution with Python and SqlAlchemy, but the question is generic and the answer doesn't have to address those technologies.
I will try to explain it on an example of public library. So imagine having a public library, with server holding tables with all the books, scans (large binary images), users. I've already made a client and server parts which work great, but locally for a single library.
Now I would like to have this of server and clients for another public library (and later more public libraries to come). Having a local server for each library is desired as there is much data to be transferred to and from local server.
The complication comes from the requirement to be able to share users (with their member cards) between libraries - if user comes and registers at library A, he should be able to go to library B without the need for new registration. There's no need for being able to see other user data in the library he wasn't registered in the first place, just hist member account (id, login and password).
The simple solution would be:
having large data on local server
having users on cloud (some public server on internet)
The problem is that there are queries (for statistics, views, and so on), which run on local server and need accessing users, so I can't have users on a different server and database, because I couldn't then do select + join on such an architecture.
The solution which is left behind by previous developer and which other developers think is wrong, is to have the users table set up as replicated table (MariaDB + Galera), so it would end up having users table the same on cloud and each library site, so the previous code would work as if everything is just local, while sharing the users on the background with other libraries.
One of the problems with this is that the current version of our database (MariaDB) doesn't support (or has broken) partial replication (only some tables or some databases), so it would need patching of the MariaDB and distributing this patched version of database server to cloud and other sites, which stinks of various problems now and in the future, when new version of MariaDB will come out.
What would be the proper way of sharing these users between sites, while retaining the ability to do local selects and joins with the user table?
(Maybe there's a known design / architecture pattern for this, but I just don't know what to search for as I'm new to this.)
Thanks,
Miro
schema - sharing table between sites
Start with a single-source-of truth for the user registrations. That is one server (or Galera cluster, for HA) somewhere (in HQ, in Cloud, wherever). Login queries remotely access that server.
Think about any place you log in -- you are going to some central cite. My point is, that is the way everyone does it because it is fast, reliable, efficient, etc, with today's networks.
Next, what about images, etc? If they are shared across your sites, you may as well do them the same way. Look at any search engine for the last two decades -- images (etc) are fetched from a single site. (Actually a small number of sites, for redundancy, etc). Even the biggest web providers have no more than perhaps a dozen datacenters to service the entire world.
After that, you need to decide on Cloud vs dedicated (or even run your own datacenter).
For HA, Cloud providers do a lot. For do-it-yourself, there are various replication scenarios, Galera being one of the best (today). For true HA, you need two copies of your data geographically separated -- to protect from hurricanes, fires, floods, earthquakes, etc. Consider a WAN deployment of Galera, or some asynchronouse replication (possibly even between two Galera clusters.
Another choice is whether the Users and Images tables need to be on separate servers. Only if the traffic and size are high do you need to consider separating them. For a huge Image library, you may need a large number of servers, at which point, they should probably living on servers with the sole purpose of delivering images -- no Users, no HTML pages, etc. Even the "meta" info about images could be elsewhere in MySQL; the Images are in files and just a web server tuned to deliver images runs. (I can think of multiple 'big guys' that do it this way.)

Categories