Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I am developing a script to save a stream of prices received via websocket to a local database but I am facing some design dilemma.
The stream consists of 1 minute candlestick data of around 200 instruments received via a cryptocurrency exchange websocket. When closed new candles are received I want to save them to a local db.
I started building it in Python with a MySQL DB but I am dubious on feasibility.
Apologies for not posting code, this is more a design/architecture dilemma
Questions:
Can I save the messages directly to DB or would I need an
intermediate step? (Performance concern)
Is Python the best option or should I look at Javascript or some 3rd
party software?
Am I totally out of my mind for building this? I have a strategy
that needs historical data from many instruments and calling Rest
API is not possible as I would hit the rate limit therefore looking
a working with Websockets
Thank you in advance
1 - I don't know if it's the best idea to save 200 values per minute, you can store the average, the highest and the lowest values ?
If you want to store all values you can use influxdb :
InfluxDB Cloud is the most powerful time series database as a service — free to start, easy to use, fast, serverless, elastic scalability.
2 - I think python is appropriate for this use
3 - If you can't use API rest, websocket can be a good idea.
When I had the opportunity to work on massive temporal data (IOT sensors), I used influxdb for storage, and MQTT for communication
Yes, you can. As an JSON string or direct into the table, but that depends on the form of the data. If its format don't change I would save it directly into the table. When the format can change (often) I would do both and save the raw data as a kind of fail save.
Python should work. Inserting Data into a database should be no problem with most languages. Use the one you have the most experience/ feel most comfortable with or want to lern.
This kind of Programm is an good exercise for learning a new language or programming in general. And I don't think that 200 Datasets per minute is to much to handle. So I don't think your out of your mind, actually most programmers I know have build something like this at some point.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 days ago.
Improve this question
I am a sort of experienced python programmer. I will quickly describe my situation. For a hobby programming was always nice. I then started working at a company that did lots of manual excel processing. One day I mentioned that I could probably automate this with python.
Things led to another and now there is python doing the excel work multiple times a day running from an Intel NUC i deployed as a small server. It has been some work figuring everything out but the money has been good as well, no complaints.
They are quite happy with me and have lots of different plans.
They want me to design a website where the employees can fill out a form daily and the data can be used elsewhere. However, I've done some html and css programming in highschool, but I know there needs to be a back-end to at least save the data that gets filled.
I dont know where to start. I know SQL is the #1 language in data processing and PHP in handling the back-end. But I already know python which also can do back-end operations.
I have two direct questions but also looking for advice on the whole situation. Feel free to just point anything out; I will read every comment.
My questions:
Could I run the webserver from my Intel NUC? Or is this generally seen as bad practice? Also, is it true that I would only need the domain if I run the webserver myself?
Is it worth it to learn SQL and PHP or should I stick to python?
I have tried looking online but found countless of resources. I would like to create a large database with lots of data I can use anytime. I think SQL is good for this but not looking to waste time.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I want to create two microservices in python, one posts data into the database every minute and the other will process the data once it's posted into the database. I would like to know what would be an ideal architecture for this? How can this be done in python?
This sounds a lot like something that should be solved using the CQRS pattern. One service is responsible for updating the database and the other one is responsible for utilizing the data. This way you are separating the update and read operations making it very scalable.
I'm a big fan of an event-driven architecture when it makes sense, and since you are talking about RabbitMQ in your first solution, then I would probably continue down that path.
I would use two different topic types. One for commands and one for events. Commands would be things like "update entity" or whatever makes sense in your case. The events are things that happened like "entity updated". Your first service should subscribe to the relevant commands and send out an event after the operation is complete. The second service would subscribe to that event and do the processing that it is supposed to do.
Also a quick note on message queues. There are a lot of different message queues out there. RabbitMQ is a solid but old choice so you might benefit from one of the other options. I personally like Kafka a lot but things like Redis or the ones provided by cloud services like Azure or AWS along with many others.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I'm currently writing a chat-application for my Website and storing all the Messages in one database, by referring to sender and receiver with a Foreign-key.
Is this really the smartest Idea? I think the site could become slow when everybody is trying to access the same database or when the number of sent messages gets big. Would it be smarter/faster, to use one database per user, storing his/her messages there? If yes, am i right that this is not really possible to achieve this with Django?
Thanks for your Answer!
Would it be smarter/faster, to use one database per user
Probably not, because you will still have the same quantity of data and just have the overhead of have multiple database instances
this is not really possible to achieve this with Django?
Django is not a database manager, so the question does not really make sense. For example, Oracle or SQLite allow a single query to access multiple databases. So if you used Oracle with Django, you could have a different database per user. Anyway, even if by default Django only uses one single database connection, it can be configured to use multiple databases (thanks to #brunodesthuilliers for the info).
Database design is a rather complex operation: you have to design the tables to structure the data, the indexes, to speed up some accesses and/or enforce uniqueness, and optionaly views to easy request writing. Once this is done (and the database is used), you will have to define recurrent tasks to cleanup the database (index maintenance, archivage and purge of archived data).
If you intend to use a large farm of servers to be able to handle thousands of simultaneous accesses, a SQL database can indeed become a bottleneck, and you should considere NoSQL databases to dispatch the load on independant servers. But in this use case, Django would just be a casting error. Anyway, I know no example of real world use case where multiple databases are used just for performance reasons.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm a CS student interning at a company that needs a web app made to make looking at data much easier for the end user. I'm very new to web dev. I have experimented a little bit with HTML and CSS but I have never touched Javascript.
Anyway, my company has a web API that I have access to that returns a bunch of data points in JSON format. Upon doing some research online it seems that utilizing something like Django, Node.js, or Rails would be the best option to parse these JSON strings and return the data that I am interested in. Django seems like the best alternative because the documentation seems very good, and I know Python relatively well so the learning curve will not be too bad.
Do you guys think I have roughly the right idea so far? Would using Django to parse hundreds of JSON format strings be a good idea, and then export the data I have to HTML in some way and construct the web app?
Thank you!
I <3 Django.
But in my opinion, Django is best suited to making objects out of data in a database. It does this using and MVC-ish structure and an Object Relational Model (ORM). I'll make some assumptions from your question:
your data isn't in a DB, but is a bunch of JSON strings
you're more interested in displaying, rather than manipulating this data
If those are true, I would think you would want front-end focused system using Javascript. That's the best at handling JSON, after all. Django or Rails would be overkill for parsing strings.
Look at Angular or Ember, et. al.
Node.js is great for requesting remote JSON data and real-time streaming. Pair it with socket.io and real-time data just get more fun.
Take a look at D3.js, which use SVG to generate stunning, customized data visualizations or for simple visualizations, Google Charts API is really simple to begin with.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
For my link scraping program (written in python3.3) I want to use a database to store around 100.000 websites:
just the URL,
a time stamp
and for each website a list of several properties
I don't have knowledge about databases, but found the following may fit my purpose:
Postgresql
SQLite
Firebird
I'm interested in speed (to access the database and to get the wanted information). For example: for website x does property y exist and if yes read it. The speed of writing is of course also important.
My question: Are there big differences in speed or does it not matter for my small program? Maybe someone can tell which database fits my requirements (and is easy to handle with Python).
The size and scale of your database is not particularly large, and it's well within the scope of almost any off-the-shelf database solution.
Basically, what you're going to do is install the database server on your machine and it will come up on a given port. You then can install a library in Python to access it.
For example, if you want to use Postgresql, you'll install it on your machine and it will come up attached to some port like 5000, or port 5432.
But if you just have the information you're talking about to store and retrieve, you probably want to go with a NoSQL solution because it's very easy.
For example, you can install mongodb on your server, then install pymongo. The tutorial for pymongo will teach you pretty much everything you need for your application.
If speed is the main criteria, then i would suggest to go with a in-memory database.
Take a look at http://docs.python.org/2/library/sqlite3.html
it can be used as a normal database too, for the in-memory mode use the below and the db should get created in the RAM itself and hence much faster run-time access.
import sqlite3
conn = sqlite3.connect(':memory:')