Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
For my link scraping program (written in python3.3) I want to use a database to store around 100.000 websites:
just the URL,
a time stamp
and for each website a list of several properties
I don't have knowledge about databases, but found the following may fit my purpose:
Postgresql
SQLite
Firebird
I'm interested in speed (to access the database and to get the wanted information). For example: for website x does property y exist and if yes read it. The speed of writing is of course also important.
My question: Are there big differences in speed or does it not matter for my small program? Maybe someone can tell which database fits my requirements (and is easy to handle with Python).
The size and scale of your database is not particularly large, and it's well within the scope of almost any off-the-shelf database solution.
Basically, what you're going to do is install the database server on your machine and it will come up on a given port. You then can install a library in Python to access it.
For example, if you want to use Postgresql, you'll install it on your machine and it will come up attached to some port like 5000, or port 5432.
But if you just have the information you're talking about to store and retrieve, you probably want to go with a NoSQL solution because it's very easy.
For example, you can install mongodb on your server, then install pymongo. The tutorial for pymongo will teach you pretty much everything you need for your application.
If speed is the main criteria, then i would suggest to go with a in-memory database.
Take a look at http://docs.python.org/2/library/sqlite3.html
it can be used as a normal database too, for the in-memory mode use the below and the db should get created in the RAM itself and hence much faster run-time access.
import sqlite3
conn = sqlite3.connect(':memory:')
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 days ago.
Improve this question
I am a sort of experienced python programmer. I will quickly describe my situation. For a hobby programming was always nice. I then started working at a company that did lots of manual excel processing. One day I mentioned that I could probably automate this with python.
Things led to another and now there is python doing the excel work multiple times a day running from an Intel NUC i deployed as a small server. It has been some work figuring everything out but the money has been good as well, no complaints.
They are quite happy with me and have lots of different plans.
They want me to design a website where the employees can fill out a form daily and the data can be used elsewhere. However, I've done some html and css programming in highschool, but I know there needs to be a back-end to at least save the data that gets filled.
I dont know where to start. I know SQL is the #1 language in data processing and PHP in handling the back-end. But I already know python which also can do back-end operations.
I have two direct questions but also looking for advice on the whole situation. Feel free to just point anything out; I will read every comment.
My questions:
Could I run the webserver from my Intel NUC? Or is this generally seen as bad practice? Also, is it true that I would only need the domain if I run the webserver myself?
Is it worth it to learn SQL and PHP or should I stick to python?
I have tried looking online but found countless of resources. I would like to create a large database with lots of data I can use anytime. I think SQL is good for this but not looking to waste time.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 months ago.
Improve this question
I'm trying to migrate some perl code to python and it uses Sleeypcat::DbXml 'simple' to get read access to a .dbxml file, creates a XmlManager, calls createQueryContext, openContainer and query to get an XmlValue. I have found https://pypi.org/project/berkeleydb/ to support the Berkeley DB in general, but it has no mention of this XML layer. Is there an existing API I can use in python 3?
Berkeley dbxml does come with a Pyhon bindings. I ended up having to make modifications to the SWIG interface files to get it to run with Python3. If you are interested in building for a recent Python, you will need to make some modifications to the Python interface file. Specifically, you have to
redefine PYSTR_* macros to use unicode strings
make changes to the initialization code to return the module
update the Python 3 iterator code to use __next__ via a %rename pragma
potentially add code for missing objects an changed interfaces, e.g. I added an XmlResultsIterator, and added som code to XmlManager to let me reindex containers.
You then need to regenerate the swig interface and recompile the module. I don't know StackOverflow's policy on posting patches, but if it's allowed I'd be happy to post the patches that I created for dbxml 6.1.4 and Python 3.9 for you. Getting it all compiled is a little bit of work, but very doable.
Berkeley DB and Berkeley DB XML are two different products. My python bindings (legacy "bsddb3" and current "berkeleydb") only interface with Berkeley DB.
I am not aware of any Python bindings for Berkeley DB XML.
I am a freelance with commercial contracts, if that option would be useful to you.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
I am developing a script to save a stream of prices received via websocket to a local database but I am facing some design dilemma.
The stream consists of 1 minute candlestick data of around 200 instruments received via a cryptocurrency exchange websocket. When closed new candles are received I want to save them to a local db.
I started building it in Python with a MySQL DB but I am dubious on feasibility.
Apologies for not posting code, this is more a design/architecture dilemma
Questions:
Can I save the messages directly to DB or would I need an
intermediate step? (Performance concern)
Is Python the best option or should I look at Javascript or some 3rd
party software?
Am I totally out of my mind for building this? I have a strategy
that needs historical data from many instruments and calling Rest
API is not possible as I would hit the rate limit therefore looking
a working with Websockets
Thank you in advance
1 - I don't know if it's the best idea to save 200 values per minute, you can store the average, the highest and the lowest values ?
If you want to store all values you can use influxdb :
InfluxDB Cloud is the most powerful time series database as a service — free to start, easy to use, fast, serverless, elastic scalability.
2 - I think python is appropriate for this use
3 - If you can't use API rest, websocket can be a good idea.
When I had the opportunity to work on massive temporal data (IOT sensors), I used influxdb for storage, and MQTT for communication
Yes, you can. As an JSON string or direct into the table, but that depends on the form of the data. If its format don't change I would save it directly into the table. When the format can change (often) I would do both and save the raw data as a kind of fail save.
Python should work. Inserting Data into a database should be no problem with most languages. Use the one you have the most experience/ feel most comfortable with or want to lern.
This kind of Programm is an good exercise for learning a new language or programming in general. And I don't think that 200 Datasets per minute is to much to handle. So I don't think your out of your mind, actually most programmers I know have build something like this at some point.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm using Python (3.7.4) to make a text/password manager, where the user can store text under different tabs (using tkinter for the interface) , and use a "master login" to access all data.
The only experience I've got with saving/storing data is using CSV files, and looping through them to get values.
Can anyone recommend anyway I can store text, without it being able to be opened from Windows Explorer, and need some sort of key to be opened?
The natural alternative to using csv files is the use of a database. A solution like sqlite might be enough for your solution. Directly from the documentation:
SQLite is a C library that provides a lightweight disk-based database
that doesn’t require a separate server process and allows accessing
the database using a nonstandard variant of the SQL query language.
Some applications can use SQLite for internal data storage. It’s also
possible to prototype an application using SQLite and then port the
code to a larger database such as PostgreSQL or Oracle.
Once you have learned sqlite, you can think of encrypting your database: in this post you will find many ideas for encrypting your sqlite database.
Otherwise you can switch to other more complete and complex DBMSs. The important thing is that you consider moving from the csv to using a db.
Take a look at SQLLite as a lightweight local database and use something like SQLCipher for strong AES 256-bit encryption.
I havn't read it fully but take a look at this blog for a python implementation.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
could you tell me which is the best python driver independent of database platform similar to PDO in PHP or JDBC in Java. Thank you in advance.
I believe "best" it's a matter of your preferences. But I find the SQLAlchemy Core convenient. They support quite some database dialects and offer a ORM layer that's optional to use. It's easy to swap database without making any code change (I'm running SQLite3 in-memory for my test suit but Oracle and Postgres in production). You also get connection pooling and other stuff for free.
Python has a ODBC driver which might fit your needs, especially if you are already familiar with JDBC.
Furthermore, the python standard defines an API for database modules, which might help somewhat to abstract from the actual database implementation as well (I can't really tell how many implementations adhere to this standard as I only ever worked with the sqlite module, but the standard claims 'most').
A third option is mentioned here: using JDBC with Jython.
Which would be the best for you depends on what you actually want to achieve, but from your comparison with JDBC I would suspect the first option using the ODBC driver might be the best for you.