How can I map an ontology components to a relational database? - python

I already have an owl ontology which contains classes, instances and object properties. How can I map them to a relational data base such as MYSQL using a Python as a programming language(I prefer Python) ?
For example, an ontology can contains the classes: "Country and city" and instances like: "United states and NYC".
So I need manage to store them in relational data bases' tables. I would like to know if there is some Python libraries to so.

If I understand you well, I think you could use SQLite with python. SQLite is great because you have just to import the library with :
import sqlite3
And then, there is no need for a server. Things are stored in a file, gerenaly ending with : .db
Have a look at the doc, the example are helpful : https://docs.python.org/2/library/sqlite3.html
EDIT : To review or create your database and tables, I advise you tu use sqlitebrowser, which is light and easy to use : http://sqlitebrowser.org/

Use the right tool for the job. You're using RDF, that it's OWL axioms is immaterial, and you want to store and query it. Use an RDF database. They're optimized for storing and querying RDF. It's a waste of your time to homegrow storage & query in MySQL when other folks have already figured out how best to do this.
As an aside, there is a way to map RDF to a relational database. There's a formal specification for this, it's called R2RML.

Related

Python ORM - save or read sql data from/to files

I'm completely new to managing data using databases so I hope my question is not too stupid but I did not find anything related using the title keywords...
I want to setup a SQL database to store computation results; these are performed using a python library. My idea was to use a python ORM like SQLAlchemy or peewee to store the results to a database.
However, the computations are done by several people on many different machines, including some that are not directly connected to internet: it is therefore impossible to simply use one common database.
What would be useful to me would be a way of saving the data in the ORM's format to be able to read it again directly once I transfer the data to a machine where the main database can be accessed.
To summarize, I want to do:
On the 1st machine: Python data -> ORM object -> ORM.fileformat
After transfer on a connected machine: ORM.fileformat -> ORM object -> SQL database
Would anyone know if existing ORMs offer that kind of feature?
Is there a reason why some of the machine cannot be connected to the internet?
If you really can't, what I would do is setup a database and the Python app on each machine where data is collected/generated. Have each machine use the app to store into its own local database and then later you can create a dump of each database from each machine and import those results into one database.
Not the ideal solution but it will work.
Ok,
thanks to MAhsan's and Padraic's answers I was able to find the how this can be done: the CSV format is indeed easy to use for import/export from a database.
Here are examples for SQLAlchemy (import 1, import 2, and export) and peewee

Storing unstructured data with ramses to be searched with Ramses-API?

I would like to give my users the possibility to store unstructured data in JSON-Format, alongside the structured data, via an API generated with Ramses.
Since the data is made available via Elasticsearch, I try to achieve that this data is indexed and searchable, too.
I can't find any mentioning in the docs or searching.
Would this be possible and how would one do it?
Cheers /Carsten
I put an answer here because needed to give a several docs links and this is a new SO account limited to a couple: https://gitter.im/ramses-tech/ramses?at=56bc0c7a4dfe1fa71ffc0b61
This is Chrisses answer, copied from gitter.im:
You can use the dict field type for "unstructured data", as it takes arbitrary json. If the db engine is postgres, it uses jsonfield under the hood, and if the db engine is mongo, it's converted to a bson document as usual. Either way it should index automatically as expected in ES and will be queryable through the Ramses API.
The following ES queries are supported on documents/fields: nefertari-readthedocs-org/en/stable/making_requests.html#query-syntax-for-elasticsearch
See the docs for field types here, start at the high level (ramses) and it should "just work", but you can see what the code is mapped to at each level below down to the db if desired:
ramses: ramses-readthedocs-org/en/stable/fields.html
nefertari (underlying web framework): nefertari-readthedocs-org/en/stable/models.html#wrapper-api
nefertari-sqla (postgres-specific engine): nefertari-sqla-readthedocs-org/en/stable/fields.html
nefertari-mongodb (mongo-specific engine): nefertari-mongodb-readthedocs-org/en/stable/fields.html
Let us know how that works out, sounds like it could be a useful thing. So far we've just used that field type to hold data like user settings that the frontend wants to persist but for which the API isn't concerned.

Can I query a YAML dataset in Python?

Similar to Is there a query language for JSON? and the more specific How can I filter a YAML dataset with an attribute value? - I would like to:
hand-edit small amounts data in YAML files
perform arbitrary queries on the complete dataset (probably in Python, open to other ideas)
work with the resulting subset in Python
It doesn't appear that PyYAML has a feature like this, and today I can't find the link I had to the YQuery language, which wasn't a mature project anyway (or maybe I dreamt it).
Is there a (Python) library that offers YAML queries? If not, is there a Pythonic way to "query" a set of objects other than just iterating over them?
I don't thing there is a direct way to do it. But PyYAML reads yaml files into a dict representing everything in the file. Afterwards you can execute all dict related operations. The question python query keys in a dictionary based on values mentions some pythonic "query" styles.
bootalchemy provides a means to do this via SQLAlchemy. First, define your schema in a SQLAlchemy model. Then load your YAML into a SQLAlchemy session using bootalchemy. Finally, perform queries on that session. (You don't have to commit the session to an actual database.)
Example from the PyPI page (assume model is already defined):
from bootalchemy.loader import Loader
# (this simulates loading the data from YAML)
data = [{'Genre':[{'name': "action",
'description':'Car chases, guns and violence.'
}
]
}
]
# load YAML data into session using pre-defined model
loader = Loader(model)
loader.from_list(session, data)
# query SQLAlchemy session
genres = session.query(Genre).all()
# print results
print [(genre.name, genre.description) for genre in genres]
Output:
[('action', 'Car chases, guns and violence.')]
You could try to use jsonpath? Yes, that's meant for json, not yaml, but as long as you have json-compatible datastructures, this should work, because you're working on the parsed data, not on the json or yaml represention? (seems to work with the python libraries jsonpath and jsonpath-rw)
You can check the following tools:
yq for CLI queries, like with jq,
yaml-query another CLI query tool written in Python.

php DAL out of a schema

I developed a web platform in PHP a year ago, and I was kinda proud of the data access layer I wrote for it. Since then, I started re-using the same concept over and over. But now I'm thinking to take it to the next level, instead of re-writing the whole database access code I'd like to create a tool that will parse my SQL schema and generate the DAL classes by itself.
The information needed from the SQL schema in order to generate the code is:
* Tables
* Fields
* Fields types
* Foreign keys
Indeed, I looked up for some SQL parser and found some stuff but I ended up by deciding to do this differently. Instead of generating the code from the SQL schema itself, I'd generate it from a meta data that I'd create according to the database real schema.
I thought of something like:
TableName[
FieldA : Type;
FieldB: Type;
]
TableName2[
FieldA : Type, FK(TableName.FieldA);
FieldZ: Type;
]
This is not a spec at all, it's just a quick thinking result that says what kind of stuff I'd like to achieve.
The question now is:
Does python have some built-in API, or maybe some 3rd party library I could use to parse some format that'd let me define my schema as stated above?
I don't want to reinvent the wheel, and I'm not interested at all in writing my own parser, all I want is getting a basic and working tool ASAP.
Thanks
The immitiate thought would be to simply use regular python syntax to define your tables:
{
'TableName': {'FieldA': ['Type', FK(..)], 'FieldB': ['type']}
}
and so on.
You could however have a look at how django does it: you define a class and add properties to that class, which will then represent your model. This model can then be used to generate the SQL statements, and is also valid - and easily extendable - Python code.
Other suggestions could be to use a JSON structure to represent your data, and then write some code to parse that one. This would be similar to using the existing python syntax, but would be easier to parse in other languages (the example given above would be almost valid JSON syntax out of the box (replace ' with ").

how to make table partitions?

I am not very familiar with databases, and so I do not know how to partition a table using SQLAlchemy.
Your help would be greatly appreciated.
There are two kinds of partitioning: Vertical Partitioning and Horizontal Partitioning.
From the docs:
Vertical Partitioning
Vertical partitioning places different
kinds of objects, or different tables,
across multiple databases:
engine1 = create_engine('postgres://db1')
engine2 = create_engine('postgres://db2')
Session = sessionmaker(twophase=True)
# bind User operations to engine 1, Account operations to engine 2
Session.configure(binds={User:engine1, Account:engine2})
session = Session()
Horizontal Partitioning
Horizontal partitioning partitions the
rows of a single table (or a set of
tables) across multiple databases.
See the “sharding” example in
attribute_shard.py
Just ask if you need more information on those, preferably providing more information about what you want to do.
It's quite an advanced subject for somebody not familiar with databases, but try Essential SQLAlchemy (you can read the key parts on Google Book Search -- p 122 to 124; the example on p. 125-126 is not freely readable online, so you'd have to purchase the book or read it on commercial services such as O'Reilly's Safari -- maybe on a free trial -- if you want to read the example).
Perhaps you can get better answers if you mention whether you're talking about vertical or horizontal partitioning, why you need partitioning, and what underlying database engines you are considering for the purpose.
Automatic partitioning is a very database engine specific concept and SQLAlchemy doesn't provide any generic tools to manage partitioning. Mostly because it wouldn't provide anything really useful while being another API to learn. If you want to do database level partitioning then do the CREATE TABLE statements using custom Oracle DDL statements (see Oracle documentation how to create partitioned tables and migrate data to them). You can use a partitioned table in SQLAlchemy just like you would use a normal table, you just need the table declaration so that SQLAlchemy knows what to query. You can reflect the definition from the database, or just duplicate the table declaration in SQLAlchemy code.
Very large datasets are usually time-based, with older data becoming read-only or read-mostly and queries usually only look at data from a time interval. If that describes your data, you should probably partition your data using the date field.
There's also application level partitioning, or sharding, where you use your application to split data across different database instances. This isn't all that popular in the Oracle world due to the exorbitant pricing models. If you do want to use sharding, then look at SQLAlchemy documentation and examples for that, for how SQLAlchemy can support you in that, but be aware that application level sharding will affect how you need to build your application code.

Categories