Creating a table in python - python

I want to build a table in python with three columns and later on fetch the values as necessary.
I am thinking dictionaries are the best way to do it, which has key mapping to two values.
|column1 | column 2 | column 3 |
| MAC | PORT NUMBER | DPID |
| Key | Value 1 | Value 2 |
proposed way :
// define a global learning table
globe_learning_table = defaultdict(set)
// add port number and dpid of a switch based on its MAC address as a key
// packet.src will give you MAC address in this case
globe_learning_table[packet.src].add(event.port)
globe_learning_table[packet.src].add(dpid_to_str(connection.dpid))
// getting value of DPID based on its MAC address
globe_learning_table[packket.src][????]
I am not sure if one key points to two values how can I get the particular value associated with that key.
I am open to use any another data structure as well, if it can build this dynamic table and give me the particular values when necessary.

Why a dictionary? Why not a list of named tuples, or a collection (list, dictionary) of objects from some class which you define (with attributes for each column)?
What's wrong with:
class myRowObj(object):
def __init__(self, mac, port, dpid):
self.mac = mac
self.port = port
self.dpid = dpid
myTable = list()
for each in some_inputs:
myTable.append(myRowObj(*each.split())
... or something like that?
(Note: myTable can be a list, or a dictionary or whatever is suitable to your needs. Obviously if it's a dictionary then you have to ask what sort of key you'll use to access these "rows").
The advantage of this approach is that your "row objects" (which you'd name in some way that made more sense to your application domain) can implement whatever semantics you choose. These objects can validate and convert any values supplied at instantiation, compute any derived values, etc. You can also define a string and code representations of your object (implicit conversions for when one of your rows is used as a string or in certain types of development and debugging or serialization (_str_ and _repr_ special methods, for example).
The named tuples (added in Python 2.6) are a sort of lightweight object class which can offer some performance advantages and lighter memory footprint over normal custom classes (for situations where you only want the named fields without binding custom methods to these objects, for example).

Something like this perhaps?
>>> global_learning_table = collections.defaultdict(PortDpidPair)
>>> PortDpidPair = collections.namedtuple("PortDpidPair", ["port", "dpid"])
>>> global_learning_table = collections.defaultdict(collections.namedtuple('PortDpidPair', ['port', 'dpid']))
>>> global_learning_table["ff:" * 7 + "ff"] = PortDpidPair(80, 1234)
>>> global_learning_table
defaultdict(<class '__main__.PortDpidPair'>, {'ff:ff:ff:ff:ff:ff:ff:ff': PortDpidPair(port=80, dpid=1234)})
>>>
Named tuples might be appropriate for each row, but depending on how large this table is going to be, you may be better off with a sqlite db or something similar.

If it is small enough to store in memory and you want it to be a data structure, you could create an class that contains Values 1 & 2 and use that as the value for your dictionary mapping.
However, as Mr E pointed out, it is probably better design to use a database to store the information and retrieve as necessary from there. This will likely not result in significant performance loss.

Another option to keep in mind is an in-memory SQLite table. See the Python SQLite docs for a basic example:
11.13. sqlite3 — DB-API 2.0 interface for SQLite databases — Python v2.7.5 documentation
http://docs.python.org/2/library/sqlite3.html

I think you're getting two distinct objectives mixed up. You want a representative data structure, and (as I read it) you want to print it in a readable form. What gets printed as a table is not stored internally in the computer in two dimensions; the table presentation is a visual metaphor.
Assuming I'm right about what you want to accomplish, the way I'd go about it is by a) keeping it simple and b) using the right modules to save effort.
The simplest data structure that correctly represents your information is in my opinion a dictionary within a dictionary. Like this:
foo = {'00:00:00:00:00:00': {'port':22, 'dpid':42},
'00:00:00:00:00:01': {'port':23, 'dpid':43}}
The best module I have found for quick and dirty table printing is prettytable. Your code would look something like this:
foo = {'00:00:00:00:00:00': {'port':22, 'dpid':42},
'00:00:00:00:00:01': {'port':23, 'dpid':43}}
t = PrettyTable(['MAC', 'Port', 'dpid'])
for row in foo:
t.add_row([row, foo[row]['port'], foo[row]['dpid']])
print t

Related

Which database to store very large nested Python dicts?

My script produces data in the following format:
dictionary = {
(.. 42 values: None, 1 or 2 ..): {
0: 0.4356, # ints as keys, floats as values
1: 0.2355,
2: 0.4352,
...
6: 0.6794
},
...
}
where:
(.. 42 values: None, 1 or 2 ..) is a game state
inner dict stores calculated values of actions which are possible in that state
The problem is that the state space is very big (millions of states), so the whole data stucture cannot be stored in memory. That's why I'm looking for a database engine which would fit my needs and I could use with Python. I need to get the list of actions and their values in the given state (previously mentioned tuple of 42 values) and to modify value of given action in given state.
Check out ZODB: http://www.zodb.org/en/latest/
It's natve object DB for Python that supports transactions, caching, pluggable layers, pack operations (for keeping history) and BLOBs.
You can use a key-value cache solution. A good one is Redis. It`s very fast and simple, written on the C and more over than just a key value cache. Integration with python just several lines of code. The redis is also can be scaled very easy for the really big data. I worked in the game industry and understand what I am talking about.
Also, as already mentioned here, you can use more complex solution, not a cache, the database PostgresSQL. Now it supports a JSON binary format field - JSONB. I think the best python database ORM is the SQLAlchemy. It supports PostgresSQL out of the box. I will use this one in my code block. For example, you have a table
class MobTable(db.Model):
tablename = 'mobs'
id = db.Column(db.Integer, primary_key=True)
stats = db.Column(JSONB, index=True, default={})
If your have a mob with such json stats
{
id: 1,
title: 'UglyOrk',
resists: {cold: 13}
}
You can search all mobs with the not null cold resists
expr = MobTable.stats[("resists", "cold")]
q = (session.query(MobTable.id, expr.label("cold_protected"))
.filter(expr != None)
.all())
I recommend you use HD5f. It's a data base format that works perfectly with Python (it is specifically developed for Python) and stores the data in binary format. This reduces the size of the data to be stored a great extent! More importantly it gives you the ability of random access which I believe serves for your purposes. Also, if you do not use any compression method you will retrieve the data with the highest possible speed.
You can also store it as JSONB in PostgreSQL DB.
For connecting with PostgreSQL you can use psycopg2, which is compliant with Python Database API Specification v2.0.

Is there a data structure that can get value by key and get key by value both in Python?

Title might be a bit confusing, so here's an example, assume you have a really big dict like this:
{"James": “20492”, “Mike": "292", "Tony": "11134", "Timmy": "3984", ... }
Let's say all keys and values are unique, there're no duplicates. And I want to get James by providing id 20492, or get 292 by providing Mike.
Well, besides for creating another "reverse" dict like this: {"20492": "James", ... }, what other choices(better be elegant) do I have?
The best data structure for this (short of a database) is a dictionary. You would have to implement a reverse dictionary as well.
If you want to go the database route, and your data set is not large, you can use sqlite (bundled with recent versions of Python) and use the special :memory: location to create an in-memory database. It should be quite fast.

Store dictionary in database

I create a Berkeley database, and operate with it using bsddb module. And I need to store there information in a style, like this:
username = '....'
notes = {'name_of_note1':{
'password':'...',
'comments':'...',
'title':'...'
}
'name_of_note2':{
#keys same as previous, but another values
}
}
This is how I open database
db = bsddb.btopen['data.db','c']
How do I do that ?
So, first, I guess you should open your database using parentheses:
db = bsddb.btopen('data.db','c')
Keep in mind that Berkeley's pattern is key -> value, where both key and value are string objects (not unicode). The best way in your case would be to use:
db[str(username)] = json.dumps(notes)
since your notes are compatible with the json syntax.
However, this is not a very good choice, say, if you want to query only usernames' comments. You should use a relational database, such as sqlite, which is also built-in in Python.
A simple solution was described by #Falvian.
For a start there is a column pattern in ordered key/value store. So the key/value pattern is not the only one.
I think that bsddb is viable solution when you don't want to rely on sqlite. The first approach is to create a documents = bsddb.btopen['documents.db','c'] and store inside json values. Regarding the keys you have several options:
Name the keys yourself, like you do "name_of_note_1", "name_of_note_2"
Generate random identifiers using uuid.uuid4 (don't forget to check it's not already used ;)
Or use a row inside this documents with key=0 to store a counter that you will use to create uids (unique identifiers).
If you use integers don't forget to pack them with lambda x: struct.pack('>q', uid) before storing them.
If you need to create index. I recommend you to have a look at my other answer introducting composite keys to build index in bsddb.

Create dictionary of a sqlalchemy query object in Pyramid

I am new to Python and Pyramid. In a test application I am using to learn more about Pyramid, I want to query a database and create a dictionary based on the results of a sqlalchemy query object and finally send the dictionary to the chameleon template.
So far I have the following code (which works fine), but I wanted to know if there is a better way to create my dictionary.
...
index = 0
clients = {}
q = self.request.params['q']
for client in DBSession.query(Client).filter(Client.name.like('%%%s%%' % q)).all():
clients[index] = { "id": client.id, "name": client.name }
index += 1
output = { "clients": clients }
return output
While learning Python, I found a nice way to create a list in a for loop statement like the following:
myvar = [user.name for user in users]
So, the other question I had: is there a similar 'one line' way like the above to create a dictionary of a sqlalchemy query object?
Thanks in advance.
well, yes, we can tighten this up a bit.
First, this pattern:
index = 0
for item in seq:
frobnicate(index, item)
item += 1
is common enough that there's a builtin function that does it automatically, enumerate(), used like this:
for index, item in enumerate(seq):
frobnicate(index, item)
but, I'm not sure you need it, Associating things with an integer index starting from zero is the functionality of a list, you don't really need a dict for that; unless you want to have holes, or need some of the other special features of dicts, just do:
stuff = []
stuff.extend(seq)
when you're only interested in a small subset of the attributes of a database entity, it's a good idea to tell sqlalchemy to emit a query that returns only that:
query = DBSession.query(Client.id, Client.name) \
.filter(q in Client.name)
In the above i've also shortened the .name.like('%%%s%%' % q) into just q in name since they mean the same thing (sqlalchemy expands it into the correct LIKE expression for you)
Queries constructed in this way return a special thing that looks like a tuple, and can be easily turned into a dict by calling _asdict() on it:
so to put it all together
output = [row._asdict() for row in DBSession.query(Client.id, Client.name)
.filter(q in Client.name)]
or, if you really desperately need it to be a dict, you can use a dict comprehension:
output = {index: row._asdict()
for index, row
in enumerate(DBSession.query(Client.id, Client.name)
.filter(q in Client.name))}
#TokenMacGuy gave a nice and detailed answer to your question. However, I have a feeling you've asked a wrong question :)
You don't need to convert SQLALchemy objects to dictionaries before passing them to the template - that would be quite inconvenient. You can pass the result of a query as is and directly use SQLALchemy mapped objects in your template
q = self.request.params['q']
clients = DBSession.query(Client).filter(q in Client.name).all()
return {'clients': clients}
If you want to turn a SqlAlchemy object into a dict, you can use this code:
def obj_to_dict(obj):
return dict((col.name, getattr(obj, col.name)) for col in sqlalchemy_orm.class_mapper(obj.__class__).mapped_table.c)
there is another attribute of the mapped table that has the relationships in it , but the code gets dicey.
you don't need to cast an object into a dict for any of the template libraries, but if you decide to persist the data ( memcached, session, pickle, etc ) you'll either need to use dicts or write some code to 'merge' the persisted data back into the session.
a quick side note- if you render any of this data through json , you'll either need to have a custom json renderer that can handle datetime objects , or change the values in a function.

Is it possible to save a list of values into a SQLite column?

I want 3 columns to have 9 different values, like a list in Python.
Is it possible? If not in SQLite, then on another database engine?
You must serialize the list (or other Python object) into a string of bytes, aka "BLOB";-), through your favorite means (marshal is good for lists of elementary values such as numbers or strings &c, cPickle if you want a very general solution, etc), and deserialize it when you fetch it back. Of course, that basically carries the list (or other Python object) as a passive "payload" -- can't meaningfully use it in WHERE clauses, ORDER BY, etc.
Relational databases just don't deal all that well with non-atomic values and would prefer other, normalized alternatives (store the list's items in a different table which includes a "listID" column, put the "listID" in your main table, etc). NON-relational databases, while they typically have limitations wrt relational ones (e.g., no joins), may offer more direct support for your requirement.
Some relational DBs do have non-relational extensions. For example, PostGreSQL supports an array data type (not quite as general as Python's lists -- PgSQL's arrays are intrinsically homogeneous).
Generally, you do this by stringifying the list (with repr()), and then saving the string. On reading the string from the database, use eval() to re-create the list. Be careful, though that you are certain no user-generated data can get into the column, or the eval() is a security risk.
Your question is difficult to understand. Here it is again:
I want 3 columns to have 9 different values, like a list in Python. Is it possible? If not in SQLite, then on another database engine?
Here is what I believe you are asking: is it possible to take a Python list of 9 different values, and save the values under a particular column in a database?
The answer to this question is "yes". I suggest using a Python ORM library instead of trying to write the SQL code yourself. This example code uses Autumn:
import autumn
import autumn.util
from autumn.util import create_table
# get a database connection object
my_test_db = autumn.util.AutoConn("my_test.db")
# code to create the database table
_create_sql = """\
DROP TABLE IF EXISTS mytest;
CREATE TABLE mytest (
id INTEGER PRIMARY KEY AUTOINCREMENT,
value INTEGER NOT NULL,
UNIQUE(value)
);"""
# create the table, dropping any previous table of same name
create_table(my_test_db, _create_sql)
# create ORM class; Autumn introspects the database to find out columns
class MyTest(autumn.model.Model):
db = my_test_db
lst = [3, 6, 9, 2, 4, 8, 1, 5, 7] # list of 9 unique values
for n in lst:
row = MyTest(value=n) # create MyTest() row instance with value initialized
row.save() # write the data to the database
Run this code, then exit Python and run sqlite3 my_test.db. Then run this SQL command inside SQLite: select * from mytest; Here is the result:
1|3
2|6
3|9
4|2
5|4
6|8
7|1
8|5
9|7
This example pulls values from one list, and uses the values to populate one column from the database. It could be trivially extended to add additional columns and populate them as well.
If this is not the answer you are looking for, please rephrase your request to clarify.
P.S. This example uses autumn.util. The setup.py included with the current release of Autumn does not install util.py in the correct place; you will need to finish the setup of Autumn by hand.
You could use a more mature ORM such as SQLAlchemy or the ORM from Django. However, I really do like Autumn, especially for SQLite.

Categories