Future Agnostic South Data Migrations

Future Agnostic South Data Migrations - python

I've been developing a django app with south for a while, and doing a sort of loose continuous deployment. Very shortly after my initial migration, I did a couple data migrations that looked like this:
def forwards(self, orm):
from django.core.management import call_command
call_command("loaddata", "#######.json")
At the time, I didn't think anything of it. It had been easy enough to populate the database manually and then dump it all into a fixture. Then when I finally wrote some unit tests, I started getting errors like this:
Creating test database for alias 'default'...
Problem installing fixture '/home/axel/Workspace/02_ereader_blast/content/fixtures/99_deals.json': Traceback (most recent call last):
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 196, in handle
obj.save(using=using)
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/core/serializers/base.py", line 165, in save
models.Model.save_base(self.object, using=using, raw=True)
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/db/models/base.py", line 551, in save_base
result = manager._insert([self], fields=fields, return_id=update_pk, using=using, raw=raw)
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/db/models/manager.py", line 203, in _insert
return insert_query(self.model, objs, fields, **kwargs)
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/db/models/query.py", line 1593, in insert_query
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 910, in execute_sql
cursor.execute(sql, params)
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 52, in execute
return self.cursor.execute(query, args)
DatabaseError: Could not load content.BookDeal(pk=1): column "entry_id" of relation "content_bookdeal" does not exist
LINE 1: INSERT INTO "content_bookdeal" ("id", "book_id", "entry_id",...
^
Installed 19 object(s) from 1 fixture(s)
Problem installing fixture '/home/axel/Workspace/02_ereader_blast/content/fixtures/99_deals_entries.json': Traceback (most recent call last):
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 190, in handle
for obj in objects:
File "/home/axel/Workspace/02_ereader_blast/venv/local/lib/python2.7/site-packages/django/core/serializers/json.py", line 47, in Deserializer
raise DeserializationError(e)
DeserializationError: Entry has no field named 'book_deals'
As far as I can tell, the loaddata command is using my most recent models, rather than the state of south at the time, and because I have changed them significantly since then, the current models are interpreting the old data as invalid.
So my questions are:
What is the best way to set up future data migrations so that this doesn't happen?
How can I backpedal out of this situation, and take it to best-practices land?

I found the solution, courtesy of this stackoverflow question:
django loading data from fixture after backward migration / loaddata is using model schema not database schema
As noted in the top answer, I used a snippet that very elegantly patches where the loaddata command gets its model.
Note that I did have to broaden my freezes for those data migrations, so that they could access all the models they needed from the orm rather than directly.
This feels like the right way to solve the problem.

My approach is a highly likely to be a complete hack / abuse of south / worst kind of practice, I think. However... if you know that your django models conform to your data tables. I might go for a fresh start approach:
rename (or delete) the relevant migrations folders.
in the south_migrationhistory data table, remove all the associated entries for the apps you are trying to create a fresh start for.
python manage.py convert_to_south app-name.
everything should be good to go.
If the data tables are canonical, and your django models are out of line, the way I conform my django models to my data tables is to run:
python manage.py inspectdb > inspectdb.py
Now I can compare the two versions of the django code for my models to get them in alignment. This enables me to go through the fresh start sequence.

Related

What causes this Attribute Error encountered when implementing LangChain's OpenAI LLM wrapper?

This is my first post here. I'm building a Python window application with PyQt5 that implements interactions with the OpenAI completions endpoint. So far, any code that I've written myself has performed fine, and I was reaching the point where I wanted to start implementing long-term memory for conversational interactions. I started by just running my own chain of prompts for categorizing and writing topical subjects and summaries to text files, but I decided it best to try exploring open source options to see how the programming community is managing things. This led me to LangChain, which seems to have some popular support behind it and already implements many features that I intend.
However, I have not had even the tiniest bit of success with it yet. Even the most simple examples don't perform, regardless of what context I'm implementing it in (within a class, outside a class, in an asynchronous loop, to the console, to my text browsers within the main window, whatever) I always get the same error message.
The simplest possible example:
import os
from langchain.llms import OpenAI
from local import constants #For API key
os.environ["OPENAI_API_KEY"] = constants.OPENAI_API_KEY
davinci = OpenAI(model_name= 'text-davinci-003', verbose=True, temperature=0.6)
text = "Write me a story about a guy who is frustrated with Python."
print("Prompt: " + text)
print(davinci(text))
It capably instantiates the wrapper and prints the prompt to the console, but at any point a command is sent through the wrapper's functions to receive generated text, it encounters this AttributeError.
Here is the traceback:
Traceback (most recent call last):
File "D:\Dropbox\Pycharm Projects\workspace\main.py", line 16, in <module>
print(davinci(text))
File "D:\Dropbox\Pycharm Projects\workspace\venv\lib\site-packages\langchain\llms\base.py", line 255, in __call__
return self.generate([prompt], stop=stop).generations[0][0].text
File "D:\Dropbox\Pycharm Projects\workspace\venv\lib\site-packages\langchain\llms\base.py", line 128, in generate
raise e
File "D:\Dropbox\Pycharm Projects\workspace\venv\lib\site-packages\langchain\llms\base.py", line 125, in generate
output = self._generate(prompts, stop=stop)
File "D:\Dropbox\Pycharm Projects\workspace\venv\lib\site-packages\langchain\llms\openai.py", line 259, in _generate
response = self.completion_with_retry(prompt=_prompts, **params)
File "D:\Dropbox\Pycharm Projects\workspace\venv\lib\site-packages\langchain\llms\openai.py", line 200, in completion_with_retry
retry_decorator = self._create_retry_decorator()
File "D:\Dropbox\Pycharm Projects\workspace\venv\lib\site-packages\langchain\llms\openai.py", line 189, in _create_retry_decorator
retry_if_exception_type(openai.error.Timeout)
AttributeError: module 'openai.error' has no attribute 'Timeout'
I don't expect that there is a fault in the LangChain library, because it seems like nobody else has experienced this problem. I imagine I may have some dependency issue? Or I do notice that others using the LangChain library are doing so in a notebook development environment, and my lack of familiarity in that regard is making me overlook some fundamental expectation of the library's use?
Any advice is welcome! Thanks!
What I tried: I initially just replaced my own function for managing calls to the completion endpoint with one that issued the calls through LangChain's llm wrapper. I expected it to work as easily as my own code had, but I received that error. I then stripped everything apart layer by layer attempting to instantiate the wrapper at every scope of the program, then I attempted to make the calls in an asynchronous function through a loop that waited to completion, and no matter what, I always get that same error message.

I think it might be something about your current installed versions of Python, OpenAI, and/or LangChain. Maybe try using a newer version of Python and OpenAI. I'm new to Python and these things but hopefully I could help.

'tensorflow' has no attribute 'space_to_depth' error with tensorflow 2.3 when running yad2k to generate model h5 file

I am trying to generate YOLOv2 model yolo.h5 so that I can load this pre-trained model. I am trying to port Andrew Ng coursera Yolo assignment ( which runs in tensorflow 1.x) to tensorflow 2.3.
I was able to cleanly port it thanks to tensorflow uprade (https://www.tensorflow.org/guide/upgrade), But little did I realize that I cannot download the yolo.h5 file ( either its get corrupted or the download times out) and therefore I thought I should build one and I followed instructions from https://github.com/JudasDie/deeplearning.ai/issues/2.
It looked pretty straight forward as I cloned YAD2k repo and downloaded both the yolo.weights and yolo.cfg.
I ran the following the command as per the instructions:
python yad2k.py yolo.cfg yolo.weights model_data/yolo.h5
But I got the following error:-
Traceback (most recent call last):
_main(parser.parse_args())
File "yad2k.py", line 233, in _main
Lambda(
File "/home/sunny/miniconda3/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line
925, in __call__
return self._functional_construction_call(inputs, args, kwargs,
File "/home/sunny/miniconda3/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line
1117, in _functional_construction_call
outputs = call_fn(cast_inputs, *args, **kwargs)
File "/home/sunny/miniconda3/lib/python3.8/site-packages/tensorflow/python/keras/layers/core.py", line 903, i
n call
result = self.function(inputs, **kwargs)
File "/home/sunny/YAD2K/yad2k/models/keras_yolo.py", line 32, in space_to_depth_x2
return tf.space_to_depth(x, block_size=2)
AttributeError: module 'tensorflow' has no attribute 'space_to_depth'
From the all chats I figured out that the above needs to run in tensorflow 1.x . However it puts me back where I started which is to run it in tensorflow 1.x. I would love to stick with tensorflow 2.3.
Wondering if someone can guide me here. Frankly, to get me going all I need is an model hd5 file. But I thought generating one would be a better learning than to get one.

The above problem goes away when you upgrade all of your code under yad2k repo ( particularly yad2k.py and python files under models folder to tensorflow 2.x. The beautiful upgrade utility provided by tensorflow does the magic for you by replacing the original call to the compatible tf.compat.v1.space_to_depth(input=x, block_size=...)
Therefore for those who are planning to do the hard job of downgrading their tensorflow and keras, I would recommend them to try the tensorflow upgrade. This saves a lot of time.
This takes care of my model h5 file creation. My bad - I didn't think about it when I asking the question.

Can't install trigger network automation tools

I read in the howto documentation to install Trigger, but when I test in python environment, I get the error below:
>>> from trigger.netdevices import NetDevices
>>> nd = NetDevices()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/trigger/netdevices/__init__.py", line 913, in __init__
with_acls=with_acls)
File "/usr/local/lib/python2.7/dist-packages/trigger/netdevices/__init__.py", line 767, in __init__
production_only=production_only, with_acls=with_acls)
File "/usr/local/lib/python2.7/dist-packages/trigger/netdevices/__init__.py", line 83, in _populate
# device_data = _munge_source_data(data_source=data_source)
File "/usr/local/lib/python2.7/dist-packages/trigger/netdevices/__init__.py", line 73, in _munge_source_data
# return loader.load_metadata(path, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/trigger/netdevices/loader.py", line 163, in load_metadata
raise RuntimeError('No data loaders succeeded. Tried: %r' % tried)
RuntimeError: No data loaders succeeded. Tried: [<trigger.netdevices.loaders.filesystem.XMLLoader object at 0x7f550a1ed350>, <trigger.netdevices.loaders.filesystem.JSONLoader object at 0x7f550a1ed210>, <trigger.netdevices.loaders.filesystem.SQLiteLoader object at 0x7f550a1ed250>, <trigger.netdevices.loaders.filesystem.CSVLoader object at 0x7f550a1ed290>, <trigger.netdevices.loaders.filesystem.RancidLoader object at 0x7f550a1ed550>]
Does anyone have some idea how to fix it?

The NetDevices constructor is apparently trying to find a "metadata source" that isn't there.
Firstly, you need to define the metadata. Second, your code should handle the exception where none is found.

I'm the lead developer of Trigger. Check out the the doc Working with NetDevices. It is probably what you were missing. We've done some work recently to improve the quality of the setup/install docs, and I hope that this is more clear now!
If you want to get started super quickly, you can feed Trigger a CSV-formatted NetDevices file, like so:
test1-abc.net.example.com,juniper
test2-abc.net.example.com,cisco
Just put that in a file, e.g. /tmp/netdevices.csv and then set the NETDEVICES_SOURCE environment variable:
export NETDEVICES_SOURCE=/tmp/netdevices.csv
And then fire up python and continue on with your examples and you should be good to go!

I found that the default of /etc/trigger/netdevices.xml wasn't listed in the setup instructions. It did indicate to copy from the trigger source folder:
cp conf/netdevices.json /etc/trigger/netdevices.json
But, I didn't see how to specify this instead of the default NETDEVICES_SOURCE on the installation page. But, as soon as I had a file that NETDEVICES_SOURCE pointed to in my /etc/trigger folder, it worked.
I recommend this to get the verifying functionality examples to work right away with minimal fuss:
cp conf/netdevices.xml /etc/trigger/netdevices.xml
Using Ubuntu 14.04 with Python 2.7.3

Threaded Sessions expiring on SQLAlchemy?

This is difficult to describe or show much code for, but I'll try. Essentially I have a multi-threaded desktop app that will frequently handle the adding/removing/changing of tables in threads. From what I read, I should use scoped_session and pass that around to the various threads to do the work (I think?). Here're some basic code examples:
class SQL():
def __init__(self):
self.db = create_engine('mysql+mysqldb://thesqlserver')
self.metadata = MetaData(self.db)
self.SessionObj = scoped_session(sessionmaker(bind=self.db, autoflush=True))
db = SQL()
session = db.SessionObj()
someObj = Obj(val, val2)
session.add(someObj)
session.commit()
The above class is what I'm using as the general access of SQL stuff. After creating a new session, performing a query and update/add to it, upon the session.commit(), I get the following error:
Traceback (most recent call last):
File "core\taskHandler.pyc", line 42, in run
File "core\taskHandler.pyc", line 184, in addTasks
File "core\sqlHandler.pyc", line 35, in commit
File "sqlalchemy\orm\session.pyc", line 624, in rollback
File "sqlalchemy\orm\session.pyc", line 338, in rollback
File "sqlalchemy\orm\session.pyc", line 369, in _rollback_impl
File "sqlalchemy\orm\session.pyc", line 239, in _restore_snapshot
File "sqlalchemy\orm\state.pyc", line 252, in expire
AttributeError: 'NoneType' object has no attribute 'expire'
Then the next if another sql attempt goes through:
Traceback (most recent call last):
File "core\taskHandler.pyc", line 44, in run
File "core\taskHandler.pyc", line 196, in deleteTasks
File "sqlalchemy\orm\query.pyc", line 2164, in scalar
File "sqlalchemy\orm\query.pyc", line 2133, in one
File "sqlalchemy\orm\query.pyc", line 2176, in __iter__
File "sqlalchemy\orm\query.pyc", line 2189, in _execute_and_instances
File "sqlalchemy\orm\query.pyc", line 2180, in _connection_from_session
File "sqlalchemy\orm\session.pyc", line 729, in connection
File "sqlalchemy\orm\session.pyc", line 733, in _connection_for_bind
File "sqlalchemy\orm\session.pyc", line 249, in _connection_for_bind
File "sqlalchemy\orm\session.pyc", line 177, in _assert_is_active
sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back by a nested rollback() call. To begin a new transaction, issue Session.rollback() first.
That's about as much as I know and I think the best I can describe. Any ideas on what I'm supposed to be doing here? It's all mud to me. Thanks in advance!

The funny part is, you missed the most critical part of the answer you "ripped the code from", which is that there is a Python function in the middle, which is executing some abstract operation (it's labeled as func()). That code illustrates a transactional wrapper for a function, and in the above example you instead have an object method called commit() that isn't otherwise calling upon any additional operations with the Session.
Here you have kind of a session-holding object called SQL() that is not really adding any usefulness to your program and makes it needlessly complicated, and is probably also the source of the issue. Unless your application intends to connect to many different databases at different times, and use SQL() objects to represent that state, there's not much use in building a class called "SQL" that has an "engine" stuck onto it. Just stick the engine in a module somewhere, as well as your scoped_session().
The engine and scoped_session represent a pattern called the factory pattern - they are objects that create some other useful object, in this case scoped_session creates a Session, and the Engine is used internally by the Session to create a Connection with which to talk to the database. It doesn't make much sense to place the Session object as a sibling member along with Engine and scoped_session - you'd be carrying around either the factories (the Engine and scoped_session), or the object itself that they create (the Session), which all depends on what you're trying to do.
The Session itself, remember here we're talking about the thing the factories create (Session), not the factories themselves (Engine and scoped_session), is not in the least bit thread safe. It is something you usually create only local to a function - it shouldn't be global, and if you're in fact using a single SQL() object across threads that's probably the problem here. The actual error you're getting, I'm not really sure what that is and I could only have a better clue if I knew the exact version of SQLAlchemy in use here, though the randomness of the error suggests that you have some kind of threading issue where something is becoming None in one thread as another expects that same object to be present.
So what you need to establish in this program is when exactly a particular thread of execution begins, what it needs to do with the database as it proceeds, and then when it ends. When you can establish a consistent pattern for that, you would then link a single Session to this thread, which goes for the lifespan of that thread, and is never shared. All the objects which are produced by this session must also not be shared to other threads - they are extensions of the Session's state. If you have "worker threads" in use, those worker threads should load up their own data as needed, within their own Session. The Session represents a live database transaction and you generally want transactions local to a single thread.
As this is not a web application you might want to forego the usage of scoped_session, unless you do in fact have a place for a thread-local pattern to be used.

Deleting an object from an SQLAlchemy session before it's been persisted

My application allows users to create and delete Site objects. I have implemented this using session.add() and session.delete(). I then have 'Save' and 'Reset' buttons that call session.commit() and session.rollback().
If I add a new Site, then save/commit it, and then delete it, everything goes OK. However, if I try to remove an object from the session before it's been saved, I get a 'not persisted' error.
Code:
self.newSite = Site('foo')
self.session.add(self.newSite)
print self.session.new
self.session.delete(self.newSite)
Output:
IdentitySet([<Site('foo')>])
Traceback (most recent call last):
File "C:\Program Files\Eclipse\dropins\plugins\org.python.pydev.debug_2.2.1.2011071313\pysrc\pydevd_comm.py", line 744, in doIt
result = pydevd_vars.evaluateExpression(self.thread_id, self.frame_id, self.expression, self.doExec)
File "C:\Program Files\Eclipse\dropins\plugins\org.python.pydev.debug_2.2.1.2011071313\pysrc\pydevd_vars.py", line 375, in evaluateExpression
result = eval(compiled, updated_globals, frame.f_locals)
File "<string>", line 1, in <module>
File "C:\Python27\Lib\site-packages\sqlalchemy\orm\session.py", line 1245, in delete
mapperutil.state_str(state))
InvalidRequestError: Instance '<Site at 0x1ed5fb0>' is not persisted
I understand what's happening here, but I'm not sure what I should be doing instead.
Is there some other method of removing a not-yet-persisted object from a session? Or should I be calling session.flush() before attempting a deletion, in case the object I want to delete hasn't been flushed yet?
If it's the latter, how come session.query() auto-flushes (ensuring that pending objects show up in the query results), but session.delete() doesn't (which would ensure that pending objects can be deleted without error).

You can Session.expunge() it. I think the rationale with delete() being that way is, it worries you're not keeping track of things if you send it a pending. But I can see the other side of the story on that, I'll think about it. Basically the state implied by delete() includes some assumptions of persistence but they're probably not as significant as I'm thinking. An "expunge or delete" method then comes to mind, which is funny that's basically the "save or update" we originally copied from Hibernate, which just became "add". "add" can do the transitions of transient->pending as well as detached->persistent - would a potential "remove()" do both pending->transient and persistent->deleted ? too bad the scoped session already has "remove()"....
Session.query() autoflushes because it's about to go out to the database to emit some SQL to get some rows; so whatever you have locally needs to go out first. delete() just marks the state of an object so there's no need to invoke any SQL. If we wanted delete() to work on a pending, we'd just change that assertion.
Interestingly, if you rollback() the session, whatever you've add()'ed within that session, whether or not it got flushed, is expunged.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Future Agnostic South Data Migrations - python

Related

What causes this Attribute Error encountered when implementing LangChain's OpenAI LLM wrapper?

'tensorflow' has no attribute 'space_to_depth' error with tensorflow 2.3 when running yad2k to generate model h5 file

Can't install trigger network automation tools

Threaded Sessions expiring on SQLAlchemy?

Deleting an object from an SQLAlchemy session before it's been persisted

Categories

Resources