Django 1.7 dumpdata on Windows scrambles unicode characters - python

I use manage.py dumpdata --format xml --some-more-parameters to export a full dump of the database to xml. The database is MS sql server and I'm using pyodbc as the driver. The dumpdata command is run using PowerShell and since Django 1.7 does not support a --output argument for the dumpdata command I redirect the output into a file using PowerShell.
Unfortunately the database contains unicode characters (e.g. country \xd6sterreich) and these characters are scrambled int the export file.
Here's what didn't work:
./manage.py dumpdata --format xml > export.xml
./manage.py dumpdata --format xml | out-file -encoding utf8 export.xml
./manage.py dumpdata -format xml | out-file -encoding ANY_OTHER_SUPPORTED_ENCODING export.xml
None of these commands work. Umlauts and accents are scrambled and additionally the > export.xml method adds an invalid BOM to the file which will result in ./manage.py loaddata export.xml aborting with an UnicodeDecode error message when I try to import this on another host.
Any suggestions on how I could export the data and preserve the special characters? The same problem exists when using the json or yaml serializers.

I was able to work around this problem using my own export script. The script below will dump the data and store it in a utf-8 encoded xml file called export_CURRENT-DATE-TIME.xml. call_command() calls the dumpdata command in Django. The script below should be equivalent to using dumpdata with the following arguments:
./manage.py dumpdata --natural --natural-foreign --natural-primary --format xml --indent 2
import sys
import codecs
import os
import django
from django.core.management import call_command
from StringIO import StringIO
from datetime import datetime
# setup access to django
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "PROJECT_NAME.settings")
django.setup()
# the actual export command
def do_work():
#print(u"\xd6sterreich")
call_command('dumpdata', use_natural_keys=True, use_natural_foreign_keys=True, use_natural_primary_keys=True, format='xml', indent=2)
# nasty hack to workaround encoding issues on windows
_stdout = sys.stdout
sys.stdout = StringIO()
do_work()
value = sys.stdout.getvalue().decode('utf-8')
sys.stdout = _stdout
with codecs.open('export_{}.xml'.format(datetime.now().strftime("%Y-%m-%d_%H-%M")), 'w', 'utf-8-sig') as f:
f.write(value)
print("export completed")

Related

MYSQL Database Copy Using Python/Django

I need to create a copy of the database in my MySQL Server using Django Application
After a little research, i found mysqldump as the better approach
backup_file_path = f"/tmp/{src_database_name}_backup.sql"
backup_db_command = f"mysqldump -h {SQL_DB_HOST} -P 3306 -u {SQL_DB_USER} -p{SQL_DB_PASSWORD} {src_database_name} > {backup_file_path}"
print(backup_db_command) # TODO: remove
with os.popen(backup_db_command, "r") as p:
r = p.read()
print(f"Backup Output: {r}")
restore_command = f"mysql -u root -p{SQL_DB_PASSWORD} {dest_database_name} < {backup_file_path}"
with os.popen(restore_command, "r") as p:
r = p.read()
print(f"Restore Output: {r}")
My Queries:
Any issues with this approach
Any better approaches to do a copy of DB using Either python or Django ORM
You can try Using django-admin, Django’s command-line utility for dumpdata and loaddata.
Use the following command to dump data
django-admin dumpdata [app_label[.ModelName] [app_label[.ModelName]
...]] -o database_backup.sql
Use the following command to load data
django-admin loaddata database_backup.sql
docs-
dump data
load data

Python script called from PHP can't write a file

I have a problem with converting docx to pdf files in my script.
At first I tried to use a pure php-based solution, described here:
https://stackoverflow.com/a/20035739/12812601
Unfortunatelly this does not work (it creates an empty com object, then throws a fatal error).
So I've tried to use a python script to do this.
I use a great script from here:
https://stackoverflow.com/a/20035739/12812601
So here is a problem.
The Python script standalone (run via a command line) works just fine, and saves the converted pdf. Unfortunatelly when I try to call it via PHP it can't save a converted file.
PHP scripts can create and write files oin the same directory without any problem
This supposed to be a local configuration, so I do not care about any portability
Scripts:
*******PHP*******
<?php
//Script only for testing Python calls, tried different methods
error_reporting(E_ALL);
echo '<h1>Begin</h1>';
echo '<h2>Before call</h2>';
exec ('python dp.py');
echo '<h2>After exec call</h2>';
system('python dp.py');
echo '<h2>After Sys Call</h2>';
passthru('python dp.py');
echo '<h2>After Pass Call</h2>';
$w = get_current_user();
var_dump($w);
?>
*****Python*****
import sys
import os
import comtypes.client
import win32com.client
wdFormatPDF = 17
#static file names for testing
in_file = 'C:\\Users\\fake_user\\OneDrive\\Stuff\\f1.docx'
out_file = 'C:\\Users\\fake_user\\OneDrive\\Stuff\\f3.pdf'
print('BEGIN<br>\n')
word = win32com.client.Dispatch('Word.Application')
word.Visible = False
doc = word.Documents.Open(in_file)
print('\nOpened Docx\n<br>')
print(in_file);
doc.SaveAs(out_file, FileFormat=wdFormatPDF)
print('\nSaved\n<br>')
doc.Close()
word.Quit()
print('DONE\n')
*****Output from the browser*****
Begin
Before call
After exec call
BEGIN
Opened Docx
C:\Users\fake_user\OneDrive\Stuff\f1.docx
After Sys Call
BEGIN
Opened Docx
C:\Users\fake_user\OneDrive\Stuff\f1.docx
After Pass Call
string(5) "fake_user"
System configuration
Windows 7 Professional Edition Service Pack 1
Apache/2.4.26 (Win32)
OpenSSL/1.0.2l
PHP/7.1.7
Python 3.8.1
I tried to run Apache both as a system service and as a user who owns the OneDrive (name changed to "fake_user" here), so it shouldn't be a permissions issue (I think)
Any help appreciated

Store password in Ansbile Vault and retrieve that key from Python script using API

I have a requirement where I should not store any passwords in the script files in plain text. So I have created an Ansible vault file called "vault.yml" which contains username and password.
Is there some kind of API that I can use to look up this value from python script called for example "test.py"?
What I would like in test.py is something like this:
username = ansible_api_get(key=username)
password = ansible_api_get(key=password)
P.S. - I don't have to use Ansible Vault, but that is preferred option as we would like to use all sensitive info with Vault and we want to integrate our scripts as much as possible.
Yes, ansible-vault is the Python library that you can use for this purpose.
vault.py
#!/usr/bin/env python3
''' get secrets from ansible-vault file with gpg-encrypted password '''
import os
import sys
from subprocess import check_output
import yaml
from ansible_vault import Vault
vault_file = sys.argv[1]
if os.path.exists(vault_file):
get_vault_password = os.environ['ANSIBLE_VAULT_PASSWORD_FILE']
if os.path.exists(get_vault_password):
PASSWORD = check_output(get_vault_password).strip().decode("utf-8")
secrets = yaml.safe_load(Vault(PASSWORD).load_raw(open(vault_file, encoding='utf-8').read()))
print(secrets['username'])
print(secrets['password'])
else:
raise FileNotFoundError
As #nwinkler wisely says, you'll still have to have the password for the Ansible vault. As a developer you're probably familiar with signing your commits in Git, the good news is that you can use the same GPG keys to encrypt/decrypt the file that stores the password, and this can work transparently.
If the environment variable ANSIBLE_VAULT_PASSWORD_FILE points to an executable, then Ansible will run that executable to get to the password required to decrypt a vault file. If it's not executable, then someone can store plain-text secrets there. The executable in this example needs just one line of shell to decrypt a file named vault_pw.gpg. Create this file with the vault-password on one line and encrypt it with your GPG key, remove the plain-text file.
~/.bash_profile
I setup my shell to do this, and also launch the gpg-agent (for caching).
export ANSIBLE_VAULT_PASSWORD_FILE=~/.ansible_vault_password_exe
gpg-agent --daemon --write-env-file "${HOME}/.gpg-agent-info"'
export GPG_TTY=$(tty)
~/.bashrc
This will ensure that only one gpg-agent is running:
if [ -f "${HOME}/.gpg-agent-info" ]
then
. "${HOME}/.gpg-agent-info"
fi
~/.ansible_vault_password_exe
#!/bin/sh
exec gpg -q -d ${HOME}/vault_pw.gpg

Jinja not imported when executing script from another one

I have a web server with CGI script calling python scripts.
When i try to execute in a main file (test1.py) another script called via
os.system('/var/www/cgi-bin/readIRtemp.py '+arg1+' '+arg2+' '+arg3)
I get his error message in /var/log/apache2/error.log :
import: not found
from: can't read /var/mail/jinja2
this is understandable for me since when called directly from the python console my script works !
its content is:
import sys, os
from jinja2 import Environment, FileSystemLoader, select_autoescape
last20values=sys.argv[1]
currTempInDegreesCelcius=sys.argv[2]
print('test '+last20values+' '+currTempInDegreesCelcius)
env = Environment(
loader=FileSystemLoader('/var/www/html/templates'),
autoescape=select_autoescape(['html', 'xml'])
)
template = env.get_template('IR.html')
updatedTemplate=template.render( arrayOfTemp = last20values, currTemp=currTempInDegreesCelcius)
Html_file=open("/var/www/html/IR.html","w")
Html_file.write(updatedTemplate)
Html_file.close()
I read somewhere something like maybe when calling os.system() the script is running with a different user account or some crazy things like that ... please help!
of course i chmod 777 * everything but that doesnt help ...

Django custom management command running Scrapy: How to include Scrapy's options?

I want to be able to run the Scrapy web crawling framework from within Django. Scrapy itself only provides a command line tool scrapy to execute its commands, i.e. the tool was not intentionally written to be called from an external program.
The user Mikhail Korobov came up with a nice solution, namely to call Scrapy from a Django custom management command. For convenience, I repeat his solution here:
# -*- coding: utf-8 -*-
# myapp/management/commands/scrapy.py
from __future__ import absolute_import
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def run_from_argv(self, argv):
self._argv = argv
return super(Command, self).run_from_argv(argv)
def handle(self, *args, **options):
from scrapy.cmdline import execute
execute(self._argv[1:])
Instead of calling e.g. scrapy crawl domain.com I can now do python manage.py scrapy crawl domain.com from within a Django project. However, the options of a Scrapy command are not parsed at all. If I do python manage.py scrapy crawl domain.com -o scraped_data.json -t json, I only get the following response:
Usage: manage.py scrapy [options]
manage.py: error: no such option: -o
So my question is, how to extend the custom management command to adopt Scrapy's command line options?
Unfortunately, Django's documentation of this part is not very extensive. I've also read the documentation of Python's optparse module but afterwards it was not clearer to me. Can anyone help me in this respect? Thanks a lot in advance!
Okay, I have found a solution to my problem. It's a bit ugly but it works. Since the Django project's manage.py command does not accept Scrapy's command line options, I split the options string into two arguments which are accepted by manage.py. After successful parsing, I rejoin the two arguments and pass them to Scrapy.
That is, instead of writing
python manage.py scrapy crawl domain.com -o scraped_data.json -t json
I put spaces in between the options like this
python manage.py scrapy crawl domain.com - o scraped_data.json - t json
My handle function looks like this:
def handle(self, *args, **options):
arguments = self._argv[1:]
for arg in arguments:
if arg in ('-', '--'):
i = arguments.index(arg)
new_arg = ''.join((arguments[i], arguments[i+1]))
del arguments[i:i+2]
arguments.insert(i, new_arg)
from scrapy.cmdline import execute
execute(arguments)
Meanwhile, Mikhail Korobov has provided the optimal solution. See here:
# -*- coding: utf-8 -*-
# myapp/management/commands/scrapy.py
from __future__ import absolute_import
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def run_from_argv(self, argv):
self._argv = argv
self.execute()
def handle(self, *args, **options):
from scrapy.cmdline import execute
execute(self._argv[1:])
I think you're really looking for Guideline 10 of the POSIX argument syntax conventions:
The argument -- should be accepted as a delimiter indicating the end of options.
Any following arguments should be treated as operands, even if they begin with
the '-' character. The -- argument should not be used as an option or as an operand.
Python's optparse module behaves this way, even under windows.
I put the scrapy project settings module in the argument list, so I can create separate scrapy projects in independent apps:
# <app>/management/commands/scrapy.py
from __future__ import absolute_import
import os
from django.core.management.base import BaseCommand
class Command(BaseCommand):
def handle(self, *args, **options):
os.environ['SCRAPY_SETTINGS_MODULE'] = args[0]
from scrapy.cmdline import execute
# scrapy ignores args[0], requires a mutable seq
execute(list(args))
Invoked as follows:
python manage.py scrapy myapp.scrapyproj.settings crawl domain.com -- -o scraped_data.json -t json
Tested with scrapy 0.12 and django 1.3.1

Categories