Converting a MSSQL varbinary to a base64 string - python

The goal
Take the data in mssql, an image, convert to base64 and embed in an email.
Deets
I have an image, stored in a varbinary column in a mssql database.
0xFFD8FFE00....
On the other end, I'm querying it out into an ancient Jython environment because that's all I have access to.
When I query and print, I appear to get a a signed array of bytes or a char (maybe?).
>>> array('b', [-1, -40, -1, -32, 0, 16,...
Another thread had suggested dumping it into the b64 encoder
import base64
encoded = base64.b64encode(queryResult)
Which gave me an error TypeError: b2a_base64(): 1st arg can't be coerced to String
The thread also mentioned converting it to json, but since I'm in Python 2.4 land, I don't have access to import json or import simplejson. Using a json interpreter here seems like a major kludge to me.
I've also tried to convert it on the SQL end with decompress and casting to xml, neither of those work at all. The images work fine when passed as an email attachment, so they aren't corrupted as far as I can tell. To embed them in an html template, I need to get that Base64 string out.
I am missing something, I don't work with this stuff often enough to figure it out. I am aware of signed/unsigned, endian-ness at a high level but I can't quite crack this nut.

Converting Column values from VARBINARY to Base64
In most cases we will need to work on multiple rows in table, and we want to convert only the VARBINARY data into BASE64 String. The basic solution is the same as above, except for the solution using XML XQuery, which we will simply need to use different method.
Option 1: Convert binary to Base64 using JSON
select Id,AvatarBinary
from openjson(
(
select Id,AvatarBinary
from AriTestTbl
for json auto
)
) with(Id int, AvatarBinary varchar(max))
GO
Option 2: Convert binary to Base64 using XML XQuery
select Id,
cast('' as xml).value(
'xs:base64Binary(sql:column("AriTestTbl.AvatarBinary"))', 'varchar(max)'
)
from AriTestTbl
GO
Option 3: Convert binary to Base64 using XML and the hint "for xml path"
select Id,AvatarBinary,s
from AriTestTbl
cross apply (select AvatarBinary as '*' for xml path('')) T (s)
GO
Hope this helps...

Related

How to load json files with rasdaman

im studying Array database management systems a bit, in particular Rasdaman, i understand superficially the architecture and how the system works with sets and multidimensional arrays instead of tables as it is usual in relational dbms, im trying to save my own type of data to check if this type of databases can give me better performance to my specific problem(geospatial data in a particular format: DGGS), to do so i have created my own basic type based on a structure as indicated by the documentation, created my array type, set type and finally my collection for testing, i'm trying to insert data into this collection with the following idea:
query_executor.execute_update_from_file("insert into test_json_dict values decode($1, 'json', '{\"formatParameters\": {\"domain\": \"[0:1000]\",\"basetype\": struct { char k, long v } } })'", "...path.../rasdapy-demo/dggs_sample.json")
I'm using the library rasdapy to work from python instead of using rasql only(i use it anyways to validate small things), but i have been fighting with error messages that give little to no information:
Internal error: RasnetClientComm::executeQuery(): illegal status value 5
My source file has this type of data into it:
{
"N1": 6
}
A simple dict with a key and a value, i wanna save both things, i also tried to have a bigger dict with multiples keys and values on it but as the rasdaman decode function expects a basetype definition if i understand correctly i tried to change my data source format as a simple dict. It is obvious to see that i'm not doing the appropriate definition for decoding or that my source file has the wrong format but i haven't been able to find any examples on the web, any ideas on how to proceed? maybe i am even doing this whole thing from the wrong perspective and maybe i should try to use the OGC Web Coverage Service (WCS) standard ? i don't understand this yet so i have been avoiding it, anyways any advice or direction is greatly appreciated. Thanks in advance.
Edit:
I have been trying to load CSV data with the following format:
1 930
2 461
..
and the following query
query_executor.execute_update_from_file("insert into test_json_dict values decode($1, 'csv', '{\"formatParameters\": {\"domain\": \"[1:255]\",\"basetype\": struct { char key, long value } } })'", "...path.../rasdapy-demo/dggs_sample_4.csv")
but still no results, even tho it looks quite similar to the documentation example in Look for the CSV/JSON examples but no results still. What could be the issue?
It seems that my problem was trying to use the rasdapy library, this lib works fine but when working with data formats like csv and json it is best to use the rasql command line option, it states in the documentation :
filePaths - An array of absolute paths to input files to be decoded, e.g. ["/path/to/rgb.tif"]. This improves ingestion performance if the data is on the same machine as the rasdaman server, as the network transport is bypassed and the data is read directly from disk. Supported only for GDAL, NetCDF, and GRIB data formats.
and also it says:
As a first parameter the data to be decoded must be specified. Technically this data must be in the form of a 1D char array. Usually it is specified as a query input parameter with $1, while the binary data is attached with the --file option of the rasql command-line client tool, or with the corresponding methods in the client API.
It would be interesting to note if rasdapy takes this into account. Anyhow use of rasql gives way better response errors so i recommend that to anyone having a similar problem.
An example command could be:
rasql -q 'insert into test_basic values decode($1, "csv", "{ \"formatParameters\": {\"domain\": \"[0:1,0:2]\",\"basetype\": \"long\" } }")' --out string --file "/home/rasdaman/Documents/TFM/include/DGGS-Comparison/rasdapy-demo/dggs_sample_6.csv" --user rasadmin --passwd rasadmin
using this data:
1,2,3,2,1,3
After that you just got to start making it more and more complex as you need.

Overflow error when reading json file

I am trying to read a json which includes a number of tweets, but I get the following error.
OverflowError: int too large to convert
The script filters multiple json files to get specific tweets, and it crashes when reaching to a specific json.
The line that creates the error is this one :
df_temp = pd.read_json(path_or_buf=json_path, lines=True)
Here is the error in the cmd
Just store the user id as a String, and treat it like it is one (this is actually what you should do when dealing with this kind of ids). If you can't change the json input format, you can always parse it like a string before parsing it like a json object, and add the quotes to the id code, using for instance regexes: Regex in python.
I don't know with which library you are parsing the json, but maybe also implicit casting will work: either try the "getString" method on the number instead of the "getInt" method, or force python to treat the object like a string, with something like x = "" + json.getId()
Python is pretty loose on typing and may let you do it.

Python Azure Queue, getting error

I am struggling with an encoding issue. I am still trying to figure out the Python3 encoding scheme. I am trying to upload a json object from Python into an Azure Queue. I am using Python3
I make the json object
response = {"UserImageId": 636667744866847370, "OutputImageName": "car-1807177_with_blue-2467336_size_1020_u38fa38.png"}
queue_service.put_message(response_queue, json.dumps(response))
When it gets to the queue, I get the error
{"imgResponse":"The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters. ","log":null,"$return":""}
So I have to do something else, because apparently I need to base64 encode my string. So I try
queue_service.put_message(response_queue, base64.b64encode(json.dumps(response).encode('utf-8')))
and I get
TypeError: message should be of type str
From the Azure Storage Queue package. If I check the type of the above statement, it is of type bytes (makes sense).
So my question is, how do I encode my json object into something that the queue service will understand. I would really like to be able to keep the _ and - and . characters in the image name.
If anyone is looking to solve this problem using QueueClient rather than QueueService, here is what worked for me:
import json
from azure.storage.queue import QueueServiceClient, QueueClient, QueueMessage, TextBase64EncodePolicy
conn_string = '[YOUR_CONNECTION_STRING_HERE]'
queue_client = QueueClient.from_connection_string(
conn_string,
'[QUEUE_NAME_HERE]',
message_encode_policy=TextBase64EncodePolicy()
)
queue_client.send_message(json.dumps({'a':'b'}))
this is what I had to do in my code to make it work:
queue_service = QueueService(account_name=os.getenv('storageAccount'), account_key=os.getenv('storageKey'))
queue_service.encode_function = QueueMessageFormat.text_base64encode
after that I could just put messages:
queue_service.put_message('bbbb', message) # 'bbbb' is a queue name

How can I ensure that my Python regular expression outputs a dictionary?

I'm using Beej's Python Flickr API to ask Flickr for JSON. The unparsed string Flickr returns looks like this:
jsonFlickrApi({'photos': 'example'})
I want to access the returned data as a dictionary, so I have:
photos = "jsonFlickrApi({'photos': 'test'})"
# to match {'photos': 'example'}
response_parser = re.compile(r'jsonFlickrApi\((.*?)\)$')
parsed_photos = response_parser.findall(photos)
However, parsed_photos is a list, not a dictionary (according to type(parsed_photos). It outputs like:
["{'photos': 'test'}"]
How can I ensure that my parsed data ends up as a dictionary type?
If you're using Python 2.6, you can just use the JSON module to parse JSON stuff.
import json
json.loads(dictString)
If you're using an earlier version of Python, you can download the simplejson module and use that.
Example:
>>> json.loads('{"hello" : 4}')
{u'hello': 4}
You need to use a JSON parser to convert the string representation to actual Python data structure. Take a look at the documentation of the json module in the standard library for some examples.
In other words you'd have to add the following line at the end of your code
photos = json.loads(parsed_photos[0])
PS. In theory you could also use eval to achieve the same effect, as JSON is (almost) compatible with Python literals, but doing that would open a huge security hole. Just to let you know.

python csv - export data and format color, text format, etc

I'm exporting data from python with the csv library which works pretty good. After a web-search I can not find any information about how to format the exported data with python.
For example. I export a csv row within python like this.
for hour_record in object_list:
row = list()
for field in fields:
try:
row.append(getattr(hour_record, field))
except:
pass
writer.writerow(row)
Now I'm wondering how I can pass some formating information to the row.append call. For example if I want to format the text red and make sure that it is formated as a number etc..
anybody an idea how this works?
CSV is used only for plain text. If you want formatting information to be contained then you must either embed HTML fragments, or you must add the attributes as separate fields. Either option will require a consumer that understands said formatting mechanism.
Instead of csv, you should use numpy.genfromtxt and numpy.savetxt:
>>> data = numpy.genfromtxt('mydata.txt', dtype=None)
>>> <-> work with the data recarray object
>>> numpy.savetxt('results.txt', data)
If you look at help(numpy.savetxt) you can see that there is a 'fmt' option that allows you to specify the output format.
To output colors, you have to use HTML as said in another answer, or you can use the module terminalcolor which is only to output to STDOUT.

Categories