How to load json files with rasdaman

How to load json files with rasdaman - python

im studying Array database management systems a bit, in particular Rasdaman, i understand superficially the architecture and how the system works with sets and multidimensional arrays instead of tables as it is usual in relational dbms, im trying to save my own type of data to check if this type of databases can give me better performance to my specific problem(geospatial data in a particular format: DGGS), to do so i have created my own basic type based on a structure as indicated by the documentation, created my array type, set type and finally my collection for testing, i'm trying to insert data into this collection with the following idea:
query_executor.execute_update_from_file("insert into test_json_dict values decode($1, 'json', '{\"formatParameters\": {\"domain\": \"[0:1000]\",\"basetype\": struct { char k, long v } } })'", "...path.../rasdapy-demo/dggs_sample.json")
I'm using the library rasdapy to work from python instead of using rasql only(i use it anyways to validate small things), but i have been fighting with error messages that give little to no information:
Internal error: RasnetClientComm::executeQuery(): illegal status value 5
My source file has this type of data into it:
{
"N1": 6
}
A simple dict with a key and a value, i wanna save both things, i also tried to have a bigger dict with multiples keys and values on it but as the rasdaman decode function expects a basetype definition if i understand correctly i tried to change my data source format as a simple dict. It is obvious to see that i'm not doing the appropriate definition for decoding or that my source file has the wrong format but i haven't been able to find any examples on the web, any ideas on how to proceed? maybe i am even doing this whole thing from the wrong perspective and maybe i should try to use the OGC Web Coverage Service (WCS) standard ? i don't understand this yet so i have been avoiding it, anyways any advice or direction is greatly appreciated. Thanks in advance.
Edit:
I have been trying to load CSV data with the following format:
1 930
2 461
..
and the following query
query_executor.execute_update_from_file("insert into test_json_dict values decode($1, 'csv', '{\"formatParameters\": {\"domain\": \"[1:255]\",\"basetype\": struct { char key, long value } } })'", "...path.../rasdapy-demo/dggs_sample_4.csv")
but still no results, even tho it looks quite similar to the documentation example in Look for the CSV/JSON examples but no results still. What could be the issue?

It seems that my problem was trying to use the rasdapy library, this lib works fine but when working with data formats like csv and json it is best to use the rasql command line option, it states in the documentation :
filePaths - An array of absolute paths to input files to be decoded, e.g. ["/path/to/rgb.tif"]. This improves ingestion performance if the data is on the same machine as the rasdaman server, as the network transport is bypassed and the data is read directly from disk. Supported only for GDAL, NetCDF, and GRIB data formats.
and also it says:
As a first parameter the data to be decoded must be specified. Technically this data must be in the form of a 1D char array. Usually it is specified as a query input parameter with $1, while the binary data is attached with the --file option of the rasql command-line client tool, or with the corresponding methods in the client API.
It would be interesting to note if rasdapy takes this into account. Anyhow use of rasql gives way better response errors so i recommend that to anyone having a similar problem.
An example command could be:
rasql -q 'insert into test_basic values decode($1, "csv", "{ \"formatParameters\": {\"domain\": \"[0:1,0:2]\",\"basetype\": \"long\" } }")' --out string --file "/home/rasdaman/Documents/TFM/include/DGGS-Comparison/rasdapy-demo/dggs_sample_6.csv" --user rasadmin --passwd rasadmin
using this data:
1,2,3,2,1,3
After that you just got to start making it more and more complex as you need.

Related

What is equivalent of Perl DB_FILE module in Python?

I was asked by my supervisor to convert some Perl scripts into Python language. I'm baffled by few lines of code and I am also relatively inexperienced with Python as well. I'm an IT intern, so this was something of a challenge.
Here are the lines of code:
my %sybase;
my $S= tie %sybase, "DB_File", $prismfile, O_RDWR|O_CREAT, 0666, $DB_HASH or die "Cannot open: $!\n";
$DB_HASH->{'cachesize' } = $cache;
I'm not sure what is the equivalent of this statement in Python? DB_FILE is a Perl module. DB_HASH is a database type that allows arbitrary keys/values to be stored in data file, at least that's according to Perl documentation.
After that, the next lines of code also got me stumped on how to convert this to the equivalent in Python as well.
$scnt=0;
while(my $row=$msth->fetchrow_arrayref()) {
$scnt++;
$srow++;
#if ($scnt <= 600000) {
$S->put(join('#',#{$row}[0..5]),join('#',#{$row}[6..19]));
perf(500000,'sybase') ;#if $VERBOSE ;
# }
}
I'll probably use fetchall() in Python to store the entire result dataset in it, then work through it row by row. But I'm not sure how to implement join() correctly in Python, especially since these lines use range within the row index elements -- [0..5]. Also it seems to write the output to data file (look at put()). I'm not sure what perf() does, can anyone help me out here?
I'd appreciate any kind of help here. Thank you very much.

Check if JSON var has nullable key (Twitter Streaming API)

I'm downloading tweets from Twitter Streaming API using Tweepy. I manage to check if downloaded data has keys as 'extended_tweet', but I'm struggling with an specific key inside another key.
def on_data(self, data):
savingTweet = {}
if not "retweeted_status" in data:
dataJson = json.loads(data)
if 'extended_tweet' in dataJson:
savingTweet['text'] = dataJson['extended_tweet']['full_text']
else:
savingTweet['text'] = dataJson['text']
if 'coordinates' in dataJson:
if 'coordinates' in dataJson['coordinates']:
savingTweet['coordinates'] = dataJson['coordinates']['coordinates']
else:
savingTweet['coordinates'] = 'null'
I'm checking 'extended_key' propertly, but when I try to do the same with ['coordinates]['coordinates] I get the following error:
TypeError: argument of type 'NoneType' is not iterable
Twitter documentation says that key 'coordinates' has the following structure:
"coordinates":
{
"coordinates":
[
-75.14310264,
40.05701649
],
"type":"Point"
}
I achieved to solve it by just putting the conflictive check in a try, except, but I think this is not the most suitable approach to the problem. Any other idea?

So the twitter API docs are probably lying a bit about what they return (shock horror!) and it looks like you're getting a None in place of the expected data structure. You've already decided against using try, catch, so I won't go over that, but here are a few other suggestions.
Using dict get() default
There are a couple of options that occur to me, the first is to make use of the default ability of the dict get command. You can provide a fall back if the expected key does not exist, which allows you to chain together multiple calls.
For example you can achieve most of what you are trying to do with the following:
return {
'text': data.get('extended_tweet', {}).get('full_text', data['text']),
'coordinates': data.get('coordinates', {}).get('coordinates', 'null')
}
It's not super pretty, but it does work. It's likely to be a little slower that what you are doing too.
Using JSONPath
Another option, which is likely overkill for this situation is to use a JSONPath library which will allow you to search within data structures for items matching a query. Something like:
from jsonpath_rw import parse
matches = parse('extended_tweet.full_text').find(data)
if matches:
print(matches[0].value)
This is going to be a lot slower that what you are doing, and for just a few fields is overkill, but if you are doing a lot of this kind of work it could be a handy tool in the box. JSONPath can also express much more complicated paths, or very deeply nested paths where the get method might not work, or would be unweildy.
Parse the JSON first!
The last thing I would mention is to make sure you parse your JSON before you do your test for "retweeted_status". If the text appears anywhere (say inside the text of a tweet) this test will trigger.
JSON parsing with a competent library is usually extremely fast too, so unless you are having real speed problems it's not necessarily worth worrying about.

Integrating the OHLC value from Python API to MT5 using MQL5

I have obtained the OHLC values from the iqoption and trying to find out a way to use it with MT5.
Here is how I got the values:
import time
from iqoptionapi.stable_api import IQ_Option
I_want_money=IQ_Option("email","password")
goal="EURUSD"
print("get candles")
print(I_want_money.get_candles(goal,60,111,time.time()))
The above code library is here: iqoptionapi
The line: I_want_money.get_candles(goal,60,111,time.time()) output json as : Output of the command
Now I am getting json in the output so it work like an API, I guess so.
Meanwhile, I try to create a Custom Symbol in MT5 as iqoption. Now I just wanted to add the data of the OHLC from the API to it, so that it will continue fetching data from the Iqoption and display the chart on the chart window for the custom symbol iqoption.
But I am not able to load it in the custom symbol. Kindly, help me.
Edited
This is the code for live streaming data from the iqoption:
from iqoptionapi.stable_api import IQ_Option
import logging
import time
logging.basicConfig(level=logging.DEBUG,format='%(asctime)s %(message)s')
I_want_money=IQ_Option("email","password")
I_want_money.start_candles_stream("EURUSD")
thread=I_want_money.collect_realtime_candles_thread_start("EURUSD",100)
I_want_money.start_candles_stream("USDTRY")
thread2=I_want_money.collect_realtime_candles_thread_start("USDTRY",100)
time.sleep(3)
#Do some thing
ans=I_want_money.thread_collect_realtime.items()
for k, v in ans:
print (k, v)
I_want_money.collect_realtime_candles_thread_stop(thread)
I_want_money.stop_candles_stream("EURUSD")
I_want_money.collect_realtime_candles_thread_stop(thread2)
I_want_money.stop_candles_stream("USDTRY")

Ok, you need to
1. receive the feed from the broker(I hope you succeeded)
2. write it into a file
** (both - python) **
3. read and parse it
4. add it to the history centre/marketWatch
** (both - mt5) **
So, you receive data as a string after
I_want_money.get_candles(goal,60,111,time.time())
this string might be json or json-array.
The important question is of course the path you are going to put the data. An expert in MQL45 can access only two folders (if not applying dll):
C:\Users\MY_NAME_IS_DANIEL_KNIAZ\AppData\Roaming\MetaQuotes\Terminal\MY_TERMINAL_ID_IN_HEX_FORMAT\MQL4\Files
and
C:\Users\MY_NAME_IS_DANIEL_KNIAZ\AppData\Roaming\MetaQuotes\Terminal\Common\Files
in the latter case you need to open a file with const int handle=FileOpen(,|*| FILECOMMON);
In order to parse json, you can use jason.mqh https://www.mql5.com/en/code/13663 library (there are few others) but as far as i remember it has a bug: it cannot parse array of objects correctly. In order to overcome that, I would suggest to write each tick at a separate line.
And the last, you will recieve data from your python application at random time, and write it into Common or direct folder. The MT5 robot will read it and delete. Just to avoid confusion, it could be better to guarantee that a file has a unique name. Either random (random.randint(1,1000)) or milliseconds from datetime can help.
So far, you have python code:
receivedString = I_want_money.get_candles(goal,60,111,time.time())
filePath = 'C:\Users\MY_NAME_IS_DANIEL_KNIAZ\AppData\Roaming\MetaQuotes\Terminal\MY_TERMINAL_ID_IN_HEX_FORMAT\MQL4\Files\iqoptionfeed'
fileName = os.path.join(filePath,"_"+goal+"_"+str(datetime.now())+".txt")
file = open(fileName, "w")
for string_ in receivedString:
file.write(string_)
file.close()
In case you created a thread, each time you receive an answer from the thread you write such a file.
Next, you need that data in MT5.
The easiest way is to loop over the existing files, make sure you can read them and read (or give up if you cannot) and delete after reading, then proceed with the data received.
The easiest and faster way is to use 0MQ of course, but let us do it without dll's.
In order to read the files, you need to setup a timer that can work as fast as possible, and let it go. Since you cannot make a windows app sleeping less then 15.6ms, your timer should sleep this number of time.
string path;
int OnInit()
{
EventSetMillisecondTimer(16);
path="iqoptionfeed\\*";
}
void OnDeinit(const int reason) { EventKillTimer(); }
string _fileName;
long _search_handle;
void OnTimer()
{
_search_handle=FileFindFirst(path,_fileName);
if(_search_handle!=INVALID_HANDLE)
{
do
{
ResetLastError();
FileIsExist(_fileName);
if(GetLastError()!=ERR_FILE_IS_DIRECTORY)
processFile(path+_fileName);
}
while(FileFindNext(_search_handle,_fileName));
FileFindClose(_search_handle);
}
}
this piece of code loops the folder and processes each file it managed to find.
Now reading the file (two functions) and processing the message inside it:
void processFile(const string fileName)
{
string message;
if(ReadFile(fileName,message))
processMessage(message,fileName);
}
bool ReadFile(const string fileName,string &result,const bool common=false)
{
const int handle = FileOpen(fileName,common?(FILE_COMMON|FILE_READ):FILE_READ);
if(handle==INVALID_HANDLE)
{
printf("%i - failed to find file %s (probably doesnt exist!). error=%d",__LINE__,fileName,GetLastError());
return(false);
}
Read(handle,result);
FileClose(handle);
if(!FileDelete(fileName,common?FILE_COMMON:0))
printf("%i - failed to delete file %s/%d. error=%d",__LINE__,fileName,common,GetLastError());
return(true);
}
void Read(const int handle,string &message)
{
string text="";
while(!FileIsEnding(handle) && !IsStopped())
{
text=StringConcatenate(text,FileReadString(handle),"\n");
}
//printf("%i %s - %s.",__LINE__,__FUNCTION__,text);
message=text;
}
And the last but not the least: process the obtained file.
As it was suggested above, it has a json formatted tick for each new tick, separated by \r\n.
Our goal is to add it to the symbol. In order to parse json, jason.mqh is an available solution but you can parse it manually of course.
void processMessage(const string message,const string fileName)
{
string symbolName=getSymbolFromFileName(fileName);
if(!SymbolSelect(symbolName,true))
{
if(!CustomSymbolCreate(symbolName))
return;
}
string lines[];
int size=StringSplit(message,(ushort)'\n',lines);
for(int i=0;i<size;i++)
{
if(StringLen(lines[i])==0)
continue;
CJAVal jLine(jtUNDEF,NULL);
jLine.Deserialize(lines[i]);
MqlTick mql;
//here I assume that you receive a json file like " { "time":2147483647,"bid":1.16896,"ask":1.16906,"some_other_data":"someOtherDataThatYouMayAlsoUse" } "
mql.time=(datetime)jLine["time"].ToInt();
mql.bid=(double)jLine["bid"].ToDbl();
mql.ask=(double)jLine["ask"].ToDbl();
ResetLastError();
if(CustomTicksAdd(symbolName,mql)<0)
printf("%i %s - failed to upload tick: %s %s %.5f %.5f. error=%d",__LINE__,__FILE__,symbolName,TimeToString(mql.time),mql.bid,mql.ask,GetLastError());
}
}
string getSymbolFromFileName(const string fileName)
{
string elements[];
int size=StringSplit(fileName,(ushort)'_',elements);
if(size<2)
return NULL;
return elements[1];
}
Do not forget to add debugging info and request for GetLastError() is for some reason you get errors.
Can this work in a back tester? Of course not. Fist, OnTimer() is not supported in MQL tester. Next, you need some history record in order to make it running. If you do not have any history - Nobody can help you unlees a broker can give it to you; the best idea could be to start collecting and storing it right now, and when the project is ready (maybe another couple months), you will have it ready and be able to test and optimize the strategy with the available dataset. You can apply the collected set into tester (MQL5 is really the next step in algo trading development compared to MQL4), either manually or with something like tickDataSuite and its Csv2Fxt.ex4 file that makes HST binary files that the tester can read and process; anyway that is another question and nobody can tell you if your broker stores their data somewhere to provide it to you.

After second-reading what you wrote (and edited) I can see you want:
a symbol synchronized with iqoption [ through your proxy / remotely ]
The symbol could be used for backtesting
The symbol could be used for on-screen live/strategy/indicator run
That implies operations outside strategy/indicator which MT platforms do not allow in an automated manner - you can achieve it manually by providing a data package, parsing it to CSV and importing to custom symbol creator. Well documented here.
Unfortunately, you choose a platform that by-design stands for self-contained strategies and indicators, more for beginners than professionals taking it seriously.
Refer to the link I provided and see for yourself. The official doc states you can create a custom symbol via mql ref, yet even though they state, in the foreword, it allows 3rd party providers - it's not referenced anywhere else and does not show any integration possibilities.
custom indicators
custom symbol properties

Using boto (AWS Python), how do I get a list of IAM users?

Newbie question. I'm expecting to be able to do something like the following:
import boto
from boto.iam.connection import IAMConnection
cfn = IAMConnection()
cfn.get_all_users()
for user in userList:
print user.user_id
The connection works fine, but the last line returns the error "'unicode' object has no attribute 'user_id'".
I did a type(userList) and it reported the type as <class 'boto.jsonresponse.Element'>, which doesn't appear (?) to be documented. Using normal JSON parsing doesn't appear to work either.
From another source, it looks as if the intent is the results of an operation like this are supposed to be "pythonized."
Anyway, I'm pretty new to boto, so I assume there's some simple way to do this that I just haven't stumbled across.
Thx!

For some of the older AWS services, boto takes the XML responses from the service and turns them into nice Python objects. For others, it takes the XML response and transliterates it directly into native Python data structures. The boto.iam module is of the latter form. So, the actual data returned by get_all_users() looks like this:
{u'list_users_response':
{u'response_metadata': {u'request_id': u'8d694cbd-93ec-11e3-8276-993b3edf6dba'},
u'list_users_result': {u'users':
[{u'path': u'/', u'create_date': u'2014-01-21T17:19:45Z', u'user_id':
u'<userid>', u'arn': u'<arn>', u'user_name': u'foo'},
{...next user...}
]
}
}
So, all of the data you want is there, it's just a bit difficult to find. The boto.jsonresponse.Element object returned does give you a little help. You can actually do something like this:
data = cfn.get_all_users()
for user in data.user:
print(user['user_name'])
but for the most part you just have to dig around in the data returned to find what you are looking for.

Python: parse VARIANT (?)

I have to read a file in python that uses Microsoft VARIANT (I think - I really don't know much about Microsoft code :S). Basically I want to know if there are python packages that can do this for me.
To explain - the file I'm trying to read is just a whole bunch of { 2-byte integer, <data> } repeated over and over, where the 2-byte integer specifies what the <data> is.
The 2-byte integer corresponds to the Microsoft data types in VARIANT: VT_I2, VT_I4, etc, and based on the type I can write code to read in and coerce <data> to an appropriate Python object.
My current attempt is along the following lines:
while dtype = file.read(2):
value = None
# translate dtype (I've put in VT_XX myself to match up with Microsoft)
if dtype == VT_I2:
value = file.read(2)
elif dtype == VT_I4:
value = file.read(4)
# ... and so on for other types
# append value to the list of values
# return the values we read
return values
The thing is, I'm having trouble working out how to convert some of the bytes to the appropriate Python object (for example VT_BSTR, VT_DECIMAL, VT_DATE). However before I try further, I'd like to know if there are any existing python packages that do this logic for me (i.e. take in a file object/bytes and parse it into a set of python objects, be they float, int, dates, strings, ...).
It just seems like this is a fairly common thing to do.
However, I've been having difficulty looking for packages to do it because not knowing anything about Microsoft code, I don't have the terminology to do the appropriate googling. (If it is relevant, I am running LINUX).

The win32com package in pywin32 will do just that for you. The documentation is quite underwhelming, but there's a lot variant.html included explaining the basic use and a lot of tutorials and references online.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to load json files with rasdaman - python

Related

What is equivalent of Perl DB_FILE module in Python?

Check if JSON var has nullable key (Twitter Streaming API)

Integrating the OHLC value from Python API to MT5 using MQL5

Using boto (AWS Python), how do I get a list of IAM users?

Python: parse VARIANT (?)

Categories

Resources