I have an old php system that uses Rfid. We have a sql query which converts the Rfid codes. I am changing the PHP system to ODOO 11 Here is the SQL query
UPDATE members set member_id=100000*floor(fob_id/65536)+(fob_id%65536)
It is basically the fob id multiplied by 100000 then divided by 65536
But later they discovered if a fob number ended with a 7 or 9 i think, it was being calculated wrong so the floor part always rounds down and the % adds on the remainder after dividing them. (I think is how it worked)
How can I get the same result in Python as the above query (Python version 3.5)
Here is my code
#api.onchange('barcode2')
def _onchange_barcode2(self):
self.barcode = (self.barcode2)
a = (self.barcode2) * 100000
fob = (math.floor(a)) /65536 +(self.barcode2)%65536
self.memberid = fob
I have a rfid number of 0005225306 which should be 07947962 but with the above code I get 8,021,146.
Am I using the % operator correct? If I input 0005225306 into my old PHP system which uses MySQL UPDATE members set member_id=100000*floor(fob_id/65536)+(fob_id%65536)it gives correct value of 07947962.
Any idea why my python code is not getting the same value?
I have no idea about your barcode* fields and their data type, but care when parsing them. And the order of those calculations is important:
#api.onchange('barcode2')
def _onchange_barcode2(self):
# let's suppose the barcode fields are strings
# because of leading zeros
self.barcode = self.barcode2
try:
barcode_int = int(self.barcode2)
fob = math.floor(barcode_int / 65536)
fob = int((fob * 100000) + barcode_int % 65536)
# let's suppose memberid is a char field
self.memberid = '{:08d}'.format(fob)
except:
# handle parse problems eg. Odoo Warning
Related
Could you give a very simple example of using Redis' xread and xadd in Python ( that displays the type and format of return values form xread and input of xadd)? I've already read many documentation but none of them are in Python.
The Redis doc gives an example:
> XADD mystream * sensor-id 1234 temperature 19.8
1518951480106-0
but If I try in python:
sample = {b"hello":b"12"}
id = r.xadd("mystream", sample)
I get this error:
redis.exceptions.ResponseError: WRONGTYPE Operation against a key holding the wrong kind of value
make sure to flush before running just to make sure that there doesn't exist a list / stream with the same name. :
redis-cli flushall
if __name__ == '__main__':
r = redis.Redis(host='localhost', port=6379, db=0)
encoder = JSONEncoder()
sample = {"hello": encoder.encode([1234,125, 1235, 1235])} # converts list to string
stream_name = 'mystream'
for i in range(10):
r.xadd(stream_name, sample)
# "$" doesn't seem to work in python
read_samples = r.xread({stream_name:b"0-0"})
Based on redis-py documentation:
Redis intitalization:
redis = redis.Redis(host='localhost')
To add a key-value pair (key-value should be a dictionary):
redis.xadd(stream_name, {key: value})
Block to read:
redis.xread({stream_name: '$'}, None, 0)
stream_name and ID should be a dictionary.
$ means the most new message.
Moreover, instead of passing a normal ID for the stream mystream I
passed the special ID $. This special ID means that XREAD should use
as last ID the maximum ID already stored in the stream mystream, so
that we will receive only new messages, starting from the time we
started listening.from here
COUNT should be NONE if you want to receive the newest, not just any number of messages.
0 for BLOCK option means Block with a timeout of 0 milliseconds (that means to never timeout)
Looking at the help (or the docstrings (1), (2)) for the functions, they're quite straightforward:
>>> import redis
>>> r = redis.Redis()
>>> help(r.xadd)
xadd(name, fields, id='*', maxlen=None, approximate=True)
Add to a stream.
name: name of the stream
fields: dict of field/value pairs to insert into the stream
id: Location to insert this record. By default it is appended.
maxlen: truncate old stream members beyond this size
approximate: actual stream length may be slightly more than maxlen
>>> help(r.xread)
xread(streams, count=None, block=None)
Block and monitor multiple streams for new data.
streams: a dict of stream names to stream IDs, where
IDs indicate the last ID already seen.
count: if set, only return this many items, beginning with the
earliest available.
block: number of milliseconds to wait, if nothing already present.
My table contains user query data. I generate a hashed string by doing the following:
queries = Query.objects.values('id', 'name')
# creating a bytes string using the ID, NAME columns and a string "yes" (this string could be anything, I've considered yes as an example)
data = (str(query['id']) + str(query['name']) + "yes").encode()
link_hash = hashlib.pbkdf2_hmac('sha256', data, b"satisfaction", 100000)
link_hash_string = binascii.hexlify(link_hash).decode()
I've sent this hashstring via email embedded in a link which is checked when the use visits that link. My current method of checking if the hash (got from the GET parameter in the link) matches with some data in the table is like this:
queries = Query.objects.values('id', 'name')
# I've set replyHash as a string here as an example, it is generated by the code written above, but the hash will be got from the GET parameter in the link
replyHash = "269e1b3de97b10cd28126209860391938a829ef23b2f674f79c1436fd1ea38e4"
#Currently iterating through all the queries and checking each of the query
for query in queries:
data = (str(query['id']) + str(query['name']) + "yes").encode()
link_hash = hashlib.pbkdf2_hmac('sha256', data, b"satisfaction", 100000)
link_hash_string = binascii.hexlify(link_hash).decode()
if replyHash == link_hash_string:
print("It exists, valid hash")
query['name'] = "BooBoo"
query.save()
break
The problem with this approach is that if I have a large table with thousands of rows, this method will take a lot of time. Is there an approach using annotation or aggregation or something else which will perform the same action in less time?
I give a lot of information on the methods that I used to write my code. If you just want to read my question, skip to the quotes at the end.
I'm working on a project that has a goal of detecting sub populations in a group of patients. I thought this sounded like the perfect opportunity to use association rule mining as I'm currently taking a class on the subject.
I there are 42 variables in total. Of those, 20 are continuous and had to be discretized. For each variable, I used the Freedman-Diaconis rule to determine how many categories to divide a group into.
def Freedman_Diaconis(column_values):
#sort the list first
column_values[1].sort()
first_quartile = int(len(column_values[1]) * .25)
third_quartile = int(len(column_values[1]) * .75)
fq_value = column_values[1][first_quartile]
tq_value = column_values[1][third_quartile]
iqr = tq_value - fq_value
n_to_pow = len(column_values[1])**(-1/3)
h = 2 * iqr * n_to_pow
retval = (column_values[1][-1] - column_values[1][1])/h
test = int(retval+1)
return test
From there I used min-max normalization
def min_max_transform(column_of_data, num_bins):
min_max_normalizer = preprocessing.MinMaxScaler(feature_range=(1, num_bins))
data_min_max = min_max_normalizer.fit_transform(column_of_data[1])
data_min_max_ints = take_int(data_min_max)
return data_min_max_ints
to transform my data and then I simply took the interger portion to get the final categorization.
def take_int(list_of_float):
ints = []
for flt in list_of_float:
asint = int(flt)
ints.append(asint)
return ints
I then also wrote a function that I used to combine this value with the variable name.
def string_transform(prefix, column, index):
transformed_list = []
transformed = ""
if index < 4:
for entry in column[1]:
transformed = prefix+str(entry)
transformed_list.append(transformed)
else:
prefix_num = prefix.split('x')
for entry in column[1]:
transformed = str(prefix_num[1])+'x'+str(entry)
transformed_list.append(transformed)
return transformed_list
This was done to differentiate variables that have the same value, but appear in different columns. For example, having a value of 1 for variable x14 means something different from getting a value of 1 in variable x20. The string transform function would create 14x1 and 20x1 for the previously mentioned examples.
After this, I wrote everything to a file in basket format
def create_basket(list_of_lists, headers):
#for filename in os.listdir("."):
# if filename.e
if not os.path.exists('baskets'):
os.makedirs('baskets')
down_length = len(list_of_lists[0])
with open('baskets/dataset.basket', 'w') as basketfile:
basket_writer = csv.DictWriter(basketfile, fieldnames=headers)
for i in range(0, down_length):
basket_writer.writerow({"trt": list_of_lists[0][i], "y": list_of_lists[1][i], "x1": list_of_lists[2][i],
"x2": list_of_lists[3][i], "x3": list_of_lists[4][i], "x4": list_of_lists[5][i],
"x5": list_of_lists[6][i], "x6": list_of_lists[7][i], "x7": list_of_lists[8][i],
"x8": list_of_lists[9][i], "x9": list_of_lists[10][i], "x10": list_of_lists[11][i],
"x11": list_of_lists[12][i], "x12":list_of_lists[13][i], "x13": list_of_lists[14][i],
"x14": list_of_lists[15][i], "x15": list_of_lists[16][i], "x16": list_of_lists[17][i],
"x17": list_of_lists[18][i], "x18": list_of_lists[19][i], "x19": list_of_lists[20][i],
"x20": list_of_lists[21][i], "x21": list_of_lists[22][i], "x22": list_of_lists[23][i],
"x23": list_of_lists[24][i], "x24": list_of_lists[25][i], "x25": list_of_lists[26][i],
"x26": list_of_lists[27][i], "x27": list_of_lists[28][i], "x28": list_of_lists[29][i],
"x29": list_of_lists[30][i], "x30": list_of_lists[31][i], "x31": list_of_lists[32][i],
"x32": list_of_lists[33][i], "x33": list_of_lists[34][i], "x34": list_of_lists[35][i],
"x35": list_of_lists[36][i], "x36": list_of_lists[37][i], "x37": list_of_lists[38][i],
"x38": list_of_lists[39][i], "x39": list_of_lists[40][i], "x40": list_of_lists[41][i]})
and I used the apriori package in Orange to see if there were any association rules.
rules = Orange.associate.AssociationRulesSparseInducer(patient_basket, support=0.3, confidence=0.3)
print "%4s %4s %s" % ("Supp", "Conf", "Rule")
for r in rules:
my_rule = str(r)
split_rule = my_rule.split("->")
if 'trt' in split_rule[1]:
print 'treatment rule'
print "%4.1f %4.1f %s" % (r.support, r.confidence, r)
Using this, technique I found quite a few association rules with my testing data.
THIS IS WHERE I HAVE A PROBLEM
When I read the notes for the training data, there is this note
...That is, the only
reason for the differences among observed responses to the same treatment across patients is
random noise. Hence, there is NO meaningful subgroup for this dataset...
My question is,
why do I get multiple association rules that would imply that there are subgroups, when according to the notes I shouldn't see anything?
I'm getting lift numbers that are above 2 as opposed to the 1 that you should expect if everything was random like the notes state.
Supp Conf Rule
0.3 0.7 6x0 -> trt1
Even though my code runs, I'm not getting results anywhere close to what should be expected. This leads me to believe that I messed something up, but I'm not sure what it is.
After some research, I realized that my sample size is too small for the number of variables that I have. I would need a way larger sample size in order to really use the method that I was using. In fact, the method that I tried to use was developed with the assumption that it would be run on databases with hundreds of thousands or millions of rows.
I write a Djano application which deals with financial data process.
I have to load large data(more than 1000000 records) from MySQL table, and convert the records to JSON data in django views as following:
trades = MtgoxTrade.objects.all()
data = []
for trade in trades:
js = dict()
js['time']= trade.time
js['price']= trade.price
js['amount']= trade.amount
js['type']= trade.type
data.append(js)
return data
The problem is that the FOR loop is very slow(which takes more than 9 seconds for 200000 records), is there any effective way to convert DB records to JSON format data in Python?
Updated: I have run code according to Mike Housky's answer in my ENV(ActivePython2.7,Win7) With code changes and result as:
def create_data(n):
from api.models import MtgoxTrade
result = MtgoxTrade.objects.all()
return result
Build ............ 0.330999851227
For loop ......... 7.98400020599
List Comp. ....... 0.457000017166
Ratio ............ 0.0572394796312
For loop 2 ....... 0.381999969482
Ratio ............ 0.047845686326
You will find the for loop takes about 8 seconds! And if i comment out the For loop,then List Comp also takes such time as:
Times:
Build ............ 0.343000173569
List Comp. ....... 7.57099986076
For loop 2 ....... 0.375999927521
My new question is that whether the for loop will touch the database? But i did not see any DB access log. So strange!
Here are several tips/things to try.
Since you need to make a JSON-string from the queryset eventually, use django's built-in serializers:
from django.core import serializers
data = serializers.serialize("json",
MtgoxTrade.objects.all(),
fields=('time','price','amount','type'))
You can make serialization faster by using ujson or simplejson modules. See SERIALIZATION_MODULES setting.
Also, instead of getting all the field values from the record, be explicit and get only what you need to serialize:
MtgoxTrade.objects.all().values('time','price','amount','type')
Also, you may want to use iterator() method of a queryset:
...For a QuerySet which returns a large number of objects that you
only need to access once, this can result in better performance and a
significant reduction in memory...
Also, you can split your huge queryset into batches, see: Batch querysets.
Also see:
Why is iterating through a large Django QuerySet consuming massive amounts of memory?
Memory efficient Django Queryset Iterator
django: control json serialization
You can use a list comprehension as that prevents many dict() and append() calls:
trades = MtgoxTrade.objects.all()
data = [{'time': trade.time, 'price': trade.price, 'amount': trade.amount, 'type': trade.type}
for trade in trades]
return data
Function calls are expensive in Python so you should aim to avoid them in slow loops.
This answer is in support of Simeon Visser's observation. I ran the following code:
import gc, random, time
if "xrange" not in dir(__builtins__):
xrange = range
class DataObject(object):
def __init__(self, time, price, amount, type):
self.time = time
self.price = price
self.amount = amount
self.type = type
def create_data(n):
result = []
for index in xrange(n):
s = str(index);
result.append(DataObject("T"+s, "P"+s, "A"+s, "ty"+s))
return result
def convert1(trades):
data = []
for trade in trades:
js = dict()
js['time']= trade.time
js['price']= trade.price
js['amount']= trade.amount
js['type']= trade.type
data.append(js)
return data
def convert2(trades):
data = [{'time': trade.time, 'price': trade.price, 'amount': trade.amount, 'type': trade.type}
for trade in trades]
return data
def convert3(trades):
ndata = len(trades)
data = ndata*[None]
for index in xrange(ndata):
t = trades[index]
js = dict()
js['time']= t.time
js['price']= t.price
js['amount']= t.amount
js['type']= t.type
#js = {"time" : t.time, "price" : t.price, "amount" : t.amount, "type" : t.type}
return data
def main(n=1000000):
t0s = time.time()
trades = create_data(n);
t0f = time.time()
t0 = t0f - t0s
gc.disable()
t1s = time.time()
jtrades1 = convert1(trades)
t1f = time.time()
t1 = t1f - t1s
t2s = time.time()
jtrades2 = convert2(trades)
t2f = time.time()
t2 = t2f - t2s
t3s = time.time()
jtrades3 = convert3(trades)
t3f = time.time()
t3 = t3f - t3s
gc.enable()
print ("Times:")
print (" Build ............ " + str(t0))
print (" For loop ......... " + str(t1))
print (" List Comp. ....... " + str(t2))
print (" Ratio ............ " + str(t2/t1))
print (" For loop 2 ....... " + str(t3))
print (" Ratio ............ " + str(t3/t1))
main()
Results on Win7, Core 2 Duo 3.0GHz:
Python 2.7.3:
Times:
Build ............ 2.95600008965
For loop ......... 0.699999809265
List Comp. ....... 0.512000083923
Ratio ............ 0.731428890618
For loop 2 ....... 0.609999895096
Ratio ............ 0.871428659011
Python 3.3.0:
Times:
Build ............ 3.4320058822631836
For loop ......... 1.0200011730194092
List Comp. ....... 0.7500009536743164
Ratio ............ 0.7352942070195492
For loop 2 ....... 0.9500019550323486
Ratio ............ 0.9313733946208623
Those vary a bit, even with GC disabled (much more variance with GC enabled, but about the same results). The third conversion timing shows that a fair-sized chunk of the saved time comes from not calling .append() a million times.
Ignore the "For loop 2" times. This version has a bug and I am out of time to fix it for now.
First you have to check if the performance loss happens while fetching the data from the database or inside the loop.
There is no real option for giving you a significant speedup - also not using a list comprehension as noticed above.
However there is a huge difference in performance between Python 2 and 3.
A simple benchmark showed me that the for-loop is roughly 2,5 times faster with Python 3.3 (using some simple benchmark like the following):
import time
ts = time.time()
data = list()
for i in range(1000000):
d = {}
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
d['a'] = 5
data.append(d)
print(time.time() - ts)
/opt/python-3.3.0/bin/python3 foo2.py
0.5906929969787598
python2.6 foo2.py
1.74390792847
python2.7 foo2.py
0.673550128937
You will also note that there is a significant performance difference between Python 2.6 and 2.7.
I think it's worth trying to do a raw query against the database because a Model puts a lot of extra boilerplate code into fields (I belive that fields are properties) and like previously mentioned function calls are expensive. See the documentation, there is an example at the bottom of the page that uses dictfetchall which seems like the thing you are after.
You might want to look into the values method. It will return an iterable of dicts instead of model objects, so you don't have to create a lot of intermediate data structures. Your code could be reduced to
return MtgoxTrade.objects.values('time', 'price', 'amount', 'type')
First, I'm new in Python and I work on Arc GIS 9.3.
I'd like to realize a loop on the "Select_Analysis" tool. Indeed I have a layer "stations" composed of all the bus stations of a city.
The layer has a field "rte_id" that explains on what line a station is located.
And I'd like to save in distinct layers all the stations with "rte_id" = 1, the stations with "rte_id" = 2 and so on. Hence the use of the tool select_analysis.
So, I decided to make a loop (I have 70 different "rte_id" .... so 70 different layers to create!). But it does not work and I'm totally lost!
Here is my code:
import arcgisscripting, os, sys, string
gp = arcgisscripting.create(9.3)
gp.AddToolbox("C:/Program Files (x86)/ArcGIS/ArcToolbox/Toolboxes/Data Management Tools.tbx")
stations = "d:/Travaux/NantesMetropole/Traitements/SIG/stations.shp"
field = "rte_id"
for i in field:
gp.Select_Analysis (stations, "d:/Travaux/NantesMetropole/Traitements/SIG/stations_" + i + ".shp", field + "=" + i)
i = i+1
print "ok"
And here is the error message:
gp.Select_Analysis (stations, "d:/Travaux/NantesMetropole/Traitements/SIG/stations_" + i + ".shp", field + "=" + i)
TypeError: can only concatenate list (not "str") to list
Have you got any ideas to solve my problem?
Thanks in advance!
Julien
The main problem here is in the string
for i in field:
You are trying to iterate a string - field name ("rte_id").
This is not correct.
You need to iterate all possible values of field "rte_id".
Easiest solution:
if you know that field "rte_id" have values 1 - 70 (for example) then you can try
for i in range(1, 71):
shp_name = "d:/Travaux/NantesMetropole/Traitements/SIG/stations_" + str(i) + ".shp"
expression = '{0} = {1}'.format(field, i)
gp.Select_Analysis (stations, shp_name , expression)
print "ok"
More sophisticated solution:
You need to get a list of all unique values of field "rte_id" in terms of SQL - to perform GROUP BY.
I think it is not actually possible to perform GROUP BY operation on SHP files with one tool.
You can use SearchCursor, iterate through all features and generate a list of unique values of you field. But this is more complex task.
Another way is to use the Summarize option on the shapefile table in ArcMap (open table, right click on the column header). You will get dbf table with unique values which you can read in your script.
I hope it will help you to start!
Don't have ArcGIS right now and can't write and check any script.
You will need to make substantial changes to this code in order to get it to do what you want. You may just want to download the Split Layer By Attribute Code from ArcGIS online which does the exact same thing.