Float Values Rounding with SQLAlchemy and MySQL - python

My question is similar to this unanswered question: SQLAlchemy commits makes float to be rounded
I have a text file of data that looks like this:
#file camera date mjd focus error
ibcy02blq UVIS1 08/03/09 55046.196630 0.57857 0.55440
ibcy02bnq UVIS1 08/03/09 55046.198330 -0.15000 0.42111
ibcy03j8q UVIS1 08/11/09 55054.041650 -0.37143 0.40802
ibcy03jaq UVIS1 08/11/09 55054.043350 -0.91857 0.51859
ibcy04m4q UVIS1 08/18/09 55061.154900 -0.32333 0.52327
ibcy04m6q UVIS1 08/18/09 55061.156600 -0.24867 0.66651
ibcy05b7q UVIS1 09/05/09 55079.912670 0.64900 0.58423
ibcy05b9q UVIS1 09/05/09 55079.914370 0.82000 0.50202
ibcy06meq UVIS1 10/02/09 55106.909840 -0.09667 0.24016
But once I read it into my MySQL database it looks like this:
+------+-----------+--------+------------+---------+----------+
| id | filename | camera | date | mjd | focus |
+------+-----------+--------+------------+---------+----------+
| 1026 | ibcy02blq | UVIS1 | 2009-08-03 | 55046.2 | 0.57857 |
| 1027 | ibcy02bnq | UVIS1 | 2009-08-03 | 55046.2 | -0.15 |
| 1028 | ibcy03j8q | UVIS1 | 2009-08-11 | 55054 | -0.37143 |
| 1029 | ibcy03jaq | UVIS1 | 2009-08-11 | 55054 | -0.91857 |
| 1030 | ibcy04m4q | UVIS1 | 2009-08-18 | 55061.2 | -0.32333 |
| 1031 | ibcy04m6q | UVIS1 | 2009-08-18 | 55061.2 | -0.24867 |
| 1032 | ibcy05b7q | UVIS1 | 2009-09-05 | 55079.9 | 0.649 |
| 1033 | ibcy05b9q | UVIS1 | 2009-09-05 | 55079.9 | 0.82 |
| 1034 | ibcy06meq | UVIS1 | 2009-10-02 | 55106.9 | -0.09667 |
| 1035 | ibcy06mgq | UVIS1 | 2009-10-02 | 55106.9 | -0.1425 |
+------+-----------+--------+------------+---------+----------+
The mjd column is being truncated and I'm not sure why. I understand that there are floating point precision errors for something like 1/3 but this looks more like some type of rounding is being implemented.
Here is the code I use to ingest the data into the database:
def make_focus_table_main():
"""The main controller for the make_focus_table
module."""
logging.info('Process Starting')
filename_list = glob.glob('/grp/hst/OTA/focus/source/FocusModel/UVIS*FocusHistory.txt')
logging.info('Found {} files'.format(len(filename_list)))
for filename in filename_list:
logging.info('Reading data from {}'.format(filename))
output_list = []
with open(filename, 'r') as f:
data = f.readlines()
for line in data[1:]:
line = line.split()
output_dict = {}
output_dict['filename'] = line[0]
output_dict['camera'] = line[1]
output_dict['date'] = datetime.strptime(line[2], '%m/%d/%y')
output_dict['mjd'] = float(line[3])
output_dict['focus'] = float(line[4])
output_list.append(output_dict)
logging.info('Beginning bulk insert of records.')
engine.execute(Focus.__table__.insert(), output_list)
logging.info('Database insert complete.')
logging.info('Process Complete')
I've used pdb to check that the values are not being truncated prior to being passed to the database (i.e. Python/SQLAlchemy is not performing the rounding). I can verify this in the INSERT command SQLAlchemy issues:
2014-04-11 13:08:20,522 INFO sqlalchemy.engine.base.Engine INSERT INTO focus (filename, camera, date, mjd, focus) VALUES (%s, %s, %s, %s, %s)
2014-04-11 13:08:20,602 INFO sqlalchemy.engine.base.Engine (
('ibcy02blq', 'UVIS2', datetime.datetime(2009, 8, 3, 0, 0), 55046.19663, 1.05778),
('ibcy02bnq', 'UVIS2', datetime.datetime(2009, 8, 3, 0, 0), 55046.19833, 1.32333),
('ibcy03j8q', 'UVIS2', datetime.datetime(2009, 8, 11, 0, 0), 55054.04165, 1.57333),
('ibcy03jaq', 'UVIS2', datetime.datetime(2009, 8, 11, 0, 0), 55054.04335, 0.54333),
('ibcy04m4q', 'UVIS2', datetime.datetime(2009, 8, 18, 0, 0), 55061.1549, -1.152),
('ibcy04m6q', 'UVIS2', datetime.datetime(2009, 8, 18, 0, 0), 55061.1566, -1.20733),
('ibcy05b7q', 'UVIS2', datetime.datetime(2009, 9, 5, 0, 0), 55079.91267, 2.35905),
('ibcy05b9q', 'UVIS2', datetime.datetime(2009, 9, 5, 0, 0), 55079.91437, 1.84524)
... displaying 10 of 1025 total bound parameter sets ...
('ichl05qwq', 'UVIS2', datetime.datetime(2014, 4, 2, 0, 0), 56749.05103, -2.98),
('ichl05qxq', 'UVIS2', datetime.datetime(2014, 4, 2, 0, 0), 56749.05177, -3.07))
2014-04-11 13:08:20,959 INFO sqlalchemy.engine.base.Engine COMMIT
Here is how the column is defined in my SQLAlchemy classes:
class Focus(Base):
"""ORM for the table storing the focus measurement information."""
__tablename__ = 'focus'
id = Column(Integer(), primary_key=True)
filename = Column(String(17), index=True, nullable=False)
camera = Column(String(5), index=True, nullable=False)
date = Column(Date(), index=True, nullable=False)
mjd = Column(Float(precision=20, scale=10), index=True, nullable=False)
focus = Column(Float(15), nullable=False)
__table_args__ = (UniqueConstraint('filename', 'camera',
name='focus_uniqueness_constraint'),)
Here is the SQL that's logged from SQLAlchemy with echo=True when I create the table:
CREATE TABLE focus (
id INTEGER NOT NULL AUTO_INCREMENT,
filename VARCHAR(17) NOT NULL,
camera VARCHAR(5) NOT NULL,
date DATE NOT NULL,
mjd FLOAT(20) NOT NULL,
focus FLOAT(15) NOT NULL,
PRIMARY KEY (id),
CONSTRAINT focus_uniqueness_constraint UNIQUE (filename, camera)
)
So far, so good. But here's what I see MySQL with a SHOW CREATE TABLE focus;:
CREATE TABLE `focus` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`filename` varchar(17) NOT NULL,
`camera` varchar(5) NOT NULL,
`date` date NOT NULL,
`mjd` float NOT NULL,
`focus` float NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `focus_uniqueness_constraint` (`filename`,`camera`),
KEY `ix_focus_filename` (`filename`),
KEY `ix_focus_mjd` (`mjd`),
KEY `ix_focus_date` (`date`),
KEY `ix_focus_camera` (`camera`)
) ENGINE=InnoDB AUTO_INCREMENT=1193 DEFAULT CHARSET=latin1
Somehow the FLOAT definition changed! Is this some type of MySQL configuration setting? I'm just running this on my local host right now, but if this is a configuration setting I'm concerned about the portability of this code onto a production server if I continue to use floats. I could just switch to a decimal column type as I've seen in other SO questions since I need exact values but I would like to understand what's going on here.
Update: Just to expand a little on two-bit-alchemist's answer, here is how it changes my query:
> SELECT ROUND(mjd,10) FROM focus LIMIT 10;
+------------------+
| ROUND(mjd,10) |
+------------------+
| 55046.1953125000 |
| 55046.1992187500 |
| 55054.0429687500 |
| 55054.0429687500 |
| 55061.1562500000 |
| 55061.1562500000 |
| 55079.9140625000 |
| 55079.9140625000 |
| 55106.9101562500 |
| 55106.9101562500 |
+------------------+
10 rows in set (0.00 sec)
Notice that all the decimal precision is still there. I had no idea SELECT was rounding values but I guess this makes sense if you think about how a floating point representation works. It uses the full bytes allocated for that number, how many decimals you display is arbitrary up to the full length of the float:https://stackoverflow.com/a/20482699/1216837
Specifying the precision only seems to affect if it's stored as a double or a single: http://dev.mysql.com/doc/refman/5.0/en/floating-point-types.html.
But, what's also interesting/annoying is that I have to worry about this same thing when issuing a SELECT from the SQLAlchemy layer:
query = psf_session.query(Focus).first()
print query.filename, query.mjd, query.focus
Gives me bcy02blq 55046.2 1.05778 so the values are still being rounded. Again, this makes sense because SQLAlchemy is just issuing SQL commands anyway. All in all this is motivating me to switch to a DECIMAL column type: http://dev.mysql.com/doc/refman/5.0/en/fixed-point-types.html

It looks like all your values were printed with exactly six digits (except where .0 was left off in a couple of places). While I can't find any documentation on this, I suspect this is simply a default MySQL behavior for displaying float values in the context of a SELECT statement.
Based on the CREATE TABLE statement you provided, the internal representation is correct, so you need only add something like ROUND(mjd, 3) to your statement, with the first argument being the fields to round and the last being the number of digits to round to (which can be longer than what is displaying now).

Related

Why pywinauto won't detect the control identifiers in a window except the title bar?

I've been trying to automate the Cinebench window with python using pyautogui, as this is the best library that i came across. I made a few projects that worked well, but with Cinebench i don't get any control identifiers (Except for title, and the normal 3 top buttons). My main objective is to be able to automatically start benchmarks and read the final score.
I didn't come here to bother you all as soon as I hit an issue, so here's all of the things that i tried:
Switching backend="uia" to backend="win32". Result: code stopped working
Waiting for the window to load, using time.sleep(). Result: no difference was noticed
Adding a timeout=10 to the .connect() function. Result: no difference was noticed
Researching if Cinebench had an API. Result: of course it doesn't (as of what i found)
Researching if there was another library to do it. Result: didn't find any.
I really don't want to do this using "click at this coordinates" and even so i wouldn't be able to read from it, so it would be useless.
The code that i used:
app = Application(backend="uia").start(rf"C:/Users/{os.getlogin()}/Desktop/MasterBench/Benchmarks/Cinebench.exe")
app = Application(backend="uia").connect(title=CINEBENCH_WINDOW_NAME, timeout=10)
app.CINEBENCHR23200.print_control_identifiers()
What i got:
Control Identifiers:
Dialog - 'CINEBENCH R23.200' (L-8, T-8, R1928, B1088)
['CINEBENCH R23.200', 'CINEBENCH R23.200Dialog', 'Dialog']
child_window(title="CINEBENCH R23.200", control_type="Window")
|
| TitleBar - '' (L16, T-5, R1920, B23)
| ['TitleBar']
| |
| | Menu - 'Sistema' (L0, T0, R22, B22)
| | ['SistemaMenu', 'Sistema', 'Menu', 'Sistema0', 'Sistema1']
| | child_window(title="Sistema", auto_id="MenuBar", control_type="MenuBar")
| | |
| | | MenuItem - 'Sistema' (L0, T0, R22, B22)
| | | ['Sistema2', 'SistemaMenuItem', 'MenuItem']
| | | child_window(title="Sistema", control_type="MenuItem")
| |
| | Button - 'Riduci a icona' (L1779, T8, R1826, B22)
| | ['Button', 'Riduci a iconaButton', 'Riduci a icona', 'Button0', 'Button1']
| | child_window(title="Riduci a icona", control_type="Button")
| |
| | Button - 'Ripristino' (L1826, T8, R1872, B22)
| | ['Button2', 'Ripristino', 'RipristinoButton']
| | child_window(title="Ripristino", control_type="Button")
| |
| | Button - 'Chiudi' (L1872, T8, R1928, B22)
| | ['Button3', 'Chiudi', 'ChiudiButton']
| | child_window(title="Chiudi", control_type="Button")

How to disable text wrap in a columnar column?

|---------|------------------|------------------|-----------|------------------|
|serial no|ggggggg name |status |status code|AAAAAAAAAurl |
|==============================================================================|
|1 |ggggggggggg-kkkkkk|Healthy |200 |http://aaaaaaaaaaa|
| |e | | |-service.dev.sdddd|
| | | | |1.cccc.cc/health/l|
| | | | |ive |
|---------|------------------|------------------|-----------|------------------|
|2 |zzzzzzzz-jjjjjj |Healthy |200 |http://ddddddddddd|
| | | | |ader.dev.ffffff.cc|
| | | | |cc.cc/health/live |
|---------|------------------|------------------|-----------|------------------|
I am trying to get the last column in one row the entire url. I am using the following python library to print this, tried few things but unable to get this working. I tried https://pypi.org/project/Columnar/ setting max column width and min column width and such as mentioned here, but none are working
Edit: Headers are simply names of the columns, you can name it anything you want
from columnar import columnar
headers = ['serial no', 'service name', 'status', 'status code']
...
tabledata = []
counter = 0
for x in services:
zzz = requests.get("http://xxx.yyy"+ x)
counter = counter + 1
i = counter
myrowdata = [i, x, zzz.text, zzz.status_code]
tabledata.append(myrowdata)
table = columnar(tabledata, headers, no_borders=True, max_column_width=None)
print(table)
1.) You missed the column name "url" from headers.
You should do as follows:
headers = ['serial no', 'service name', 'status', 'status code', 'url']
2.) You have to add url to myrowdata:
myrowdata = [i, x, zzz.text, zzz.status_code, "http://xxx.yyy"+ x]
Update:
If you did all the fixes above, you have to run it in an external system terminal to get the real result, as some internal IDE console constrains the width of the display:
In Spyder:
SERIAL NO SERVICE NAME STATUS STATUS CODE URL
1 Anyname Anytext Anystatus_code http://aaaaaaaaaaaaaaaaaaa
aadddddddddddddddddddddddd
dddddddaaaaaaaaa.com
In external system terminal:

Formating a table from a csv file

I'm trying to make a table from data from a CSV file using only the CSV module. Could anyone tell me what should I do to display the '|' at the end of every row(just after the last element in the row)?
Here's what I have so far:
def display_playlist( filename ):
if filename.endswith('.csv')==False: #check if it ends with CSV extension
filename = filename + ('.csv') #adding .csv if given without .csv extension
max_element_length=0
#aligning columns to the longest elements
for row in get_datalist_from_csv( filename ):
for element in row:
if len(element)>max_element_length:
max_element_length=len(element)
# print(max_element_length)
#return max_element_length
print('-----------------------------------------------------------------------------')
for row in get_datalist_from_csv( filename ):
for element in row:
print('| ', end='')
if (len(element)<=4 and element.isdigit==True):
print(pad_to_length(element,4), end=' |') #trying to get '|' at the end[enter image description here][1]
else:
print(pad_to_length(element, max_element_length), end=' ')
print('\n')
print('-----------------------------------------------------------------------------')
## Read data from a csv format file
def get_datalist_from_csv( filename ):
## Create a 'file object' f, for accessing the file:
with open( filename ) as f:
reader = csv.reader(f) # create a 'csv reader' from the file object
datalist = list( reader ) # create a list from the reader
return datalist # we have a list of lists
## For aligning table columns
## It adds spaces to the end of a string to make it up to length n.
def pad_to_length( string, n):
return string + " "* (n-len(string)) ## s*n gives empty string for n<1
The image I get for now is:
| Track | Artist | Album | Time
| Computer Love | Kraftwerk | Computer World | 7:15
| Paranoid Android | Radiohead | OK Computer | 6:27
| Computer Age | Neil Young | Trans | 5:24
| Digital | Joy Division | Still | 2:50
| Silver Machine | Hawkwind | Roadhawks | 4:39
| Start the Simulator | A-Ha | Foot of the Mountain | 5:11
| Internet Connection | M.I.A. | MAYA | 2:56
| Deep Blue | Arcade Fire | The Suburbs | 4:29
| I Will Derive! | MindofMatthew | You Tube | 3:17
| Lobachevsky | Tom Lehrer | You Tube | 3:04

getting alphabets after applying sentence tokenizer of nltk instead of sentences in Python 3.5.1

import codecs, os
import re
import string
import mysql
import mysql.connector
y_ = ""
'''Searching and reading text files from a folder.'''
for root, dirs, files in os.walk("/Users/ultaman/Documents/PAN dataset/Pan Plagiarism dataset 2010/pan-plagiarism-corpus-2010/source-documents/test1"):
for file in files:
if file.endswith(".txt"):
x_ = codecs.open(os.path.join(root,file),"r", "utf-8-sig")
for lines in x_.readlines():
y_ = y_ + lines
'''Tokenizing the senteces of the text file.'''
from nltk.tokenize import sent_tokenize
raw_docs = sent_tokenize(y_)
tokenized_docs = [sent_tokenize(y_) for sent in raw_docs]
'''Removing punctuation marks.'''
regex = re.compile('[%s]' % re.escape(string.punctuation))
tokenized_docs_no_punctuation = ''
for review in tokenized_docs:
new_review = ''
for token in review:
new_token = regex.sub(u'', token)
if not new_token == u'':
new_review+= new_token
tokenized_docs_no_punctuation += (new_review)
print(tokenized_docs_no_punctuation)
'''Connecting and inserting tokenized documents without punctuation in database field.'''
def connect():
for i in range(len(tokenized_docs_no_punctuation)):
conn = mysql.connector.connect(user = 'root', password = '', unix_socket = "/tmp/mysql.sock", database = 'test' )
cursor = conn.cursor()
cursor.execute("""INSERT INTO splitted_sentences(sentence_id, splitted_sentences) VALUES(%s, %s)""",(cursor.lastrowid,(tokenized_docs_no_punctuation[i])))
conn.commit()
conn.close()
if __name__ == '__main__':
connect()
After writing the above code, The result is like
2 | S | N |
| 3 | S | o |
| 4 | S | |
| 5 | S | d |
| 6 | S | o |
| 7 | S | u |
| 8 | S | b |
| 9 | S | t |
| 10 | S | |
| 11 | S | m |
| 12 | S | y |
| 13 | S |
| 14 | S | d
in the database.
It should be like:
1 | S | No doubt, my dear friend.
2 | S | no doubt.
I suggest making the following edits(use what you would like). But this is what I used to get your code running. Your issue is that review in for review in tokenized_docs: is already a string. So, this makes token in for token in review: characters. Therefore to fix this I tried -
tokenized_docs = ['"No doubt, my dear friend, no doubt; but in the meanwhile suppose we talk of this annuity.', 'Shall we say one thousand francs a year."', '"What!"', 'asked Bonelle, looking at him very fixedly.', '"My dear friend, I mistook; I meant two thousand francs per annum," hurriedly rejoined Ramin.', 'Monsieur Bonelle closed his eyes, and appeared to fall into a gentle slumber.', 'The mercer coughed;\nthe sick man never moved.', '"Monsieur Bonelle."']
'''Removing punctuation marks.'''
regex = re.compile('[%s]' % re.escape(string.punctuation))
tokenized_docs_no_punctuation = []
for review in tokenized_docs:
new_token = regex.sub(u'', review)
if not new_token == u'':
tokenized_docs_no_punctuation.append(new_token)
print(tokenized_docs_no_punctuation)
and got this -
['No doubt my dear friend no doubt but in the meanwhile suppose we talk of this annuity', 'Shall we say one thousand francs a year', 'What', 'asked Bonelle looking at him very fixedly', 'My dear friend I mistook I meant two thousand francs per annum hurriedly rejoined Ramin', 'Monsieur Bonelle closed his eyes and appeared to fall into a gentle slumber', 'The mercer coughed\nthe sick man never moved', 'Monsieur Bonelle']
The final format of the output is up to you. I prefer using lists. But you could concatenate this into a string as well.
nw = []
for review in tokenized_docs[0]:
new_review = ''
for token in review:
new_token = regex.sub(u'', token)
if not new_token == u'':
new_review += new_token
nw.append(new_review)
'''Inserting into database'''
def connect():
for j in nw:
conn = mysql.connector.connect(user = 'root', password = '', unix_socket = "/tmp/mysql.sock", database = 'Thesis' )
cursor = conn.cursor()
cursor.execute("""INSERT INTO splitted_sentences(sentence_id, splitted_sentences) VALUES(%s, %s)""",(cursor.lastrowid,j))
conn.commit()
conn.close()
if __name__ == '__main__':
connect()

Is there an easy way to add permissions to a user or a group in Django?

I'm currently adding permissions to users and groups like this:
permissions = list(
Permission.objects.filter(
Q(codename='add_server', content_type=ContentType.objects.get(app_label='bildverteiler', name='server'))
| Q(codename='change_server', content_type=ContentType.objects.get(app_label='bildverteiler', name='server'))
| Q(codename='delete_server', content_type=ContentType.objects.get(app_label='bildverteiler', name='server'))
| Q(codename='change_group', content_type=ContentType.objects.get(app_label='bildverteiler', name='group'))
| Q(codename='add_group', content_type=ContentType.objects.get(app_label='bildverteiler', name='group'))
| Q(codename='delete_group', content_type=ContentType.objects.get(app_label='bildverteiler', name='group'))
| Q(codename='change_user', content_type=ContentType.objects.get(app_label='auth', name='user'))
| Q(codename='add_user', content_type=ContentType.objects.get(app_label='auth', name='user'))
| Q(codename='delete_user', content_type=ContentType.objects.get(app_label='auth', name='user'))
)
)
some_user_obj.user_permissions = permissions
Is there a better way ? Maybe without the query ?
This solution has less characters:
query=Q()
for app_labe, name in [
('bildverteiler', 'server'),
('bildverteiler', 'group'),
('auth', 'user')]:
query|=Q(content_type__app_label=app_label, content_type__name=name, codname__in=['%s_%s' % (perm_name, name) for perm_name in ['change', 'add', delete'])
Permission.objects.filter(query)

Categories