Undo Table Conversion with Astropy - python

I have a FITS file with a BinTableHDU that has many entries that have been converted from digital numbers various units like Volts and Currents. I would like to turn off this conversion and access the original digital number value that was stored in the table.
This table makes use of TTYPE, TSCAL, TZERO, and TUNIT keys in the header to accomplish the conversions. So I could use these header keys to undo the conversion manually.
undo = (converted - tzero ) / tscal
Since the astropy.table.Table and astropy.table.QTable class automatically interpret these fields I can't help but think I'm overlooking a way to use astropy functions to undo the conversion. Is there an attribute or function in the astropy QTable or Table class that could let me undo the conversion automatically. (or perhaps another easier way) to undo this conversion?
EDIT: Thanks to the answer by Tom Aldcroft below.
In my situation a FITS binary table can be grabbed and put into a pandas.DataFrame either converted or unconverted using the following code:
import pandas as pd
from astropy.table import Table
from astropy.io import fits
# grab converted data in various scaled values and units volts / current
converted = Table(fits.getdata(file, 2)).to_pandas()
# grab unconverted data in its original raw form
unconverted = Table(fits.getdata(file, 2)._get_raw_data()).to_pandas()

You'll need to be using the direct astropy.io.fits interface instead of the high-level Table interface. From there something like this might work:
from astropy.io import fits
with fits.open('data.fits') as hdus:
hdu1 = hdus[1].data
raw = hdu1._get_raw_data()
See https://github.com/astropy/astropy/blob/645a6f96b86238ee28a7057a4a82adae14885414/astropy/io/fits/fitsrec.py#L1022

Related

How to display (and format) datetime data in Qtableview with pyside2

I am trying to display datetime values in a QTableView. I have found this working pyside2 example (scroll down) for string and float type data:
PySide + QTableView example
What would I need to change within the table model so that I could display datetime data. How could this data be formatted to be displayed for example like '01.05.2019'.
I do NOT want to convert the datetime data to string beforehand since then the data cannot be sorted in a meaningful way when clicking in the table header...
Thanks a lot!
Just return the data as a QDateTime (or QDate or QTime). The QTableView should be fine with that.
If you want to format the date differently then it starts to get complicated: you'll need to convert it to a string using your own formatting. Then to get the sorting right, you'll need to return the original date data in some other role (Qt::UserRole) and set that to be the sort role, as explained in this answer (which also suggests an alternative approach using a delegate).

TypeError: cannot convert the series to <class 'float'> iterrows

I'm trying to get python to read a row from a csv, and then place the contents into DynamoDB, then do the same with each subsequent row, until the end. I've modified Amazon's example code so that it reads like this:
from __future__ import print_function # Python 2/3 compatibility
import boto3
import decimal
import pandas as pd
dynamodb = boto3.resource('dynamodb', region_name='us-west-2', aws_access_key_id='123456789', aws_secret_access_key='987654321')
table = dynamodb.Table('Loyalty_One')
data = pd.read_csv('testdb.csv')
data[['collector_key']] = data[['collector_key']].astype(float)
for i, rows in data.iterrows():
collector_key = float(data['collector_key'])
sales = float(data['sales'])
print("Adding data:", collector_key, sales)
table.put_item(
Item={
'collector_key': collector_key,
'sales': sales,
}
)
When I do, I get the error: TypeError: cannot convert the series to
You can see the file here, it's nothing out of the ordinary:
https://s3.amazonaws.com/vshideler-data/testdb.csv
It seems to me that line 16 (and 17) should be extracting the contents of each row in the column. I thought that it might be possible that the header is interfering but doing pd.read_csv with "header=None" just gets me a different error altogether.
(although it occurs to me that I can probably just use the original dataframe instead of a csv file)
I hope this would be helpful.
If the type of the item is Series and have a single number. You can cast the type with conventional casting.
For example, You can cast like this...
XX=AA[AA.SHOUT="HAHA"]["TYPE"] # Get the Type column and find value...
XX=float(XX) if len(XX)==1 else np.NaN
print(XX)
In the real serious code, you have to check more exceptions but for the limited scenario it would be the easiest way to convert data.
Good Luck

Point type in sqlalchemy?

I found this regarding Point type in Postgres: http://www.postgresql.org/docs/current/interactive/datatype-geometric.html
Is there the SQLAlchemy version of this?
I am storing values in this manner: (40.721959482, -73.878993913)
You can use geoalchemy2 whis is an extension to sqlalchemy and can be used with flask-sqlalchemy too.
from sqlalchemy import Column
from geoalchemy2 import Geometry
# and import others
class Shop(db.Model):
# other fields
coordinates = Column(Geometry('POINT'))
You can extend UserDefinedType to achieve what you want.
Here's an example I found that gets pretty close to what you want subclassing UserDefinedType
Note that Mohammad Amin's answer is valid only if your point is intended to be a geographic point (latitude and longitude constraints). It doesn't apply if you want to represent any point on a plane. Also, in that case you would need to install the PostGIS extension, which I encourage if you are working with geography points as it provides a lot of utlities and extra functions.
I found out it's a slight modification of Mohammad's answer.
Yes I need to add a point column via geo/sqlalchemy
from sqlalchemy import Column
from geoalchemy2 import Geometry
# and import others
class Shop(db.Model):
# other fields
coordinates = Column(Geometry('POINT'))
On my PostGIS/Gres and PGloader side, since I'm loading from a csv that formats my latitude and longitude points as: "(40.721959482, -73.878993913)" I needed to do a macro in vim for all of my entries (there's a lot) in order to force that column to adhere to how points are created in PostGIS, so I turned the csv column into point(40.721959482 -73.878993913), then upon table creation set the location column with the datatype as geometry(point) location geometry(point).

How to override/prevent sqlalchemy from ever using floating point type?

I've been using pandas to transform raw file data and import into a database. Often times we use large integers as primary keys. When using pandas to_sql function without explicitly specifying column types, it will sometimes automatically assign large integers as float (rather than bigint).
As you can imagine, much hair was lost when we realized our selects and joins weren't working.
Of course, we can go through and via trial-error manually assign problem columns as bigint, but we'd rather outright disable float altogether and instead force bigint since we work with an extremely large amount of tables, and sometimes an extremely large amount of columns that we can't spend time individually fact-checking. We basically never want a float type in any table definition, ever.
Any way to override floating point type (either in pandas, sqlalchemy, or numpy) as bigint?
ie:
import pandas as pd
from sqlalchemy import create_engine
e = create_engine('mysql+pymysql://user:pass#host')
columns = ['foo', 'bar']
data = [
[123456789, 'one'],
[234567890, 'two'],
[345678901, 'three']
]
df = pd.DataFrame(data=data, columns=columns)
df.to_sql('table', e, flavor='mysql', schema='schema', if_exists='replace')
Unfortunately, this code does not reproduce the effect. It committed as bigint. It happens when loading data from certain csv or xls files, it happens when doing a transfer from one MySQL database to another (latin1) which one would assume to be an isometric copy.
There's nothing to the code at all, it's just:
import pandas as pd
from sqlalchemy import create_engine
e = create_engine('mysql+pymysql://user:pass#host')
df = pd.read_sql('SELECT * FROM source_schema.source_table;', e)
df.to_sql('target_table', e, flavor='mysql', schema='target_schema')
Creating a testfile.csv:
thing1,thing2
123456789,foo
234567890,bar
345678901,baz
didn't reproduce the effect either. I know for a fact it happens with data from NPPES Dissemination, perhaps it has to do with the encoding? I have to convert the files from WIN-1252 to UTF-8 in order for MySQL to even accept them.

FITS file change

I have some data given to me by my mentor. The data consists of thousands of .fits files. Some of the .fits files are older versions of the others and the way the data tables are constructed are different. Here is what I mean:
Let's say I have two .fits files: FITS1.fits and FITS2.fits
$ python
>>> import pyfits
>>> a = pyfits.getdata('FITS1.fits')
>>> b = pyfits.getdata('FITS2.fits')
>>> a.names
['time', 'timeerr', 'sap_flux', 'sap_flux_err']
>>> b.names
['time', 'sap_flux', 'timeerr', 'sap_flux_err']
Does anyone know of a way that I can switch around the columns in the data tables? so that FITS2.fits's format is similar to FITS1.fits ?
Your best bet is to not use pyfits directly, but to use the newer, shinier Astropy Table interface. You can read in a FITS table like:
from astropy.table import Table
table = Table.read('FITS1.fits')
As demonstrated in the section on modifying tables, you can then reorder the columns like:
table = table[['time', 'timeerr', 'sap_flux', 'sap_flux_err']]
(technically this creates a new copy of the table, with the columns selected in the order you wanted them to be in; however IIRC this does not copy the underlying column arrays and should still be a fast operation).
It is also perfectly possible to do this with the legacy pyfits interface, but I wouldn't recommend it for most cases.
You shouldn't write code which depends on the order of keys in a dictionary - a dictionary is a hash table and the order that they are stored is essentialy arbitrary.
If you need to match or compare the entries you should get a list of the keys and sort them.
It's probable that pyfits builds a dictionary in the order that the keys are stored in the FITS header, but that isn't necessarily true.

Categories