One of our old sql legacy code, converts a numerical column in sql using the HASHBYTES function and sha2_256.
The entire process is moving to python as we are putting in some advanced usage on top of the legacy work. However, when using connector, we are calling the same sql code, the HASHBYTES('sha2_256',column_name) id returning values with lot of garbage.
Running the code in sql result in this
Column Encoded_Column
101286297 0x7AC82B2779116F40A8CEA0D85BE4AA02AF7F813B5383BAC60D5E71B7BDB9F705
Running same sql query from python result in
Column Encoded_Column
101286297
b"z\xc8+'y\x11o#\xa8\xce\xa0\xd8[\xe4\xaa\x02\xaf\x7f\x81;S\x83\xba\xc6\r^q\xb7\xbd\xb9\xf7\x05"
Code is
Select Column,HASHBYTES('SHA2_256', CONVERT(VARBINARY(8),Column)) as Encoded_Column from table
I have tried usual garbage removal but not helping
You are getting the right result but is displayed as raw bytes (This is why you have the b in b"...").
Looking at the result from SQL you have the data encoded with hexadecimal.
So to transform the python result you can do:
x = b"z\xc8+'y\x11o#\xa8\xce\xa0\xd8[\xe4\xaa\x02\xaf\x7f\x81;S\x83\xba\xc6\r^q\xb7\xbd\xb9\xf7\x05"
x.hex().upper()
And the result will be:
'7AC82B2779116F40A8CEA0D85BE4AA02AF7F813B5383BAC60D5E71B7BDB9F705'
Which is what you had in SQL.
You can read more here about the 0x at the start of the SQL result that is not present in the python code.
And finally, if you are working with pandas you can convert the whole column with:
df["Encoded_Column"] = df["Encoded_Column"].apply(lambda x: x.hex().upper())
# And if you want the '0x' at the start do:
df["Encoded_Column"] = "0x" + df["Encoded_Column"]
Related
I'm having a db that contains a blob column with the binary representation as follows
The value that I'm interested in is encoded as little endian unsigned long long (8 byte) value in the marked. Reading this value works fine like this
p = session.query(Properties).filter((Properties.object_id==1817012) & (Properties.name.like("%OwnerUniqueID"))).one()
id = unpack("<Q", p.value[-8:])[0]
id in the above example is 1657266.
Now what I would like to do is the reverse. I have the row object p, I have a number in decimal format (using the same 1657266 for testing purposes) and I want to write that number in little endian format to those same 8 byte.
I've been trying to do so via SQL statement
UPDATE properties SET value = (SELECT substr(value, 1, length(value)-8) || x'b249190000000000' FROM properties WHERE object_id=1817012 AND name LIKE '%OwnerUniqueID%') WHERE object_id=1817012 AND name LIKE '%OwnerUniqueID%'
But when I do it like that I then can't read it anymore. At least not with SQLAlchemy. When I try the same code as above, I get the error message Could not decode to UTF-8 column 'properties_value' with text '☻' so it looks like it's written in a different format.
Interestingly using a normal select statement in DB Browser still works fine and the blob is still displayed exactly as in the screenshot above.
Now ideally I'd like to be able to write just those 8 bytes using the SQLAlchemy ORM but I'd settle for a raw SQL statement if that's what it takes.
I managed to get it to work with SQLAlchemy by basically reversing the process that I used to read it. In hindsight using the + to concatenate and the [:-8] to slice the correct part seems pretty obvious.
p = session.query(Properties).filter((Properties.object_id==1817012) & (Properties.name.like("%OwnerUniqueID"))).one()
p.value = p.value[:-8] + pack("<Q", 1657266)
By turning on ECHO for SQLAlchemy I got the following raw SQL statement:
UPDATE properties SET value=? WHERE properties.object_id = ? AND properties.name = ?
(<memory at 0x000001B93A266A00>, 1817012, 'BP_ThrallComponent_C.OwnerUniqueID')
Which is not particularly helpful if you want to do the same thing manually I suppose.
It's worth noting that the raw SQL statement in my question not only works as far as reading it with the DB Browers is concerned but also with the game client that uses the db in question. It's only SQLAlchemy that seems to have troubles, trying to decode it as UTF-8 it seems.
I am fetching data from SQL (Oracle and MS SQL both) databases from a python code using pyodbc and cxOracle packages. Python automatically converts all date time fields in SQL to datetime.datetime. Is there any way I can capture data as is from SQL into a file. Same happens to Null and integer columns as well.
1) Date: Value in DB and expected-- 12-AUG-19 12.00.01.000 -- Python Output: 2019-08-12 00:00:01
2) Null becomes a NaN
3) Integer value 1s and 0s becomes True and False.
I tried to google the issue, and seems like a common issue amongst all packages like pyodbc, cx_oracle, pandas.read_sql as well.
I would like the data appearing exactly the same as in the database.
We are calling a Oracle/SQL Server Stored proc and NOT a SQL query to get this result and we can't change the stored proc. We cannot use CAST in sql query.
Pyodbc fetchall() output is the table in list format. We lose the formatting of the data as soon as it is captured in python.
Could someone help with this issue?
I'm not sure about Oracle, but on the SQL Server side, you could change the command you use so that you capture the results of the stored proc in a temp table, and then you can CAST() the columns of the temp table.
So if you currently call a stored proc on SQL Server like this: EXEC {YourProcName}
Then you could change your command to something like this:
CREATE TABLE #temp
(
col1 INT
,col2 DATETIME
,col3 VARCHAR(20)
);
INSERT INTO #temp
EXEC [sproc];
SELECT
col1 = CAST(col1 AS VARCHAR(20))
,col2 = CAST(FORMAT(col2,'dd-MMM-yy ') AS VARCHAR) + REPLACE(CAST(CAST(col2 AS TIME(3)) AS VARCHAR),':','.')
,col3
FROM #temp;
DROP TABLE #temp
You'll want to create your temp table using the same column names and datatypes that get output from the proc. Then you can CAST() numeric values to VARCHAR, and with dates/datetimes, you can use FORMAT() to define your date string format. The example I have here should result in format you want of 12-AUG-19 12.00.01.000. I couldn't find a single format string that gave me the correct output, so I broke the date and time elements apart, format them in the expected way, and then concatenate the casted values.
I am new to this and trying to learn python. I wrote a select statement in python where I used a parameter
Select """cln.customer_uid = """[(num_cuid_number)])
TypeError: string indices must be integers
Agree with the others, this doesn't look really like Python by itself.
I will see even without seeing the rest of that code I'll guess the [(num_cuid_number)] value(s) being returned is a string, so you'll want to convert it to integer for the select statement to process.
num_cuid_number is most likely a string in your code; the string indices are the ones in the square brackets. So please first check your data variable to see what you received there. Also, I think that num_cuid_number is a string, while it should be in an integer value.
Let me give you an example for the python code to execute: (Just for the reference: I have used SQLAlchemy with flask)
#app.route('/get_data/')
def get_data():
base_sql="""
SELECT cln.customer_uid='%s' from cln
""" % (num_cuid_number)
data = db.session.execute(base_sql).fetchall()
Pretty sure you are trying to create a select statement with a "where" clause here. There are many ways to do this, for example using raw sql, the query should look similar to this:
query = "SELECT * FROM cln WHERE customer_uid = %s"
parameters = (num_cuid_number,)
separating the parameters from the query is secure. You can then take these 2 variables and execute them with your db engine like
results = db.execute(query, parameters)
This will work, however, especially in Python, it is more common to use a package like SQLAlchemy to make queries more "flexible" (in other words, without manually constructing an actual string as a query string). You can do the same thing using SQLAlchemy core functionality
query = cln.select()
query = query.where(cln.customer_uid == num_cuid_number)
results = db.execute(query)
Note: I simplified "db" in both examples, you'd actually use a cursor, session, engine or similar to execute your queries, but that wasn't your question.
I use the simple query below to select from a table based on the date:
select * from tbl where date = '2019-10-01'
The simple query is part of a much larger query that extracts information from many tables on the same server. I don't have execute access on the server, so I can't install a stored procedure to make my life easier. Instead, I read the query into Python and try to replace certain values inside single quote strings, such as:
select * from tbl where date = '<InForceDate>'
I use a simple Python function (below) to replace with another value like 2019-10-01, but the str.replace() function isn't replacing when I look at the output. However, I tried this with a value like that wasn't in quotes and it worked. I'm sure I'm missing something fundamental, but haven't uncovered why it works without quotes and fails with quotes.
Python:
def generate_sql(sql_path, inforce_date):
with open(pd_sql_path, 'r') as sql_file:
sql_string = sql_file.read()
sql_final = str.replace(sql_string, r'<InForceDate>', inforce_date)
return(sql_final)
Can anyone point me in the right direction?
Nevermind folks -- problem solved, but haven't quite figured out why. File encoding is my guess.
I'm trying to use Pandas read_sql to validate some fields in my app.
When i read my db using SQL Developer, i get these values:
603.29
1512.00
488.61
488.61
But reading the same sql query using Pandas, the decimal places are ignored and added to the whole-number part. So i end up getting these values:
60329.0
1512.0
48861.0
48861.0
How can i fix it?
I've found a workaround for now.
Convert the column you want to string, then after you use Pandas you can convert the string to whatever type you want.
Even though this works, it doesn't feel right to do so.
Could you specify what SQL you are using? I just encountered a bit similar problem, which I overcame by defining the datatype more specifically in the query - here the solution for it. I guess you could use the similar approach and see if it works.
In my MySQL database I have four columns with very high precision
When I tried to read them with this query, they were truncated to 5 digits after the decimal delimiter.
query = """select
y_coef,
y_intercept,
x_coef,
x_intercept
from TABLE_NAME"""
df = pd.read_sql(query, connection)
However, when I specified that I want to have them with the precision of 15 digits after the decimal delimiter like below, they were not truncated anymore.
query = """select
cast(y_coef as decimal(15, 15)) as y_coef,
cast(y_intercept as decimal(15, 15)) as y_intercept,
cast(x_coef as decimal(15, 15)) as x_coef,
cast(x_intercept as decimal(15, 15)) as x_intercept
from TABLE_NAME"""
df = pd.read_sql(query, connection)