Python vs Go Hashing Differences

Python vs Go Hashing Differences - python

I have a Go program
package main
import (
"crypto/hmac"
"crypto/sha1"
"fmt"
)
func main() {
val := []byte("nJ1m4Cc3")
hasher := hmac.New(sha1.New, val)
fmt.Printf("%x\n", hasher.Sum(nil))
// f7c0aebfb7db2c15f1945a6b7b5286d173df894d
}
And a Python (2.7) program that is attempting to reproduce the Go code (using crypto/hmac)
import hashlib
val = u'nJ1m4Cc3'
hasher = hashlib.new("sha1", val)
print hasher.hexdigest()
# d67c1f445987c52bceb8d6475c30a8b0e9a3365d
Using the hmac module gives me a different result but still not the same as the Go code.
import hmac
val = 'nJ1m4Cc3'
h = hmac.new("sha1", val)
print h.hexdigest()
# d34435851209e463deeeb40cba7b75ef
Why do these print different values when they use the same hash on the same input?

You have to make sure that
the input in both scenarios is equivalent and that
the processing method in both scenarios is equivalent.
In both cases, the input should be the same binary blob. In your Python program you define a unicode object, and you do not take control of its binary representation. Replace the u prefix with a b, and you are fine (this is the explicit way to define a byte sequence in Python 2.7 and 3). This is not the actual problem, but better be explicit here.
The problem is that you apply different methods in your Go and Python implementations.
Given that Python is the reference
In Go, no need to import "crypto/hmac" at all, in Python you just build a SHA1 hash of your data. In Go, the equivalent would be:
package main
import (
"crypto/sha1"
"fmt"
)
func main() {
data := []byte("nJ1m4Cc3")
fmt.Printf("%x", sha1.Sum(data))
}
Test and output:
go run hashit.go
d67c1f445987c52bceb8d6475c30a8b0e9a3365d
This reproduces what your first Python snippet creates.
Edit: I have simplified the Go code a bit, to not make Python look more elegant. Go is quite elegant here, too :-).
Given that Go is the reference
import hmac
import hashlib
data = b'nJ1m4Cc3'
h = hmac.new(key=data, digestmod=hashlib.sha1)
print h.hexdigest()
Test & output:
python hashit.py
f7c0aebfb7db2c15f1945a6b7b5286d173df894d
This reproduces what your Go snippet creates. I am, however, not sure about the cryptographic significance of an HMAC when one does use an empty message.

Related

Get Python to read a return code from C# in a reliable fashion

I've written a large-ish program in Python, which I need to talk to a smaller C# script. (I realise that getting Python and C# to talk to each other is not an ideal state of affairs, but I'm forced to do this by a piece of hardware, which demands a C# script.) What I want to achieve in particular - the motivation behind this question - is that I want to know when a specific caught exception occurs in the C# script.
I've been trying to achieve the above by getting my Python program to look at the C# script's return code. The problem I've been having is that, if I tell C# to give a return code x, my OS will receive a return code y and Python will receive a return code z. While a given x always seems to correspond to a specific y and a specific z, I'm having difficulty deciphering the relationship between the three; they should be the same.
Here are the specifics of my setup:
My version of Python is Python 3.
My OS is Ubuntu 20.04.
I'm using Mono to compile and run my C# script.
And here's a minimal working example of the sort of thing I'm talking about:
This is a tiny C# script:
namespace ConsoleApplication1
{
class Script
{
const int ramanujansNumber = 1729;
bool Run()
{
return false;
}
static int Main(string[] args)
{
Script program = new Script();
if(program.Run()) return 0;
else return ramanujansNumber;
}
}
}
If I compile this using mcs Script.cs, run it using mono Script.exe and then run echo $?, it prints 193. If, on the other hand, I run this Python script:
import os
result = os.system("mono Script.exe")
print(result)
it prints 49408. What is the relationship between these three numbers: 1729, 193, 49408? Can I predict the return code that Python will receive if I know what the C# script will return?
Note: I've tried using Environment.Exit(code) in the C# script instead of having Main return an integer. I ran into exactly the same problem.

With os.system the documentation explicitly states that the result is in the same format as for the os.wait, i.e.:
a 16-bit number, whose low byte is the signal number that killed the process, and whose high byte is the exit status (if the signal number is zero); the high bit of the low byte is set if a core file was produced.
So in your case it looks like:
>>> 193<<8
49408
You might want to change that part to using subprocess, e.g. as in the answer to this question
UPD: As for the mono return code, it looks like only the lower byte of it is used (i.e. it is expected to be between 0 and 255). At least:
>>> 1729 & 255
193

Translate from get_ipython().magic(u'R ...') to simple r2py commands

I have a Python script (converted from an IPython Notebook) with
def train_mod_inR():
get_ipython().magic(u'R -i myinput')
get_ipython().magic(u'R -o res res <- Rfunction(myinput)')
return res
I am trying to just run the script using Python without any of the magic, so I am using rpy2. Does anyone know any easy way to translate the function above so it just works with standard rpy2. E.g., is it:
def train_mod_inR()
r2py.robjects('-i myinput')
r2py.robjects('-o res res <- Rfunction(myinput)')
return res
In fact, does anyone have a good workaround for this in general? I have a feeling this will require a bit more than what I've shown - is there any command in r2py directly that will allow us to interpret R like is being done by the magic command?

You were pretty close.
An exact translation of what is happening in the magic would be
(NOTE: the object "converter" is defined is the next code block to break down what is happening, just assume that it is just an rpy2 converter for now):
from rpy2.robjects.conversion import localconverter
# Take the Python object known as "myinput", pass it to R
# (going through the rpy2 conversion layer), and make it known to R
# as "myinput".
with localconverter(converter) as cv:
r2py.robjects.globalenv['myinput'] = myinput
# Run the R function known (to R) as "Rfunction", using the object known
# to R as "myinput" as the argument. Make the resulting object known to
# as "res".
r2py.robjects('res <- Rfunction(myinput)')
# Take the R objects "res", pass it to Python (going through the rpy2
# conversion layer), and make it known to Python as "res".
with localconverter(converter) as cv:
res = r2py.robjects.globalenv['res']
The conversion can be customized in the R magic (https://rpy2.github.io/doc/v3.1.x/html/interactive.html#module-rpy2.ipython.rmagic), but assuming that you are using the current default it would be (testing whether numpy or pandas are installed is left as an exercise for the reader):
from rpy2.robjects.conversion import Converter
converter = Converter('ipython conversion')
if has_numpy:
from rpy2.robjects import numpy2ri
converter += numpy2ri.converter
if has_pandas:
from rpy2.robjects import pandas2ri
converter += pandas2ri.converter
Note that conversion can be an expensive step (for example, there is no C-level mapping between R and Python for arrays of strings so they have to be copied over a loop), and depending on how that R result is used elsewhere in the code you may want to go for a lighter conversion (or no conversion).
Additional comment
Now this is when one want exactly replicate what is happening in an "R magic" cell in jupyter. The core part is the execution of the cell, that is the evaluation of the string:
r2py.robjects('res <- Rfunction(myinput)')
The evaluation of that string can be moved to Python, progressively or all at one. In this case only the latter is possible because there is only one function call. The function known to R as "Rfunction" can be mapped to a Python object that is callable:
rfunction = r2py.robjects('Rfunction')
If you are only interested in having res in Python, the creations of symbols myinput and res in R's globalenv might no longer be necessary and the whole can be shortened to:
with localconverter(converter) as cv:
res = rfunction(myinput)

Calling C++ function from Inside Python function

In R we can use Rcpp to call a cpp function as the one below:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
SEXP critcpp(SEXP a, SEXP b){
NumericMatrix X(a);
NumericVector crit(b);
int p = XtX.ncol();
NumericMatrix critstep(p,p);
NumericMatrix deltamin(p,p);
List lst(2);
for (int i = 0; i < (p-1); i++){
for (int j = i+1; j < p; j++){
--some calculations
}
}
lst[0] = critstep;
lst[1] = deltamin;
return lst;
}
I want to do the same thing in python.
I have gone through Boost,SWIG etc but it seems complicated to my newbie Python eyes.
Can the python wizards here kindly point me in the right direction.
I need to call this C++ function from inside a Python function.

Since I think the only real answer is by spending some time in rewriting the function you posted, or by writing a some sort of wrapper for the function (absolutely possible but quite time consuming) I'm answering with a completely different approach...
Without passing by any sort of compiled conversion, a really faster way (from a programming time point of view, not in efficiency) may be directly calling the R interpreter with the module of the function you posted from within python, through the python rpy2 module, as described here. It requires the panda module, to handle the data frames from R.
The module to use (in python) are:
import numpy as np # for handling numerical arrays
import scipy as sp # a good utility
import pandas as pd # for data frames
from rpy2.robjects.packages import importr # for importing your module
import rpy2.robjects as ro # for calling R interpreter from within python
import pandas.rpy.common as com # for storing R data frames in pandas data frames.
In your code you should import your module by calling importr
importr('your-module-with-your-cpp-function')
and you can send directly commands to R by issuing:
ro.r('x = your.function( blah blah )')
x_rpy = ro.r('x')
type(x_rpy)
# => rpy2.robjects.your-object-type
you can store your data in a data frame by:
py_df = com.load_data('variable.name')
and push back a data frame through:
r_df = com.convert_t_r_dataframe(py_df)
ro.globalenv['df'] = r_df
This is for sure a workaround for your question, but it may be considered as a reasonable solution for certain applications, even if I do not suggest it for "production".

How do you import a Python library within an R package using rPython?

The basic question is this: Let's say I was writing R functions which called python via rPython, and I want to integrate this into a package. That's simple---it's irrelevant that the R function wraps around Python, and you proceed as usual. e.g.
# trivial example
# library(rPython)
add <- function(x, y) {
python.assign("x", x)
python.assign("y", y)
python.exec("result = x+y")
result <- python.get("result")
return(result)
}
But what if the python code with R functions require users to import Python libraries first? e.g.
# python code, not R
import numpy as np
print(np.sin(np.deg2rad(90)))
# R function that call Python via rPython
# *this function will not run without first executing `import numpy as np`
print_sin <- function(degree){
python.assign("degree", degree)
python.exec('result = np.sin(np.deg2rad(degree))')
result <- python.get('result')
return(result)
}
If you run this without importing the library numpy, you will get an error.
How do you import a Python library in an R package? How do you comment it with roxygen2?
It appears the R standard is this:
# R function that call Python via rPython
# *this function will not run without first executing `import numpy as np`
print_sin <- function(degree){
python.assign("degree", degree)
python.exec('import numpy as np')
python.exec('result = np.sin(np.deg2rad(degree))')
result <- python.get('result')
return(result)
}
Each time you run an R function, you will import an entire Python library.

As #Spacedman and #DirkEddelbuettel suggest you could add a .onLoad/.onAttach function to your package that calls python.exec to import the modules that will typically always be required by users of your package.
You could also test whether the module has already been imported before importing it, but (a) that gets you into a bit of a regression problem because you need to import sys in order to perform the test, (b) the answers to that question suggest that at least in terms of performance, it shouldn't matter, e.g.
If you want to optimize by not importing things twice, save yourself the hassle because Python already takes care of this.
(although admittedly there is some quibblingdiscussion elsewhere on that page about possible scenarios where there could be a performance cost).
But maybe your concern is stylistic rather than performance-oriented ...

Call a Python 2.7 module in Python 3.2

I have a GUI program that I have developed in Python 3.2 to extract various geospatial data products. I need to call a module I have developed in Python 2.7.
I am looking for a way to be able to call the Python 2.7 code using a Python 2.7 interpreter inside of Python 3.2 program. I cannot port the 2.7 to Python 3.2 as it uses a Python version installed with ESRI ArcMap and relies on the arcpy module which is not available for Python 3. My only idea right now would be to use subprocess to call the module as a batch process however this is a bit messy and I would prefer the two programs had some relationship.
Thank you in advance.

You could spawn the python 2.7 process as a server handling RPC requests from your GUI running on 3.2. That will work either over the network, or local pipes, or shared memory, or your system's message bus, or many other ways. You just need to translate your library's API into some kind of serialised messages.
Let's say your library has a function: (super simplified example)
def add(a, b):
return a+b
You'd wrap this in some server, let's say a flask app, which does:
#app.route("/add", methods=["POST"])
def handle_add():
data = request.get_json()
ret = your_lib.add(data['a'], data['b'])
return jsonify(ret)
and on the client side, send and unpack the values using something like requests
You could even make it fairly transparent by implementing a translator module with methods named the same as the library itself and doing import your_http_wrapper as your_library_name.
The trick now is to make sure all your parameters can be serialised and that you can realistically send all the arguments/return values in a reasonable time on each call. Also, you lose the ability to change the contents of the variables you pass to the wrapper, because the server will modify only the local copy (unless you implement serialising all those modifications as well)

Try using subprocess.check_output(['C:\\python27\\python.exe', 'yourModule.py'])
If you want to call a specific function within the 27 file, you can use more system arguments. The call would look like:
subprocess.check_output(['C:\\python27\\python.exe', 'yourModule.py', 'funcName'])
And in the 27 file you can add:
import sys
if __name__=='__main__':
if 'funcName' in sys.argv:
funcName()
else:
#... execute normally

Somewhat late but in case someone stumbles over this thread:
I've written a module that transparently runs parts of a program in other python interpreters. Basically it provides decorators and base classes that replace functions and objects with proxies that interface with other interpreters.
It's intended for compatibility, e.g. running python2 only code in python3, or accessing C modules from pypy.
In the following example, the loop runs 4x as fast via using pypy as it would with plain python.
#!/usr/local/bin/python
from cpy2py import TwinMaster, twinfunction
import sys
import time
import math
# loops in PyPy
#twinfunction('pypy')
def prime_sieve(max_val):
start_time = time.time()
primes = [1] * 2 + [0] * (max_val - 1)
for value, factors in enumerate(primes):
if factors == 0:
for multiple in xrange(value*value, max_val + 1, value):
primes[multiple] += 1
return {'xy': [
[primes[idx] == 0 for idx in range(minidx, minidx + int(math.sqrt(max_val)))]
for minidx in range(0, max_val, int(math.sqrt(max_val)))
], 'info': '%s in %.1fs' % (sys.executable, time.time() - start_time)}
# matplotlib in CPython
#twinfunction('python')
def draw(xy, info='<None>'):
from matplotlib import pyplot
pyplot.copper()
pyplot.matshow(xy)
pyplot.xlabel(info, color="red")
pyplot.show()
if __name__ == '__main__':
twins = [TwinMaster('python'), TwinMaster('pypy')]
for twin in twins:
twin.start()
data = prime_sieve(int(1E6))
draw(**data)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python vs Go Hashing Differences - python

Related

Get Python to read a return code from C# in a reliable fashion

Translate from get_ipython().magic(u'R ...') to simple r2py commands

Calling C++ function from Inside Python function

How do you import a Python library within an R package using rPython?

Call a Python 2.7 module in Python 3.2

Categories

Resources