Why is writing to I2C too slow in Python comparing to C? - python

I am trying to find out why the same code in Python works 25 times slower than C even if I use CDLL, when I try to write into I2C. Below I will describe all the details what I am doing step by step.
The version of Raspberry PI: Raspberry PI 3 Model B
OS: Raspbian Buster Lite Version:July 2019
GCC version: gcc (Raspbian 8.3.0-6+rpi1) 8.3.0
Python version: Python 3.7.3
The device I am working with though I2C is MCP23017. All I do is writing 0 and 1 to the pin B0. Here is my code written in C:
// test1.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <linux/i2c-dev.h>
#include <time.h>
int init() {
int fd = open("/dev/i2c-1", O_RDWR);
ioctl(fd, I2C_SLAVE, 0x20);
return fd;
}
void deinit(int fd) {
close(fd);
}
void makewrite(int fd, int v) {
char buffer[2] = { 0x13, 0x00 };
buffer[1] = v;
write(fd, buffer, 2);
}
void mytest() {
clock_t tb, te;
int n = 1000;
int fd = init();
tb = clock();
int v = 1;
for (int i = 0; i < n; i++) {
makewrite(fd, v);
v = 1 - v;
}
te = clock();
printf("Time: %.3lf ms\n", (double)(te - tb) / n / CLOCKS_PER_SEC * 1e3);
deinit(fd);
}
int main() {
mytest();
return 0;
}
I compile and run it with the command:
gcc test1.c -o test1 && ./test1
It gives me the result:
pi#raspberrypi:~/dev/i2c_example $ gcc test1.c -o test1 && ./test1
Time: 0.020 ms
I may conclude that writing to the pin takes 0.02 milliseconds.
After that I create SO-file to be able to access the written functions from my Python script:
gcc -c -fPIC test1.c -o test1.o && gcc test1.o -shared -o test1.so
And my Python script to test:
# test1.py
import ctypes
from time import time
test1so = ctypes.CDLL("/home/pi/dev/i2c_example/test1.so")
test1so.mytest()
n = 1000
fd = test1so.init()
tb = time()
v = 1
for _ in range(n):
test1so.makewrite(fd, v)
v = 1 - v
te = time()
print("Time: {:.3f} ms".format((te - tb) / n * 1e3))
test1so.deinit(fd)
This provides me the result:
pi#raspberrypi:~/dev/i2c_example $ python test1.py
Time: 0.021 ms
Time: 0.516 ms
I cannot understand why the call of makewrite is 25 times slower in Python though actually I call the same C-code. I also researched that if I comment write(fd, buffer, 2); in test1.c or change fd to 1, the times given by the Python script are compatible, there is no such huge difference.
// in test1.c
write(fd, buffer, 2); -> write(1, buffer, 2);
Running C-program:
pi#raspberrypi:~/dev/i2c_example $ gcc test1.c -o test1 && ./test1
...Time: 0.012 ms
Running Python-program:
pi#raspberrypi:~/dev/i2c_example $ python3 test1.py
...Time: 0.009 ms
...Time: 0.021 ms
It confused me a lot. Can anybody tell me why does it happen and how can I improve my performance in Python regarding the access via I2C using C-DLL?
Summary:
Descriptor: 1 (stdout)
Execution time of makewrite in C purely: 0.009 ms
Execution time of makewrite in C called as C-DLL function from Python: 0.021 ms
The result is expectable. This difference is not so high. It can be explained that Python loop and its statements are not as efficient as in C, thus it increases the execution time.
Descriptor: I2C
Execution time of makewrite in C purely: 0.021 ms
Execution time of makewrite in C called as DLL function from Python: 0.516 ms
After switching the file descriptor to I2C the execution time in pure C increased in around 0.012 ms, so I would expect the execution time for calling from Python: 0.021 ms + 0.012 ms = 0.033 ms, because all changes I've done are inside of makewrite, so Python is supposed not to know this internal thing (because it's packed in so-file). But I have 0.516 ms instead of 0.033 ms that confuses me.

Related

For loop in C breaks randomly

I've been struggling with this loop in C for a while now. I'm trying to create a string array through a for loop (which I'm not sure I'm doing correctly. I hope I am). Every time I enter a string with a space in it, the for loop breaks and skips all iterations. For example, if I write S 1 in the command line, it would break.
This is the code:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
int main(){
int players;
int jerseys;
int count = 0;
int i;
scanf("%d", &jerseys);
scanf("%d", &players);
char size[jerseys], p[players][100];
for(jerseys; jerseys > 0; jerseys--){
scanf(" %c", &size[count]);
count++;
}
getchar();
count = 0;
for(players; players>0; players--){
/*scanf(" %s", p[0] ); */ /*you cant assign arrays in C.*/
getchar();
fgets(p[count], 100, stdin);
printf("%s", p[count]);
printf("%s", p[count][2]); /* LINE 29 */
printf("Hello\n");
count ++;
}
return 0;
}
Moreover, on line 29, if I change the index from 2 to 1, the loop instantly breaks, no matter what I put.
I have a python code for what I essentially want from C:
given = []
jerseys = int(input())
if jerseys == 0:
print(0)
players = int(input())
j = []
requests = 0
for _ in range(jerseys):
size = input()
j.append(size)
for _ in range(players):
p = input().split()
I've looked at many places, and I think the problem is with the array, not the new lines, but I have no clue.
Edit:
This is something that would look like what I want to input(and what I usually try):
3
3
S
M
L
S 1
S 3
L 2
If the input characters do not match the control characters or are of the wrong type for a formatted input scanf terminates leaving the offending character as the next character to be read.
If you write 1` in the command line, then jerseys set to 1, but players is a random int because the ` not match the %d format. So in you program, your players variable may be a big int.
So when you use scanf, you'd better to check the return value like
if ((scanf("%d", &players) != 1) {
/* error handle */
}
I run the code and segmentation fault is raise.
the posted code does not cleanly compile!
Here is the output from the gcc compiler:
gcc -ggdb3 -Wall -Wextra -Wconversion -pedantic -std=gnu11 -c "untitled.c" -o "untitled.o"
untitled.c: In function ‘main’:
untitled.c:17:5: warning: statement with no effect [-Wunused-value]
17 | for(jerseys; jerseys > 0; jerseys--){
| ^~~
untitled.c:24:5: warning: statement with no effect [-Wunused-value]
24 | for(players; players>0; players--){
| ^~~
untitled.c:29:18: warning: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘int’ [-Wformat=]
29 | printf("%s", p[count][2]); /* LINE 29 */
| ~^ ~~~~~~~~~~~
| | |
| char * int
| %d
untitled.c:10:9: warning: unused variable ‘i’ [-Wunused-variable]
10 | int i;
| ^
Compilation finished successfully.
Please correct the code AND check the returned status from the C library I/O functions
Regarding:
Compilation finished successfully.
since the compiler output several warnings, This statement only means the compiler made some (not necessarily correct) guesses as to what you meant.

Arduino-Python Serial Communication HC-SR04

I'm trying to measure distance using HC-SR04 then writing it to python's monitor. The first output that python shows is true but second and the other outputs looks like python read two parts of the distance value. I think, i am having some timing issues but i tried to change delays in two of them and that didn't work.
When I run the code output looks like this:
C:\Users\mobyr\PycharmProjects\1\venv\Scripts\python.exe C:/Users/mobyr/Desktop/mesafeolc.py
distance 2.18 m
distance
2. m
distance 18
m
Process finished with exit code 0
I get the true values on
Arduino's Serial Monitor
2.18
2.19
2.18
2.17
2.17
2.17
2.18
2.20
2.17
Python Codes:
import serial
import time
arduino = serial.Serial('COM3', 9600)
def Measure():
distance = arduino.read(4)
time.sleep(1)
print ("distance "+distance+" m")
return float(distance)
while True:
output = Measure()
if output > 5:
break
Arduino Codes:
const int echo_pin = 9;
const int trig_pin = 10;
void setup() {
Serial.begin(9600);
pinMode(echo_pin, INPUT);
pinMode(trig_pin, OUTPUT);
}
void loop() {
double period, distance;
digitalWrite(trig_pin, LOW);
delayMicroseconds(2);
digitalWrite(trig_pin, HIGH);
delayMicroseconds(10);
digitalWrite(trig_pin, LOW);
period = pulseIn(echo_pin, HIGH);
distance = (period / 2) / 29.1;
distance = distance / 100; //to convert cm to m
delay(500);
Serial.println(distance);
}
The line distance = arduino.read(4) reads 4 chars, but the output you are after is actually 5 chars, cause the linebreak is also counted as a char.
I would recommend using distance = arduino.readline() instead, then you can handle values of any size.

Eigen + MKL or OpenBLAS slower than Numpy/Scipy + OpenBLAS

I'm starting with c++ atm and want to work with matrices and speed up things in general. Worked with Python+Numpy+OpenBLAS before.
Thought c++ + Eigen + MKL might be faster or at least not slower.
My c++ code:
#define EIGEN_USE_MKL_ALL
#include <iostream>
#include <Eigen/Dense>
#include <Eigen/LU>
#include <chrono>
using namespace std;
using namespace Eigen;
int main()
{
int n = Eigen::nbThreads( );
cout << "#Threads: " << n << endl;
uint16_t size = 4000;
MatrixXd a = MatrixXd::Random(size,size);
clock_t start = clock ();
PartialPivLU<MatrixXd> lu = PartialPivLU<MatrixXd>(a);
float timeElapsed = double( clock() - start ) / CLOCKS_PER_SEC;
cout << "Elasped time is " << timeElapsed << " seconds." << endl ;
}
My Python code:
import numpy as np
from time import time
from scipy import linalg as la
size = 4000
A = np.random.random((size, size))
t = time()
LU, piv = la.lu_factor(A)
print(time()-t)
My timings:
C++ 2.4s
Python 1.2s
Why is c++ slower than Python?
I am compiling c++ using:
g++ main.cpp -o main -lopenblas -O3 -fopenmp -DMKL_LP64 -I/usr/local/include/mkl/include
MKL is definiely working: If I disable it the running time is around 13s.
I also tried C++ + OpenBLAS which gives me around 2.4s as well.
Any ideas why C++ and Eigen are slower than numpy/scipy?
The timing is just wrong. That's a typical symptom of wall clock time vs. CPU time. When I use the system_clock from the <chrono> header it “magically” becomes faster.
#define EIGEN_USE_MKL_ALL
#include <iostream>
#include <Eigen/Dense>
#include <Eigen/LU>
#include <chrono>
int main()
{
int const n = Eigen::nbThreads( );
std::cout << "#Threads: " << n << std::endl;
int const size = 4000;
Eigen::MatrixXd a = Eigen::MatrixXd::Random(size,size);
auto start = std::chrono::system_clock::now();
Eigen::PartialPivLU<Eigen::MatrixXd> lu(a);
auto stop = std::chrono::system_clock::now();
std::cout << "Elasped time is "
<< std::chrono::duration<double>{stop - start}.count()
<< " seconds." << std::endl;
}
I compile with
icc -O3 -mkl -std=c++11 -DNDEBUG -I/usr/include/eigen3/ test.cpp
and get the output
#Threads: 1
Elasped time is 0.295782 seconds.
Your Python version reports 0.399146080017 on my machine.
Alternatively, to obtain comparable timing you could use time.clock() (CPU time) in Python instead of time.time() (wall clock time).
This is not a fair comparison. The python routine is operating on float precision while the c++ code needs to crunch doubles. This exactly doubles the computation time.
>>> type(np.random.random_sample())
<type 'float'>
You should compare with MatrixXf instead of MatrixXd and your MKL code should be equally fast.

Use numpy with Cython

I want to create .so file from python and execute the .so file in C.
To do it I used cython to convert .pyx to .so
## print_me.pyx
cimport numpy as cnp
import numpy as np
cimport cython
cpdef public char* print_me(f):
# I know this numpy line does nothing
cdef cnp.ndarray[cnp.complex128_t, ndim=3] a = np.zeros((3,3,3), dtype=np.complex128)
return f
Then I used setup.py to actually convert .pyx to .so
## setup.py
from distutils.core import setup
from Cython.Build import cythonize
import numpy as np
setup(
ext_modules=cythonize("print_me.pyx"),
include_dirs=[np.get_include()]
)
By running the following command line, I was able to create .so file
python setup.py build_ext --inplace
When I tried to run so file using the following C code, I got a Segmentation Fault.
/* toloadso.c */
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
#include <time.h>
#include <python2.7/Python.h>
int main(void)
{
// define function
void *handle;
char* (*print_me)(PyObject*);
char *error;
PyObject* filename = PyString_FromString("hello");
// load so file
handle = dlopen("./print_me.so", RTLD_LAZY);
if (!handle) {
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
dlerror();
// get function handler from so file
print_me = (char* (*)(PyObject*))dlsym(handle, "print_me");
// check if handler got error
error = dlerror();
if (error != NULL) {
fprintf(stderr, "%s\n", error);
exit(EXIT_FAILURE);
}
// execute loaded function
printf("%s\n", (char*)(*print_me)(filename));
dlclose(handle);
exit(EXIT_SUCCESS);
}
I compiled this .c file with following command:
gcc -fPIC -I/usr/include/ -o toloadso toloadso.c -lpython2.7 -ldl
(It compiled without error or warning)
When I tried to run this code, I got a segmentation Fault
[root#localhost ~]# ./toloadso
Segmentation fault
If I comment out the following line in print_me.pyx
cdef cnp.ndarray[cnp.complex128_t, ndim=3] a = np.zeros((3,3,3), dtype=np.complex128)
My C code runs without error, but once I uncomment this line, it does not work.
I think that trying to use numpy in cython generates an error somehow.
How can I fix it??
I thank you so much for your reply
You must initialize the numpy C API by calling import_array().
Add this line to your cython file:
cnp.import_array()
And as pointed out by #user4815162342 and #DavidW in the comments, you must call Py_Initialize() and Py_Finalize() in main().
Thank you for your help first. I could get something useful information, even though that could not directly solve my problem.
By referring to others advice,
rather than calling print_me function from .so file, I decided to call directly from C. This is what I did.
# print_me.pyx
import numpy as np
cimport numpy as np
np.import_array()
cdef public char* print_me(f):
cdef int[2][4] ll = [[1, 2, 3, 4], [5,6,7,8]]
cdef np.ndarray[np.int_t, ndim=2] nll = np.zeros((4, 6), dtype=np.int)
print nll
nll += 1
print nll
return f + str(ll[1][0])
This is my .c file
// main.c
#include <python2.7/Python.h>
#include "print_me.h"
int main()
{
// initialize python
Py_Initialize();
PyObject* filename = PyString_FromString("hello");
initsquare_number();
//initprint_me();
// call python-oriented function
printf("%s\n", print_me(filename));
// finalize python
Py_Finalize();
return 0;
}
I compiled then as follows
# to generate print_me.c and print_me.h
cython print_me.pyx
# to build main.c and print_me.c into main.o and print_me.o
cc -c main.c print_me.c -I/usr/include/python2.7 -I/usr/lib64/python2.7/site-packages/numpy/core/include
# to linke .o files
cc -lpython2.7 -ldl main.o print_me.o -o main
# execute main
./main
This results the following
[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 0 0 0]]
[[1 1 1 1 1 1]
[1 1 1 1 1 1]
[1 1 1 1 1 1]
[1 1 1 1 1 1]]
hello5
Thank you for all of your help again!! :)

md5 a string multiple times get different result on different platform

t.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <openssl/md5.h>
static char* unsigned_to_signed_char(const unsigned char* in , int len) {
char* res = (char*)malloc(len * 2 + 1);
int i = 0;
memset(res , 0 , len * 2 + 1);
while(i < len) {
sprintf(res + i * 2 , "%02x" , in[i]);
i ++;
};
return res;
}
static unsigned char * md5(const unsigned char * in) {
MD5_CTX ctx;
unsigned char * result1 = (unsigned char *)malloc(MD5_DIGEST_LENGTH);
MD5_Init(&ctx);
printf("len: %lu \n", strlen(in));
MD5_Update(&ctx, in, strlen(in));
MD5_Final(result1, &ctx);
return result1;
}
int main(int argc, char *argv[])
{
const char * i = "abcdef";
unsigned char * data = (unsigned char *)malloc(strlen(i) + 1);
strncpy(data, i, strlen(i));
unsigned char * result1 = md5(data);
free(data);
printf("%s\n", unsigned_to_signed_char(result1, MD5_DIGEST_LENGTH));
unsigned char * result2 = md5(result1);
free(result1);
printf("%s\n", unsigned_to_signed_char(result2, MD5_DIGEST_LENGTH));
unsigned char * result3 = md5(result2);
free(result2);
printf("%s\n", unsigned_to_signed_char(result3, MD5_DIGEST_LENGTH));
return 0;
}
makeflle
all:
cc t.c -Wall -L/usr/local/lib -lcrypto
and t.py
#!/usr/bin/env python
import hashlib
import binascii
src = 'abcdef'
a = hashlib.md5(src).digest()
b = hashlib.md5(a).digest()
c = hashlib.md5(b).hexdigest().upper()
print binascii.b2a_hex(a)
print binascii.b2a_hex(b)
print c
The results of python script on Debian6 x86 and MacOS 10.6 are the same:
e80b5017098950fc58aad83c8c14978e
b91282813df47352f7fe2c0c1fe9e5bd
85E4FBD1BD400329009162A8023E1E4B
the c version on MacOS is:
len: 6
e80b5017098950fc58aad83c8c14978e
len: 48
eac9eaa9a4e5673c5d3773d7a3108c18
len: 64
73f83fa79e53e9415446c66802a0383f
Why it is different from Debian6 ?
Debian environment:
gcc (Debian 4.4.5-8) 4.4.5
Python 2.6.6
Linux shuge-lab 2.6.26-2-686 #1 SMP Thu Nov 25 01:53:57 UTC 2010 i686 GNU/Linux
OpenSSL was installed from testing repository.
MacOS environment:
i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
Python 2.7.1
Darwin Lees-Box.local 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386
OpenSSL was installed from MacPort.
openssl #1.0.0d (devel, security)
OpenSSL SSL/TLS cryptography library
I think you are allocating bytes exactly for MD5 result, without ending \0. Then you are calculating MD5 of block of memory that starts with result from previous MD5 calculating but with some random bytes after it. You should allocate one byte more for result and set it to \0.
My proposal:
...
unsigned char * result1 = (unsigned char *)malloc(MD5_DIGEST_LENGTH + 1);
result1[MD5_DIGEST_LENGTH] = 0;
...
The answers so far don't seem to me to have stated the issue clearly enough. Specifically the problem is the line:
MD5_Update(&ctx, in, strlen(in));
The data block you pass in is not '\0' terminated, so the call to update may try to process further bytes beyond the end of the MD5_DIGEST_LENGTH buffer. In short, stop using strlen() to work out the length of an arbitrary buffer of bytes: you know how long the buffers are supposed to be so pass the length around.
You don't '\0' terminate the string you're passing to md5 (which I
suppose takes a '\0' terminated string, since you don't pass it the
length). The code
memset( data, 0, sizeof( strlen( i ) ) );
memcpy( data, i, strlen( i ) );
is completely broken: sizeof( strlen( i ) ) is the same as
sizeof( size_t ), 4 or 8 on typical machines. But you don't want the
memset anyway. Try replacing these with:
strcpy( data, i );
Or better yet:
std::string i( "abcdef" );
, then pass i.c_str() to md5 (and declare md5 to take a char
const*. (I'd use an std::vector<unsigned char> in md5() as well,
and have it return it. And unsigned_to_signed_char would take the
std::vector<unsigned char> and return std::string.)

Categories