Detect gray things with OpenCV - python

I'd like to detect an object using OpenCV that is distinctly different from other elements in the scene as it's gray. This is good because I can just run a test with R == G == B and it allows to be independent of luminosity, but doing it pixel by pixel is slow.
Is there a faster way to detect gray things? Maybe there's an OpenCV method that does the R == G == B test... cv2.inRange does color thresholding, it's not quite what I'm looking for.

The fastest method I can find in Python is to use slicing to compare each channel. After a few test runs, this method is upwards of 200 times faster than two nested for-loops.
bg = im[:,:,0] == im[:,:,1] # B == G
gr = im[:,:,1] == im[:,:,2] # G == R
slices = np.bitwise_and(bg, gr, dtype= np.uint8) * 255
This will generate a binary image where gray objects are indicated by white pixels. If you do not need a binary image, but only a logical array where grey pixels are indicated by True values, this method gets even faster:
slices = np.bitwise_and(bg, gr)
Omitting the type cast and multiplication yields a method 500 times faster than nested loops.
Running this operation on this test image:
Gives the following result:
As you can see, the gray object is correctly detected.

I'm surprised that such a simple check is slow, probably you are not coding it efficiently.
Here is a short piece of code that should do that for you. It optimal neither in speed nor in memory, but quite in number of lines of code :)
std::vector<cv::Mat> planes;
cv::split(image, planes);
cv::Mat mask = planes[0] == planes[1];
mask &= planes[1] == planes[2];
For the sake of it, here is a comparison with something that would be the fastest way to do it in my opinion (without parallelization)
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
#include <vector>
#include <sys/time.h> //gettimeofday
static
double
P_ellapsedTime(struct timeval t0, struct timeval t1)
{
//return ellapsed time in seconds
return (t1.tv_sec-t0.tv_sec)*1.0 + (t1.tv_usec-t0.tv_usec)/1000000.0;
}
int
main(int argc, char* argv[])
{
struct timeval t0, t1;
cv::Mat image = cv::imread(argv[1]);
assert(image.type() == CV_8UC3);
std::vector<cv::Mat> planes;
std::cout << "Image resolution=" << image.rows << "x" << image.cols << std::endl;
gettimeofday(&t0, NULL);
cv::split(image, planes);
cv::Mat mask = planes[0] == planes[1];
mask &= planes[1] == planes[2];
gettimeofday(&t1, NULL);
std::cout << "Time using split: " << P_ellapsedTime(t0, t1) << "s" << std::endl;
cv::Mat mask2 = cv::Mat::zeros(image.size(), CV_8U);
unsigned char *imgBuf = image.data;
unsigned char *maskBuf = mask2.data;
gettimeofday(&t0, NULL);
for (; imgBuf != image.dataend; imgBuf += 3, maskBuf++)
*maskBuf = (imgBuf[0] == imgBuf[1] && imgBuf[1] == imgBuf[2]) ? 255 : 0;
gettimeofday(&t1, NULL);
std::cout << "Time using loop: " << P_ellapsedTime(t0, t1) << "s" << std::endl;
cv::namedWindow("orig", 0);
cv::imshow("orig", image);
cv::namedWindow("mask", 0);
cv::imshow("mask", mask);
cv::namedWindow("mask2", 0);
cv::imshow("mask2", mask2);
cv::waitKey(0);
}
Bench on an image:
Image resolution=3171x2179
Time using split: 0.06353s
Time using loop: 0.029044s

Related

save data (matrix or ndarray) with python then it load in c++ (as OpenCV Mat)

I've created some data in numpy that I would like to use in a separate C++ program. Therefore I need to save the data using python and later load it in C++. What is the best way of doing this?
My numpy ndarray is float 32 and of shape [10000 x 18 x 5]. I can save it for example using
numpy.save(filename, data)
Is there an easy way to load such data in C++? Target structure could be an Eigen::Matrix for example.
After searching for hours I found my year-old example files.
Caveat:
solution only covers 2D matrices
not suited for 3 dimensional or generic ndarrays
Write numpy array to ascii file with header specifying nrows, ncols:
def write_matrix2D_to_ascii(filename, matrix2D):
nrows, ncols = matrix2D.shape
with open(filename, "w") as file:
# write header [rows x cols]
nrows, ncols = matrix2D.shape
file.write(f"{nrows} {ncols}")
file.write("\n")
# write values
for row in range(nrows):
for col in range(ncols):
value = matrix2D[row, col]
file.write(str(value))
file.write(" ")
file.write("\n")
Example output data-file.txt looks like this (first row is header specifying nrows and ncols):
2 3
1.0 2.0 3.0
4.0 5.0 6.0
Cpp function to read matrix from ascii file into OpenCV matrix:
#include <iostream>
#include <fstream>
#include <iomanip> // set precision of output string
#include <opencv2/core/core.hpp> // OpenCV matrices for storing data
using namespace std;
using namespace cv;
void readMatAsciiWithHeader( const string& filename, Mat& matData)
{
cout << "Create matrix from file :" << filename << endl;
ifstream inFileStream(filename.c_str());
if(!inFileStream){
cout << "File cannot be found" << endl;
exit(-1);
}
int rows, cols;
inFileStream >> rows;
inFileStream >> cols;
matData.create(rows,cols,CV_32F);
cout << "numRows: " << rows << "\t numCols: " << cols << endl;
matData.setTo(0); // init all values to 0
float *dptr;
for(int ridx=0; ridx < matData.rows; ++ridx){
dptr = matData.ptr<float>(ridx);
for(int cidx=0; cidx < matData.cols; ++cidx, ++dptr){
inFileStream >> *dptr;
}
}
inFileStream.close();
}
Driver code to use above mentioned function in cpp program:
Mat myMatrix;
readMatAsciiWithHeader("path/to/data-file.txt", myMatrix);
For completeness, some code to save the data using C++:
int saveMatAsciiWithHeader( const string& filename, Mat& matData)
{
if (matData.empty()){
cout << "File could not be saved. MatData is empty" << endl;
return 0;
}
ofstream oStream(filename.c_str());
// Create header
oStream << matData.rows << " " << matData.cols << endl;
// Write data
for(int ridx=0; ridx < matData.rows; ridx++)
{
for(int cidx=0; cidx < matData.cols; cidx++)
{
oStream << setprecision(9) << matData.at<float>(ridx,cidx) << " ";
}
oStream << endl;
}
oStream.close();
cout << "Saved " << filename.c_str() << endl;
return 1;
}
Future work:
solution for 3D matrices
conversion to Eigen::Matrix

Same random numbers in C++ as computed by Python3 numpy.random.rand

I would like to duplicate in C++ the testing for some code that has already been implemented in Python3 which relies on numpy.random.rand and randn values and a specific seed (e.g., seed = 1).
I understand that Python's random implementation is based on a Mersenne twister. The C++ standard library also supplies this in std::mersenne_twister_engine.
The C++ version returns an unsigned int, whereas Python rand is a floating point value.
Is there a way to obtain the same values in C++ as are generated in Python, and be sure that they are the same? And the same for an array generated by randn ?
You can do it this way for integer values:
import numpy as np
np.random.seed(12345)
print(np.random.randint(256**4, dtype='<u4', size=1)[0])
#include <iostream>
#include <random>
int main()
{
std::mt19937 e2(12345);
std::cout << e2() << std::endl;
}
The result of both snippets is 3992670690
By looking at source code of rand you can implement it in your C++ code this way:
import numpy as np
np.random.seed(12345)
print(np.random.rand())
#include <iostream>
#include <iomanip>
#include <random>
int main()
{
std::mt19937 e2(12345);
int a = e2() >> 5;
int b = e2() >> 6;
double value = (a * 67108864.0 + b) / 9007199254740992.0;
std::cout << std::fixed << std::setprecision(16) << value << std::endl;
}
Both random values are 0.9296160928171479
It would be convenient to use std::generate_canonical, but it uses another method to convert the output of Mersenne twister to double. The reason they differ is likely that generate_canonical is more optimized than the random generator used in NumPy, as it avoids costly floating point operations, especially multiplication and division, as seen in source code. However it seems to be implementation dependent, while NumPy produces the same result on all platforms.
double value = std::generate_canonical<double, std::numeric_limits<double>::digits>(e2);
This doesn't work and produces result 0.8901547132827379, which differs from the output of Python code.
For completeness and to avoid re-inventing the wheel, here is an implementation for both numpy.rand and numpy.randn in C++
The header file:
#ifndef RANDOMNUMGEN_NUMPYCOMPATIBLE_H
#define RANDOMNUMGEN_NUMPYCOMPATIBLE_H
#include "RandomNumGenerator.h"
//Uniform distribution - numpy.rand
class RandomNumGen_NumpyCompatible {
public:
RandomNumGen_NumpyCompatible();
RandomNumGen_NumpyCompatible(std::uint_fast32_t newSeed);
std::uint_fast32_t min() const { return m_mersenneEngine.min(); }
std::uint_fast32_t max() const { return m_mersenneEngine.max(); }
void seed(std::uint_fast32_t seed);
void discard(unsigned long long); // NOTE!! Advances and discards twice as many values as passed in to keep tracking with Numpy order
uint_fast32_t operator()(); //Simply returns the next Mersenne value from the engine
double getDouble(); //Calculates the next uniformly random double as numpy.rand does
std::string getGeneratorType() const { return "RandomNumGen_NumpyCompatible"; }
private:
std::mt19937 m_mersenneEngine;
};
///////////////////
//Gaussian distribution - numpy.randn
class GaussianRandomNumGen_NumpyCompatible {
public:
GaussianRandomNumGen_NumpyCompatible();
GaussianRandomNumGen_NumpyCompatible(std::uint_fast32_t newSeed);
std::uint_fast32_t min() const { return m_mersenneEngine.min(); }
std::uint_fast32_t max() const { return m_mersenneEngine.max(); }
void seed(std::uint_fast32_t seed);
void discard(unsigned long long); // NOTE!! Advances and discards twice as many values as passed in to keep tracking with Numpy order
uint_fast32_t operator()(); //Simply returns the next Mersenne value from the engine
double getDouble(); //Calculates the next normally (Gaussian) distrubuted random double as numpy.randn does
std::string getGeneratorType() const { return "GaussianRandomNumGen_NumpyCompatible"; }
private:
bool m_haveNextVal;
double m_nextVal;
std::mt19937 m_mersenneEngine;
};
#endif
And the implementation:
#include "RandomNumGen_NumpyCompatible.h"
RandomNumGen_NumpyCompatible::RandomNumGen_NumpyCompatible()
{
}
RandomNumGen_NumpyCompatible::RandomNumGen_NumpyCompatible(std::uint_fast32_t seed)
: m_mersenneEngine(seed)
{
}
void RandomNumGen_NumpyCompatible::seed(std::uint_fast32_t newSeed)
{
m_mersenneEngine.seed(newSeed);
}
void RandomNumGen_NumpyCompatible::discard(unsigned long long z)
{
//Advances and discards TWICE as many values to keep with Numpy order
m_mersenneEngine.discard(2*z);
}
std::uint_fast32_t RandomNumGen_NumpyCompatible::operator()()
{
return m_mersenneEngine();
}
double RandomNumGen_NumpyCompatible::getDouble()
{
int a = m_mersenneEngine() >> 5;
int b = m_mersenneEngine() >> 6;
return (a * 67108864.0 + b) / 9007199254740992.0;
}
///////////////////
GaussianRandomNumGen_NumpyCompatible::GaussianRandomNumGen_NumpyCompatible()
: m_haveNextVal(false)
{
}
GaussianRandomNumGen_NumpyCompatible::GaussianRandomNumGen_NumpyCompatible(std::uint_fast32_t seed)
: m_haveNextVal(false), m_mersenneEngine(seed)
{
}
void GaussianRandomNumGen_NumpyCompatible::seed(std::uint_fast32_t newSeed)
{
m_mersenneEngine.seed(newSeed);
}
void GaussianRandomNumGen_NumpyCompatible::discard(unsigned long long z)
{
//Burn some CPU cyles here
for (unsigned i = 0; i < z; ++i)
getDouble();
}
std::uint_fast32_t GaussianRandomNumGen_NumpyCompatible::operator()()
{
return m_mersenneEngine();
}
double GaussianRandomNumGen_NumpyCompatible::getDouble()
{
if (m_haveNextVal) {
m_haveNextVal = false;
return m_nextVal;
}
double f, x1, x2, r2;
do {
int a1 = m_mersenneEngine() >> 5;
int b1 = m_mersenneEngine() >> 6;
int a2 = m_mersenneEngine() >> 5;
int b2 = m_mersenneEngine() >> 6;
x1 = 2.0 * ((a1 * 67108864.0 + b1) / 9007199254740992.0) - 1.0;
x2 = 2.0 * ((a2 * 67108864.0 + b2) / 9007199254740992.0) - 1.0;
r2 = x1 * x1 + x2 * x2;
} while (r2 >= 1.0 || r2 == 0.0);
/* Box-Muller transform */
f = sqrt(-2.0 * log(r2) / r2);
m_haveNextVal = true;
m_nextVal = f * x1;
return f * x2;
}
After doing a bit of testing, it does seem that the values are within a tolerance (see #fdermishin 's comment below) when the C++ unsigned int is divided by the maximum value for an unsigned int like this:
#include <limits>
...
std::mt19937 generator1(seed); // mt19937 is a standard mersenne_twister_engine
unsigned val1 = generator1();
std::cout << "Gen 1 random value: " << val1 << std::endl;
std::cout << "Normalized Gen 1: " << static_cast<double>(val1) / std::numeric_limits<std::uint32_t>::max() << std::endl;
However, Python's version seems to skip every other value.
Given the following two programs:
#!/usr/bin/env python3
import numpy as np
def main():
np.random.seed(1)
for i in range(0, 10):
print(np.random.rand())
###########
# Call main and exit success
if __name__ == "__main__":
main()
sys.exit()
and
#include <cstdlib>
#include <iostream>
#include <random>
#include <limits>
int main()
{
unsigned seed = 1;
std::mt19937 generator1(seed); // mt19937 is a standard mersenne_twister_engine
for (unsigned i = 0; i < 10; ++i) {
unsigned val1 = generator1();
std::cout << "Normalized, #" << i << ": " << (static_cast<double>(val1) / std::numeric_limits<std::uint32_t>::max()) << std::endl;
}
return EXIT_SUCCESS;
}
the Python program prints:
0.417022004702574
0.7203244934421581
0.00011437481734488664
0.30233257263183977
0.14675589081711304
0.0923385947687978
0.1862602113776709
0.34556072704304774
0.39676747423066994
0.538816734003357
whereas the C++ program prints:
Normalized, #0: 0.417022
Normalized, #1: 0.997185
Normalized, #2: 0.720324
Normalized, #3: 0.932557
Normalized, #4: 0.000114381
Normalized, #5: 0.128124
Normalized, #6: 0.302333
Normalized, #7: 0.999041
Normalized, #8: 0.146756
Normalized, #9: 0.236089
I could easily skip every other value in the C++ version, which should give me numbers that match the Python version (within a tolerance). But why would Python's implementation seem to skip every other value, or where do these extra values in the C++ version come from?

LibSVM Predict Not working

I've coded my classifer in libsvm's python utility, and it has worked quite well so far. Here is an example of how I call my Python API:
print svmutil.svm_predict([2], [f.flatten().tolist()], libsvm_model, '-b 1')
where f is a (1024,1) vector.
I have saved the model, and loaded it using the C++ API. However, when I attempt to load and predict the same vector, it gives me wrong results.
cv::Mat oneCol = fcMat.row(0);
svm_node *x = (struct svm_node *) malloc(1025*sizeof(struct svm_node));
for(int i=0; i<1024; i++){
x[i].index = i;
x[i].value = (double)oneCol.at<float>(i);
}
x[1024].index = -1;
double *prob_estimates=NULL;
prob_estimates = (double *) malloc(svmModel->nr_class*sizeof(double));
double retVal = svm_predict_probability(svmModel, x, prob_estimates);
cout << retVal << endl;
for(int j=0;j<svmModel->nr_class;j++)
cout << prob_estimates[j] << endl;
Over here, I attempt to load a vector in from an OpenCV object as such. However, the predicted model comes out wrong. Is something wrong here?
for(int i=0; i<1024; i++){
x[i].index = i+1;
x[i].value = (double)oneCol.at<float>(i);
}
In LibSVM, indexes start at 1. Who knew :(

How to write 2D std vector of floats to HDF5 file and then read it in python

I want to write a 2D vector of floats to a HDF5 file.
I used the following code (writeh5.cpp):
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <H5Cpp.h>
using namespace H5;
using namespace std;
int main(void) {
int nrow = 5;
int ncol = 4;
vector<vector< double > > vec2d;
vec2d.resize(nrow, vector<double>(ncol, 0.0));
srand((unsigned)time(0));
typename vector< vector< double > >::iterator row;
typename vector< double >::iterator col;
for (row = vec2d.begin(); row != vec2d.end(); row++) {
cout << endl;
for (col = row->begin(); col != row->end(); col++) {
*col = (rand()/(RAND_MAX+1.0));
cout << *col << '\t';
}
}
cout << endl;
H5File file("test.h5", H5F_ACC_TRUNC);
// dataset dimensions
hsize_t dimsf[2];
dimsf[0] = nrow;
dimsf[1] = ncol;
DataSpace dataspace(2, dimsf);
DataType datatype(H5::PredType::NATIVE_DOUBLE);
DataSet dataset = file.createDataSet("data", datatype, dataspace);
// dataset.write(vec2d.data(), H5::PredType::NATIVE_DOUBLE);
dataset.write(&vec2d[0][0], H5::PredType::NATIVE_DOUBLE);
cout << endl << " vec2d has " << endl;
for (row = vec2d.begin(); row != vec2d.end(); row++) {
cout << endl;
for (col = row->begin(); col != row->end(); col++) {
cout << *col << '\t';
}
}
cout << endl;
dataset.close();
dataspace.close();
file.close();
return 0;
}
I compiled it using g++ writeh5.cpp -I/usr/include/hdf5/ -lhdf5_cpp -lhdf5 -Wall
A run of the code produced the following output:
0.325553 0.598941 0.364489 0.0125061
0.374205 0.0319419 0.380329 0.815621
0.863754 0.386279 0.0173515 0.15448
0.703936 0.372486 0.728436 0.991631
0.666207 0.568983 0.807475 0.964276
And the file test.h5
Then when i read this file from python (using the following)
import h5py
import numpy as np
file = h5py.File("test.h5", 'r')
dataset = np.array(file["data"])
print dataset
file.close()
I got
[[ 3.25553381e-001 5.98941262e-001 3.64488814e-001 1.25061036e-002]
[ 0.00000000e+000 2.42092166e-322 3.74204732e-001 3.19418786e-002]
[ 3.80329057e-001 8.15620518e-001 0.00000000e+000 2.42092166e-322]
[ 8.63753530e-001 3.86278684e-001 1.73514970e-002 1.54479635e-001]
[ 0.00000000e+000 2.42092166e-322 7.03935940e-001 3.72486182e-001]]
the first row is good, the other rows are garbage.
I tried with dataset.write(&vec2d[0]... and dataset.write(vec2d[0].data()..., i got similar problems.
I want to
Write a HDF5 file with the contents of a 2D std::vector of doubles,
Read the file in python and store the contents in a numpy array
What i am doing wrong?
Apparently, I am not allowed to pass a std::vector of vectors to the write function. Thus, copying the elements of the vector to an static array solves the problem, because the write function accepts happily this array.
However, I am not happy with this solution, I expected to use the vectors
directly into the write function.
Here is the code:
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <H5Cpp.h>
using namespace H5;
using namespace std;
int main(void) {
int nrow = 5;
int ncol = 4;
vector<vector< double > > vec2d;
vec2d.resize(nrow, vector<double>(ncol, 0.0));
srand((unsigned)time(0));
// generate some data
typename vector< vector< double > >::iterator row;
typename vector< double >::iterator col;
for (row = vec2d.begin(); row != vec2d.end(); row++) {
cout << endl;
for (col = row->begin(); col != row->end(); col++) {
*col = (rand()/(RAND_MAX+1.0));
cout << *col << '\t';
}
}
cout << endl;
double varray[nrow][ncol];
for( int i = 0; i<nrow; ++i) {
cout << endl;
for( int j = 0; j<ncol; ++j) {
varray[i][j] = vec2d[i][j];
}
}
H5File file("test.h5", H5F_ACC_TRUNC);
// dataset dimensions
hsize_t dimsf[2];
dimsf[0] = nrow;
dimsf[1] = ncol;
DataSpace dataspace(2, dimsf);
DataType datatype(H5::PredType::NATIVE_DOUBLE);
DataSet dataset = file.createDataSet("data", datatype, dataspace);
dataset.write(varray, H5::PredType::NATIVE_DOUBLE);
cout << endl;
dataset.close();
dataspace.close();
file.close();
return 0;
}
I ran into the same problem when i converted my data from a vector to a dynamic 2D array. The problem with the h5write command is not that it will not accept a vector, It does not understand the concept of a pointer array. it only writes out contiguous memory. A vector of vectors is not contiguous in memory but instead a pointer array to a bunch of vectors. That is why when you passed the first element of the array the first row was correct. The rest of the table is just the garbage in memory following the first vector.
My solution was creating a giant 1D vector and performing my own indexing to convert back and forth. This is similar to the approach in h5_writedyn.c https://www.hdfgroup.org/ftp/HDF5/examples/misc-examples/h5_writedyn.c
What is this?
gives
0.325553 0.598941 0.364489 0.0125061
0.374205 0.0319419 0.380329 0.815621
0.863754 0.386279 0.0173515 0.15448
0.703936 0.372486 0.728436 0.991631
0.666207 0.568983 0.807475 0.964276
I don't see a print your c++ code. Did you read the file with some other tool?
(yes, this is a clarifying question, but it requires too much formatting to fit in a comment).
https://stackoverflow.com/a/24622720/901925
Writing 2-D array int[n][m] to HDF5 file using Visual C++
The solution talks about writing a vector of vectors. It also talks about writing variable length arrays.
You may have to put in the dataset write in a row iterator
for (row = vec2d.begin(); row != vec2d.end(); row++) {
dataset.write(*row, H5::PredType::NATIVE_DOUBLE);
# or dataset.write(row[0], ...)?
}
}

C++ - Reading in 16bit .wav files

I'm trying to read in a .wav file, which I thought was giving me the correct result, however, when I plot the same audio file in Matlab or Python, the results are different.
This is the result that I get:
This is the result that Python (plotted with matplotlib) gives:
The results do not seem that different, but, when it comes to analysis, this is messing up my results.
Here is the code that converts:
for (int i = 0; i < size; i += 2)
{
int c = (data[i + 1] << 8) | data[i];
double t = c/32768.0;
//cout << t << endl;
rawSignal.push_back(t);
}
Where am I going wrong? Since, this conversion seems fine and does produce such a similar results.
Thanks
EDIT:
Code to read the header / data:
voidreadHeader(ifstream& file) {
s_riff_hdr riff_hdr;
s_chunk_hdr chunk_hdr;
long padded_size; // Size of extra bits
vector<uint8_t> fmt_data; // Vector to store the FMT data.
s_wavefmt *fmt = NULL;
file.read(reinterpret_cast<char*>(&riff_hdr), sizeof(riff_hdr));
if (!file) return false;
if (memcmp(riff_hdr.id, "RIFF", 4) != 0) return false;
//cout << "size=" << riff_hdr.size << endl;
//cout << "type=" << string(riff_hdr.type, 4) << endl;
if (memcmp(riff_hdr.type, "WAVE", 4) != 0) return false;
{
do
{
file.read(reinterpret_cast<char*>(&chunk_hdr), sizeof(chunk_hdr));
if (!file) return false;
padded_size = ((chunk_hdr.size + 1) & ~1);
if (memcmp(chunk_hdr.id, "fmt ", 4) == 0)
{
if (chunk_hdr.size < sizeof(s_wavefmt)) return false;
fmt_data.resize(padded_size);
file.read(reinterpret_cast<char*>(&fmt_data[0]), padded_size);
if (!file) return false;
fmt = reinterpret_cast<s_wavefmt*>(&fmt_data[0]);
sample_rate2 = fmt->sample_rate;
if (fmt->format_tag == 1) // PCM
{
if (chunk_hdr.size < sizeof(s_pcmwavefmt)) return false;
s_pcmwavefmt *pcm_fmt = reinterpret_cast<s_pcmwavefmt*>(fmt);
bits_per_sample = pcm_fmt->bits_per_sample;
}
else
{
if (chunk_hdr.size < sizeof(s_wavefmtex)) return false;
s_wavefmtex *fmt_ex = reinterpret_cast<s_wavefmtex*>(fmt);
if (fmt_ex->extra_size != 0)
{
if (chunk_hdr.size < (sizeof(s_wavefmtex) + fmt_ex->extra_size)) return false;
uint8_t *extra_data = reinterpret_cast<uint8_t*>(fmt_ex + 1);
// use extra_data, up to extra_size bytes, as needed...
}
}
//cout << "extra_size=" << fmt_ex->extra_size << endl;
}
else if (memcmp(chunk_hdr.id, "data", 4) == 0)
{
// process chunk data, according to fmt, as needed...
size = padded_size;
if(bits_per_sample == 16)
{
//size = padded_size / 2;
}
data = new unsigned char[size];
file.read(data, size);
file.ignore(padded_size);
if (!file) return false;
}
{
// process other chunks as needed...
file.ignore(padded_size);
if (!file) return false;
}
}while (!file.eof());
return true;
}
}
This is where the "conversion to double" happens :
if(bits_per_sample == 8)
{
uint8_t c;
//cout << size;
for(unsigned i=0; (i < size); i++)
{
c = (unsigned)(unsigned char)(data[i]);
double t = (c-128)/128.0;
rawSignal.push_back(t);
}
}
else if(bits_per_sample == 16)
{
for (int i = 0; i < size; i += 2)
{
int c;
c = (unsigned) (unsigned char) (data[i + 2] << 8) | data[i];
double t = c/32768.0;
rawSignal.push_back(t);
}
Note how "8bit" files work correctly?
I suspect your problem may be that data is an array of signed char values. So, when you do this:
int c = (data[i + 1] << 8) | data[i];
… it's not actually doing what you wanted. Let's look at some simple examples.
If data[i+1] == 64 and data[i] == 64, that's going to be 0x4000 | 0x40, or 0x4040, all good.
If data[i+1] == -64 and data[i] == -64, that's going to be 0xffffc000 | 0xffffffc0, or 0xffffffc0, which is obviously wrong.
If you were using unsigned char values, this would work, because instead of -64 those numbers would be 192, and you'd end up with 0xc000 | 0xc0 or 0xc0c0, just as you want. (But then your /32768.0 would give you numbers in the range 0.0 to 2.0, when you presumably want -1.0 to 1.0.)
Suggesting a "fix" is difficult without knowing what exactly you're trying to do. Obviously you want to convert some kind of 16-bit little-endian integer format into some kind of floating-point format, but a lot rests on the exact details of those formats, and you haven't provided any such details. The default .wav format is 16-bit unsigned little-endian integers, so just using unsigned char * would fix that part of the equation. But I don't know of any audio format that uses 64-bit floating point numbers from 0.0 to 2.0, and I don't know what audio format you're actually aiming for, so I can't say what that /32768.0 should actually be, just that it's probably wrong.

Categories