Manipulating data arrays and adding noise - python

I just cannot seem to get this right! :-(
I have a code that is supposed to read in a .fits file, add normally distributed noise to it, and then re-save that code. So far, it just does not seem to be working at all. There's a lot of extra code, so I only posted the portion that's relevant. Assume that everything that this slice of code needs to read in exists, because it does. The goal of this code is to take a .fits file and add normally distributed noise to the pixels, then save that file. "poisson" is a previously inputted variable, i.e. a "poisson" value of 1 corresponds to one standard deviation from the mean of zero. Yes, the word "poisson" is a bit of a misnomer, and I should really rehaul my code to amend that.
My first issue is . . . what does im0 = im[0] mean? It doesn't seem like it's the first row of pixels in the .fits file, because when I change the integer in the brackets to anything besides "0", I get an index error. On top of that, the normalNoise = np.random.normal(0,poisson) method is incomplete because I'm missing a third parameter, "size" (tuple of ints) and I have no idea what that means. My images are 130 pixels x 130 pixels, if that means anything.
im = pf.open(name)
im0 = im[0]
normalNoise = np.random.normal(0,poisson)
print im0.data
test = im0.data + normalNoise
print test
im0.data = test
stringee = 'NOISE'
pf.writeto(stringee+str(poisson)+name, data=test, clobber=True, header=im0.header)
print poisson
This should ideally spit out the same image but with added noise, except it doesn't!

I do not know the underlying library, but if pf is a FITS file, im[0] is probably
creating a reference to the primary HDU; if there is only a primary HDU (as for
any simple-minded FITS file with only a single image), any higher index leads to an error.

Related

plot results from user defined ACT Extension

As a result of my simulation, I want the volume of a surface body (computed using a convex hull algorithm). This calculation is done in seconds but the plotting of the results takes a long time, which becomes a problem for the future design of experiment. I think the main problem is that a matrix (size = number of nodes =over 33 000 nodes) is filled with the same volume value in order to be plotted. Is there any other way to obtain that value without creating this matrix? (the value retrieved must be selected as an output parameter afterwards)
It must be noted that the volume value is computed in python in an intermediate script then saved in an output file that is later read by Ironpython in the main script in Ansys ACT.
Thanks!
The matrix creation in the intermediate script (myICV is the volume computed) :
import numpy as np
NodeNo=np.array(Col_1)
ICV=np.full_like(NodeNo,myICV)
np.savetxt(outputfile,(NodeNo,ICV),delimiter=',',fmt='%f')
Plot of the results in main script :
import csv #after the Cpython function
resfile=opfile
reader=csv.reader(open(resfile,'rb'),quoting=csv.QUOTE_NONNUMERIC) #read the node number and the scaled displ
NodeNos=next(reader)
ICVs=next(reader)
#ScaledUxs=next(reader)
a=int(NodeNos[1])
b=ICVs[1]
ExtAPI.Log.WriteMessage(a.GetType().ToString())
ExtAPI.Log.WriteMessage(b.GetType().ToString())
userUnit=ExtAPI.DataModel.CurrentUnitFromQuantityName("Length")
DispFactor=units.ConvertUnit(1,userUnit,"mm")
for id in collector.Ids:
collector.SetValues(int(NodeNos[NodeNos.index(id)]), {ICVs[NodeNos.index(id)]*DispFactor}) #plot results
ExtAPI.Log.WriteMessage("ICV read")
So far the result looks like this
Considering that your 'CustomPost' object is not relevant in terms of visualization but just to pass the volume calculation as a parameter, without adding many changes to the workflow, I suggest you to change the 'Scoping Method' to 'Geometry' and then selecting a single node (if the extension result type is 'Node'; you can check data on the xml file), instead of 'All Bodies'.
If you code runs slow due to the plotting this should fix it, cause you will be requesting just one node.
As you are referring to DoE, I understand you are expecting to run this model iteratively and read the parameter result. An easy trick might be to generate a 'NamedSelection' by 'Worksheet' and select 'Mesh Node' (Entity Type) with 'NodeID' as Criterion and equal to '1', for example. So even if through your iterations you change the mesh, we expect to have always node ID 1, so your NamedSelection is guaranteed to be generated successfully in each iteration.
Then you can scope you 'CustomPost' to 'NamedSelection' and then to the one you created. This should work.
If your extension does not accept 'NamedSelection' as 'Scoping Method' and you are changing the mesh in each iteration (if you are not, you can directly scope a node), I think it is time to manually write the parameter as an 'Input Parameter', in the 'Parameter Set'. But in this way you will have to control the execution of the model from Workbench platform.
I am curious to see how it goes.

Corner Refine Method, CORNER_REFINE_SUBPIX, from AcUro with Python and OpenCV

Hi There
I want to increase the accuracy of the marker detection from aruco.detectMarkers. So, I want to use Corner Refine Method with CORNER_REFINE_SUBPIX, but I do not understand how it is implemented in python.
Sample code:
frame = cv.imread("test.png")
gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
para = aruco.DetectorParameters_create()
det_corners, ids, rejected = aruco.detectMarkers(gray,dictionary,parameters=para)
aruco.drawDetectedMarkers(frame,det_corners,ids)
Things I have tried:
para.cornerRefinementMethod()
para.cornerRefinementMethod(aruco.CORNER_REFINE_SUBPIX)
para.cornerRefinementMethod.CORNER_REFINE_SUBPIX
para = aruco.DetectorParameters_create(aruco.CORNER_REFINE_SUBPIX)
para = aruco.DetectorParameters_create(para.cornerRefinementMethod(aruco.CORNER_REFINE_SUBPIX))
They did not work, I’m pretty new to python ArUco so I hope that there is a simple and obvious solution.
I would also Like to implement enclosed markers like in the Documentation(Page 4). Do you happen to know if there is a way to generate these enclosed markers in python?
Concerning the first part of your question, you were pretty close: I assume your trouble is in switching and tweaking the "para" options. If so, you only need to set the corresponding values in the parameters object like
para.cornerRefinementMethod = aruco.CORNER_REFINE_SUBPIX
Note that "aruco.CORNER_REFINE_SUBPIX" is simply an integer. You can verify this by typing type(aruco.CORNER_REFINE_SUBPIX) in the console. Thus assigning values to the "para" object works like mentioned above.
You might also want to tweak the para.cornerRefinementWinSize which seems to be implemented in units of code pixels, not actual image pixel units.
Concerning the second part, you might have to write a function, that adds the boxes at the corner points, which you can get using the detectMarker function. Note that the corner points are always ordered clockwise, thus you can easily assign the correct offset values (like "up & left", "up & right" etc.).
para.cornerRefinementMethod = 1
may work.

Tesseract OCR fails to detect varying font size and letters that are not horizontally aligned

I am trying to detect these price labels text which is always clearly preprocessed. Although it can easily read the text written above it, it fails to detect price values. I am using python bindings pytesseract although it also fails to read from the CLI commands. Most of the time it tries to recognize the part where the price as one or two characters.
Sample 1:
tesseract D:\tesseract\tesseract_test_images\test.png output
And the output of the sample image is this.
je Beutel
13
However if I crop and stretch the price to look like they are seperated and are the same font size, output is just fine.
Processed image(cropped and shrinked price):
je Beutel
1,89
How do get OCR tesseract to work as I intended, as I will be going over a lot of similar images?
Edit: Added more price tags:
sample5 sample6 sample7
The problem is the image you are using is of small size. Now when tesseract processes the image it considers '8', '9' and ',' as a single letter and thus predicts it to '3' or may consider '8' and ',' as one letter and '9' as a different letter and so produces wrong output. The image shown below explains it.
A simple solution could be increasing its size by factor of 2 or 3 or even more as per the size of your original image and then passing to tesseract so that it detects each letter individually as shown below. (Here I increased its size by factor of 2)
Bellow is a simple python script that will solve your purpose
import pytesseract
import cv2
img = cv2.imread('dKC6k.png')
img = cv2.resize(img, None, fx=2, fy=2)
data = pytesseract.image_to_string(img)
print(data)
Detected text:
je Beutel
89
1.
Now you can simply extract the required data from the text and format it as per your requirement.
data = data.replace('\n\n', '\n')
data = data.split('\n')
dollars = data[2].strip(',').strip('.')
cents = data[1]
print('{}.{}'.format(dollars, cents))
Desired Format:
1.89
The problem is that the Tesseract engine was not trained to read this kind of text topology.
You can:
train your own model, and you'll need in particular to provide images with variations of topology (position of characters). You can actually use the same image, and shuffle the positions of the characters.
reorganize the image into clusters of text and use tesseract, in particular, I would consider the cents part and move it on the right of the coma, in that case you can use tesseract out of the box. Few relevant criterions would be the height of the clusters (to differenciate cents and integers), and the position of the clusters (read from the left to the right).
In general computer vision algorithms (including CNNs) are giving you tool to have a higher representation of an image (features or descriptors), but they fail to create a logic or an algorithm to process intermediate results in a certain way.
In your case that would be:
"if the height of those letters are smaller, it's cents",
"if the height, and vertical position is the same, it's about the
same number, either on left of coma, or on the right of coma".
The thing is that it's difficult to reach that through training, and at the same time it's extremely simple to write this for a human as an algorithm. Sorry for not giving you an actual implementation, but my text is the pseudo code.
TrainingTesseract2
TrainingTesseract4
Joint Unsupervised Learning of Deep Representations and Image Clusters

Using a function in a for loop python

I wrote a function to write a file based on different inputs. The function works fine when I used it for individual part. However, when I used a for loop to do it, it does not work, I do not know why.
Below is my code
import controDictRead as CD
import numpy as np
Totallength=180000
x=np.arange(0.,Totallength,1.)
h=300*np.ones(len(x))
Relaxationzone=1000 # Relaxation zone width
#I=2
#x_input=1000
#y_start=-200
y_end=9
z=0.05
nPoints=4000
dx=20000
threshold=1000
x_input=np.zeros(int(np.floor(Totallength/dx)))
y_start=np.zeros(len(x_input))
I=np.zeros(len(x_input))
for i in range(len(x_input)):
if i==0:
x_input[i]=Relaxationzone
else:
x_input[i]=dx*i
if x_input[i]>=Totallength-Relaxationzone:
x_input[i]=Totallength-Relaxationzone
I[i]=i+1
y_start[i]=np.interp(x_input[i],x,h)
controlDict_new=CD.controlDicWritter(I,x_input[i],y_start[i],y_end,z,nPoints)
All I tried to do is to write a file called controlDict, It has a list of locations that has a format like the one below
location1
{
type uniform;
axis y;
start (1000 -300 0.05 );
end (1000 9 0.05 );
nPoints 3000;
}
All my controlDictRead function does is to find the location name (e.g. location1) I am interested in, and then replace the values (e.g. start, end, nPoints) to the one I inputted. This function works fine when I input these values one by one, for example
controlDict_new=CD.controlDicWritter(3,x_input[2],y_start[2],y_end,z,nPoints)
However, when I did it using a loop like the one shown at the beginning, it does not work, the loop stuck at the first location and kept writing values onto the first location throughout the loop, I am not sure why. Any suggestions?
It would be helpful to also see the source of the controDictRead module, but I have a suspicion what might be wrong:
Your code seems to be passing an entire array to controlDictWriter, but your example code appears to be passing a single integer. Is that intentional?

Max value of array won't save ASCII text to file

I am reading points from a frequency spectrum graph into an array like this:
self.iq_fft = self.dofft(self.iq)#Gets values of all data point in block (in dB)
x = self.iq_fft #puts values into x for simplicity
maxi = max(x)
print maxi #This part works fine
def dofft(self, iq):
N = len(iq)
iq_fft = scipy.fftpack.fftshift(scipy.fft(iq))
Everything works great and it prints out the maximum value of the array to the screen perfect, however, when I try to save the max value to a file, like this:
savetxt('myfilelocation', maxi, fmt='%1.4f')#saves max(x) in binary format
It ends up saving the binary value to the text file and not the nice pretty ASCII one that printed to the screen. Funny thing is, when I just dump the whole array to a file it looks fine. Any ideas on how to fix this?
numpy.savetxt saves an array to a file, and it looks like maxi is a python Float. It's sort of unusual to use savetxt to save a scalar (why not just use Python's built-in file i/o functions?), but if you must, this might work:
savetxt('myfilelocation', [maxi], fmt='%1.4f')
Notice that I've put maxi in a list. Also, make sure that myfilelocation doesn't end with a .gz. If it does, numpy will compress it. When it doubt, just use .txt, or no extension at all.

Categories