I'm trying to create a new FITS file out of two older ones using PyFITS.
import pyfits
from sys import stdout
from sys import argv
import time
file1 = argv[1]
file2 = argv[2]
hdu1 = pyfits.open(file1)
hdu2 = pyfits.open(file2)
new0 = hdu1[0]
new1 = hdu1[0]
sci1 = hdu1[0].data
sci2 = hdu2[0].data
for r in range(0, len(sci1)):
for c in range(0, len(sci1[r])):
add = sci1[r][c] + sci2[r][c]
new0.data[r][c] = add
for r in range(0, len(sci1)):
for c in range(0, len(sci1[r])):
print "(" + str(r) + ", " + str(c) + ") FirstVal = " + str(sci1[r][c]) + " || SecondVal = " + str(sci2[r][c])
print "\t New File/Add = " + str(new0.data[r][c])
All it prints out is the first value, i.e. sci1[r][c]. This means that the variable isn't being modified at all. How can I make it modify? I'm very new to using FITS.
What you have done here is make sci1 a reference to new0.data which means the assignment to new0 also changes sci1, so it is modifying the intended variable but your print loop is printing the same object twice.
If you want to have a copy instead of reference you have to use the objects copy method, in this case sci0 = new0.data.copy()
This is also not the way you are supposed to use numpy which pyfits uses to represent its images. Instead of loops you apply operations to full arrays which is in most cases easier to read and significantly faster. If you want to add two fits images represented as numpy arrays inplace:
new0.data += new1.data
print new0.data
or if you want to create a new image out of the sum of both inputs:
sum_image = new0.data + new1.data
# put it into an pyfits HDU (primary fits extension)
hdu = pyfits.PrimaryHDU(data=sum_image)
Related
So I have a set of positional data that I comes from a factory sensor. It produces x, y and z info in meters from to a known lat/long position. I have a function that will convert the distance in meters from the lat/long but I need to use the x and y data in a Pythagoras function to determine that. Let me try to clarify with an example of the JSON data the sensor gives.
[
{
"id": "84eb18677194",
"name": "forklift_0001",
"areaId": "Tracking001",
"areaName": "Hall1",
"color": "#FF0000",
"coordinateSystemId": "CoordSys001",
"coordinateSystemName": null,
"covarianceMatrix": [
0.82,
-0.07,
-0.07,
0.55
],
"position": [ #this is the x,y and z data, in meters from the ref point
18.11,
33.48,
2.15
],
In this branch the forklift is 18.11m along and 33.38m up from the reference lat/long. The sensor is 2.15m high and that is a constant piece of info i don't need. To work out the distance from the reference point I need to use Pythagoras and then convert that data back into lat/long so my analysis tool can present it.
My problem (as far as python goes) is that I can't figure out how to make it see 18.11 & 33.38 as the x & y and tell it to disregard 2.15 entirely. Here is what i have so far.
import math
import json
import pprint
import os
from glob import iglob
rootdir_glob = 'C:/Users/username/Desktop/test_folder**/*"' # Note the
added asterisks, use forward slash
# This will return absolute paths
file_list = [f for f in
iglob('C:/Users/username/Desktop/test_folder/13/00**/*', recursive=True)
if os.path.isfile(f)]
for f in file_list:
print('Input file: ' + f) # Replace with desired operations
with open(f, 'r') as f:
distros = json.load(f)
output_file = 'position_data_blob_14' + str(output_nr) + '.csv' #output file name may be changed
def pythagoras(a,b):
value = math.sqrt(a*a + b*b)
return value
result = pythagoras(str(distro['position'])) #I am totally stuck here :/
print(result)
This piece of script is part of a wider project to parse the file by machine and people and also by work and non work times of day.
If someone could give me some tips on how to make the pythagorus part work i'd be really grateful. I am not sure if I should define it as a function but as I've typed this I am wondering if it should be a 'for' loop which uses the x & y and ignores the x.
All help really appreciated.
Try this:
position = distro['position'] # Get the full list
result = pythagoras(position[0], position[1]) # Get the first and second element from the list
print(result)
Why do you use str() for the argument of the function ? What were you trying to do ?
You're passing one input, a list of numbers, into a function that takes two numbers as input. There are two solutions to this - either change what you pass in, or change the function.
distro['position'] = [18.11, 33.48, 2.15], so for the first solution all you need to do is pass in distro['position'][0] and distro['position'][1]:
result = pythagoras(distro['position'][0], distro['position'][1])
Alternatively (which in my opinion is more elegant), pass in the list to the function and have the function extract the values it cares about:
result = pythagoras(distro['position'])
def pythagoras(input_triple):
a,b,c = input_triple
value = math.sqrt(a*a + b*b)
return value
The solution i used was
for f in file_list:
print('Input file: ' + f) # Replace with desired operations
with open(f, 'r') as f:
distros = json.load(f)
output_file = '13_01' + str(output_nr) + '.csv' #output file name may be changed
with open(output_file, 'w') as text_file:
for distro in distros:
position = distro['position']
result = math.sqrt(position[0]*position[0] + position[1]*position[1]),
print((result), file=text_file)
print('Output written to file: ' + output_file)
output_nr = output_nr + 1
Did you check with the data type of the parameters you're passing?
def pythagoras(a,b):
value = math.sqrt(int(a)**2 + int(b)**2)
return value
This is in the case of integers.
I would heavily need your help to simplify a code that allows me to have a data analysis on salesforce leads.
I have the following dataframes, which as you can see are split due to python limit on managing more than 550 objects in a single list.
Iterlist = list()
for x in range(0, int(len(List)/550)+1):
m = List[550*x: 550*x+550]
Iterlist.append(m)
Iterlist0= pd.DataFrame(Iterlist[0])
Iterlist1= pd.DataFrame(Iterlist[1])
Iterlist2= pd.DataFrame(Iterlist[2])
...and so on until the initial longer list is split
...
converted in the following lists for formatting reasons:
A= Iterlist0["Id"].tolist()
mylistA = "".join(str(x)+ "','" for x in A)
mylistA = mylistA[:-2]
mylistA0 = "('" + mylistA + ")"
B = Iterlist1["Id"].tolist()
mylistB = "".join(str(x)+ "','" for x in B)
mylistB = mylistB[:-2]
mylistB1 = "('" + mylistB + ")"
C = Iterlist2["Id"].tolist()
mylistC = "".join(str(x)+ "','" for x in C)
mylistC = mylistC[:-2]
mylistC2 = "('" + mylistC + ")"
and so on...
...
I want to create a loop that allows me to query from salesforce each of the lists using the following code template for example:
queryA='SELECT '+cols[1]+', '+cols[2]+', '+cols[3]+', '+cols[4]+', '+cols[5]+', '+cols[6]+', '+cols[7]+', '+cols[8]+' FROM LeadHistory WHERE LeadId IN '+mylistA0
and then finally:
sf = Salesforce(username='xxx', password='xxx', security_token='xxx')
leadhistory = sf.query_all(queryA)
I don´t want to write over and over numerous dataframes with specific names, lists and queries in order to get to the result. I would like to have a line for each of the codes written above, and let python automatically update the naming according to the number of 550 elements list.
I am new to this programming language and any tip would help me a lot. I think is possible to simplify it a lot but no idea how can be done.
Thanks in advance!
Python/Numpy Problem. Final year Physics undergrad... I have a small piece of code that creates an array (essentially an n×n matrix) from a formula. I reshape the array to a single column of values, create a string from that, format it to remove extraneous brackets etc, then output the result to a text file saved in the user's Documents directory, which is then used by another piece of software. The trouble is above a certain value for "n" the output gives me only the first and last three values, with "...," in between. I think that Python is automatically abridging the final result to save time and resources, but I need all those values in the final text file, regardless of how long it takes to process, and I can't for the life of me find how to stop it doing it. Relevant code copied beneath...
import numpy as np; import os.path ; import os
'''
Create a single column matrix in text format from Gaussian Eqn.
'''
save_path = os.path.join(os.path.expandvars("%userprofile%"),"Documents")
name_of_file = 'outputfile' #<---- change this as required.
completeName = os.path.join(save_path, name_of_file+".txt")
matsize = 32
def gaussf(x,y): #defining gaussian but can be any f(x,y)
pisig = 1/(np.sqrt(2*np.pi) * matsize) #first term
sumxy = (-(x**2 + y**2)) #sum of squares term
expden = (2 * (matsize/1.0)**2) # 2 sigma squared
expn = pisig * np.exp(sumxy/expden) # and put it all together
return expn
matrix = [[ gaussf(x,y) ]\
for x in range(-matsize/2, matsize/2)\
for y in range(-matsize/2, matsize/2)]
zmatrix = np.reshape(matrix, (matsize*matsize, 1))column
string2 = (str(zmatrix).replace('[','').replace(']','').replace(' ', ''))
zbfile = open(completeName, "w")
zbfile.write(string2)
zbfile.close()
print completeName
num_lines = sum(1 for line in open(completeName))
print num_lines
Any help would be greatly appreciated!
Generally you should iterate over the array/list if you just want to write the contents.
zmatrix = np.reshape(matrix, (matsize*matsize, 1))
with open(completeName, "w") as zbfile: # with closes your files automatically
for row in zmatrix:
zbfile.writelines(map(str, row))
zbfile.write("\n")
Output:
0.00970926751178
0.00985735189176
0.00999792646484
0.0101306077521
0.0102550302672
0.0103708481917
0.010477736974
0.010575394844
0.0106635442315
.........................
But using numpy we simply need to use tofile:
zmatrix = np.reshape(matrix, (matsize*matsize, 1))
# pass sep or you will get binary output
zmatrix.tofile(completeName,sep="\n")
Output is in the same format as above.
Calling str on the matrix will give you similarly formatted output to what you get when you try to print so that is what you are writing to the file the formatted truncated output.
Considering you are using python2, using xrange would be more efficient that using rane which creates a list, also having multiple imports separated by colons is not recommended, you can simply:
import numpy as np, os.path, os
Also variables and function names should use underscores z_matrix,zb_file,complete_name etc..
You shouldn't need to fiddle with the string representations of numpy arrays. One way is to use tofile:
zmatrix.tofile('output.txt', sep='\n')
I am new to Python and want an auto-completing variable inside a while-loop. I try to give a minimal example. Lets assume I have the following files in my folder, each starting with the same letters and an increasing number but the end of the filenames consists of just random numbers
a_i=1_404
a_i=2_383
a_i=3_180
I want a while loop like
while 1 <= 3:
old_timestep = 'a_i=n_*'
actual_timestep = 'a_i=(n+1)_*'
... (some functions that need the filenames saved in the two above initialised variables)
n = n+1
So if I start the loop I want it to automatically process all files in my directory. As a result there are two questions:
1) How do I tell python that (in my example I used the '*') I want the filename to be completed automatically?
2) How to I use a formula inside the filename (in my example the '(n+1)')?
Many thanks in advance!
1) To my knowledge you can't do that automatically. I would store all filenames in the directory in a list and then do a search through that list
from os import listdir
from os.path import isfile, join
dir_files = [ f for f in listdir('.') if isfile(join('.',f)) ]
while i <= 3:
old_timestep = "a_i=n_"
for f in dir_files:
if f.startswith(old_timestep):
# process file
i += 1
2) You can use string concatenation
f = open("a_i=" + str(n + 1) + "remainder of filename", 'w')
You can use the glob module to do * expansion:
import glob
old = next(glob.glob('a_i={}_*'.format(n)))
actual = next(glob.glob('a_i={}_*'.format(n + 1)))
I found a way to solve 1) myself by first doing a) then b):
1) a) Truncate the file names so that only the first x characters are left:
for i in {1..3}; do
cp a_i=$((i))_* a_i=$((i))
done
b)
n = 1
while n <= 3:
old = 'a_i=' + str(n)
new = 'a_i=' + str(n+1)
the str() is for converting the integer n into a string in order to concatenate
thanks for your input!
First-time post and python newb who has exhausted all other options. I am interested in appending selected raster properties (using the arcpy.GetRasterProperties_management(input_raster, "property_type") function) to a comma-delimited table, but am having trouble figuring out how to do this for multiple results. As an abridged example (of my actual script), I have created two 'for' loops; one for each raster property I am interested in outputting (i.e. Cell Size X, Cell Size Y). My list of rasters include S01Clip_30m through S05Clip_30m. My goal is to create a .txt file that should look something like this:
RasterName, CellSizeX, CellSizeY
S01Clip_30m, 88.9372, 88.9375
S02Clip_30m, 88.9374, 88.9371
The code I have so far is below (with some uncertain, botched syntax at the bottom). When I run it, I get this result:
S05Clip_30m, 88.9374
(last raster in the list, CellSizeY)
I appreciate any help you can provide on the crucial bottom code block.
import arcpy
from arcpy import env
env.workspace = ('C:\\StudyAreas\\Aggregates.gdb')
InFolder = ('C:\\dre\\python\\tables')
OutputFile = open(InFolder + '\\' + 'RasterProps.txt', 'a')
rlist = arcpy.ListRasters('*','*')
for grid in rlist:
if grid[-8:] == "Clip_30m":
result = arcpy.GetRasterProperties_management(grid,'CELLSIZEX')
CellSizeX = result.getOutput(0)
for grid in rlist:
if grid[-8:] == "Clip_30m":
result = arcpy.GetRasterProperties_management(grid,'CELLSIZEY')
CellSizeY = result.getOutput(0)
> I know the syntax below is incorrect, but I know there are *some* elements that
> should be included based on other example scripts that I have...
> if result.getOutput(0) == CellSizeX:
> coltype = CellSizeX
> elif result.getOutput(0) == CellSizeY:
> coltype = CellSizeY
> r = ''.join(grid)
> colname = r[0:]
> OutputFile.writelines(colname+','+coltype+'\n')
After receiving help from another Q&A forum on my script, I am now providing the answer to my own GIS-related question to close this thread (and move to gis.stackexchange :) - thanks to L.Yip's comment). Here is the final corrected script which outputs my two raster properties (Cell Size in X-direction, Cell Size in Y-direction) for a list of rasters into a .txt file:
import arcpy
from arcpy import env
env.workspace = ('C:\\StudyAreas\\Aggregates.gdb')
InFolder = ('C:\\dre\\python\\tables')
OutputFile = open(InFolder + '\\' + 'RasterProps.txt', 'a')
rlist = arcpy.ListRasters('*','*')
for grid in rlist:
if grid[-8:] == "Clip_30m":
resultX = arcpy.GetRasterProperties_management(grid,'CELLSIZEX')
CellSizeX = resultX.getOutput(0)
resultY = arcpy.GetRasterProperties_management(grid,'CELLSIZEY')
CellSizeY = resultY.getOutput(0)
OutputFile.write(grid + ',' + str(CellSizeX) + ',' + str(CellSizeY) + '\n')
OutputFile.close()
My results after running the script:
S01Clip_30m,88.937158083333,88.9371580833333
S02Clip_30m,88.937158083333,88.937158083333
S03Clip_30m,88.9371580833371,88.9371580833333
S04Clip_30m,88.9371580833308,88.937158083333
S05Clip_30m,88.9371580833349,88.937158083333
Thanks!