Reading / modifying fortran file - python

I have a big fortran (f90) file that is used to generate an input file for numerical simulations. I would like to create a GUI to set the parameters of the simulations written within the f90 file.
Right now I am trying to create a python script that can find, store and modify (if necessary) these variables (there are a lot of them).
I am trying to access values that are stored this way (this is a shortenned version) :
! ib : ply orientation
! ply at 0 : ib = 1
! ply at 90 : ib = 2
! ply at 45 : ib = 3
! ply at -45 : ib = 4
a = 7500.
b = 8.
e = 579.
!--------------Centre
x0 = 0.
y0 = 0.
!--------------Radius
Rfoc = 1600.
!--------------End of geometric parameters
So far, the code looks like this :
import re
geomVar = ["a =","b =","e =","Rfoc ="]
endGeom = "End of geometric parameters"
def goFetchGeom(fileNameVariable, valueInput):
valueOutput = []
for line in fileNameVariable.readlines():
# Looking for svalue numbers associated with search value
if any(string in line for string in valueInput):
result = [float(s) for s in re.findall(r'\b\d+\b', line)]
valueOutput.append(result[0])
# End of search loop to avoid reading the whole file
elif endGeom in line :
fileNameVariable.seek(0) #rewind the file
return valueOutput
with open("mesh_Comp_UD.f90", "r+") as f:
geomValue = goFetchGeom(f,geomVar)
print geomValue
First problem: it finds lines that have "ib = ..." within them because the variable "b =" stored in geomVar.
Second problem: if a line has multiple variables, I only get the first one, not all of them.
Because they are numerous, I would like to pass the variables i'm looking for, in the form of an array, directly in the function. Moreover, since the file is very long, I do not want to start a search from scratch for every variable.

Related

How can I extract starting line numbers associated with function definitions of a C program, by writing a parser in python?

I am trying to write a python script to extract starting line numbers of function definitions of a C program. I used a C parsing library in python, called "Pyclibrary", and using it to extract function names from my C file. I am then putting these names in a list, iterating through it, and searching line numbers where they are found, and deleting duplicates, by storing only the first instance of the search. But it fails for those cases where the first instance is not the function definition. I need to refine my logic for the same. Any leads would be appreciated.
Here is my code:
from pyclibrary import CParser
from pyclibrary import CLibrary
import pandas as pd
parser = CParser(['path/to/c/file/sample.c'])
my_list = []
list_of_func = []
d1 = []
d2 = []
d3 = []
func1 = parser.defs['functions']
inside_function = 0
left_brack_num = 0
for i in func1:
my_list.append(str(i))
with open('path/to/c/file/sample.c') as myFile:
for num, line in enumerate(myFile, 1):
for i in range(len(my_list)):
if my_list[i] in line:
list_of_func.append([my_list[i], num])
d1.append(my_list[i])
d2.append(num)
inside_function = 1
if inside_function == 1:
left_brack_num += line.count("{")
if "}" in line:
left_brack_num -= line.count("}")
if left_brack_num == 0:
d3.append(num)
inside_function = 0
Data ={'Function Name': d1, 'Starting Line number': d2}
df2d = pd.DataFrame(Data)
df2d.drop_duplicates(subset = 'Function Name',
keep = 'first', inplace = True)
snd = pd.Series(list_of_func)
print(df2d)
Parsing manually a c file is generally a bad idea, there is a lot of corner cases and you will end up reinventing the wheel.
If you can compile you file with debug symbols you can find your symbols easily with :
nm -l ./foo --defined-only| grep :
Where:
nm Lists the symbols defined in the the binary
-l Writes the file and line number where the symbol is defined
The grep keeps only the user-defined symbols.
For instance if i try with this file:
int a;
int f1(){}
int f2(){}
int main(){}
Compiled with gcc -o foo foo.c -g, I get the following symbols:
000000000000402c B a /home/user/foo.c:1
0000000000001125 T f1 /home/user/foo.c:2
000000000000112c T f2 /home/user/foo.c:3
0000000000001133 T main /home/user/foo.c:4
Please note that I get both the function and the global variables. If you want only the functions you can filter them using the 2nd field and keep only these with T value
If you really want to start from your C file, you might want to use cscope (see this post).

How to read in multiple documents with same code?

So I have a couple of documents, of which each has a x and y coordinate (among other stuff). I wrote some code which is able to filter out said x and y coordinates and store them into float variables.
Now Ideally I'd want to find a way to run the same code on all documents I have (number not fixed, but let's say 3 for now), extract x and y coordinates of each document and calculate an average of these 3 x-values and 3 y-values.
How would I approach this? Never done before.
I successfully created the code to extract the relevant data from 1 file.
Also note: In reality each file has more than just 1 set of x and y coordinates but this does not matter for the problem discussed at hand.
I'm just saying that so that the code does not confuse you.
with open('TestData.txt', 'r' ) as f:
full_array = f.readlines()
del full_array[1:31]
del full_array[len(full_array)-4:len(full_array)]
single_line = full_array[1].split(", ")
x_coord = float(single_line[0].replace("1 Location: ",""))
y_coord = float(single_line[1])
size = float(single_line[3].replace("Size: ",""))
#Remove unecessary stuff
category= single_line[6].replace(" Type: Class: 1D Descr: None","")
In the end I'd like to not have to write the same code for each file another time, especially since the amount of files may vary. Now I have 3 files which equals to 3 sets of coordinates. But on another day I might have 5 for example.
Use os.walk to find the files that you want. Then for each file do you calculation.
https://docs.python.org/2/library/os.html#os.walk
First of all create a method to read a file via it's file name and do the parsing in your way. Now iterate through the directory,I guess files are in the same directory.
Here is the basic code:
import os
def readFile(filename):
try:
with open(filename, 'r') as file:
data = file.read()
return data
except:
return ""
for filename in os.listdir('C:\\Users\\UserName\\Documents'):
#print(filename)
data=readFile( filename)
print(data)
#parse here
#do the calculation here

Using python to process a LaTeX file and create random questions

I am using python to pre-process a LaTeX file before running it through the LaTeX compiler. I have a python script which reads a .def file. The .def file contains some python code at the top which helps to initialize problems with randomization. Below the python code I have LaTeX code for the problem. For each variable in the LaTeX code, I use the symbol # to signify that it should be randomized and replaced before compiling. For example, I may write #a to be a random integer between 1 and 10.
There may be larger issues with what I'm trying to do, but so far it's working mostly as I need it to. Here is a sample .def file:
a = random.choice(range(-3,2))
b = random.choice(range(-6,1))
x1 = random.choice(range(a-3,a))
x2 = x1+3
m = 2*x1 - 2*a + 3
y1 = (x1-a)**2+b
y2 = (x2-a)**2+b
xmin = a - 5
xmax = a + 5
ymin = b-1
ymax = b+10
varNames = [["#a", str(a)],["#b", str(b)], ["#x1",str(x1)], ["#x2",str(x2)], ["#m", str(m)], ["#y1",str(y1)], ["#y2",str(y2)], ["#xmin", str(xmin)], ["#xmax", str(xmax)], ["#ymin", str(ymin)], ["#ymax", str(ymax)]]
#####
\question The graph of $f(x) = (x-#a)^2+#b$ is shown below along with a secant line between the points $(#x1,#y1)$ and $(#x2,#y2)$.
\begin{center}
\begin{wc_graph}[xmin=#xmin, xmax=#xmax, ymin=#ymin, ymax=#ymax, scale=.75]
\draw[domain=#a-3.16:#a + 3.16, smooth] plot ({\x}, {(\x-#a)^2+#b});
\draw (#x1,#y1) to (#x2,#y2);
\pic at (#x1,#y1) {closed};
\pic at (#x2,#y2) {closed};
\end{wc_graph}
\end{center}
\begin{parts}
\part What does the slope of the secant line represent?
\vfill
\part Compute the slope of the secant line.
\vfill
\end{parts}
\newpage
As you can see, removing the #a and replacing it with the actual value of a is starting to become tedious. In my python script, I just replace all of the #ed things in my latexCode string.
for x in varNames:
latexCode = latexCode.replace(x[0],x[1])
which seems to work okay. However, it seems obnoxious and ugly.
My Question: Is there a better way of working between the variable identifier and the string version of the identifier? It would be great if I could simply make a python list of variable names I'm using in the .def file, and then run a simple command to update the LaTeX code. What I've done is cumbersome and painful. Thanks in advance!
Yes either eval (name) or getattr (obj, name) or globals () [name]. In this case I'd say globals [name].
Also vars would work:
https://docs.python.org/3/library/functions.html#vars
In the following fragment it's e.g used to make objects know their own name:
def _setPublicElementNames (self):
for var in vars (self):
if not var.startswith ('_'):
getattr (self, var) ._name = var
Yet another, unnecessary complicated, solution would be to generate a .py file with the right replace statements from your .def file.

Create name value pairs in python

I have a python script that has the following output stored in a variable called jvmData:
Stats name=jvmRuntimeModule, type=jvmRuntimeModule#
{
name=HeapSize, ID=1, description=The total memory (in KBytes) in the Java virtual machine run time., unit=KILOBYTE, type=BoundedRangeStatistic, lowWaterMark=1048576, highWaterMark=1048576, current=1048576, integral=0.0, lowerBound=1048576, upperBound=2097152
name=FreeMemory, ID=2, description=The free memory (in KBytes) in the Java virtual machine run time., unit=KILOBYTE, type=CountStatistic, count=348466
name=UsedMemory, ID=3, description=The amount of used memory (in KBytes) in the Java virtual machine run time., unit=KILOBYTE, type=CountStatistic, count=700109
name=UpTime, ID=4, description=The amount of time (in seconds) that the Java virtual machine has been running., unit=SECOND, type=CountStatistic, count=3706565
name=ProcessCpuUsage, ID=5, description=The CPU Usage (in percent) of the Java virtual machine., unit=N/A, type=CountStatistic, count=0
}
What I would like to do is simply print out name/value pairs for the important parts, which in this case would simply be:
HeapSize=1048576
FreeMemory=348466
UsedMemory=700109
UpTime=3706565
ProcessCpuUsage=0
Im not at all good with python :) The only solution in my head seems very long-winded? Split the lines, throw away first, second and last lines, then loop through each line with different cases (sometimes current, sometimes count) for finding the length of string, etc etc
Perhaps (well definitely) I am missing something some nice function I can use to put these into the equivalent of a java hashmap or something?
The "equivalent of a java HashMap" would be called a dictionary in python. As for how to parse this, just iterate over the lines that contain the data, make a dict of all key/value pairs in the line and have a special case for the HeapSize:
jvmData = "..." #the string holding the data
jvmLines = jvmData.split("\n") #a list of the lines in the string
lines = [l.strip() for l in jvmLines if "name=" in l] #filter all data lines
result = {}
for line in lines:
data = dict(s.split("=") for s in line.split(", "))
#the value is in "current" for HeapSize or in "count" otherwise
value = data["current"] if data["name"] == "HeapSize" else data["count"]
result[data["name"]] = value
As you seem to be stuck on Jython2.1, here's a version that should work with it (obviously untested). Basically the same as above, but with the list comprehension and generator expression replaced by filter and map respectively, and without using the ternary if/else operator:
jvmData = "..." #the string holding the data
jvmLines = jvmData.split("\n") #a list of the lines in the string
lines = filter(lambda x: "name=" in x, jvmLines) #filter all data lines
result = {}
for line in lines:
data = dict(map(lambda x: x.split("="), line.split(", ")))
if data["name"] == "HeapSize":
result[data["name"]] = data["current"]
else:
result[data["name"]] = data["count"]
Try something using find function and small re:
import re
final_map = {}
NAME= 'name='
COUNT= 'count='
HIGHWATERMARK= "highWaterMark="
def main():
with open(r'<file_location>','r') as file:
lines = [line for line in file if re.search(r'^name', line)]
for line in lines:
sub = COUNT if line.find(COUNT) != -1 else HIGHWATERMARK
final_map[line[line.find(NAME)+5:line.find(',')]] = line[line.find(sub)+len(sub):].split(',')[0].strip()
print line[line.find(NAME)+5:line.find(',')]+'='+final_map[line[line.find(NAME)+5:line.find(',')]]
if __name__ == '__main__':
main()
Output:
HeapSize=1048576
FreeMemory=348466
UsedMemory=700109
UpTime=3706565
ProcessCpuUsage=0

Python: How to extract string from text file to use as data

this is my first time writing a python script and I'm having some trouble getting started. Let's say I have a txt file named Test.txt that contains this information.
x y z Type of atom
ATOM 1 C1 GLN D 10 26.395 3.904 4.923 C
ATOM 2 O1 GLN D 10 26.431 2.638 5.002 O
ATOM 3 O2 GLN D 10 26.085 4.471 3.796 O
ATOM 4 C2 GLN D 10 26.642 4.743 6.148 C
What I want to do is eventually write a script that will find the center of mass of these three atoms. So basically I want to sum up all of the x values in that txt file with each number multiplied by a given value depending on the type of atom.
I know I need to define the positions for each x-value, but I'm having trouble with figuring out how to make these x-values be represented as numbers instead of txt from a string. I have to keep in mind that I'll need to multiply these numbers by the type of atom, so I need a way to keep them defined for each atom type. Can anyone push me in the right direction?
mass_dictionary = {'C':12.0107,
'O':15.999
#Others...?
}
# If your files are this structured, you can just
# hardcode some column assumptions.
coords_idxs = [6,7,8]
type_idx = 9
# Open file, get lines, close file.
# Probably prudent to add try-except here for bad file names.
f_open = open("Test.txt",'r')
lines = f_open.readlines()
f_open.close()
# Initialize an array to hold needed intermediate data.
output_coms = []; total_mass = 0.0;
# Loop through the lines of the file.
for line in lines:
# Split the line on white space.
line_stuff = line.split()
# If the line is empty or fails to start with 'ATOM', skip it.
if (not line_stuff) or (not line_stuff[0]=='ATOM'):
pass
# Otherwise, append the mass-weighted coordinates to a list and increment total mass.
else:
output_coms.append([mass_dictionary[line_stuff[type_idx]]*float(line_stuff[i]) for i in coords_idxs])
total_mass = total_mass + mass_dictionary[line_stuff[type_idx]]
# After getting all the data, finish off the averages.
avg_x, avg_y, avg_z = tuple(map( lambda x: (1.0/total_mass)*sum(x), [[elem[i] for elem in output_coms] for i in [0,1,2]]))
# A lot of this will be better with NumPy arrays if you'll be using this often or on
# larger files. Python Pandas might be an even better option if you want to just
# store the file data and play with it in Python.
Basically using the open function in python you can open any file. So you can do something as follows: --- the following snippet is not a solution to the whole problem but an approach.
def read_file():
f = open("filename", 'r')
for line in f:
line_list = line.split()
....
....
f.close()
From this point on you have a nice setup of what you can do with these values. Basically the second line just opens the file for reading. The third line define a for loop that reads the file one line at a time and each line goes into the line variable.
The last line in that snippet basically breaks the string --at every whitepsace -- into an list. So line_list[0] will be the value on your first column and so forth. From this point if you have any programming experience you can just use if statements and such to get the logic that you want.
** Also keep in mind that the type of values stored in that list will all be string so if you want to perform any arithmetic operations such as adding you have to be careful.
* Edited for syntax correction
If you have pandas installed, checkout the read_fwf function that imports a fixed-width file and creates a DataFrame (2-d tabular data structure). It'll save you lines of code on import and also give you a lot of data munging functionality if you want to do any additional data manipulations.

Categories