Python3: How to get value from other file with function within class? - python

I want to create a class for storing attributes of the many data files that my script has to process. The attributes are values that are found in the datafiles, or values that are calculated from other values that are found in the data files.
Unfortunately, I'm not understanding the output of the code that I've written to accomplish that goal. What I think this should do is: print the name of the file being processed and a value seqlength from that file. The actual output is given below the code.
class SrcFile:
def __init__(self, which):
self.name = which
def seqlength(self):
with open(self.name) as file:
linecounter = 0
for line in file:
linecounter += 1
if linecounter == 3:
self.seqlength = int(line.split()[0])
break
for f in files:
file = SrcFile(f)
print(file.name, file.seqlength)
This prints file.name as expected, but for file.seqlength it returns a value that I don't understand.
../Testdata/12_indels.ss <bound method SrcFile.seqlength of <__main__.SrcFile object at 0x10066cad0>>
It's clear to me that I'm not understanding something fundamental about classes and functions. Is it clear to you what I'm missing here?

.seqlength is a method and needs (), but you are also not returning anything from it. Try this instead:
def seqlength(self):
with open(self.name) as file:
linecounter = 0
for line in file:
linecounter += 1
if linecounter == 3:
return int(line.split()[0])
And then calling it:
for f in files:
file = SrcFile(f)
print(file.name, file.seqlength())

Thats because .seqlength is a method.
Try doing
print(filename, file.seqlength())

Related

Export JSON data from python objects

I'm trying to make a phone book in python and I want to save all contacts in a file, encoded as JSON, but when I try to read the exported JSON data from the file, I get an error:
Extra data: line 1 column 103 - line 1 column 210 (char 102 - 209)
(It works fine when I have only one object in "list.txt")
This is my code:
class contacts:
def __init__(self, name="-", phonenumber="-", address="-"):
self.name= name
self.phonenumber= phonenumber
self.address= address
self.jsonData=json.dumps(vars(self),sort_keys=False, indent=4)
self.writeJSON(self.jsonData)
def writeJSON(self, jsonData):
with open("list.txt", 'a') as f:
json.dump(jsonData, f)
ted=contacts("Ted","+000000000","Somewhere")
with open('list.txt') as p:
p = json.load(p)
print p
The output in list.txt:
"{\n \"phonenumber\": \"+000000000\", \n \"name\": \"Ted\", \n \"address\": \"Somewhere\"\n}"
Now, if I add another object, it can't read the JSON data anymore. If my way of doing it is wrong, how else should I export the JSON code of every object in a class, so it can be read back when I need to?
The reason this isn't working is that this code path gives you an invalid JSON structure. With one contact you get this:
{"name":"", "number":""}
While with 2 contacts you would end up with this:
{"name":"", "number":""}{"name":"", "number":""}
The second one is invalid json because 2 objects should be encoded in an array, like this:
[{"name":"", "number":""},{"name":"", "number":""}]
The problem with your code design is that you're writing to the file every time you create a contact. A better idea is to create all contacts and then write them all to the file at once. This is cleaner, and will run more quickly since file I/O is one of the slowest things a computer can do.
My suggestion is to create a new class called Contact_Controller and handle your file IO there. Something like this:
import json
class Contact_Controller:
def __init__(self):
self.contacts = []
def __repr__(self):
return json.dumps(self)
def add_contact(self, name="-", phonenumber="-", address="-"):
new_contact = Contact(name,phonenumber,address)
self.contacts.append(new_contact)
return new_contact
def save_to_file(self):
with open("list.txt", 'w') as f:
f.write(str(self.contacts))
class Contact:
def __init__(self, name="-", phonenumber="-", address="-"):
self.name= name
self.phonenumber= phonenumber
self.address= address
def __repr__(self):
return json.dumps({"name": self.name, "phonenumber": self.phonenumber, "address": self.address})
contact_controller = Contact_Controller()
ted = contact_controller.add_contact("Ted","+000000000","Somewhere")
joe = contact_controller.add_contact("Joe","+555555555","Somewhere Else")
contact_controller.save_to_file()
with open('list.txt') as p:
p = json.load(p)
print(p)
I've also changed it to use the built in __repr__() class method. Python will call that method whenever it needs a string representation of the object.
in writeJSON, you opened the file for append (mode='a'), which works fine the first time, but not the subsequent calls. To fix this problem, open the file with overwrite mode ('w'):
with open("list.txt", 'w') as f:

How to Return to the First Line in a Urlopen Object

I am iterating a .dat file save on a http website using
import urllib2
test_file = urllib2.urlopen('http://~/file.dat')
And then, I have a function which iterates the file
def f(file):
while True:
iter = file.readline()
if iter == "":
break
print iter
If I want to call this function twice without opening the test_file again:
f(test_file)
f(test_file)
then what should I add into the f function?
Update:
Since I am not allowed to change anything outside the function, I finally came up a silly but effective solution:
def f(file):
while True:
iter = file.readline()
if iter == "":
break
print iter
global test_file
test_file = test_file = urllib2.urlopen('http://~/file.dat')
Thanks for the guys who answered my questions!
f.seek(0)
returns you to the start of the file. The argument is the byte position in the file.
So the best thing for you to do is to save the output of your f.read() to a var and then push through via the StringIO.readline() method that will work similarly to f.readline() but within memory.
import urllib2
import StringIO
t_fh = urllib2.urlopen('http://ftp.cs.stanford.edu/pub/sgb/test.dat')
test_data = t_fh.read()
def f(data):
buf = StringIO.StringIO(data)
while True:
line = buf.readline()
if line == "":
break
print line
f(test_data)

Is it possible to print a next line in a code?

Is it possible to make a method, which prints a next line of a code?
def print_next_line():
sth
import fxx
print 'XXX'
print_next_line()
file.split('/')
....
>>> 'XXX'
>>> 'file.split('/')'
I was thinking that It could be somewhere in the stack, but I'm not sure because it is next, not previous line.
Straight approach. I use inspect module to determine file and line where print_next_line was called. Later I read the file to find next string. You might want to add some error handling here (what if there is no next line in a file? and so on)
def print_next_line():
def get_line(f, lineno):
with open(f) as fp:
lines = fp.readlines()
return lines[lineno-1]
import inspect
callerframerecord = inspect.stack()[1]
frame = callerframerecord[0]
info = inspect.getframeinfo(frame)
line_ = info.lineno
file_ = info.filename
print get_line(file_, line_ + 1)
print 'XXX'
a = 1
print_next_line()
b = a*2
All you need is a profiling tool or just a debugger.
Use Python's inspect module:
import inspect
def print_next_line():
lineno = inspect.currentframe().f_back.f_lineno
with open(__file__) as f:
print(f.readlines()[lineno].rstrip())
Well you could open() your .py file and iterate to find specific line, then print it.

"Write" method for generator in python

I wrote a class to deal with large files and I want to make a "write" method for the class so that I can easily make changes to the data in the file and then write out a new file.
What I want to be able do is:
1.) Read in the original file
sources = Catalog(<filename>)
2.) Make changes on the data contained in the file
for source in sources:
source['blah1'] = source['blah1'] + 4
3.) Write out the updated value to a new file
sources.catalog_write(<new_filename>)
To this end I wrote a fairly straightforward generator,
class Catalog(object):
def __init__(self, fname):
self.data = open(fname, 'r')
self.header = ['blah1', 'blah2', 'blah3']
def next(self):
line = self.data.readline()
line = line.lstrip()
if line == "":
self.data.close()
raise StopIteration()
cols = line.split()
if len(cols) != len(self.header):
print "Input catalog is not valid."
raise StopIteration()
for element, col in zip(self.header, cols):
self.__dict__.update({element:float(col)})
return self.__dict__.copy()
def __iter__(self):
return self
This is my attempt at a write method:
def catalog_write(self, outname):
with open(outname, "w") as out:
out.write(" ".join(self.header) + "\n")
for source in self:
out.write(" ".join(map(str, source)) + "\n")
But I get the following error when I try to call that class method,
File "/Catalogs.py", line 53, in catalog_write
for source in self:
File "/Catalogs.py", line 27, in next
line = self.data.readline()
ValueError: I/O operation on closed file
I realize that this is because generators are generally a one time deal but I know that there are workarounds to this (like this question and this post but I'm not sure what the best way to do this is. These files are quite large and I'd like their read in and use to be as efficient as possible (both time-wise and memory-wise). Is there a pythonic way to do this?
Assumptions made:
Input File: [ infile ]
1.2 3.4 5.6
4.5 6.7 8.9
Usage:
>>> a = Catalog('infile')
>>> a.catalog_write('outfile')
Now Output File: [ outfile ]
blah1 blah2 blah3
1.2 3.4 5.6
4.5 6.7 8.9
Writing it again to another file: [ outfile2 ]
>>> a.catalog_write('outfile2')
Now Output File: [ outfile2 ]
blah1 blah2 blah3
1.2 3.4 5.6
4.5 6.7 8.9
So from what you have posted, looks like you need to reopen your data [ Assuming it is the file object with file name as self.fname ].
Modify your __init__ to save the fname as an attribute
Create a data object initially [ I am not opening it at __init__ stage, so that you could open and close when needed all inside your next() method ] I have just created the data as an object so that it can have an attribute closed like a file object, so that you could check whether self.data.closed is True and reopen the same from inside your next() method and read from the same.
def __init__(self, fname):
self.fname = fname
self.data = object()
self.data = lambda: None
self.data.closed = True
self.header = ['blah1', 'blah2', 'blah3']
Now the next method is modified as follows :
def next(self):
if self.data.closed:
self.data = open(self.fname, "r")
line = self.data.readline()
line = line.lstrip()
if line == "":
if not self.data.closed:
self.data.close()
raise StopIteration()
cols = line.split()
if len(cols) != len(self.header):
print "Input catalog is not valid."
if not self.data.closed:
self.data.close()
raise StopIteration()
for element, col in zip(self.header, cols):
self.__dict__.update({element:float(col)})
return self.__dict__.copy()
Your catalog_write method should be as follows :
Note that any modifications to data must be done within the for loop as shown.
def catalog_write(self, outname):
with open(outname, "w") as out:
out.write(" ".join(self.header) + "\n")
for source in self:
source['blah1'] = 444 # Data modified.
out.write(" ".join(map(str, [source[self.header[i]] for i in range(len(self.header)) ])) + "\n")
I assumed that you want the updated values of the headers written as a column in the outname file.

Search for a string with in a module in a python file using Python

#!/usr/bin/env python`
import sys`
import binascii`
import string
sample = "foo.apples"
data_file = open("file1.py","r")
dat_file = open("file2.txt", "w")
for line in data_file:
if sample in line:
dat_file.writelines(line)
dat_file.close()`
When I do this I am able to find the string foo.apples. The problem is foo.apples is present in various lines in the python file. I want those lines which are inside a particular function. I need the lines within this def function.
Example:
def start():
foo.apples(a,b)
foo.apples(c,d) ... so on.
The following program finds defs and will append the sample string to the output file if the indentation remains within the def.
import re
sample = 'foo.apples'
data_file = open("file1.py", "r")
out_file = open("file2.txt", "w")
within_def = False
def_indent = 0
for line in data_file:
def_match = re.match(r'(\s*)def\s+start\s*\(', line) # EDIT: fixed regex
if def_match and not within_def:
within_def = True
def_indent = len(def_match.group(1))
elif within_def and re.match(r'\s{%s}\S' % def_indent, line):
within_def = False
if within_def and sample in line:
out_file.writelines(line)
out_file.close()
data_file.close()
Tested working on an example file1.py.
One, slightly off the beaten path approach to this would be to use the getsource method of the inspect module. Consider the following (theoretical) test1.py file:
class foo(object):
apples = 'granny_smith'
#classmethod
def new_apples(cls):
cls.apples = 'macintosh'
def start():
"""This is a pretty meaningless python function.
Attempts to run it will definitely result in an exception being thrown"""
print foo.apples
foo.apples = 3
[x for x in range(10)]
import bar as foo
Now you want to know about the start code:
import inspect
import test1 #assume it is somewhere that can be imported
print inspect.getsource(test1.start)
Ok, now we have only the source of that function. We can now parse through that:
for line in inspect.getsource(test1.start).splitlines():
if 'foo.apples' in line:
print line
There are some advantages here -- python does all the work of parsing out the function blocks when it imports the file. The downside though is that the file actually needs to be imported. Depending on where your files are coming from, this could introduce a HUGE security hole in your program -- You'll be running (potentially) "untrusted" code.
Here's a very non pythonic way, untested, but it should work.
sample = "foo.apples"
infile = open("file1.py", "r")
outfile = open("file2.txt", "w")
in_function = False
for line in infile.readlines():
if in_function:
if line[0] in(" ", "\t"):
if sample in line:
outfile.write(line)
else:
in_function = False
elif line.strip() == "def start():":
in_function = True
infile.close()
outfile.close()
I would suggest doing a function of this, which takes sample, the input file, and the function which we're supposed to search from as it's parameters. It would then return a list or tuple of all the lines that had the text in them.
def findFromFile(file, word, function):
in_function = False
matches = []
infile = open(file, "r")
for line in infile.readlines():
if in_function:
if line[0] in(" ", "\t"):
if word in line:
matches.append(line)
else:
in_function = False
elif line.strip() == "def %s():"%function:
in_function = True
infile.close()
return matches

Categories