Printing as a dictionary in python - python

I have a problem in python. I want to create a function to print a file from user to a new file (example.txt).
The old file is like this:
{'a':1,'b':2...)
and I want the new file like:
a 1,b 2(the next line)
But the function which I made can run but it doesn't show anything in the new file. Can someone help me please.
def printing(file):
infile=open(file,'r')
outfile=open('example.txt','w')
dict={}
file=dict.values()
for key,values in file:
print key
print values
outfile.write(str(dict))
infile.close()
outfile.close()

This creates a new empty dictionary:
dict={}
dict is not a good name for a variable as it shadows the built-in type dict and could be confusing.
This makes the name file point at the values in the dictionary:
file=dict.values()
file will be empty because dict was empty.
This iterates over pairs of values in file.
for key,values in file:
As file is empty nothing will happen. However if file weren't empty, the values in it would have to be pairs of values to unpack them into key, values.
This converts dict to a string and writes it to the outfile:
outfile.write(str(dict))
Calling write with a non-str object will call str on it anway, so you could just say:
outfile.write(dict)
You don't actually do anything with infile.

You can use re module (regular expression) to achieve what you need. Solution could be like that. Of course you can customize to fit your need. Hope this helps.
import re
def printing(file):
outfile=open('example.txt','a')
with open(file,'r') as f:
for line in f:
new_string = re.sub('[^a-zA-Z0-9\n\.]', ' ', line)
outfile.write(new_string)
printing('output.txt')

Related

Dictionary returned empty from function

I'm writing a function that takes in a fasta file that may have multiple sequences and returns a dictionary with the accession number as the key and the full length header as the value. When I run the function, I get an empty dictionary, but when I run the same code outside of a function, I get the desired dictionary. I assume this means that somewhere in the function my dictionary is being overwritten with nothing, because if it had to do with my regex pattern the code wouldn't work outside the function either, but I can't seem to find where the dictionary might be getting reset. I tried doing some searching for a solution, but the only questions I found with the same problem had obvious problems (no return value or one of the input variables was overwritten).
here is my function code and output (assume re is imported):
def getheader(file):
'''
compiles headers from a fasta file into a dictionary
Parameters
----------
file : file
file that contains the sequences with headers and has not been read
Returns
-------
dictionary with acc no as key and full header as value
'''
head_dict = {}
for line in file:
if line.startswith(">"):
acc = re.search(">([^\s]+)", line).group(1)
header = line.rstrip()
head_dict[acc] = header
return head_dict
file = open("seq1.txt", "r")
head_dict = getheader(file)
print(head_dict)
output
{}
Here is the input/output when I run it outside of a function:
import re
file = open("seq1.txt", "r")
head_dict = {}
for line in file:
if line.startswith(">"):
key = re.search(">([^\s]+)", line).group(1)
value = line.rstrip()
head_dict[key] = value
print(head_dict)
output
{'AF12345': '>AF12345 test sequence 1'}
where seq1.txt is the following without the quotes and the txt file has an actual return instead of "\n", I just wasn't sure how to format it correctly.
">AF12345 test sequence 1\nCGATATTCCCATGCGGTTTATTTATGCAAAACTGTGACGTTCGCTTGA"
The above is where my code is now. Prior to this I had a function that created two dictionaries and returned a specific one based on an input argument. One dictionary was the accession number and sequences, and the other was this dictionary I'm trying to create now. I did not have any problem getting the first dictionary returned. So I decided to split the function and have one for each dictionary despite it seeming repetitive. I've also tried moving the head_dict[key] = value to outside the conditional statement, and got the same problem. I've tried changing the variable names from key to acc and from value to header, but still got the same result (you can see in my example outside the function that the variables were originally key and value). I just tried making the empty dictionary an argument so that it is initialized outside of the function, but I still get an empty dictionary returned. I'm not sure what to try now. Thanks in advance!
Note: I'm sure this can be solved more efficiently with another library, but this is for a class and the instructor is very against using other libraries. We have to learn how to do it ourselves first.
Edit: Now I'm frustrated. I rewrote the code to add in creating the input file for people to use to run my code and help me with it, but it works. I'm sorry to have wasted everyone's time. Here is what I rewrote. If you notice a difference between this and what I posted above, please let me know.
import re
def getheader(file):
'''
compiles headers from a fasta file into a dictionary
Parameters
----------
file : file
file that contains the sequences with headers and has not been read
Returns
-------
dictionary with acc no as key and full header as value
'''
head_dict = {}
for line in file:
if line.startswith(">"):
acc = re.search(">([^\s]+)", line).group(1)
header = line.rstrip()
head_dict[acc] = header
return head_dict
file = open("bookishbubs_test1.txt", "w")
file.write(">AF12345 test sequence 1\nCGATATTCCCATGCGGTTTATTTATGCAAAACTGTGACGTTCGCTTGA")
file.close()
file = open("bookishbubs_test1.txt", "r")
headers = getheader(file)
print(headers)
file.close()

Add a new list in a new line on a .txt file in python

I'm trying to add a list to an existing .txt file in Python three. Currently I have the following code:
def function(list, new_add):
with open("file.txt", new_add) as file:
for item in list:
file.write("{}\t".format(item))
add_grade(list, "w")
I have created a list:
list = [string1, integer1a, integer1b]
It works fine creating a new txt file. But if I replace "w" with "a" I do not get the purposed output. My question is now, how do I add new components of a list to the file in a new line? My current output look like this if I try to add new variables:
string1 integer1a integer1b string2 integer2a integer2a
I would like instead the following displayed:
string1 integer1a integer1b
string2 integer2a integer2a
...
How could I add a new line after the each list is inserted?
You can do this quite easily with the Python 3 print() function, just specify the separator and file with the sep and file parameters, and print() will take care of the details for you:
def function(iterable, filename, new_add):
with open(filename, new_add) as file:
print(*iterable, sep='\t', file=file)
Note that I renamed the list parameter to iterable because using list shadows the built-in list class, and the function should work with any iterable.
This code will also work in Python 2 if you add the following to the top of your file:
from __future__ import print_function
this should work , you should call line break after writing your list in the file
def add_grade(list, new_add):
with open("file.txt", new_add) as file:
for item in list:
file.write("{}\t".format(item))
file.write("\n")
add_grade(list, "w")

File input output python save file

I'm trying to take a dictionary and write it to a file. Each line of output is suppose to contained the key and its value separated by a single space. So far I have the following:
def save(diction):
savefile = open("save.txt", "w")
for line in diction:
values = line.split()
savefile.write(values)
savefile.close()
I'm not sure if this is writing to the file or if it even saves the file with the .close() function. Any advice?
The values are being written to the file on savefile.write(values), however the method with which you are opening and closing the file is a bit dangerous. If you encounter an error the file may never be closed. It's better to use with to ensure that the file will automatically be closed when leaving the with block, whether by normal execution or on error. Also, you probably mean to iterate through diction.items()?
def save(diction):
with open("save.txt", "w") as savefile:
for key, value in diction.items():
values = "%s %s\n" % (key, value)
savefile.write(values)
If you try to call this function, it will give you an exception, like this:
----> 7 savefile.write(values)
TypeError: must be str, not list
… and then quit. So no, it's not writing anything, because there's an error in your code.
What does that error mean? Well, obviously something in that line is a list when it should be a str. It's not savefile, it's not write, so it must be values. And in fact, values is a list, because that's what you get back from split. (If it isn't obvious to you, add some debugging code into the function to print(values), or print(type(values)) and run your code again.)
If you look up the write method in the built-in help or on the web, or re-read the tutorial section Methods of File Objects, you will see that it does in fact require a string.
So, how do you write out a list?
You have to decide what you want it to look like. If you want it to look the same as it does in the print statement, you can just call str to get that:
savefile.write(str(values))
But usually, you want it to be something that looks nice to humans, or something that's easy to parse for later scripts you write.
For example, if you want to print out each values as a line made up of 20-character-wide columns, you could do something like this:
savefile.write(''.join(format(value, '<20') for value in values) + '\n')
If you want something you can parse later, it's usually better to use one of the modules that knows how to write (and read) specific formats—csv, json, pickle, etc.
The tutorial chapter on Input and Output is worth reading for more information.
def save_dict(my_dict):
with open('save.txt', 'w') as f:
for key in my_dict.keys():
f.write(key + ' ' + my_dict[key] + '\n')
if __name__ == "__main__":
my_dict = {'a': '1',
'b': '2'}
save_dict(my_dict)
def save(diction):
with open('save.txt', 'w') as savefile:
for key, value in diction.items():
savefile.write(str(key)+ ' ' + str(value) + '\n')

Create a function for add_new_field in Python

I am trying to create a function for a task I mentioned earlier for adding a new field to some files that are stored in one folder. I read about the function method in Python and followed but it does not work. I would be happy if someone could guide me about it.
def newfield(infile,outfile):
infile.readlines()
outfile = ["%s\t%s" %(item.strip(),2) for item in infile]
outfile.write("\n".join(output))
outfile.close()
return outfile
and then I try to call it:
a = open("E:/SAGA/data/2006last/test/325145404.all","r")
b = open("E:/SAGA/data/2006last/test/325145404_edit.all","w")
newfield(a,b)
Moreover, when we create a function, should we call it in a same python shell or we can same it as a file and then call it?
Looks like the problem is you're assigning the list you build to the same name as the output file. You're also reading all of the lines out of the file and not assigning the result anywhere, so when your list comprehension iterates over it, it's already at the end of the file. So you can drop the readlines line.
def newfield(infile,outfile):
output = ["%s\t%s" %(item.strip(),2) for item in infile]
outfile.write("\n".join(output))
outfile.close()
return outfile
The line infile.readlines() consumes the whole input file. All lines are read and forgotten because they are not assigned to a variable.
The construct ["%s\t%s" %(item.strip(),2) for item in infile] is a list comprehension. This expression returns a list. By assigning it to the variable outfile the old value of outfile - probably a file object - is forgotten and replaced by the list returned by the list comprehension. This list object does not have a write method. By assigning the list to another variable the file object in outfile is preserved.
Try this:
def newfield(infile,outfile):
items = ["%s\t%s" %(item.strip(),2) for item in infile]
outfile.write("\n".join(items))
outfile.close()
return outfile

Python assign function to dict value

I am trying to write a script that opens files based on dictionary values. For each key/value it opens a file based on that value's name and assigns that file to the name of the value (I think it's going wrong here). So far it opens the files for me, with the right names, so that works. However, I think the name I assign the open(file) function to is wrong, since the rest of my function does not open the files anymore, and I can't close them.
Example of my script:
filelist=[]
codeconv={"agt":"r1p1d", "aga":"r2p1d"}
for value in codeconv.values():
value=open("c:\Biochemistry\Pythonscripts\Splittest\split"+value+".txt", 'w')
filelist.append(value)
"Here I do something with the files"
for f in filelist:
f.close()
So the problem is, how do I assign the open(file) function to the correct name? (Here r1p1d and r2p1d for example) And how do I later call those again?
The error I get now is:
AttributeError: 'str'object has no attribute 'close'
on the f.close() line.
EDIT: It now works as I want it to, using the following code: (I also included the 'here I do something' part now just for clarity)
result=open("C:\\Biochemistry\\Pythonscripts\\Illuminaresults.txt", "r")
filelist=[]
codeconv={"agt":"r1p1d", "aga":"r2p1d"}
opened_files={}
for key, value in codeconv.items():
filename="c:\Biochemistry\Pythonscripts\Splittest\split"+value+".txt"
file=open(filename, 'w')
opened_files.update({key: file})
for line in result:
if line[0]==">":
lastline=line
if line[0:3] in codeconv and len(line)==64:
f=opened_files[line[0:3]]
f.write(lastline+line)
else: continue
for f in opened_files.values():
f.close()
result.close()
I got another problem now though when I try to write the next part of my script, but that's probably something for another question, as it gives a Windowserror not related to this part. Thanks for the help all!
If i understand you well, you want to create some variable dynamically from the dictionary so that you can assign them to the opened files, is that it ???!!!!
I will suggest to do it using another dictionary like this:
codeconv={"agt":"r1p1d", "aga":"r2p1d"}
opened_files = {}
for key, value in codeconv.items():
file_name = "c:\Biochemistry\Pythonscripts\Splittest\split%s.txt" % value
file=open(file_name, 'w')
opened_files.update({key: file})
you can now access your opened files from the dictionary like this:
f = opened_files['agt']
f.read()
....
and for you latter code do it like this:
for f in opened_files.values():
f.close()
I'd caution against reusing "value" in that first loop, for clarity and future bugs that might be introduced there.
But, the code you posted actually works, so obviously there's something else going on that we can't diagnose here. What you need to do is inspect the contents of filelist before you try to close the file handles in it. That will probably point you towards the answer.
Perhaps you want to append tuples to filelist so the name is associated with the file object
codeconv={"agt":"r1p1d", "aga":"r2p1d"}
for value in codeconv.values():
f=open("c:\Biochemistry\Pythonscripts\Splittest\split"+value+".txt", 'w')
filelist.append((value, f))
"Here I do something with the files"
for name, f in filelist:
f.close()
I can't see a problem in your code, maybe it's not in the part pasted? (in filelist, or the snipped bit).
But anyway, I would rewrite it to avoid using the same variable, value, for two different things:
filelist = []
codeconv={"agt":"r1p1d", "aga":"r2p1d"}
for file_id in codeconv.values():
f = open("c:\Biochemistry\Pythonscripts\Splittest\split"+file_id+".txt", 'w')
filelist.append(f)
"Here I do something with the files"
for f in filelist:
if isinstance(f, str):
print "WARNING: Expected file handle, got string:", f
else:
f.close()
(I also added a bit of troubleshooting code)
The files are already open - your filelist object contains open file objects that you can iterate over (for example, with for line_of_text in filelist[0]:) or call other functions of (see dir(file) for other members).
You can defer opening the file by assigning a lambda and calling it later, for example:
for value in codeconv.values():
value2 = lambda: open(complete_filename)
filelist.append(value2)
my_file_object = filelist[0]()
You may prefer to store these in a dictionary:
for value in codeconv.itervalues():
filedict[value] = lambda: open(complete_filename)
my_file_object = filedict["r1p1d"]()
Or if you really want to create new variables (I strongly recommend not doing this, but since you asked):
for value in codeconv.itervalues():
globals()[value] = open(complete_filename)
# or
#globals()[value] = lambda: open(complete_filename)
# if you prefer
Finally, you close the files as you already are (substituting filedict.itervalues() for filelist if you use a dictionary instead).
(Obviously you need to replace complete_filename in the above examples with however you calculate the actual filename. I shouldn't need to say this, but I've been stung too often by leaving out these sorts of details.)
You're not assigning the function, you are assigning the return value. Which appearantly is a string in this case instead of a file object, which is a bit surprising. Are you doing anything with the filelist in the "here I do something" part?
[edit]
ok, sounds like you want a dict between key/fileobject (with example usage).
codeconv={"agt":"r1p1d", "aga":"r2p1d"}
filedict = {key, open("c:\Biochemistry\Pythonscripts\Splittest\split"+value+".txt", 'w')
for key, value in codeconv.iteritems()}
#Here I do something with the files
filedict["agt"].write("foo")
filedict["aga"].write("bar")
for f in filedict.values():
f.close()

Categories