I know there is a StringIO stream in Python, but is there such a thing as a file stream in Python? Also is there a better way for me to look up these things? Documentation, etc...
I am trying to pass a "stream" to a "writer" object I made. I was hoping that I could pass a file handle/stream to this writer object.
I am guessing you are looking for open(). http://docs.python.org/library/functions.html#open
outfile = open("/path/to/file", "w")
[...]
outfile.write([...])
Documentation on all the things you can do with streams (these are called "file objects" or "file-like objects" in Python): http://docs.python.org/library/stdtypes.html#file-objects
There is a builtin file() which works much the same way. Here are the docs: http://docs.python.org/library/functions.html#file and http://python.org/doc/2.5.2/lib/bltin-file-objects.html.
If you want to print all the lines of the file do:
for line in file('yourfile.txt'):
print line
Of course there is more, like .seek(), .close(), .read(), .readlines(), ... basically the same protocol as for StringIO.
Edit: You should use open() instead of file(), which has the same API - file() goes in Python 3.
In Python, all the I/O operations are wrapped in a hight level API : the file likes objects.
It means that any file likes object will behave the same, and can be used in a function expecting them. This is called duck typing, and for file like objects you can expect the following behavior :
open / close / IO Exceptions
iteration
buffering
reading / writing / seeking
StringIO, File, and all the file like objects can really be replaced with each others, and you don't have to care about managing the I/O yourself.
As a little demo, let's see what you can do with stdout, the standard output, which is a file like object :
import sys
# replace the standar ouput by a real opened file
sys.stdout = open("out.txt", "w")
# printing won't print anything, it will write in the file
print "test"
All the file like objects behave the same, and you should use them the same way :
# try to open it
# do not bother with checking wheter stream is available or not
try :
stream = open("file.txt", "w")
except IOError :
# if it doesn't work, too bad !
# this error is the same for stringIO, file, etc
# use it and your code get hightly flexible !
pass
else :
stream.write("yeah !")
stream.close()
# in python 3, you'd do the same using context :
with open("file2.txt", "w") as stream :
stream.write("yeah !")
# the rest is taken care automatically
Note that a the file like objects methods share a common behavior, but the way to create a file like object is not standard :
import urllib
# urllib doesn't use "open" and doesn't raises only IOError exceptions
stream = urllib.urlopen("www.google.com")
# but this is a file like object and you can rely on that :
for line in steam :
print line
Un last world, it's not because it works the same way that the underlying behavior is the same. It's important to understand what you are working with. In the last example, using the "for" loop on an Internet resource is very dangerous. Indeed, you know is you won't end up with a infinite stream of data.
In that case, using :
print steam.read(10000) # another file like object method
is safer. Hight abstractions are powerful, but doesn't save you the need to know how the stuff works.
Related
I can successfully redirect my output to a file, however this appears to overwrite the file's existing data:
import subprocess
outfile = open('test','w') #same with "w" or "a" as opening mode
outfile.write('Hello')
subprocess.Popen('ls',stdout=outfile)
will remove the 'Hello' line from the file.
I guess a workaround is to store the output elsewhere as a string or something (it won't be too long), and append this manually with outfile.write(thestring) - but I was wondering if I am missing something within the module that facilitates this.
You sure can append the output of subprocess.Popen to a file, and I make a daily use of it. Here's how I do it:
log = open('some file.txt', 'a') # so that data written to it will be appended
c = subprocess.Popen(['dir', '/p'], stdout=log, stderr=log, shell=True)
(of course, this is a dummy example, I'm not using subprocess to list files...)
By the way, other objects behaving like file (with write() method in particular) could replace this log item, so you can buffer the output, and do whatever you want with it (write to file, display, etc) [but this seems not so easy, see my comment below].
Note: what may be misleading, is the fact that subprocess, for some reason I don't understand, will write before what you want to write. So, here's the way to use this:
log = open('some file.txt', 'a')
log.write('some text, as header of the file\n')
log.flush() # <-- here's something not to forget!
c = subprocess.Popen(['dir', '/p'], stdout=log, stderr=log, shell=True)
So the hint is: do not forget to flush the output!
Well the problem is if you want the header to be header, then you need to flush before the rest of the output is written to file :D
Are data in file really overwritten? On my Linux host I have the following behavior:
1) your code execution in the separate directory gets:
$ cat test
test
test.py
test.py~
Hello
2) if I add outfile.flush() after outfile.write('Hello'), results is slightly different:
$ cat test
Hello
test
test.py
test.py~
But output file has Hello in both cases. Without explicit flush() call stdout buffer will be flushed when python process is terminated.
Where is the problem?
I am making a game in python 2.7 for fun and am trying to make a map to go along with it. I am using file I/O to read and write the map and have also got notepad ++ set to silent update, however I can only see the changes once my program has fully run and want to view the file as it is updated.
I have this code which i am testing with:
from time import sleep
map = open('C:\Users\Ryan\Desktop\Codes\Python RPG\Maps\map.txt', 'r+')
map.truncate()
print "file deleted"
sleep(1)
worldMap = open('C:\Users\Ryan\Desktop\Codes\Python RPG\Maps\worldMap.txt', 'r')
for line in worldMap:
map.write(line)
print "file updated"
worldMap.close()
map.close()
Any help is greatly appricated :)
By default Python uses buffered I/O. This means that written data is stored in memory before actually written to the file. Calling file's flush method causes the data to be written to the file.
I would like to know a very basic thing of Python programming as I am a very basic programmer right now): how can I save a result (either a list, a string, or whatever) to a file in Python?
I've been searching a lot, but I couldn't find any good answer to this.
I was thinking about the ".write ()" method, but (for instance) it seems not working with strings, neither I know what it is supposed to do though.
So, my situation is that I have binary fils, which I would like to edit, therefore I found easy to convert them to strings, modify them, and now I'd like to save them i) back to binary files (jpegs images) and ii) in the folder I want.
How would I do that? Please I need some help.
UPDATE
Here is the script I'm trying to run:
import os, sys
newpath= r'C:/Users/Umberto/Desktop/temporary'
if not os.path.exists (newpath):
os.makedirs (newpath)
data= open ('C:/Users/Umberto/Desktop/Prove_Script/Varie/_BR_Browse.001_2065642654_1.BINARY', 'rb+')
edit_data= str (data.read () )
out_dir= os.path.join (newpath, 'feed', 'address')
data.close ()
# do my edits in a secon time...
edit_data.write (newpath)
edit_data.close ()
The error I get is:
AttributeError: 'str' object has no attribute 'write'
UPDATE_2
I tried to use pickle module to serialize my binary file, modify it and save it at the end, but still not getting it to work... This is what I've been trying so far:
import cPickle as pickle
binary= open ('C:\Users\Umberto\Desktop\Prove_Script\Varie\_BR_Browse.001_2065642654_1.BINARY', 'rb')
out= open ('C:\Users\Umberto\Desktop\Prove_Script\Varie\preview.txt', 'wb')
pickle.dump (binary, out, 1)
TypeError Traceback (most recent call last)
<ipython-input-6-981b17a6ad99> in <module>()
----> 1 pprint.pprint (pickle.dump (binary, out, 1))
C:\Python27\ArcGIS10.1\lib\copy_reg.pyc in _reduce_ex(self, proto)
68 else:
69 if base is self.__class__:
---> 70 raise TypeError, "can't pickle %s objects" % base.__name__
71 state = base(self)
72 args = (self.__class__, base, state)
TypeError: can't pickle file objects
Another thing I didn't get is that if I am supposed to create a file to poit to (in my case I had to create "out", otherwise I wouldn't have the right arguments for the pickle method) or it's not necessary.
Hope I'm getting close to the solution.
P.S.: I tried also with pickle.dumps (), not achieving a nicer result though...
If you're opening a binary file and saving another binary file you could do something like this:
with open('file.jpg', 'rb') as jpgFile:
contents = jpgFile.read()
contents = (some operations here)
with open('file2.jpg', 'wb') as jpgFile:
jpgFile.write(contents)
Some comments:
'rb' and 'wb' means read and write in binary mode respectively. More info on why 'b' is recommended when working with binary files here.
Python's with statement takes care of closing the file when exiting the block.
If you need to save lists, strings or other objects, and retrieving them later, use pickle as others pointed out.
You can use standard python module named "pickle".
You can read about it here: pickle documentation
Read and write any data structure will be very easy
pickle.dump(obj, file_handler) # for serialize object to file
pickle.load(file) # for deserialize from file
or you can serialize to string: pickle.dumps(..) and load from it: pickle.loads(...)
I am trying to read only one file from a tar.gz file. All operations over tarfile object works fine, but when I read from concrete member, always StreamError is raised, check this code:
import tarfile
fd = tarfile.open('file.tar.gz', 'r|gz')
for member in fd.getmembers():
if not member.isfile():
continue
cfile = fd.extractfile(member)
print cfile.read()
cfile.close()
fd.close()
cfile.read() always causes "tarfile.StreamError: seeking backwards is not allowed"
I need to read contents to mem, not dumping to file (extractall works fine)
Thank you!
The problem is this line:
fd = tarfile.open('file.tar.gz', 'r|gz')
You don't want 'r|gz', you want 'r:gz'.
If I run your code on a trivial tarball, I can even print out the member and see test/foo, and then I get the same error on read that you get.
If I fix it to use 'r:gz', it works.
From the docs:
mode has to be a string of the form 'filemode[:compression]'
...
For special purposes, there is a second format for mode: 'filemode|[compression]'. tarfile.open() will return a TarFile object that processes its data as a stream of blocks. No random seeking will be done on the file… Use this variant in combination with e.g. sys.stdin, a socket file object or a tape device. However, such a TarFile object is limited in that it does not allow to be accessed randomly, see Examples.
'r|gz' is meant for when you have a non-seekable stream, and it only provides a subset of the operations. Unfortunately, it doesn't seem to document exactly which operations are allowed—and the link to Examples doesn't help, because none of the examples use this feature. So, you have to either read the source, or figure it out through trial and error.
But, since you have a normal, seekable file, you don't have to worry about that; just use 'r:gz'.
In addition to the file mode, I attempted to seek on a network stream.
I had the same error when trying to requests.get the file, so I extracted all to a tmp directory:
# stream == requests.get
inputs = [tarfile.open(fileobj=LZMAFile(stream), mode='r|')]
t = "/tmp"
for tarfileobj in inputs:
tarfileobj.extractall(path=t, members=None)
for fn in os.listdir(t):
with open(os.path.join(t, fn)) as payload:
print(payload.read())
I got a file that contains a data structure with test results from a Windows user. He created this file using the pickle.dump command. On Ubuntu, I tried to load this test results with the following program:
import pickle
import my_module
f = open('results', 'r')
print pickle.load(f)
f.close()
But I get an error inside pickle module that no module named "my_module".
May the problem be due to corruption in the file, or maybe moving from Widows to Linux is the couse?
The problem lies in pickle's way of handling newline characters. Some of the line feed characters cripple module names in dumped / loaded data.
Storing and loading files in binary mode may help, but I was having trouble with them too. After a long time reading docs and searching I found that pickle handles several different "protocols" for storing data and due to backward compatibility it uses the oldest one: protocol 0 - the original ASCII protocol.
User can select modern protocol by specifing the protocol keyword while storing data in dump file, something like this:
pickle.dump(someObj, open("dumpFile.dmp", 'wb'), protocol=2)
or, by choosing the highest protocol available (currently 2)
pickle.dump(someObj, open("dumpFile.dmp", 'wb'), protocol=pickle.HIGHEST_PROTOCOL)
Protocol version is stored in dump file, so Load() function handles it automaticaly.
Regards
You should open the pickled file in binary mode, especially if you are using pickle on different platforms. See this and this questions for an explanation.