Attribute error in 'file' and 'File' object in hdf5 file - python

I have created a hdf5 file using file = open() command. In this case, I can write and read the file. But it is giving me attribute error when I am trying file.keys(). The error is AttributeError: 'file' object has no attribute 'keys'.
Then I have created a new hdf5 file using file = h5py.File() command. In this case, I can read and use command file.keys() without any error. But I can not write in the file. The error is AttributeError: 'File' object has no attribute 'write'.
What are the reasons behind these error? Is there any difference between 'file' object and 'File' object?

open() returns an object of type file, that is the Python built in standard type to represent a file. This has quite a simple / low level interface and you would use it if you were reading a text file or parsing the content (be that text or binary) yourself. You can read the docs on the methods the file type has here - https://docs.python.org/2/library/stdtypes.html#bltin-file-objects
h5py.File() returns a different type of object that has additional functionality to handle the hdf5 format and provides it's own different API e.g. the keys() method you mention.
When opening a h5py.File() you must specify how you want to open it e.g. r+ for read/write mode. Someone with a better understanding of the h5py library may be able to give a better explanation for this but the reason you can not call write() on the h5py.File() object is because it does not have a write method as suggested by the error message.
Checkout the API docs for h5py, it provides different methods for writing different data to the file - http://docs.h5py.org/en/latest/high/dataset.html

Related

TypeError: object of type 'IndirectObject' has no len()

I'm trying to get contents in pdf file using the python PyPDF2 package. But getting this error.
TypeError: object of type 'IndirectObject' has no len()
This is happening to a particular file. It's working fine with the remaining files. Is there any reasonable logic behind it?
attached more details of the error

Retrieving data with openstacksdk from openstack's object storage

I'm trying to retrieve pickle data I have uploaded to an openstack object storage using openstacksdk's connection.get_object(container,object), I get a response from it, however the file body is a string, I can even save it to file with the outfile option without issues. However I would like to be able to work with it directly without having to resort to save it to file first and then loading it into pickle.
Simply using pickle's load and loads doesn't work as neither takes string objects. Is there another way to retrieve the data so I can work with the pickled data directly or is there some way to parse to string/set a config parameter on get_object()?
If you are using Python 3 - pickle expects a bytes-like object. The load method takes a file path, and relies on the file type to handle the providing of bytes back into pickle. When you use the loads method you need to provide it a bytes-like object, not a string, so you will need to convert the string to bytes.
Best way to convert string to bytes in Python 3?
EDIT:
I found the solution, for pickled objects or any other files retrieved from openstack with openstacksdk, there are a few ways of dealing with the data without resorting to disk.
First my implemented solution was to use openstack's connection method get_object_raw:
conn = connection(foo,bar, arg**)
pickle.loads(conn.get_object_raw('containerName', 'ObjectName').content)
.get_object_raw returns a response request object with the attribute content which is the binary file content which is the pickle content one can load with pickle.
You could also create a temporary in-memory file with io.BytesIO, and using it as the outfile argument in get_object from the connection object.

Resolve Attribute Error - Can I do this by defining variable?

I am new to Python and am wondering how to address the following attribute error. I believe I need to define/declare the file variable? Thanks for any suggestions, here is my script:
AttributeError Traceback (most recent call last)
in
51
52 # Write methods to print to Financial_Analysis_Summary
---> 53 file.write("Financial Analysis")
54 file.write("\n")
55 file.write("----------------------------")
AttributeError: 'str' object has no attribute 'write'
From your code and error, I think You've defined the variable 'file' as a string. Aslo there is no attribute write() in the class str. Hence, the reason for this error. For more information, include the whole script i.e., mainly the use of variable 'file'. I think you can use print() to print the above mentioned details or create a new class with a method inside to print your desired things
It looks like you've somehow defined file as a string rather than a file. What you should to is define it thus:
summary_file=open("C:/someFolder/someOtherFolder/Financial_Analysis_Summary.txt",
mode='r+', encoding='utf8')
and then write to it.
The first argument to the open function is the file path. The mode is how you want to access the file: 'r' lets you read the file and nothing else (and throws a FileNotFoundError if the file doesn't yet exist; the others just create it), 'r+' lets you write to the file while leaving its preexisting text in place (although if you write to the middle of the file you'll still overwrite whatever was there), 'w' deletes what was in the file and lets you write to it, 'a' lets you write text only to the end of the file, 'w+' and 'a+' are the same as w and a except they let you read from the file; you can add b to the end of any of these to interact with the file in the form of bytes rather than strings. The encoding should only matter if you plan to use Unicode characters, in which case set it to the same encoding you'll use to view the file (usually 'utf8') to avoid garbling non-ASCII characters.

Modify flow file attributes in NiFi with Python sys.stdout?

In my pipeline I have a flow file that contains some data I'd like to add as attributes to the flow file. I know in Groovy I can add attributes to flow files, but I am less familiar with Groovy and much more comfortable with using Python to parse strings (which is what I'll need to do to extract the values of these attributes). The question is, can I achieve this in Python when I use ExecuteStreamCommand to read in a file with sys.stdin.read() and write out my file with sys.stdout.write()?
So, for example, I use the code below to extract the timestamp from my flowfile. How do I then add ts as an attribute when I'm writing out ff?
import sys
ff = sys.stdin.read()
t_split = ff.split('\t')
ts = t_split[0]
sys.stdout.write(ff)
Instead of writing back the entire file again, you can simply write the attribute value from the input FlowFile
sys.stdout.write(ts) #timestamp in you case
and then, set the Output Destination Attribute property of the ExecuteStreamCommand processor with the desired attribute name.
Hence, the output of the stream command will be put into an attribute of the original FlowFile and the same can be found in the original relationship queue.
For more details, you can refer to ExecuteStreamCommand-Properties
If you're not importing any native (CPython) modules, you can try ExecuteScript with Jython rather than ExecuteStreamCommand. I have an example in Jython in an ExecuteScript cookbook. Note that you don't use stdin/stdout with ExecuteScript, instead you have to get the flow file from the session and either transfer it as-is (after you're done reading) or overwrite it (there are examples in the second part of the cookbook).

save string to a binary file in python

I would like to know a very basic thing of Python programming as I am a very basic programmer right now): how can I save a result (either a list, a string, or whatever) to a file in Python?
I've been searching a lot, but I couldn't find any good answer to this.
I was thinking about the ".write ()" method, but (for instance) it seems not working with strings, neither I know what it is supposed to do though.
So, my situation is that I have binary fils, which I would like to edit, therefore I found easy to convert them to strings, modify them, and now I'd like to save them i) back to binary files (jpegs images) and ii) in the folder I want.
How would I do that? Please I need some help.
UPDATE
Here is the script I'm trying to run:
import os, sys
newpath= r'C:/Users/Umberto/Desktop/temporary'
if not os.path.exists (newpath):
os.makedirs (newpath)
data= open ('C:/Users/Umberto/Desktop/Prove_Script/Varie/_BR_Browse.001_2065642654_1.BINARY', 'rb+')
edit_data= str (data.read () )
out_dir= os.path.join (newpath, 'feed', 'address')
data.close ()
# do my edits in a secon time...
edit_data.write (newpath)
edit_data.close ()
The error I get is:
AttributeError: 'str' object has no attribute 'write'
UPDATE_2
I tried to use pickle module to serialize my binary file, modify it and save it at the end, but still not getting it to work... This is what I've been trying so far:
import cPickle as pickle
binary= open ('C:\Users\Umberto\Desktop\Prove_Script\Varie\_BR_Browse.001_2065642654_1.BINARY', 'rb')
out= open ('C:\Users\Umberto\Desktop\Prove_Script\Varie\preview.txt', 'wb')
pickle.dump (binary, out, 1)
TypeError Traceback (most recent call last)
<ipython-input-6-981b17a6ad99> in <module>()
----> 1 pprint.pprint (pickle.dump (binary, out, 1))
C:\Python27\ArcGIS10.1\lib\copy_reg.pyc in _reduce_ex(self, proto)
68 else:
69 if base is self.__class__:
---> 70 raise TypeError, "can't pickle %s objects" % base.__name__
71 state = base(self)
72 args = (self.__class__, base, state)
TypeError: can't pickle file objects
Another thing I didn't get is that if I am supposed to create a file to poit to (in my case I had to create "out", otherwise I wouldn't have the right arguments for the pickle method) or it's not necessary.
Hope I'm getting close to the solution.
P.S.: I tried also with pickle.dumps (), not achieving a nicer result though...
If you're opening a binary file and saving another binary file you could do something like this:
with open('file.jpg', 'rb') as jpgFile:
contents = jpgFile.read()
contents = (some operations here)
with open('file2.jpg', 'wb') as jpgFile:
jpgFile.write(contents)
Some comments:
'rb' and 'wb' means read and write in binary mode respectively. More info on why 'b' is recommended when working with binary files here.
Python's with statement takes care of closing the file when exiting the block.
If you need to save lists, strings or other objects, and retrieving them later, use pickle as others pointed out.
You can use standard python module named "pickle".
You can read about it here: pickle documentation
Read and write any data structure will be very easy
pickle.dump(obj, file_handler) # for serialize object to file
pickle.load(file) # for deserialize from file
or you can serialize to string: pickle.dumps(..) and load from it: pickle.loads(...)

Categories