I am using a library that has a function like the below, where it expects a file location. The thing is, I store the content of this file in AWS so I can just use AWS api to securely return values I have stored in that resource as a string. How can I transform this string value to be able to pass in a function that is looking for file path - without actually just writing the string values to the local directory as .txt for json file? (I know I can convert the string to a json object but not certain as to how this solves the fact this function is looking for a path) function below:
file_location = 'String values I get returned when I return it from its safe and secure location'
password = credential_function.from_kefile_name(file_location)
Related
We have a Pyspark pair RDD which stores the path of .owl files as key and the file contents as value.
I wish to carry out reasoning using Owlready2. To load an ontology from OWL files, the get_ontology() function is used. However, the given function expects an IRI (a sort of URL) to the file, whereas I have the file contents as a str in Python.
Is there a way I could make this work out?
I have tried the following:
Used get_ontology(file_contents).load() --> this obviously does not work as the function expects a file path.
Used get_ontology(file_contents) --> no error, but the ontology does not get loaded, so reasoning does not happen.
Answering my own question.
The load() function in Owlready2 has a couple of more arguments which are not mentioned anywhere in the documentation. The function definitions of the package can be seen here.
Quoting from there, def load(self, only_local = False, fileobj = None, reload = False, reload_if_newer = False, **args) is the function signature.
We can see that a fileobj can also be passed, which is None by default. Further, the line fileobj = open(f, "rb") tells us that the file needs to be read in binary mode.
Taking all this into consideration, the following code worked for our situation:
from io import BytesIO # to create a file-like object
my_str = RDDList[1][1] # the pair RDD cell with the string data
my_str_as_bytes = str.encode(my_str) # convert to binary
fobj = BytesIO(my_str_as_bytes)
abox = get_ontology("some-random-path").load(fileobj=fobj) # the path is insignificant, notice the 'fileobj = fobj'.
I have a .txt file which contains a variable for my model. If I copy and paste the contents of the file in my program as
def Wind_phi_definition(model, i):
return m.Wind_phi[i] ==-19.995904426195736*Sinc(0.04188790204786391*(0. + m.lammda[i]*180/np.pi))*Sinc(0.08975979010256552*(-65. + m.phi[i]*180/np.pi))
m.Wind_phi_const = Constraint(m.N, rule = Wind_phi_definition)
The code is executed without issue. I want to speed this by making the program read directly from the .txt file.
I have tried to read the variable as
f = open("savedWindXPython.txt", "r")
a=f.readline()
f.close
def Wind_lammda_definition(model, i):
return m.Wind_phi[i] == a
m.Wind_phi_const = Constraint(m.N, rule = Wind_lammda_definition)
However, an error is returned
AttributeError: 'str' object has no attribute 'is_relational'
I understand that this happens because python is reading this as a string and not as a pyomo variable. I have attempted to go arround this problem using exec(a) instead of just a in the definition of m.Wind_phi. However, I still obtain an error, this time it says
'NoneType' object has no attribute 'is_relational'
Is there a way to do what I want to do and define the variable by reading the .txt file isntead of having to copy it's contents by hand?
What you are trying to achieve is called unmarshalling (or deserialization)
If your variable is a simple number, deserializing it is as easy as int(my_string_var). If it looks like the one you linked (where you need function calls in it), there is more work to be done. The lib you're using can also provide some serialization/deserialization feature. Unfortunately for you it seems like pyomo doesn't provide this feature at the moment.
So, now, if you have access to who or what is writing your .txt files in the first place, consider serializing it into json (or whatever data-storage format you're the most comfortable with; if you're not comfortable with any, try json). Then you will now how to deserialize it in your code.
Otherwise, you will have to write a string parser to extract the data from your string and serialize it into python objects. This could be trivial if you're only trying to parse this specific variable, but could also end up being rather challenging if you are trying to deserialize a lot of these variables.
I'm trying to build an AWS lambda function that accepts a file upload and then parses it in memory. The file is an xlsx file, and the content comes in to the lambda function looking like this in the body key of the event:
Beginning:
----------------------------300017151060007960655534
Content-Disposition: form-data; name="tag_list"; filename="test-list.xlsx"
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
PK
�y�N docProps/PK
And the end of the string looks like this:
[Content_Types].xmlPK�;
----------------------------475068475306850797919587--
If I do a head/tail of the actual file on my computer, it appears that the file starts at the PK and ends at the xmlPK�;. I've attempted to slice this section out and create a BytesIO object or a SpooledTemporaryFile, but none of these options work. They all give me something like invalid seek position, or bad zip file errors.
My goal is to load this xlsx file into memory and then parse it using openpyxl.
My current function looks a little something like this currently. I keep trying to format it differently, sometimes I decode it, sometimes not.
def lambda_handler(event, context):
file_index = event['body'].index('PK')
file_string = event['body'][file_index:]
file_end = file_string.index(';')
file = file_string[:file_end].encode('utf-8')
I then try to pass the file string into BytesIO or a SpooledTemporaryFile, but they all give me errors...
Note, I do NOT want to use S3.
I am using
csvFile=open("yelloUSA.csv",'w+')
to make a new csv file.
now I want to write the file names from 1-10. But I want to use formatting to write the names. how can I do that?
I used below code but it is showing error
for i in range(0,10):
csvFile=open("{0}.csv",'w+').format(i)
writer=csv.writer(csvFile)
The format is a method of a string object. You are applying it to the result of open function, which if a file object. You need to apply it to the filename string
filename = "{0}.csv".format(i)
csvFile=open(filename ,'w+')
Essentially I want to sort of do a makeshift super word count, but I'm uncertain how to create a dict object from a directory path (passed in as an argument) as opposed to a list to do what I need to do.
While I want to create a dictionary object, I also want to format the ASCII values of the keys which are filenames into email or message objects using the email module. Then I want to extract the body using the payload and parse it that way. I have some example below:
mylist=os.listdir(sys.stdin)
for emails in mylist:
email_str = emails.open()
#uncertain if this will get all emails and their content or not
#all emails are supposed to have a unique identifier, they are essentially still just ascii
file_dict = {emails : email_str}
#file_dict = dict(zip(mylist, mylist))
for emails in file_dict[emails]:
msg = email.message_from_string(email_str)
body = msg.get_payload(decode=True)
#I'm not entirely sure how message objects and sub objects work, but I want the header to
#signature and I'm not sure about the type of emails as far as header style
#pretend I have a parsing method here that implements the word count and prints it as a dict:
body.parse(regex)
I don't entirely need the keys other than to parse their values so I may consider using message_from_file instead.
You can use any string as a file path, and you can even use relative file paths. If you're just trying to format data for yourself, you could do iterate through your list of email messages and store the output.
for emailpath in list_of_email_paths
emailpath = 'someemailpath'
# open path -- only works if path exists.
f = open(emailpath)
file_dict[emailpath] = f.read()
f.close()
Not a great idea to use open file objects as keys (if it's even possible, just read them and store a string as an identifier. Read the docs on os.path for more (btw - you have to import with import os, not import os.path)
Aside from that, any immutable object or reference can be a dictionary key, so there's no problem with storing paths as keys. Python doesn't care where the path came from, nor does the dict care if its keys are paths ;)
Unfortunately, because you are asking to be shown so much information at once, my answer has to be a bit more general to overview them. Even though you stated that your example is all purely pseudocode, its all so completely wrong that its hard to know what you understand and what parts you don't, so I will cover all the bases of what you said in your comments.
How to read files
You are misusing os.listdir, as it takes a string path, not a file-type object. But personally for this I like to use glob. It saves a few steps in letting you get the full path, and filtered by a pattern. Lets assume all your email files end in .mail
import sys
import glob
first_path = sys.argv[1]
pattern = "%s/*.mail" % first_path
for mail in glob.iglob(pattern):
# with context will close the file automatically
with open(main) as f:
data = f.read()
# do something with data here
Parsing email formats
The example for using the email module are extensive, so there is no point in me showing them here other than giving you a link to review: http://docs.python.org/library/email-examples.html
If the files are actually emails, then you should be able to use this module to parse them and read the message body of each one
Using a dictionary
Using a dictionary is no different in this case than in any general case of a python dict. You would start by creating an empty dict:
file_dict = {}
And on every loop of your directory listing you will always have the string path name, which you wanted to be your key. Whether you read the files raw data using the first example, or you used the email module to get the message body, either way you will end up with some chunk of text data.
for mail in glob.iglob(pattern):
...
# do stuff to get the text data from the file
data = some_approach_to_reading_file()
...
file_dict[mail] = data
Now you have a file_dict with the key being the path to the original file, and the value being the read data.
Summary
With these three sections, you should have plenty of general information to put this all together.