Forwarded Email parsing in Python/Any other language?

Forwarded Email parsing in Python/Any other language? - python

I have some mails in txt format, that have been forwarded multiple times.
I want to extract the content/the main body of the mail. This should be at the last position in the hierarchy..right? (Someone point this out if I'm wrong).
The email module doesn't give me a way to extract the content. if I make a message object, the object doesn't have a field for the content of the body.
Any idea on how to do it? Any module that exists for the same or any any particular way you can think of except the most naive one of-course of starting from the back of the text file and looking till you find the header.
If there is an easy or straightforward way/module with any other language ( I doubt), please let me know that as well!
Any help is much appreciated!

The email module doesn't give me a way to extract the content. if I make a message object, the object doesn't have a field for the content of the body.
Of course it does. Have a look at the Python documentation and examples. In particular, look at the walk and payload methods.

Try get_payload on the parsed Message object. If there is only one message, the return type will be string, otherwise it will be a list of Message objects.
Something like this:
messages = parsed_message.get_payload()
while type(messages) <> Types.StringType:
messages = messages[-1].get_payload()

Related

Splinter find_by_css excluding a class item

class="conversation hasLabels read"
Hello, everyone, I am trying to access an unread email using a for loop and specifying a class
browser.find_by_css(.conversation.hasLabels.hasAttachments)
The problem here is some emails have class="read" so when the for loop executes, it takes all the read ones also but that is a problem since the emails don't have an unread element.
For better understanding, I would like to access a class providing exclusive parameters.

Apparently using the script this way grabs only what you ask so this solves my problem.
browser.find_by_css('div[class="conversation hasLabels hasAttachments"]')

You can just add a :not
.conversation.hasLabels.hasAttachments:not(.read)

It can help you if I understood the essence of the problem correctly:
browser.find_by_xpath('//*[contains(#class, "conversation")][contains(#class, "hasLabels")][not(contains(#class, "read"))]')

Replace words from an API in python

I am using an API I found online for one of my scripts, and I am wondering if I can change one word from the API to something else. My code is:
import requests
people = requests.get('https://insult.mattbas.org/api/insult')
print("Welcome to the insult machine!\nType somebody you want to insult!")
b = input()
print(people.replace("You", b))
Is replace not a command? If so, what plugin and/or commands would I need to do it? Thanks!

The value returned from requests.get isn’t a string, it’s a response object and that class has no replace method.
Have a look at the structure of that class. For example, you can do r = requests.get(...) and r.text.replace(...).
In other words, you need to operate on the text part of the response object.

struggling to parse an object using jsonlines

I'm having trouble parsing the body of a request using jsonlines. I'm using tornado as the server and this is happening inside a post() method.
My purpose in this is to parse the request's body into separate JSONs, then iterate over them with a jsonlines Reader, do some work on each one and then push them to a DB.
I solved this problem by dumping the utf-8 encoded body into a file and then used:
with jsonlines.open("temp.txt") as reader:
That works for me. I can iterate over the entire file with
for obj in reader:
I just feel like this is an unnecessary overhead that can be reduced if I can understand what's keeping me from just using this bit of code instead:
log = self.request.body.decode("utf-8")
with jsonlines.Reader(log) as reader:
for obj in reader:
the exception I get is this:
jsonlines.jsonlines.InvalidLineError: line contains invalid json:
Expecting property name enclosed in double quotes: line 1 column 2
(char 1) (line 1)
I've tried searching for this error here and all I found were examples where people tried using incorrectly formatted jsons that have one quote instead of double quotes. That is not the case for me. I debugged the request and saw that the string that returns from the decode method indeed has double quotes for both properties and values.
here is a sample of the body of the request I send (this is what it looks like in Postman):
{"type":"event","timestamp":"2018-03-25 09:19:50.999","event":"ButtonClicked","params":{"screen":"MainScreen","button":"SettingsButton"}}
{"type":"event","timestamp":"2018-03-25 09:19:51.061","event":"ScreenShown","params":{"name":"SettingsScreen"}}
{"type":"event","timestamp":"2018-03-25 09:19:53.580","event":"ButtonClicked","params":{"screen":"SettingsScreen","button":"MissionsButton"}}
{"type":"event","timestamp":"2018-03-25 09:19:53.615","event":"ScreenShown","params":{"name":"MissionsScreen"}}
You can reproduce the exception by using this simple bit of code in a post method and sending the lines I provided through Postman:
log = self.request.body.decode("utf-8")
with jsonlines.Reader(log) as currentlog:
for obj in currentlog:
print("obj")
As a sidenote: Postman sends the data as text, not JSON.
If you need any more information to answer this question, please let me know.
One thing I did notice is that the string that returns from the decode method starts and ends with one quote. I guess this is because of the double quotes in the JSONs themselves. Is it related in any way?
An example:
'{"type":"event","timestamp":"2018-03-25 09:19:50.999","event":"ButtonClicked","params":{"screen":"MainScreen","button":"SettingsButton"}}'
Thanks for any help!

jsonlines.Reader accepts iterable as an arg ("The first argument must be an iterable that yields JSON encoded strings" not json-encoded single string as in your example), but, after .decode("utf-8"), log would be a string, which happen to support iterable interface. So when reader calls under the hood next(log) it will get first item of a log string, i.e. character { and will try to process it as an json-line which would be obviously invalid. Try log = log.split() before passing log to the Reader.

Python filter for postfix

I am trying to make a simple Python filter for postfix, to add in a 'Reply-to' header to certain messages.
What I've done so far is to take the email from stdin, and parse it into an email object like so:
raw = sys.stdin.readlines()
msg = email.message_from_string(''.join(raw))
Then I've played with headers etc.
msg.add_header('Reply-to', 'foo#bar.com')
And now want to re-inject that back into postfix. Reading the filter readme associated with postfix, I should pass it back using the 'sendmail' command. However, I'm not sure how to pass the email object over to sendmail, for example using subprocess's 'call()' or whether I should use the smtplib's 'smtplib.SMTP()'?
What would be the 'correct' method?
Thanks

You should be able to use both methods, but smtplib.SMTP() is more flexible and makes the error handling easier.
If you need an example, have a look at my framework for python filters:
https://github.com/gryphius/fuglu/blob/master/fuglu/src/fuglu/connectors/smtpconnector.py#L67
the re_inject method does exactly that (FUSMTPClient is a subclass of smtplib.SMTP), so basically it boils down to:
client = smtplib.SMTP('127.0.0.1',<yourportnumber for the receiving postfix instance>)
client.sendmail(<envelope from>, <envelope to>, <yourmessageobject>.as_string())

Passing a Python list using JSON and Django

I'm trying to send a Python list in to client side (encoded as JSON). This is the code snippet which I have written:
array_to_js = [vld_id, vld_error, False]
array_to_js[2] = True
jsonValidateReturn = simplejson.dumps(array_to_js)
return HttpResponse(jsonValidateReturn, mimetype='application/json')
How do I access it form client side? Can I access it like the following?
jsonValidateReturn[0]
Or how do I assign a name to the returned JSON array in order to access it?
Actually I'm trying to convert a server side Ajax script that returns an array (see Stack Overflow question Creating a JSON response using Django and Python that handles client side POST requests, so I wanted the same thing in return with Python, but it didn't go well.

The JSON array will be dumped without a name / assignment.
That is, in order to give it a name, in your JavaScript code you would do something like this:
var my_json_data_dump = function_that_gets_json_data();
If you want to visualize it, for example, substitute:
var my_json_data_dump = { 'first_name' : Bob, 'last_name': smith };
Also, like Iganacio said, you're going to need something like json2.js to parse the string into the object in the last example. You could wrap that parsing step inside of function_that_gets_json_data, or if you're using jQuery you can do it with a function like jQuery.getJSON().
json2.js is still nice to have, though.
In response to the comment (I need space and markup):
Yes, of course. All the Python side is doing is encoding a string representation (JSON) for you. You could do something like 'var blah = %s' % json.dumps(obj_to_encode) and then on the client side, instead of simply parsing the response as JSON, you parse it as JavaScript.
I wouldn't recommend this for a few reasons:
You're no longer outputting JSON. What if you want to use it in a context where you don't want the variable name, or can't parse JavaScript?
You're evaluating JavaScript instead of simply parsing JSON. It's an operation that's open to security holes (if someone can seed the data, they might be able to execute a XSS attack).
I guess you're facing something I think every Ajax developer runs in to. You want one place of truth in your application, but now you're being encouraged to define variables and whatnot in JavaScript. So you have to cross reference your Python code with the JavaScript code that uses it.
I wouldn't get too hung up on it. I can't see why you would absolutely need to control the name of the variable from Python in this manner. If you're counting on the variable name being the same so that you can reference it in subsequent JavaScript or Python code, it's something you might obviate by simply restructuring your code. I don't mean that as a criticism, just a really helpful (in general) suggestion!

If both client and server are in Python, here's what you need to know.
Server. Use a dictionary to get labels on the fields. Write this as the response.
>>> import json
>>> json.dumps( {'vld_id':1,'vls_error':2,'something_else':True} )
'{"vld_id": 1, "something_else": true, "vls_error": 2}'
Client. After reading the response string, create a Python dictionary this way.
>>> json.loads( '{"vld_id": 1, "something_else": true, "vls_error": 2}' )
{u'vld_id': 1, u'something_else': True, u'vls_error': 2}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Forwarded Email parsing in Python/Any other language? - python

The email module doesn't give me a way to extract the content. if I make a message object, the object doesn't have a field for the content of the body. Of course it does. Have a look at the Python documentation and examples. In particular, look at the walk and payload methods.

Try get_payload on the parsed Message object. If there is only one message, the return type will be string, otherwise it will be a list of Message objects. Something like this: messages = parsed_message.get_payload() while type(messages) <> Types.StringType: messages = messages[-1].get_payload()

Related

Splinter find_by_css excluding a class item

Replace words from an API in python

struggling to parse an object using jsonlines

Python filter for postfix

Passing a Python list using JSON and Django

Categories

Resources