Parsing Gmail Labels into tree structure

Parsing Gmail Labels into tree structure - python

For those who don't know. If you have a gmail account, you can make various "folders" in your gmail account called labels. Now all these really are - are just a list of values with a name (its not actually a tree structure)
family
family/finance
However, Gmail will under the hood parse this into a tree structure.
The online gmail API will only return a list similar to what we have on the left
['family','family/eat/me','family/finance']
I want to parse this into a structure like
[{'name':'family','parent':None,'children':['family/eat/me','family/finance']},
{'name':'family/eat/me','parent':'family','children':[]},
{'name':'family/finance','parent':'family','children':[]}]
However, I am really stumped, anyone have any ideas on how to do this?

Related

Python: Lookup List of Emails Across Various URLs, Return Specific Information

Here's my dilemma.
I have a couple of excel workbooks, let's call them workbook-A and workbook-B.
Workbook-A holds a unique identifier, which happens to be a list of email addresses.
On workbook-B, I have a list of URLs, each URL opens a unique online excel workbook with information that includes emails as well.
Here is what I'm asking for help with.
What would be the best approach to have python take each email address on workbook-A, and look it up against the list of URLs in workbook-B, and then return which URL it found it on in workbook-A (on a blank column)?
Here is a diagram to further explain what I'm asking.
Logic Diagram

Task Automation Help: (no similar entries)

Problem: I’m a nurse, and part of my job is to pull up a list of “unsigned services”. Then of course, take these charts, and send them to the right person.
The person before me did not do this, leaving me THOUSANDS of charts to pull up by patient name and DOB, select the right document, and send to the right person.
I have figured out how to use selenium with python to automate logging in, using input to send keys to search the correct patient, and even to pull up the correct document that needs signed.
How do I have the program do this, for every chart? How do I have python work down the list of names and DOB’s without my having to manually put them in?
Anything I look for on my own is just examples of applying a basic function to a list of numbers and that isn’t my goal.
Thanks for your help!

Methods to extract email signature

I would like to extract email signatures from a single-column Pandas data frame where each row contains a discrete email message as a string. Some emails are HTML encoded and some are not. They can be of any email provider (e.g.: Gmail, Microsoft, Yahoo, etc.).
I know that Gmail signatures are contained in a div where class="email_signature" which simplifies parsing those. My dilemma is: what is the best way to extract non-gmail email signatures? Is there a regex that captures the content of an email? How can I apply this regex over the Pandas data frame in Python?
I'd provide an example but the data is private and frankly I don't think it's necessary for this question.

Checkout SigParser.com. It is an API for doing this pretty much. It uses email signatures to extract contact data. Is this what you're looking for?

How do I pass around data between pages in a GAE app?

I have a fairly basic GAE app that takes some input, fetches some data from a webpage, parses, then presents it to the user. Right now, the fairly spare input HTML form POSTs the arguments to the output 'file' which is wholly generated by the handler for that URL.
I'd like to do a couple things with the data (e.g. graph it at a landing page perhaps, then write it to an output file), but I don't know how I should pass the parsed data between the different handlers. I could maybe encode it then successively POST it to other handlers, but my gut says that I shouldn't need to HTTP the data back and forth within my app—it seems terribly inefficient (my gut is also hungry...).
In fairly broad swaths (or maybe a link to an example), how should my handlers handle this?
Later thoughts (edit)
My very rough idea now is to have the form submitted to a page that 1) enters the subsequent query into a database (datastore?) keyed to some hash, then uses that to 2) grab and parse all the data. The parsed data would be stored in memory (memcache?) for near-immediate use to graph it and/or process it into a variety of tabular formats for download. The script that does said parsing redirects to a unique URL based on the hash which can be referred to in order to get the data.
The thought would be that you could save the URL, then if you visit it later when the data has been lost, it can re-query the source to get it back/update it.
Reasonable? Should I be looking at other things?

Outlook automation / Object Model query (Python) - unique ID for each email?

I've been using the MSDN to build a simple app that automates some Outlook 2010 basics.
It's going well, but I'm just stuck at something simple I think.
My question is this:
I've been able to get objects based on email folders, and even emails, and iterate through them, outputting email subjects as strings, or folder names as strings.
I've been able to get the info into listboxes, but I'm wondering, let's say I want to do something with a specific email I have selected in a listbox, does anybody know if the mailitem object has a property like a unique ID that I could have hidden somewhere or in a SQLite DB that I could use as a reference to do something with said email instead of having to search through the folder again by subject or name ?
The same question kinda applies to what I'm doing to find a specific folder, looping through the inbox folder and if I find the folder by name, then output that folder object. Surely there's a more efficient way to search by name in one step, without looping through folders to find subfolders etc ?
This isn't necessarily a python question, more about how the objects work.
Any help is much appreciated
MSDN Links:
Outlook Object Model Reference
http://msdn.microsoft.com/en-us/library/office/ff870566%28v=office.14%29.aspx
Folders Object
http://msdn.microsoft.com/en-us/library/office/ff870798%28v=office.14%29.aspx
Items Object
http://msdn.microsoft.com/en-us/library/office/ff870897%28v=office.14%29.aspx
MailItem Object
http://msdn.microsoft.com/en-us/library/office/ff870912%28v=office.14%29.aspx

You are probably looking for EntryID.
But please be aware that this ID is only unique/constant per .pst file.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing Gmail Labels into tree structure - python

Related

Python: Lookup List of Emails Across Various URLs, Return Specific Information

Task Automation Help: (no similar entries)

Methods to extract email signature

How do I pass around data between pages in a GAE app?

Outlook automation / Object Model query (Python) - unique ID for each email?

Categories

Resources