extract data from gmail add to spreadsheet- Google apps script

extract data from gmail add to spreadsheet- Google apps script - python

I have searched, copied and modified code, and tried to break down what others have done and I still can't get this right.
I have email receipts for an ecommerce webiste, where I am trying to harvest particular details from each email and save to a spreadsheet with a script.
Here is the entire script as I have now.
function menu(e) {
var ui = SpreadsheetApp.getUi();
ui.createMenu('programs')
.addItem('parse mail', 'grabreceipt')
.addToUi();
}
function grabreceipt() {
var ss = SpreadsheetApp.getActiveSheet();
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = ss.getSheetByName("Sheet1");
var threads = GmailApp.search("(subject:order receipt) and (after:2016/12/01)");
var a=[];
for (var i = 0; i<threads.length; i++)
{
var messages = threads[i].getMessages();
for (var j=0; j<messages.length; j++)
{
var messages = GmailApp.getMessagesForThread(threads[i]);
for (var j = 0; j < messages.length; j++) {
a[j]=parseMail(messages[j].getPlainBody());
}
}
var nextRow=s.getDataRange().getLastRow()+1;
var numRows=a.length;
var numCols=a[0].length;
s.getRange(nextRow,1,numRows,numCols).setValues(a);
}
function parseMail(body) {
var a=[];
var keystr="Order #,Subtotal:,Shipping:,Total:";
var keys=keystr.split(",");
var i,p,r;
for (i in keys) {
//p=keys[i]+(/-?\d+(,\d+)*(\.\d+(e\d+)?)?/);
p=keys[i]+"[\r\n]*([^\r^\n]*)[\r\n]";
//p=keys[i]+"[\$]?[\d]+[\.]?[\d]+$";
r=new RegExp(p,"m");
try {a[i]=body.match(p)[1];}
catch (err) {a[i]="no match";}
}
return a;
}
}
So the email data to pluck from comes as text only like this:
Order #89076
(body content, omitted)
Subtotal: $528.31
Shipping: $42.66 via Priority Mail®
Payment Method: Check Payment- Money order
Total: $570.97
Note: mywebsite order 456. Customer asked about this and that... etc.
The original code regex was designed to grab content, following the keystr values which were easily found on their own line. So this made sense:
p=keys[i]+"[\r\n]*([^\r^\n]*)[\r\n]";
This works okay, but results where the lines include more data that follows as in line Shipping: $42.66 via Priority Mail®.
My data is more blended, where I only wish to take numbers, or numbers and decimals. So I have this instead which validates on regex101.com
p=keys[i]+"[\$]?[\d]+[\.]?\d+$";
The expression only, [\$]?[\d]+[.]?\d+$ works great but I still get "no match" for each row.
Additionally, within this search there are 22 threads returned, and it populates 39 rows in the spreadsheet. I can not figure out why 39?

The reason for your regex not working like it should is because you are not escaping the "\" in the string you use to create the regex
So a regex like this
"\s?\$?(\d+\.?\d+)"
needs to be escaped like so:
"\\s?\\$?(\\d+\\.?\\d+)"
The below code is just modified from your parseEmail() to work as a snippet here. If you copy this to your app script code delete document.getElementById() lines.
Your can try your example in the snippet below it will only give you the numbers.
function parseMail(body) {
if(body == "" || body == undefined){
var body = document.getElementById("input").value
}
var a=[];
var keystr="Order #,Subtotal:,Shipping:,Total:";
var keys=keystr.split(",");
var i,p,r;
for (i in keys) {
p=keys[i]+"\\s?\\$?(\\d+\\.?\\d+)";
r=new RegExp(p,"m");
try {a[i]=body.match(p)[1];}
catch (err) {a[i]="no match";}
}
document.getElementById("output").innerHTML = a.join(";")
return a;
}
<textarea id ="input"></textarea>
<div id= "output"></div>
<input type = "button" value = "Parse" onclick = "parseMail()">
Hope that helps

Related

Python InDesign scripting: Get overflowing textbox from preflight for automatic resizing

Thanks to this great answer I was able to figure out how to run a preflight check for my documents using Python and the InDesign script API. Now I wanted to work on automatically adjusting the text size of the overflowing text boxes, but was unable to figure out how to retrieve a TextBox object from the Preflight object.
I referred to the API specification, but all the properties only seem to yield strings which do not uniquely define the TextBoxes, like in this example:
Errors Found (1):
Text Frame (R=2)
Is there any way to retrieve the violating objects from the Preflight, in order to operate on them later on? I'd be very thankful for additional input on this matter, as I am stuck!

If all you need is to find and to fix the overset errors I'd propose this solution:
Here is the simple Extendscript to fix the text overset error. It decreases the font size in the all overflowed text frames in active document:
var doc = app.activeDocument;
var frames = doc.textFrames.everyItem().getElements();
var f = frames.length
while(f--) {
var frame = frames[f];
if (frame.overflows) resize_font(frame)
}
function resize_font(frame) {
app.scriptPreferences.enableRedraw = false;
while (frame.overflows) {
var texts = frame.parentStory.texts.everyItem().getElements();
var t = texts.length;
while(t--) {
var characters = texts[t].characters.everyItem().getElements();
var c = characters.length;
while (c--) characters[c].pointSize = characters[c].pointSize * .99;
}
}
app.scriptPreferences.enableRedraw = true;
}
You can save it in any folder and run it by the Python script:
import win32com.client
app = win32com.client.Dispatch('InDesign.Application.CS6')
doc = app.Open(r'd:\temp\test.indd')
profile = app.PreflightProfiles.Item('Stackoverflow Profile')
print('Profile name:', profile.name)
process = app.PreflightProcesses.Add(doc, profile)
process.WaitForProcess()
errors = process.processResults
print('Errors:', errors)
if errors[:4] != 'None':
script = r'd:\temp\fix_overset.jsx' # <-- here is the script to fix overset
print('Run script', script)
app.DoScript(script, 1246973031) # run the jsx script
# 1246973031 --> ScriptLanguage.JAVASCRIPT
# https://www.indesignjs.de/extendscriptAPI/indesign-latest/#ScriptLanguage.html
process = app.PreflightProcesses.Add(doc, profile)
process.WaitForProcess()
errors = process.processResults
print('Errors:', errors) # it should print 'None'
if errors[:4] == 'None':
doc.Save()
doc.Close()
input('\nDone... Press <ENTER> to close the window')

Thanks to the exellent answer of Yuri I was able solve my problem, although there are still some shortcomings.
In Python, I load my documents and check if there are any problems detected during the preflight. If so, I move on to adjusting the text frames.
myDoc = app.Open(input_file_path)
profile = app.PreflightProfiles.Item(1)
process = app.PreflightProcesses.Add(myDoc, profile)
process.WaitForProcess()
results = process.processResults
if "None" not in results:
# Fix errors
script = open("data/script.jsx")
app.DoScript(script.read(), 1246973031, variables.resize_array)
process.WaitForProcess()
results = process.processResults
# Check if problems were resolved
if "None" not in results:
info_fail(card.name, "Error while running preflight")
myDoc.Close(1852776480)
return FLAG_PREFLIGHT_FAIL
I load the JavaScript file stored in script.jsx, that consists of several components. I start by extracting the arguments and loading all the pages, since I want to handle them individually. I then collect all text frames on the page in an array.
var doc = app.activeDocument;
var pages = doc.pages;
var resizeGroup = arguments[0];
var condenseGroup = arguments[1];
// Loop over all available pages separately
for (var pageIndex = 0; pageIndex < pages.length; pageIndex++) {
var page = pages[pageIndex];
var pageItems = page.allPageItems;
var textFrames = [];
// Collect all TextFrames in an array
for (var pageItemIndex = 0; pageItemIndex < pageItems.length; pageItemIndex++) {
var candidate = pageItems[pageItemIndex];
if (candidate instanceof TextFrame) {
textFrames.push(candidate);
}
}
What I wanted to achieve was a setting where if one of a group of text frames was overflowing, the text size of all the text frames in this group are adjusted as well. E.g. text frame 1 overflows when set to size 8, no longer when set to size 6. Since text frame 1 is in the same group as text frame 2, both of them will be adjusted to size 6 (assuming the second frame does not overflow at this size).
In order to handle this, I pass an array containing the groups. I now check if the text frame is contained in one of these groups (which is rather tedious, I had to write my own methods since InDesign does not support modern functions like filter() as far as I am concerned...).
// Check if TextFrame overflows, if so add all TextFrames that should be the same size
for (var textFrameIndex = 0; textFrameIndex < textFrames.length; textFrameIndex++) {
var textFrame = textFrames[textFrameIndex];
// If text frame overflows, adjust it and all the frames that are supposed to be of the same size
if (textFrame.overflows) {
var foundResizeGroup = filterArrayWithString(resizeGroup, textFrame.name);
var foundCondenseGroup = filterArrayWithString(condenseGroup, textFrame.name);
var process = false;
var chosenGroup, type;
if (foundResizeGroup.length > 0) {
chosenGroup = foundResizeGroup;
type = "resize";
process = true;
} else if (foundCondenseGroup.length > 0) {
chosenGroup = foundCondenseGroup;
type = "condense";
process = true;
}
if (process) {
var foundFrames = findTextFramesFromNames(textFrames, chosenGroup);
adjustTextFrameGroup(foundFrames, type);
}
}
}
If this is the case, I adjust either the text size or the second axis of the text (which condenses the text for my variable font). This is done using the following functions:
function adjustTextFrameGroup(resizeGroup, type) {
// Check if some overflowing textboxes
if (!someOverflowing(resizeGroup)) {
return;
}
app.scriptPreferences.enableRedraw = false;
while (someOverflowing(resizeGroup)) {
for (var textFrameIndex = 0; textFrameIndex < resizeGroup.length; textFrameIndex++) {
var textFrame = resizeGroup[textFrameIndex];
if (type === "resize") decreaseFontSize(textFrame);
else if (type === "condense") condenseFont(textFrame);
else alert("Unknown operation");
}
}
app.scriptPreferences.enableRedraw = true;
}
function someOverflowing(textFrames) {
for (var textFrameIndex = 0; textFrameIndex < textFrames.length; textFrameIndex++) {
var textFrame = textFrames[textFrameIndex];
if (textFrame.overflows) {
return true;
}
}
return false;
}
function decreaseFontSize(frame) {
var texts = frame.parentStory.texts.everyItem().getElements();
for (var textIndex = 0; textIndex < texts.length; textIndex++) {
var characters = texts[textIndex].characters.everyItem().getElements();
for (var characterIndex = 0; characterIndex < characters.length; characterIndex++) {
characters[characterIndex].pointSize = characters[characterIndex].pointSize - 0.25;
}
}
}
function condenseFont(frame) {
var texts = frame.parentStory.texts.everyItem().getElements();
for (var textIndex = 0; textIndex < texts.length; textIndex++) {
var characters = texts[textIndex].characters.everyItem().getElements();
for (var characterIndex = 0; characterIndex < characters.length; characterIndex++) {
characters[characterIndex].setNthDesignAxis(1, characters[characterIndex].designAxes[1] - 5)
}
}
}
I know that this code can be improved upon (and am open to feedback), for example if a group consists of multiple text frames, the procedure will run for all of them, even though it need only be run once. I was getting pretty frustrated with the old JavaScript, and the impact is negligible. The rest of the functions are also only helper functions, which I'd like to replace with more modern version. Sadly and as already stated, I think that they are simply not available.
Thanks once again to Yuri, who helped me immensely!

How to get two RichText features to be mutually exclusive

So basically I've added two custom features for coloring text to a RichTextBlock, and I'd like to make them so selecting one for a portion of text would automatically unselect the other color button, much like it's already the case for h tags.
I've searched for a bit but didn't find much, so I guess I could use some help, be it advice, instruction or even code.
My features go like this :
#hooks.register('register_rich_text_features')
def register_redtext_feature(features):
feature_name = 'redtext'
type_ = 'RED_TEXT'
tag = 'span'
control = {
'type': type_,
'label': 'Red',
'style': {'color': '#bd003f'},
}
features.register_editor_plugin(
'draftail', feature_name, draftail_features.InlineStyleFeature(control)
)
db_conversion = {
'from_database_format': {tag: InlineStyleElementHandler(type_)},
'to_database_format': {
'style_map': {
type_: {'element': tag, 'props': {'class': 'text-primary'}}
}
},
}
features.register_converter_rule(
'contentstate', feature_name, db_conversion
)
The other one is similar but color is different.

This is possible, but it requires jumping through many hoops in Wagtail. The h1…h6 tags work like this out of the box because they are block-level formatting – each block within the editor can only be of one type. Here you’re creating this RED_TEXT formatting as inline formatting ("inline style"), which, intentionally supports multiple formats being applied to the same text.
If you want to achieve this mutually exclusive implementation anyway – you’ll need to write custom JS code to auto-magically remove the desired styles from the text when attempting to add a new style.
Here is a function that does just that. It goes through all of the characters in the user’s selection, and removes the relevant styles from them:
/**
* Remove all of the COLOR_ styles from the current selection.
* This is to ensure only one COLOR_ style is applied per range of text.
* Replicated from https://github.com/thibaudcolas/draftjs-filters/blob/f997416a0c076eb6e850f13addcdebb5e52898e5/src/lib/filters/styles.js#L7,
* with additional "is the character in the selection" logic.
*/
export const filterColorStylesFromSelection = (
content: ContentState,
selection: SelectionState,
) => {
const blockMap = content.getBlockMap();
const startKey = selection.getStartKey();
const endKey = selection.getEndKey();
const startOffset = selection.getStartOffset();
const endOffset = selection.getEndOffset();
let isAfterStartKey = false;
let isAfterEndKey = false;
const blocks = blockMap.map((block) => {
const isStartBlock = block.getKey() === startKey;
const isEndBlock = block.getKey() === endKey;
isAfterStartKey = isAfterStartKey || isStartBlock;
isAfterEndKey = isAfterEndKey || isEndBlock;
const isBeforeEndKey = isEndBlock || !isAfterEndKey;
const isBlockInSelection = isAfterStartKey && isBeforeEndKey;
// Skip filtering through the block chars if out of selection.
if (!isBlockInSelection) {
return block;
}
let altered = false;
const chars = block.getCharacterList().map((char, i) => {
const isAfterStartOffset = i >= startOffset;
const isBeforeEndOffset = i < endOffset;
const isCharInSelection =
// If the selection is on a single block, the char needs to be in-between start and end offsets.
(isStartBlock &&
isEndBlock &&
isAfterStartOffset &&
isBeforeEndOffset) ||
// Start block only: after start offset
(isStartBlock && !isEndBlock && isAfterStartOffset) ||
// End block only: before end offset.
(isEndBlock && !isStartBlock && isBeforeEndOffset) ||
// Neither start nor end: just "in selection".
(isBlockInSelection && !isStartBlock && !isEndBlock);
let newChar = char;
if (isCharInSelection) {
char
.getStyle()
.filter((type) => type.startsWith("COLOR_"))
.forEach((type) => {
altered = true;
newChar = CharacterMetadata.removeStyle(newChar, type);
});
}
return newChar;
});
return altered ? block.set("characterList", chars) : block;
});
return content.merge({
blockMap: blockMap.merge(blocks),
});
};
This is taken from the Draftail ColorPicker demo, which you can see running in the Draftail Storybook’s "Custom formats" example.
To implement this kind of customisation in Draftail, you’d need to use the controls API. Unfortunately that API isn’t currently supported out of the box in Wagtail’s integration of the editor (see wagtail/wagtail#5580), so at the moment in order for this to work you’d need to customize Draftail’s initialisation within Wagtail as well.

Not able to pass arguments with spaces to python script from c#

I am calling python script from c# using ProcessInfoStart method. As an argument it receives JSON and is input to python script.
It works fine it we pass JSON without having any spaces but if there is any space then original JSON is splitted till space and passes as argument and rest ignored
public static bool ExecutePythonScript(string jRequest, string fileType)
{
string pythonExePath = Convert.ToString(ConfigurationManager.AppSettings["PythonExe"]);
bool bIsExecutionSuccess = true;
try
{
var psi = new ProcessStartInfo();
psi.FileName = pythonExePath;
var script = #"C:Scripts\pdf-to-csv.py";
psi.Arguments = $"\"{script}\" \"{jRequest}\"";
psi.UseShellExecute = false;
psi.CreateNoWindow = true;
psi.RedirectStandardOutput = true;
psi.RedirectStandardError = true;
var errors = "";
var results = "";
using (var process = Process.Start(psi))
{
errors = process.StandardError.ReadToEnd();
results = process.StandardOutput.ReadToEnd();
}
if (!string.IsNullOrEmpty(errors))
bIsExecutionSuccess = false;
}
catch(Exception ex)
{
bIsExecutionSuccess = false;
}
return bIsExecutionSuccess;
}
Python script to accept arguments
input_params = sys.argv[1]
input_params = input_params.replace("'",'"')
data_params = json.loads(input_params)
Is there a way i can pass jRequest with spaces to python script.

Python script parameters can be wrapped in single quotes in order to read the whole string including spaces.
Try wrapping the JSON string in single quotes.

How do I access fields in an active Revit schedule via RevitPythonShell/IronPython?

I'm working on an IronPython script for Revit 2016. For starters, I'm trying to access values (as text) in an active Revit schedule, and load them into a variable. This works well enough for non-calculated values.
However, some of my schedule fields are calculated. Here's a sample schedule (all values here are calculated):
Schedule Snippet
The Revit API shows 2 methods, called TableView.GetCalculatedValueName()and TableView.GetCalculatedValueText(), which I'd like to use, but don't seem to work as advertised.
doc = __revit__.ActiveUIDocument.Document
uidoc = __revit__.ActiveUIDocument
schedule = doc.ActiveView
tableData = schedule.GetTableData()
print(tableData)
tableName = schedule.GetCellText(SectionType.Header,0,0)
qty = schedule.GetCalculatedValueText(SectionType.Body,4,1)
calcValName = schedule.GetCalculatedValueName(SectionType.Body,4,1)
print(tableName)
print("Calculated Qty is: " + qty)
print("Calculated Value Name is: " + calcValName)
Running this code (in Revit) produces the following output:
88-06134-01
Calculated Qty is:
Calculated Value Name is:
I'd like to point out that using TableView.GetCellText() actually works on calculated values, but it's the GetCalculatedValueName() that I'd really like to make work here.

I have done the same thing but in c# for Revit 2019. I hope you will understand it.
You can access the values of schedule data without exporting. Firstly, get all the schedules and read the data cell by cell. Secondly, create dictionary and store data in form of key, value pairs. Now you can use the schedule data as you want. I have tried this in Revit 2019.
Here is the implementation.
public void getScheduleData(Document doc)
{
FilteredElementCollector collector = new FilteredElementCollector(doc);
IList<Element> collection = collector.OfClass(typeof(ViewSchedule)).ToElements();
foreach (Element e in collection)
{
ViewSchedule viewSchedule = e as ViewSchedule;
TableData table = viewSchedule.GetTableData();
TableSectionData section = table.GetSectionData(SectionType.Body);
int nRows = section.NumberOfRows;
int nColumns = section.NumberOfColumns;
if (nRows > 1)
{
List<List<string>> scheduleData = new List<List<string>>();
for (int i = 0; i < nRows; i++)
{
List<string> rowData = new List<string>();
for (int j = 0; j < nColumns; j++)
{
rowData.Add(viewSchedule.GetCellText(SectionType.Body, i, j));
}
scheduleData.Add(rowData);
}
List<string> columnData = scheduleData[0];
scheduleData.RemoveAt(0);
DataMapping(columnData, scheduleData);
}
}
}
public static void DataMapping(List<string> keyData, List<List<string>>valueData)
{
List<Dictionary<string, string>> items= new List<Dictionary<string, string>>();
string prompt = "Key/Value";
prompt += Environment.NewLine;
foreach (List<string> list in valueData)
{
for (int key=0, value =0 ; key< keyData.Count && value< list.Count; key++,value++)
{
Dictionary<string, string> newItem = new Dictionary<string, string>();
string k = keyData[key];
string v = list[value];
newItem.Add(k, v);
items.Add(newItem);
}
}
foreach (Dictionary<string, string> item in items)
{
foreach (KeyValuePair<string, string> kvp in item)
{
prompt += "Key: " + kvp.Key + ",Value: " + kvp.Value;
prompt += Environment.NewLine;
}
}
Autodesk.Revit.UI.TaskDialog.Show("Revit", prompt);
}

Django - How to display Scientific Notation on Admin Page Field?

I have a field in my admin page that I'd like to display in Scientific Notation.
Right now it displays something ugly like this. How can I display this as 4.08E+13?
Right now I'm using a standard Decimal field in the model.
Any advice is greatly appreciated.
I'm on Django 1.2.

You have to use %e to get the scientific notation format:
Basic Example:
x = 374.534
print("%e" % x)
# 3.745340e+02
Precision of 2
x = 374.534
print("{0:.2E}".format(x))
# 3.75E+02
x = 12345678901234567890.534
print("{0:.2E}".format(x))
# 1.23E+19
Precision of 3
print("{0:.3E}".format(x))
# 1.235E+19

Well, here's a work around since I can't figure out how to do this within the Django Python code. I have the admin pages run some custom javascript to do the conversion after the page is loaded.
Details:
Create this javascript file called "decimal_to_sci_not.js" and place it in your media directory:
/*
Finds and converts decimal fields > N into scientific notation.
*/
THRESHOLD = 100000;
PRECISION = 3;
function convert(val) {
// ex. 100000 -> 1.00e+5
return parseFloat(val).toPrecision(PRECISION);
}
function convert_input_fields() {
var f_inputs = django.jQuery('input');
f_inputs.each(function (index, domEl) {
var jEl = django.jQuery(this);
var old_val = parseFloat(jEl.val());
if (old_val >= THRESHOLD) {
jEl.val(convert(old_val));
}
});
}
function convert_table_cells() {
//Look through all td elements and replace the first n.n number found inside
//if greater than or equal to THRESHOLD
var cells = django.jQuery('td');
var re_num = /\d+\.\d+/m; //match any numbers w decimal
cells.each(function (index, domEl) {
var jEl = django.jQuery(this);
var old_val_str = jEl.html().match(re_num);
var old_val = parseFloat(old_val_str);
if (old_val >= THRESHOLD) {
jEl.html(jEl.html().replace(old_val_str,convert(old_val)));
}
});
}
django.jQuery(document).ready( function () {
convert_input_fields();
convert_table_cells();
});
Then update your admin.py code classes to include the javascript file:
class MyModel1Admin(admin.ModelAdmin):
class Media:
js = ['/media/decimal_to_sci_not.js']
admin.site.register(MyModel1,MyModel1Admin)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

extract data from gmail add to spreadsheet- Google apps script - python

Related

Python InDesign scripting: Get overflowing textbox from preflight for automatic resizing

How to get two RichText features to be mutually exclusive

Not able to pass arguments with spaces to python script from c#

How do I access fields in an active Revit schedule via RevitPythonShell/IronPython?

Django - How to display Scientific Notation on Admin Page Field?

Categories

Resources