Python - SharePoint List - SharePlum - python

Currently my firm is using Excel to pull the contents of a SharePoint List. The list gets pulled in and a set of VBA macros are run to produce the required Excel workbooks. This method provides all the joy associated with working with Excel and VBA.
There is one thing this method does do: It maintains the formatting found in the SharePoint List all the way through to Excel. So Bolding and underlines in the text fields are maintained.
Question: using Python and SharePlum is it possible to maintain the bolding found in SharePoint List text fields. Want to extract the text field from SharePoint List and copy it into an Excel cell.
Have not found any postings on this exact topic.
Thank you for your attention to this matter.
KD

Related

Selenium, paste formatted text on web field

I am working with Python and Selenium trying to create an automation for a website.
I have had no issues previously because I take a text from an excel cell using openpyxl, but for this case, I need to paste a text that contains an embedded html link.
When I copy and paste the text from Word into a sheet it lost the format.
When doing it manually agents copy the text from word and paste it into the web field.
Is there any way to achieve the same, maybe a word library for python or similar?
Thanks in advance for your support.
I used openpyxl to get the text from the cell.
I use selenium sendkeys function to send the text to the input field, I expected the text to retain formatting.

How to embed an XLSX local file into HTML page with Python and Django

For a Python web project (with Django) I developed a tool that generates an XLSX file. For questions of ergonomics and ease for users I would like to integrate this excel on my HTML page.
So I first thought of converting the XLSX to an HTML array, with the xlsx2html python library. It works but since I can’t determine the desired size for my cells or trim the content during conversion, I end up with huge cells and tiny text..
I found an interesting way with the html tag associated with OneDrive to embed an excel window into a web page, but my file being in my code and not on Excel Online I cannot import it like that. Yet the display is perfect and I don’t need the user to interact with this table.
I have searched a lot for other methods but apart from developing a function to browse my file and generate the script of the html table line by line, I have the feeling that I cannot simply use a method to convert or display it on my web page.
I am not accustomed to this need and wonder if there would not be a cleaner method to display an excel file in html.
Does it make sense to develop a function that builds my html table script in str? Or should I find a library that does it? Maybe there is a specific Django library ?
Thank you for your experience

Navigate through a pdf file to find specific pages and extract tabular data from image with python

I've come across an assignment which requires me to extract tabular data from images in a pdf file to neatly formatted dataframes via python code. There are several files to be processed and the relevant pages in all the files the may have different page numbers, hence the sequence of steps for this problem (my assumption) are:
Navigate to relevant section of the pdf
Extract images of the tabular data
Extract data from the images, format and convert to dataframes.
Some google searches resulted in me finding libraries for pdf text extraction, table extraction and more - modular solutions only.
I would appreciate some help in this regard. What packages should I use? Is my approach correct?
Can I get references to any helpful code snippets for similar problems?
page structure of the required tables
This started as a comment. I believe the answer is valid as it is in no way an endorsement of the service. I don't even use it. I know Azure uses SO as well.
This is the stuff of commercial services. You can try Azure Form Recognizer (with which I am not affiliated):
https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer
Here are some python examples of how to use it:
https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/how-to-guides/try-sdk-rest-api?pivots=programming-language-python
The AWS equivalent is Textract https://aws.amazon.com/textract
The Google Cloud version is called Form Parser - see https://cloud.google.com/document-ai/docs/processors-list#processor_form-parser

How to make XLRD read hyperlinks in XLSX cells?

This is not a duplicate although the issue has been raised in this forum in 2011Getting a hyperlink URL from an Excel document, 2013 Extracting Hyperlinks From Excel (.xlsx) with Python and 2014 Getting the URL from Excel Sheet Hyper links in Python with xlrd; there is still no answer.
After some deep dive into the xlrd module, it seems the Data_sheet.hyperlink_map.get((row, col)) item trips because "xlrd cannot read the hyperlink without formatting_info, which is currently not supported for xlsx" per #alecxe at Extracting Hyperlinks From Excel (.xlsx) with Python.
Question: has anyone has made progress with extracting URLs from hyperlinks stored in an excel file. Say, of all the customer data, there is a column of hyperlinks. I was toying with the idea of dumping the excel sheet as an html page and proceed per usual scraping (file on local drive). But that's not a production solution. Supplementary: is there any other module that can extract the url from a .cell(row,col).value() call on the hyperlink-cell. Is there a solution in mechanize? Many thanks.
I had the same problem trying to get the hyperlinks from the cells of a xlsx file. The work around I came up with is simply converting the Excel sheet to xls format, from which I could manage to get the hyperlinks withount any trouble, and once finished the editing, I formatted it back to the original xlsx file.
I don't know if this should work for your specific needs, or if the change of format implies some consecuences I am not aware of, but I think it's worth a try.
I was able to read and use hyperlinks to copy files with openpyxl. It has a cell_obj.hyperlink and cell_obj.hyperlink.target which will grab the link value. I made a list of the cell row col values which had hyperlinks, then appended them to a list and then looped through the list to move the linked files.

Using Python To Access E-mail (Lotus Notes)

Basically, I need to write a Python script that can download all of the attachment files in an e-mail, and then organize them based on their name. I am new to using Python to interact with other applications, so I was wondering if you had any pointers regarding this, mainly how to create a link to Lotus Notes (API)?
You can do this in LotusScript as an export of data. This could be an agent that walks down a view in Notes, selects a document, document attachments could be put into a directory. Then with those objects in the directory(ies) you can run any script you like like a shell script or whatever.
With LotusScript you can grab out meta data or other meaningful text for your directory name. Detach the objects you want from richtext then move to the next document. The view that you travel down will effect the type of documents that you are working with.

Categories