Python get data from custom tab in file's properties - python

I'm strungling with what's seems to be custom metadata tab.
I spend 16 hour trying to write a snippet to read data from this tab 'Phenom-World':
I tried different approach without any success :
Exif
Win32com
os.stat
My attempts are returning information relative to File>Properties>Details (size, résolution, date, etc) but nothing from the tab named 'Phenom-World'.
Any help please ? I'm kind of despair now.
Thanks for your help !
Here's the file, it's a .jpeg
I'm using python 3.8, windows 10
EDIT #1
It seems that the phenom-world tab can only be seen by computers where a software from this same brand is installed.
I send the file 6.jpeg to another computer able to read it, it return exactly the same datas, so those metadata are embeded in the file even through if they are 'hidden' for computer without the phenom software.

Related

What are the core.xxxxx files created by tensorflow?

I've been using Tensorflow 2.0 in Jupyter Lab and I notice that there are files created called core.xxxxx such as :
core.18564
core.21528
core.28128
and more in the same naming convention format.
I've tried to open them in an editor (VI) and when I do my editor crashes (for now at least). These files are between 7 GB to 37 GB each (!!!)
Question: What are these files and what is their purpose?
Note: I've looked on Stackoverflow and to the best of my knowledge, I can't seem to find the answer here. I apologize if this turns out to be a duplicate.
These are linux core dump files because the program is crashing for some unknown reason.
See: https://en.wikipedia.org/wiki/Core_dump for more info.
The number in the file name is process id that created the core file.
They can be removed to recover disk space.

Python-based PDF parser integrated with Zapier

I am working for a company which is currently storing PDF files into a remote drive and subsequently manually inserting values found within these files into an Excel document. I would like to automate the process using Zapier, and make the process scalable (we receive a large amount of PDF files). Would anyone know any applications useful and possibly free for converting PDFs into Excel docs and which integrate with Zapier? Alternatively, would it be possible to create a Python script in Zapier to access the information and store it into an Excel file?
This option came to mind. I'm using google drive as an example, you didn't say what you where using as storage, but Zapier should have an option for it.
Use cloud convert, doc parser (depends on what you want to pay, cloud convert at least gives you some free time per month, so that may be the closest you can get).
Create a zap with this step:
Trigger on new file in drive (Name: Convert new Google Drive files with CloudConvert)
Convert file with CloudConvert
Those are two options by Zapier that I can find. But you could also do it in python from your desktop by following something like this idea. Then set an event controller in windows event manager to trigger an upload/download.
Unfortunately it doesn't seem that you can import JS/Python libraries into zapier, however I may be wrong on that. If you could, or find a way to do so, then just use PDFminer and "Code by Zapier". A technician might have to confirm this though, I've never gotten libraries to work in zaps.
Hope that helps!

Extremely new user to Python. "No module named request" error while trying code to detect image subdomains in a website to extract them to a folder

I may sound rather uninformed writing this, and unfortunately, my current issue may require a very articulate answer to fix. Therefore, I will try to be specific as possible as to ensure that my problem can be concisely understood.
My apologizes for that- as this Python code was merely obtained from a friend of mine who wrote it for me in order to complete a certain task. I myself had had extremely minimal programming knowledge.
Essentially, I am running Python 3.6 on a Mac. I am trying to work out a code that allows Python to scan through a bulk of a particular website's potentially existent subdomains in order to find possibly-existent JPG images files contained within said subdomains, and download any and all of the resulting found files to a distinct folder on my Desktop.
The Setup-
The code itself, named "download.py" on my computer, is written as follows:
import urllib.request
start = int(input("Start range:100000"))
stop = int(input("End range:199999"))
for i in range(start, stop + 1):
filename = str(i).rjust(6, '0') + ".jpg"
url = "http://website.com/Image_" + filename
urllib.request.urlretrieve(url, filename)
print(url)
(Note that the words "website" and "Image" have been substituted for the actual text included in my code).
Before I proceed, perhaps some explanation would be necessary.
Basically, the website in question contains several subdomains that include .JPG images, however, the majority of the exact URLs that allow the user to access these sub-domains are unknown and are a hidden component of the internal website itself. The format is "website.com/Image_xxxxxx.jpg", wherein x indicates a particular digit, and there are 6 total numerical digits by which only when combined to make a valid code pertain to each of the existent images on the site.
So as you can see, I have calibrated the code so that Python will initially search through number values in the aforementioned URL format from 100000 to 199999, and upon discovering any .JPG images attributed to any of the thousands of link combinations, will directly download all existent uncovered images to a specific folder that resides within my Desktop. The aim would be to start from that specific portion of number values, and upon running the code and fetching any images (or not), continually renumbering the code to work my way through all of the possible 6-digit combos until the operation is ultimately a success.
(Possible Side-Issue- Although I am fairly confident that my friend's code is written in a manner so that Python will only download .JPG files to my computer from images that actually do exist on that particular URL, rather than swarming my folder with blank/bare files from every single one of URL attempts regardless of whether that URL happens to be successful or not, I am admittedly not completely certain. If the latter is the case, informing me of a more suitable edit to my code would be tremendously appreciated.)
The Execution-
Right off the bat, the code experienced a large error. I'll list through the series of steps that led to the creation of said error.
#1- Of course, I first copy-pasted the code into a text document, and saved it as "download.py". I saved it inside of a folder named "Images" where I sought the images to be directly downloaded to. I used BBEdit.
#2- I proceeded, in Terminal, to input the commands "cd Desktop/Images" (to account for the file being held within the "Images" folder on my Desktop), followed by the command "Python download.py" (to actually run the code).
As you can see, the error which I obtained following my attempt to run the code was the ImportError: No module named request. Despite me guessing that the answer to solving this is simple, I can legitimately say I have got such minimal knowledge regarding Python that I've absolutely no idea how to solve this.
Hint: Prior to making the download.py file, the folder, and typing the Terminal code the only interactions I made with Python were downloading the program (3.6) and placing it in my toolbar. I'm not even quite sure if I am required to create any additional scripts/text files, or make any additional downloads before a script like this would work and successfully download the resulting images into my "Images" folder as is my desired goal. If I sincerely missed something integral at any point during this long read, hopefully, someone in here can provide a thoroughly detailed explanation as to how to solve my issue.
Finishing statements for those who've managed to stick along this far:
Thank you. I know this is one hell of a read, and I'm getting more tired as I go along. What I hope to get out of this question is
1.) Obviously, what would constitute a direct solution to the "No module named request" Input Error in Terminal. In other words, what I did wrong there or am missing.
2.) Any other helpful information that you know would assist this code, for example, if there is any integral step or condition I've missed or failed to meet that would ultimately cause the entirety of my code to cease to work. If you do see a fault in this, I only ask of you to be specific, as I've not got much experience in the programming world. After all, I know there is a lot of developers out here that are far more informed and experienced than am I. Thanks.
urllib.request is in Python 3 only. When running 'python' on a Mac, you're running Python 2 by default. Try running executing with python3.
python --version
might need to
brew install python3
urllib.request is a Python 3 construct. Most systems run Python 2 as default and this is what you get when you run simply python.
To install Python 3, go to https://brew.sh/ and follow the instructions to install the Hombrew package manager. Then run
brew install python3
python3 download.py

error with Xyce Gnuplot.py

I am using Xyce which is a circuit simulator. I am using it to export a .CSV file and a .prn file. I found Xycegnuplot.py "https://github.com/OpenXyce/Xyce/blob/master/utils/gnuplotXyce.py". I am trying to use it to plot my output variables from Xyce, howver, every time I run gnuplotXyce.py as mentioned by its author I get an error " Import Error" at the "from finblock import findblock" line and I don't know what is that error.
Please help.
Thanks
If you are going to use Xyce, you should probably get the official version from Sandia National Laboratories instead of from the OpenXyce site on github. This version was forked by an anonymous github user, and has not been updated since last fall. Since that update, Sandia released Xyce 6.2 and the OpenXyce creator did not import the new release.
You should also probably join the xyce-users group on googlegroups, where the Xyce developers monitor all questions and try to answer them promptly. It is only by happenstance that I found your question here on stackoverflow.
The "gnuplotXyce.py" script is not really maintained, and might not have been kept working with all the changes that have been made to Xyce since its release. That said, the python script depends on a number of python modules including gnuplot-py which should be available from http://gnuplot-py.sourceforge.net. The "findblock.py" module that you say cannot be found is also present in the "utils" directory of the Xyce source code, alongside gnuplotXyce.py. If you have the whole utils directory downloaded, this error should go away.
I just tried gnuplotXyce.py on a simple netlist with csv output and it didn't work, so my assumption is that the script was not maintained and will need to be fixed.
The script does sort of work if you use the native Xyce standard (.prn) format (i.e. don't specify "format=csv" on your .print line). Unfortunately, it does not leave the window open after it finishes plotting, so it is rather useless. If you use the "--ps" option, though, a correct postscript file will be created that can be viewed in any postscript viewer, or printed on a postscript printer (or through a properly set-up Linux CUPS printer that understands postscript).
The CSV format in Xyce was primarily created in order to allow import into spreadsheets such as Excel or OpenOffice-scalc, which programs have their own plotting utilities.
The ".prn" standard format works well in gnuplot. There is an example of how to use gnuplot to do this display in the document "Using Open Source Schematic Capture Tools With Xyce" on the Sandia Labs Xyce web site (in the documentation and tutorials section).
The official Xyce web site is http://xyce.sandia.gov/

django/python: How to convert pptx/docx formats to PDF using python?

First of all, I agree that this might sound like a question which has already been asked many times in the past. However I couldn't find any answer that was relevant to me in the similar questions so I'll try to be more specific.
I would need to transform PPTX/DOCX files into PDF using Python but I don't have any experience in file format conversion. I have been looking in many places/forums/websites, read a lot of documentation and came across some useful libraries (python-pptx and pyPdf mainly), but I still don't know where to start.
When looking on the Internet, I can see many websites that offer file format conversions as a paying service, even with advanced API's: submit a file via POST and get the transformed PDF file in return. This could work for me, but I am really interested in writing myself the code that does the conversion work from OOXML to PDF.
How would you start doing this? Or is it just impossible on my own?
Thanks for your help!
After some research and with the help of python-pptx's creator, I was able to write to the PowerPoint COM interface using a Virtual Machine.
In case someone reads this thread, this is how I managed to get this done:
- Setup a VM with Microsoft Windows/Office installed on it ;
- Install Python, Django and win32com libraries on the VM.
The files are sent locally from the original Django project to the virtual machine (which are on the same network) through a simple POST request. The file is converted on the VM using win32com.client (which is just a simple call to the win32com.client library) and then sent back as a response to the original Django view, which in turn processes the response.
Note: it took me some time to realize I needed to use the #csrf_exempt decorator for this setup to work.

Categories