Parse .docx in python 3 - python

I am currently writing a python 3 program that parses through certain docx files and extracts the text and images from them. I have been trying to use docx but it will not import into my program. I have installed lxml, Pillow, and python-docx yet it does not import. When I try to use python-docx from the terminal I cannot use example-extracttext.py or example-makedocument.py which brings me to believe that the installation didn't run properly. Is there a way I can check if this installed correctly or is there a way to get this working properly so I can import it into my project? I am on Ubuntu 13.10.

I recommend you try the latest version of python-docx which is installed like this:
$ pip install python-docx
Documentation is available here: http://python-docx.readthedocs.org/
Installation should result in a message that looks successful. It's possible you'll need to install using sudo to temporarily assume root privileges:
$ sudo pip install python-docx
After installation you should be able to do the following in the Python interpreter:
>>> from docx import Document
>>>
If instead you get something like this, the install didn't go properly:
>>> from docx import Document
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named docx
As you can provide more feedback on your attempts I can elaborate the answer.
Note that after v0.2.x the python-docx package was rewritten. The API of v0.3.x+ is different as well as the package name and repository location. All further development will be on the new version. If you're just starting out with the package going with the latest is probably a good idea as the old one will just be receiving legacy support going forward.
Also, the Python 3 support was added with v0.3.0. Prior versions are not Python 3 compatible.

you can solve your import problem by first uninstalling the existant installation and then install it with pip3. solved my problem with this
pip uninstall python-docx
pip3 install python-docx

Use Command- sudo pip install --pre python-docx for the latest version of python-docx.

Related

ModuleNotFoundError: No module named 'ruamel'

I'm using a Kubernetes inventory builder script found here: https://github.com/kubernetes-sigs/kubespray/blob/master/contrib/inventory_builder/inventory.py
On line 36, the ruamel YML library is imported using the code from ruamel.yaml import YAML. This library can be found here: https://pypi.org/project/ruamel.yaml/
On my OSX device (Mojave 10.14.3), if I run pip list, I can clearly see the most up to date version of ruamel.yaml:
If I run pip show ruamel.yaml, I get the following output:
I'm running the script with this command: CONFIG_FILE=inventory/mycluster/hosts.ini python3 contrib/inventory_builder/inventory.py 10.0.0.1 10.0.0.2 10.0.0.4 10.0.0.5
Bizarrely, it returns the following error:
Traceback (most recent call last):
File "contrib/inventory_builder/inventory.py", line 36, in <module>
from ruamel.yaml import YAML
ModuleNotFoundError: No module named 'ruamel'
I have very little experience with Python, so don't understand how this could be failing. Have I installed the library incorrectly or something? From the documentation on the ruamel.yml project page, it looks like the script is calling the library as it should be.
Thanks in advance
In my case, I was installing this with pip3 install ruamel.yaml, and it was puting the package in /usr/local/lib/python3.9/site-packages/, but the python3 binary on the machine was pinned to Python 3.7, so trying to import that module was sending the ModuleNotFoundError message.
What helped to fix this, was to install the module with python3 -m pip install ruamel.yaml, running pip via the python3 binary makes sure it runs on the same version, in this case 3.7, and gets installed via the correct version number site-packages.
pip is set to point to the Python 2 installation. To install the library under Python 3, do pip3 install ruamel.yml.
you're using python 3 and want to use the package that is with python 2. Go to the directory where your python 3 is, navigate to Scripts and use the pip in there to install the needed library.
This helped me (adding version number to python):
CONFIG_FILE=inventory/mycluster/hosts.yaml python3.6 contrib/inventory_builder/inventory.py ${IPS[#]}
[python 3.10.x].
There is no package called ruamel.yaml
what worked is pip install ruamel-yaml

I want to check the grammatical mistakes of user data by using NLTK (python)?

I have download language-check package and pasted in D:\Lib\site-packages\nltk but when I type the code over python interpreter as:
import language_check
It gives me an error: no module named language_check.
You need to properly install the package first, for example using
pip install language_check
here are a couple of links to help you:
https://docs.python.org/3/installing/
How do I install Python packages on Windows?

Python cannot import openpyxl

I am running Windows 7 and using Python 2.7.
I have installed openpyxl using easy_install. It looks like the installation was successful. I changed the directory and fired up Python.
>>> import openpyxl
>>>
So, this should mean that Python is able to find openpyxl. However, when I execute a simple test program excell_tutorial1.py and run it, I get the following:
Traceback (most recent call last):
File "C:/Python27/playground/excell_tutorial1.py", line 7, in <module>
from openpyxl import Workbook
ImportError: No module named openpyxl
Very confusing! It could find it in prompt line but not in the program!
import os, sys
the_module ="C:\\Python27\\Lib\\site-packages\\openpyxl-2.3.3-py2.7.egg\\openpyxl"
if the_module not in sys.path:
sys.path.append(the_module)
if the_module in sys.path:
print sys.path.index(the_module)
print sys.path[18]
so, this gives me:
18
C:\Python27\Lib\site-packages\openpyxl-2.3.3-py2.7.egg\openpyxl
Anyone can think of what the problem might be?
Much appreciated
I had the same problem solved using instead of pip or easy install one of the following commands :
sudo apt-get install python-openpyxl
sudo apt-get install python3-openpyxl
The sudo command also works better for other packages.
While not quite what you ran into here (since you state that you are using python 2.7), for those who run into this issue and are using python 3, you may be unintentionally installing to python 2 instead. To force the install to python 3 (instead of 2) use pip3 instead.
See this thread for more info:
No module named 'openpyxl' - Python 3.4 - Ubuntu
Try deleting all openpyxl material from C:\Python27\Lib\site-packages\
Once you do that try reinstalling it using pip. (This what worked for me)
At times this can be a simple permission issue. As it was in my case. I installed it in my local directory with my login.
python ./setup.py install
but some of the other user were not able to import the module.
They were getting this error:
ImportError: No module named openpyxl
Hence I simply gave exe permission to 'others'
chmod -R 755
That solves the problem at least in my case.
Go to the directory where pip is installed, for eg.C:\Python27\Scripts and open cmd (simply type cmd in address bar ). Now run the command "pip install openpyxl". It will install the package itself. Hope this will solve your problem.
Try this:
!pip install openpyxl
I had the same issue on 3.8.2
I found out that python was installed in two locations on my machine (probably py and python, just a guess)
Here:
C:\Users<userAccount>\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8\LocalCache\local-packages\Python38
and Here:
C:\Python38
I deleted the one in my C drive and everything is working well now. I would double check to see where your packages are getting installed first, before deleting. Which ever one is being used, keep that one.
For this case, check to see where this package got installed:
C:\Users\<userAccount>\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8\LocalCache\local-packages\Python38\site-packages\openpyxl
keep that directory.
What worked for me was to open the terminal as an administrator, cd to the 'scripts' file of where python (different for each version) is stored, and then install using pip:
cd C:\Users\Salfa\AppData\Local\Programs\Python\Python39\Scripts
pip install openpyxl
This resolved the problem for me.

When import docx in python3.3 I have error ImportError: No module named 'exceptions'

when I import docx I have this error:
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/docx-0.2.4-py3.3.egg/docx.py", line 30, in <module>
from exceptions import PendingDeprecationWarning
ImportError: No module named 'exceptions'
How to fix this error (python3.3, docx 0.2.4)?
If you are using python 3x don't do pip install docx instead go for
pip install python-docx
It is compatible with python 3.x
Official Documentation available here: https://pypi.org/project/python-docx/
When want to use import docx, be sure to install python-docx, not docx.You can install the module by running pip install python-docx.
The installation name docx is for a different module
However,
when you are going to import the python-docx module,
you’ll need to run
import docx, not import python-docx.
if still you want to use docx module then:
First of all, you will need to make sure that the docx module is installed.
If not then simply run pip install docx.
If it shows '*requirement already satisfied*'
then the solution is :
Go to the library to find docx.py file,
you'll need to go to directory where you installed python then \Lib\site-packages\ and find docx.py file
Open docx.py file in text editor and find this code
from exceptions import PendingDeprecationWarning
Replace the above code with
try:
from exceptions import PendingDeprecationWarning
except ImportError:
pass
Save the file
Now you can run import docx module in Python 3.x without any problem
Uninstall docx module with pip uninstall docx
Download python_docx-0.8.6-py2.py3-none-any.whl file from http://www.lfd.uci.edu/~gohlke/pythonlibs/
Run pip install python_docx-0.8.6-py2.py3-none-any.whl to reinstall docx.
This solved the above import error smoothly for me.
If you're using python 3.x, Make sure you have both python-docx & docx installed.
Installing python-docx :
pip install python-docx
Installing docx :
pip install docx
You may be install docx, not python-docx
You can see this for install python-docx
http://python-docx.readthedocs.io/en/latest/user/install.html#install
In Python 3 exceptions module was removed and all standard exceptions were moved to builtin module. Thus meaning that there is no more need to do explicit import of any standard exceptions.
copied from
pip install python-docx
this worked for me, try installing with admin mode
The problem, as was noted previously in comments, is the docx module was not compatible with Python 3. It was fixed in this pull-request on github: https://github.com/mikemaccana/python-docx/pull/67
Since the exception is now built-in, the solution is to not import it.
docx.py
## -27,7 +27,12 ##
except ImportError:
TAGS = {}
-from exceptions import PendingDeprecationWarning
+# Handle PendingDeprecationWarning causing an ImportError if using Python 3
+try:
+ from exceptions import PendingDeprecationWarning
+except ImportError:
+ pass
+
from warnings import warn
import logging
I had the same problem, but pip install python-docx worked for me, I'm using python 3.7.1
You need to make it work with python3.
sudo pip3 install python-docx
This installation worked for me in Python3 without any further additions.
python3
>> import docx
PS: Note that the 'pip install python-docx' or apt-get python3-docx are not useful.
I had the same problem, after installing docx module, got a bunch of errors regarding docx, .oxml and lxml...seems that for my case the package was installed in this folder:
C:\Program Files\Python3.7\Lib\site-packages
and I moved it one step back, to:
C:\Program Files\Python3.7\Lib
and this solved the issue.
Had faced similar issue and found a work around have posted answer aswell under similar question find its link :https://stackoverflow.com/a/74609166/17385292
python imports exceptions module by itself
comment out import line. it worked for me.

How do I install Python packages/ wxPython on Mac OSX?

I'm very new to python and any non-basic computer functions in general, but I'm having a very basic problem and I can't figure out how to fix it. Any time I download a module from the internet and try to import it in python, I get an error message. For example, I just downloaded wxPython after being instructed to do so on a tutorial program for Python I've been using, and after entering "import wx" I got:
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
import wx
ImportError: No module named wx
How do I fix this so that python can find modules I download?
Thanks!!
Python version 2.7.3, and I downloaded wxPython from the download link on the website. Another thing I noticed: whenever I type in python setup.py install in the Terminal, I get:
/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
can't open file 'setup.py': [Errno 2] No such file or directory
Which seems to be another huge problem?
There are a few things you need to do to actually debug this:
Run python and check what version you are running.
type where python and figure out if you have multiple versions of python running at once.
With pip or easy_install, read the output to check where they are installing packages. It's somewhat likely that they are installing to the system-wide Python 2.6, as opposed to the version that you want it to be installed to (Python 2.7).
If you find any packages installed in the wrong place with 3, uninstall them with pip uninstall <packagename> and then specifically reinstall them to 2.7 with easy_install-2.7 or pip-2.7 install. If you don't see the option for easy_install-2.7 or pip-2.7, you need to install distribute and run its setup.py file with the specific version of Python you are using.
Make sure you are actually in the directory when running setup.py. For example, to install distribute, you need to cd into the appropriate directory to install.
Finally, a separate note: it's far easier to install packages with easy_install or pip, as opposed to downloading them separately. You should try doing that first. Again, distribute has more info.

Categories