gensim file not found error - python

I am executing the following line:
id2word = gensim.corpora.Dictionary.load_from_text('wiki_en_wordids.txt')
This code is available at "https://radimrehurek.com/gensim/wiki.html". I downloaded the wikipedia corpus and generated the required files and wiki_en_wordids.txt is one of those files. This file is available in the following location:
~/gensim/results/wiki_en
So when i execute the code mentioned above I get the following error:
Traceback (most recent call last):
File "~\Python\Python36-32\temp.py", line 5, in <module>
id2word = gensim.corpora.Dictionary.load_from_text('wiki_en_wordids.txt')
File "~\Python\Python36-32\lib\site-packages\gensim\corpora\dictionary.py", line 344, in load_from_text
with utils.smart_open(fname) as f:
File "~\Python\Python36-32\lib\site-packages\smart_open\smart_open_lib.py", line 129, in smart_open
return file_smart_open(parsed_uri.uri_path, mode)
File "~\Python\Python36-32\lib\site-packages\smart_open\smart_open_lib.py", line 613, in file_smart_open
return open(fname, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'wiki_en_wordids.txt'
Even though the file is available in the required location I get that error. Should I place the file in any other location? How do I determine what the right location is?

The code requires an absolute path here. Relative path should be used when entire operation is carried out in the same directory location, but in this case, the file name is passed as argument to some other function which is located at different location.
One way to handle this situation is using abspath -
import os
id2word = gensim.corpora.Dictionary.load_from_text(os.path.abspath('wiki_en_wordids.txt'))

Related

reading pg_dump file occurs at open the file

I'm using the pgdumplib lib. Unfortunately there is an error, when I'm trying to open the file. The file is in the same folder as the python script. I'm using Python 3.7
Code:
import pgdumplib
dump = pgdumplib.load('test.dump')
print('Database: {}'.format(dump.toc.dbname))
print('Archive Timestamp: {}'.format(dump.toc.timestamp))
print('Server Version: {}'.format(dump.toc.server_version))
print('Dump Version: {}'.format(dump.toc.dump_version))
for line in dump.table_data('public', 'pgbench_accounts'):
print(line)
Error:
Traceback (most recent call last):
File "C:/Users/user/data/test.py", line 3, in <module>
dump = pgdumplib.load('test.dump')
File "C:\Users\user\venv\data\lib\site-packages\pgdumplib\__init__.py", line 24, in load
return dump.Dump(converter=converter).load(filepath)
File "C:\Users\user\venv\data\lib\site-packages\pgdumplib\dump.py", line 228, in load
raise ValueError('Path {!r} does not exist'.format(path))
ValueError: Path 'test.dump' does not exist
If you are running your code from C:/Users/user/700Joach/project/ and you have the following line in your script:
dump = pgdumplib.load('test.dump')
Then, python would look for the following path to open test.dump:
C:/Users/user/700Joach/project/test.dump
Namely, this part: load('test.dump') internally is forging a relative path to test.dump.
You can do several things to resolve the issue. Either move test.dump to the directory from which you are executing your code. Or, provide an absolute path to your test.dump as follows:
dump = pgdumplib.load('C:/Users/user/700Joach/project/test.dump')

Pytorch Torch.save FileNotFoundError

When I try to call "torch.save" to save my model in a "tmp_file", it rises a FileNotFoundError. the trace back is as follow:
Traceback (most recent call last):
File “C:/Users/Haoran/Documents/GitHub/dose-response/python/simulations/hdr.py”, line 234, in
test_hdr_continuous()
File “C:/Users/Haoran/Documents/GitHub/dose-response/python/simulations/hdr.py”, line 195, in test_hdr_continuous
model = fit_mdn(X[:split], y[:split], nepochs=20)
File “C:\Users\Haoran\Documents\GitHub\dose-response\python\simulations\continuous.py”, line 192, in fit_mdn
torch.save(model, tmp_file)
File “C:\Users\Haoran\Documents\GitHub\dose-response\python\venv\lib\site-packages\torch\serialization.py”, line 161, in save
return _with_file_like(f, “wb”, lambda f: _save(obj, f, pickle_module, pickle_protocol))
File “C:\Users\Haoran\Documents\GitHub\dose-response\python\venv\lib\site-packages\torch\serialization.py”, line 116, in _with_file_like
f = open(f, mode)
FileNotFoundError: [Errno 2] No such file or directory: ‘/tmp/tmp_file_4358f298-a1d9-4c81-9e44-db4d8f1b4319’
It is weird that everything works perfectly on my mac, but I got this error on my Windows desktop.
As shmee observed, you are trying to write to /tmp/[...] on a Windows machine. Therefore you get FileNotFoundError.
To make your code OS agnostic, you may find python's tempfile package useful, especially NamedTemporaryFile: this function creates a temporary file and returns its name, so you can access/use it in your program.

Winerror 3: File not found using ac2git

I was using the ac2git tool to concert my accurev depot to git repository.
I am getting the following error when running the command python ac2git.py after following the necessary steps, as instructed here.
2016-08-29 09:54:14,058 - ac2git - ERROR - The script has encountered an exception, aborting!
Traceback (most recent call last):
File "ac2git.py", line 3596, in AccuRev2GitMain
rv = state.Start(isRestart=args.restart, isSoftRestart=args.softRestart)
File "ac2git.py", line 2974, in Start
self.RetrieveStreams()
File "ac2git.py", line 1556, in RetrieveStreams
tr, commitHash = self.RetrieveStream(depot=depot, stream=streamInfo,dataRef=dataRef, stateRef=stateRef, hwmRef=hwmRef, startTransaction=self.config.accurev.startTransaction, endTransaction=endTr.id)
File "ac2git.py", line 1511, in RetrieveStream
dataTr, dataHash = self.RetrieveStreamData(stream=stream, dataRef=dataRef,stateRef=stateRef)
File "ac2git.py", line 1394, in RetrieveStreamData
commitHash = self.Commit(transaction=tr, allowEmptyCommit=True,messageOverride="transaction {trId}".format(trId=tr.id), parents=[], ref=dataRef)
File "ac2git.py", line 670, in Commit
self.PreserveEmptyDirs()
File "ac2git.py", line 440, in PreserveEmptyDirs
if git.GetGitDirPrefix(path) is None and len(os.listdir(path)) == 0:
FileNotFoundError: [WinError 3] The system cannot find the path specified:'C:///Users/*****/*****/app/node_modules/bower/node_modules/update-notifier/node_modules/latest-version/node_modules/package-json/node_modules/registry-url/node_modules/npmconf/node_modules/config-chain/node_modules/proto-list'
The error is quite vague and I can't seem to find any documentation on this tool that can help with the error. Has anyone faced this issue before?
I am not familiar with the tool you are using but it seems the last line in the output excerpt you provided gives the best information:
FileNotFoundError: [WinError 3] The system cannot find the path specified:'C:///Users/*****/*****/app/node_modules/bower/node_modules/update-notifier/node_modules/latest-version/node_modules/package-json/node_modules/registry-url/node_modules/npmconf/node_modules/config-chain/node_modules/proto-list'
That path looks to be malformed with extra slashes and directory names that are not valid within the file system. Also, the file path is at 227 characters in the output and if the directory names between "Users" and "app" are long enough, you could be hitting the 256 character path name limit in Windows.

Python os.walk() failing

I have created a script to give me the list of files in a folder directory. Yet, I am occasionally getting this error. What does this mean?
portion of the error:
Script failed due to an error:
Traceback (most recent call last):
File "<script>", line 12, in <module>
File "C:\Program Files\Nuix\Nuix 6\lib\jython.jar\Lib\os.py", line 309, in walk
File "C:\Program Files\Nuix\Nuix 6\lib\jython.jar\Lib\os.py", line 299, in walk
File "C:\Program Files\Nuix\Nuix 6\lib\jython.jar\Lib\genericpath.py", line 41, in isdir
File "C:\Program Files\Nuix\Nuix 6\lib\jython.jar\Lib\genericpath.py", line 41, in isdir
java.lang.AbstractMethodError: org.python.modules.posix.PythonPOSIXHandler.error(Ljnr/constants/platform/Errno;Ljava/lang/String;Ljava/lang/String;)V
at jnr.posix.BaseNativePOSIX.stat(BaseNativePOSIX.java:309)
at jnr.posix.CheckedPOSIX.stat(CheckedPOSIX.java:265)
at jnr.posix.LazyPOSIX.stat(LazyPOSIX.java:267)
The script:
import os
import codecs
import shutil
import datetime
import sys
exportpath = 'P:/Output/Export7/{6136BAF2-85BA-4E64-8C11-A2C59398FC02}/'
tempnativefolder = 'NATIVESOrig'
for dir, sub, file in os.walk(exportpath + tempnativefolder):
for fname in file:
#source path
source = os.path.join(dir, fname).replace('\\', '/')
print source
print("Natives moved to subfolders")
I found out that the presence of these characters(see "diamond with question mark" character in screenshot) in the file name causes the issue. Once I replaced those, my script works. thanks so much.
What the error means: AbstractMethodError means that some code tried to call a method which was not implemented.
PythonPOSIXHandler implements jnr.posix.POSIXHandler. JRuby also uses JNR and the interface is subtly different between the two. JRuby's newer copy of JNR has that one additional #error(Errno, String, String) method and Jython's implementation lacks that method, because it's compiled against the interface when the method didn't exist.
I usually see this problem in the other direction - where stuff in Jython's jar breaks JRuby. I assume it entirely depends on the order of the jars in the classpath.

How to wait for a folder/file to be complete to finally process it in Python?

I'm trying to code a little script that watches a defined directory with a while-loop. Every file or directory that is in this directory is compressed to RAR and moved to another directory after the process is completed.
My problem: everytime I copy a file or folder to this directory, the script doesn't wait and startes the process the second it sees a new file or folder. But when the files or folders are bigger than a few kilobytes the loop breaks with a permission error.
Since I'm a Python beginner I don't know which module to use. Is there a checking module to see if the file or folder that the tool wants to process is used by another process? Or am I going in the wrong direction?
Edit: added the code for directory-only listening:
watchDir = "L:\\PythonTest\\testfolder\\"
finishedDir = "L:\\PythonTest\\finishedfolders\\"
rarfilesDir = "L:\\PythonTest\\rarfiles\\"
rarExe = "L:\\PythonTest\\rar.exe"
rarExtension = ".rar"
rarCommand = "a"
while True:
dirList = [name for name in os.listdir(watchDir) if os.path.isdir(os.path.join(watchDir,name))]
for entryName in dirList:
if not os.path.exists((os.path.join(finishedDir,entryName))):
sourcePath = os.path.join(watchDir,entryName)
entryNameStripped = entryName.replace(" ", "")
os.chdir(watchDir)
archiveName = rarfilesDir+entryNameStripped+rarExtension
subprocesscall = [rarExe, rarCommand, archiveName, entryName]
subprocess.call(subprocesscall, shell=True)
shutil.move(sourcePath,finishedDir)
When I run the script and try to add a file of several GB (named #filename# in the following lines) these errors occur:
Creating archive L:\PythonTest\rarfiles\#filename#.rar
Cannot open #filename#
The process cannot access the file, since it's used by another process.
Adding #filename# OK
WARNING: Cannot open 1 file
Done
Traceback (most recent call last):
File "C:\Python34\lib\shutil.py", line 522, in move
os.rename(src, real_dst)
PermissionError: [WinError 5] Access denied: #filepath#
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "L:/Python Test/test.py", line 35, in <module>
shutil.move(sourcePath,finishedDir)
File "C:\Python34\lib\shutil.py", line 531, in move
copytree(src, real_dst, symlinks=True)
File "C:\Python34\lib\shutil.py", line 342, in copytree
raise Error(errors)
shutil.Error: #filepath#
instead of using os.listdir, you can use os.walk, os.walk yields 3 tuple dirpath(path of directory,filenames(all files in that dirpath),dirnames(all the sub directories in dirpath)
for x,y,z in os.walk('path-of-directory'):
do you stuff with x,y,z the three tuples

Categories