mrjob NoFIleFound Exception with cloudera cdh 5 cluster - python

I am getting this error while trying to run mrjob example on the hadoop cluster.
I have set up my hadoop_home and I can also create a new dir on the hdfs file system.
I can run python map-reduce if I use hadoop streaming. It's only with mrjob I am getting this issue.
When I run this command:
python mr_word_freq_count.py -r hadoop --hadoop-bin /usr/bin/hadoop -o hdfs:///user/zkdmkrq/out1 hdfs:///user/zkdmkrq/input1
I get:
no configs found; falling back on auto-configuration no configs found;
falling back on auto-configuration creating tmp directory
/tmp/mr_word_freq_count.zkdmkrq.20150226.172000.917957 writing wrapper
script to
/tmp/mr_word_freq_count.zkdmkrq.20150226.172000.917957/setup-wrapper.sh
STDERR: mkdir:
`hdfs:///user/zkdmkrq/tmp/mrjob/mr_word_freq_count.zkdmkrq.20150226.172000.917957/files/':
No such file or directory Traceback (most recent call last): File
"mr_word_freq_count.py", line 37, in <module>
MRWordFreqCount.run() File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 494, in run
mr_job.execute() File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 512, in execute
super(MRJob, self).execute() File "/usr/lib/python2.6/site-packages/mrjob/launch.py", line 147, in
execute
self.run_job() File "/usr/lib/python2.6/site-packages/mrjob/launch.py", line 208, in
run_job
runner.run() File "/usr/lib/python2.6/site-packages/mrjob/runner.py", line 458, in run
self._run() File "/usr/lib/python2.6/site-packages/mrjob/hadoop.py", line 238, in _run
self._upload_local_files_to_hdfs() File "/usr/lib/python2.6/site-packages/mrjob/hadoop.py", line 265, in
_upload_local_files_to_hdfs
self._mkdir_on_hdfs(self._upload_mgr.prefix) File "/usr/lib/python2.6/site-packages/mrjob/hadoop.py", line 273, in
_mkdir_on_hdfs
self.invoke_hadoop(['fs', '-mkdir', path]) File "/usr/lib/python2.6/site-packages/mrjob/fs/hadoop.py", line 109, in
invoke_hadoop
raise CalledProcessError(proc.returncode, args) subprocess.CalledProcessError: Command '['/usr/bin/hadoop', 'fs',
'-mkdir',
'hdfs:///user/zkdmkrq/tmp/mrjob/mr_word_freq_count.zkdmkrq.20150226.172000.917957/files/']'
returned non-zero exit status 1

I actually found the solution to this issue.
I had to alter the mrjob/hadoop.py file. Here is the exact solution
https://github.com/Yelp/mrjob/issues/850
Hope it helps to anyone who encounters this issue.

Related

How to overwrite Django app to Pythonanywhere?

After the second time deploying the Django app to Pythonanywhere, (I re-edited and overwritten in VS code and did git push) I got the following error.
WARNING: Package(s) not found: django
Traceback (most recent call last):
File "/home/hogehohe/.local/bin/pa_autoconfigure_django.py", line 47, in <module>
main(arguments['<git-repo-url>'], arguments['--domain'], arguments['--python'], nuke=arguments.get('--nuke'))
File "/home/hogehohe/.local/bin/pa_autoconfigure_django.py", line 36, in main
project.update_settings_file()
File "/home/hogehohe/.local/lib/python3.6/site-packages/pythonanywhere/django_project.py", line 74, in update_settings_file
new_django = version.parse(self.virtualenv.get_version("django")) >= version.parse("3.1")
File "/home/hogehohe/.local/lib/python3.6/site-packages/pythonanywhere/virtualenvs.py", line 32, in get_version
output = subprocess.check_output(commands).decode()
File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/home/hogehohe/.virtualenvs/hogehohe.pythonanywhere.com/bin/pip', 'show', 'django']' returned non-zero exit status 1.
The command is
$ pa_autoconfigure_django.py https://github.com/[user_name]/[project_name].git --nuke
The first deployment succeeded but the second one is not. I don't know the cause and how to overwrite it...
You need to have a requirements.txt file in your project that specifies the packages that you need for your app. I'm guessing that your first project had one that included django and that your second one does not.

ValueError: path is on mount 'D:', start on mount 'C:' while doing pip install

I am new to python, and its ecosystem, so I am having trouble figuring out how to fix an error I am getting while trying to install a library for a project. I understand it has something to due with the fact that my computer has two hard drives, but I do not know how to fix it. (I know putting the project on the other drive probably would, but that drive is too small, and only really has my operating system.)
This is the traceback of the error:
Traceback (most recent call last):
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\cli\base_command.py", line 179, in main
status = self.run(options, args)
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\commands\install.py", line 384, in run
installed = install_given_reqs(
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\req\__init__.py", line 53, in install_given_reqs
requirement.install(
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\req\req_install.py", line 910, in install
self.move_wheel_files(
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\req\req_install.py", line 437, in move_wheel_files
move_wheel_files(
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\wheel.py", line 458, in move_wheel_files
clobber(source, dest, False, fixer=fixer, filter=filter)
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\wheel.py", line 424, in clobber
record_installed(srcfile, destfile, changed)
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\wheel.py", line 351, in record_installed
newpath = normpath(destfile, lib_dir)
File "D:\(name)\(project)\venv\lib\site-packages\pip-19.0.3-py3.8.egg\pip\_internal\wheel.py", line 68, in normpath
return os.path.relpath(src, p).replace(os.path.sep, '/')
File "C:\Users\(name)\AppData\Local\Programs\Python\Python38\lib\ntpath.py", line 703, in relpath
raise ValueError("path is on mount %r, start on mount %r" % (
ValueError: path is on mount 'D:', start on mount 'C:'
Also, further information, though I don't think it is relevant(but just in case), I am installing this package in PyCharm, which itself runs the pip command.

Installing Google Cloud SDK throwing error in install.py when using install.bat

I am trying to install google cloud SDK using install.bat. I have tried downloading the bundled pythons versions 275 and current version 276, they both fail at the same spot. It is able to find python in the platform/bundledpython folder so that is not the issue. I have also tried the suggestions online including making sure that the "Find" command works on a command prompt. Any help appreciated.
The latest available version is: 276.0.0
���───────────────────────────────────────────────────────────────────────────────────────────────────────────────┐Trac
back (most recent call last):
File "C:\google-cloud-sdk\google-cloud-sdk\\bin\bootstrapping\install.py", line 225, in <module>
main()
File "C:\google-cloud-sdk\google-cloud-sdk\\bin\bootstrapping\install.py", line 203, in main
Install(pargs.override_components, pargs.additional_components)
File "C:\google-cloud-sdk\google-cloud-sdk\\bin\bootstrapping\install.py", line 148, in Install
_CLI.Execute(['--quiet', 'components', 'list'])
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\calliope\cli.py", line 1007, in Execute
self._HandleAllErrors(exc, command_path_string, specified_arg_names)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\calliope\cli.py", line 1040, in _HandleAllErrors
exceptions.HandleError(exc, command_path_string, self.__known_error_handler)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\calliope\exceptions.py", line 527, in HandleError
core_exceptions.reraise(exc)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\core\exceptions.py", line 146, in reraise
six.reraise(type(exc_value), exc_value, tb)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\calliope\cli.py", line 981, in Execute
resources = calliope_command.Run(cli=self, args=args)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\calliope\backend.py", line 809, in Run
display_info=self.ai.display_info).Display()
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\calliope\display.py", line 483, in Display
self._printer.Print(self._resources)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\core\resource\resource_printer_base.py", line 279, in P
int
self.Finish()
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\core\resource\table_printer.py", line 467, in Finish
self._out.write(line)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\core\log.py", line 239, in write
self._Write(plain_text, styled_text)
File "C:\google-cloud-sdk\google-cloud-sdk\lib\googlecloudsdk\core\log.py", line 232, in _Write
self.__stream_wrapper.stream.write(stream_msg)
I just had the same problem trying to install the latest Google Cloud SDK (276.0.0). The Windows setup was stuck on "Installing components".
Looking at the process list with Process Explorer I could see it was running this command that was stuck, as you probably did to get your stack trace. I took the command line and ran it in a separate Administror cmd.exe (paths might differ per system, the idea is the same).
cd "C:\Program Files (x86)\Google\Cloud SDK"
SET "CLOUDSDK_CORE_DISABLE_PROMPTS=1"
SET "CLOUDSDK_CONFIG=%APPDATA%\gcloud"
"C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\install.bat" --quiet --disable-installation-options --path-update "FALSE" --usage-reporting "true" --additional-components beta powershell"
Running them would produce the same error / stack trace.
Editing log.py and commenting out (prefix with #) line 232 would make it get further, but it runs into a separate problem:
ERROR: Cannot use bundled Python installation to update Cloud SDK in non-interactive mode.
Please run again in interactive mode.
Enable prompts with CLOUDSDK_CORE_DISABLE_PROMPTS envvar and remove --quiet and --disable-installation-options from the install.bat command line and run it again.
SET "CLOUDSDK_CORE_DISABLE_PROMPTS=0"
"C:\Program Files (x86)\Google\Cloud SDK\google-cloud-sdk\install.bat" --path-update "FALSE" --usage-reporting "true" --additional-components beta powershell"
This time it should continue, start a new console where it actually installs the components and eventually finish succesfully.

"abort: The system cannot find the file specified" in Mercurial

I have a large (~700MB) Mercurial repository. I can clone the repo fine without updating (and also it's totally browsable on Bitbucket, where it's hosted) but I can't update the working directory to the latest changeset because I get the following error:
... lot of getting [path] lines here
getting path/to/some/file.ext
abort: The system cannot find the file specified
[command returned code 255 Wed Jun 24 00:51:37 2015]
The last file before the error actually exists in the repo (it's visible in Bitbucket too).
I thought the issue was because of too long paths, but even cloning to a drive root yields the same. Paths could still be too long but "path/to/some/file.ext" is just 60 characters.
Running the command with traceback yields this:
Traceback (most recent call last):
File "mercurial\dispatch.pyo", line 160, in _runcatch
File "mercurial\dispatch.pyo", line 885, in _dispatch
File "mercurial\dispatch.pyo", line 646, in runcommand
File "mercurial\dispatch.pyo", line 976, in _runcommand
File "mercurial\dispatch.pyo", line 947, in checkargs
File "mercurial\dispatch.pyo", line 882, in <lambda>
File "mercurial\util.pyo", line 716, in check
File "mercurial\extensions.pyo", line 168, in closure
File "mercurial\util.pyo", line 716, in check
File "hgext\mq.pyo", line 3505, in mqcommand
File "mercurial\util.pyo", line 716, in check
File "mercurial\commands.pyo", line 6402, in update
File "mercurial\hg.pyo", line 535, in clean
File "mercurial\hg.pyo", line 520, in updaterepo
File "mercurial\merge.pyo", line 1140, in update
File "mercurial\merge.pyo", line 772, in applyupdates
File "mercurial\subrepo.pyo", line 246, in submerge
File "mercurial\context.pyo", line 252, in sub
File "mercurial\subrepo.pyo", line 341, in subrepo
File "mercurial\subrepo.pyo", line 1206, in __init__
File "mercurial\subrepo.pyo", line 1216, in _ensuregit
File "mercurial\subrepo.pyo", line 1294, in _gitnodir
File "subprocess.pyo", line 710, in __init__
File "subprocess.pyo", line 958, in _execute_child
WindowsError: [Error 2] The system cannot find the file specified
The repo has git subrepos (these are public repos on GitHub). And hg-git works for me otherwise, I'm able to pull from and push to git repos from hg.
Anybody with an idea how to solve this?
Solved the issue: Lazy Badger pointed into the right direction. The issue was that the path to the git executable wasn't added to my PATH environment variable.
Adding C:\Program Files (x86)\Git\bin\ (or where you have git.exe on your system) to PATH with Rapid Environment Editor (I needed to use this tool as my PATH was over 1024 chars, so using setx wasn't working) to the System variables solved it.

Marmalade deploy - deployment.py, line 62, in SetOpt

I got a mac book pro retina and I'am trying to create a project from the .mkb file: but I got this error
Building project: /Users/sergioandreotti/Downloads/twins/template/marmalade/FeedtheTwins.mkb
Traceback (most recent call last):
File "/Developer/Marmalade/6.1/s3e/makefile_builder/mkb.py", line 209, in <module>
run()
File "/Developer/Marmalade/6.1/s3e/makefile_builder/mkb.py", line 137, in run
main(sys.argv)
File "/Developer/Marmalade/6.1/s3e/makefile_builder/mkb.py", line 32, in main
exit_code = mkb_main.run(argv)
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 3461, in run
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 3619, in run2
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 2697, in process_mkb_for_platform
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 690, in process
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 2602, in process_file
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 2124, in process
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 2124, in <lambda>
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 1971, in process
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/mkb_main.py", line 1130, in process_deployment_line
File "/p4/sdkbuild/sdk/main/s3e/makefile_builder/deployment.py", line 62, in SetOpt
NameError: global name 'output' is not defined
Press enter to continue...
I ve found this solution: https://devnet.madewithmarmalade.com/questions/2784/mkb-fails-to-build.html
but I don't think it's the best solution.
it's working for building the project but I got other problems when I have to deploy with the Marmalade deploy tool.
Sometimes deploy fails and the error in the error log is the same "global name 'output' is not defined"
I m not able to save my configuration in the .mkb, because if I do this, next time I reload the configuration, the deploy fails.
it was my fault, I had marmalade 6.1.1 installed instead of 6.1.2.
6.1.1 doensn't provide retina support and in the .mkb there was the tag "Enable 4-inch Retina Support" set.

Categories