TensorFlow Get started page - print first 5 rows - python

I'm using PyCharm, and when I try to execute the statement from here:
!head -n5 {train_dataset_fp}
IDE complains that this is SyntaxError: invalid syntax and program never executes. I thought the entire tutorial on TensorFlow is in Python, but seems like this code from completely different language. Has anyone proceed successfully through the TensorFlow: Get Started tutorial?

This is not a python command, this is a unix one, to launch the head program.
You can use PyCharm to open a Terminal on your target machine, and type:
head -n5 {train_dataset_fp}
... replacing {train_dataset_fp} with the actual path to your dataset, which you obtained/printed in the previous step of the tutorial, c.f. lines:
train_dataset_fp = tf.keras.utils.get_file(fname=os.path.basename(train_dataset_url),
origin=train_dataset_url)
print("Local copy of the dataset file: {}".format(train_dataset_fp))

Since you're on Windows, you need to use Windows commands to achieve what head would do. If you have Powershell installed, you can use the command gc. If you don't, here's a workaround to print the first 5 lines of file.txt, prefixed with the line number:
findstr /n ".*" file.txt | findstr /b "[1-5]:"
inspired by this answer. Basically it numbers all lines in the file and then picks the first five. Obviously pretty inefficient for large files though. Use the "!" prefix as needed.

Related

ValueError: need more than 0 values to unpack (Python 2)

I am trying to replicate another researcher's findings by using the Python file that he added as a supplement to his paper. It is the first time I am diving into Python, so the error might be extremely simple to fix, yet after two days I haven't still. For context, in the Readme file there's the following instruction:
"To run the script, make sure Python2 is installed. Put all files into one folder designated as “cf_dir”.
In the script I get an error at the following lines:
if __name__ == '__main__':
cf_dir, cf_file, cf_phys_file = sys.argv[1:4]
os.chdir(cf_dir)
cf = pd.read_csv(cf_file)
cf_phys = pd.read_csv(cf_phys_file)
ValueError: need more than 0 values to unpack
The "cf_file" and "cf_phys_file" are two major components of all files that are in the one folder named "cf_dir". The "cf_phys_file" relates only to two survey question's (Q22 and Q23), and the "cf_file" includes all other questions 1-21. Now it seems that the code is meant to retrieve those two files from the directory? Only for the "cf_phys_file" the columns 1:4 are needed. The current working directory is already set at the right location.
The path where I located "cf_dir" is as follows:
C:\Users\Marc-Marijn Ossel\Documents\RSM\Thesis\Data\Suitable for ML\Data en Artikelen\Per task Suitability for Machine Learning score readme\cf_dir
Alternative option in readme file,
In the readme file there's this option, but also here I cannot understand how to direct the path to the right location:
"Run the following command in an open terminal (substituting for file names
below): python cfProcessor_AEAPnP.py cf_dir cf_file cf_phys_file task_file jobTaskRatingFile
jobDataFile OESfile
This should generate the data and plots as necessary."
When I run that in "Command Prompt", I get the following error, and I am not sure how to set the working directory correctly.
- python: can't open file 'cfProcessor_AEAPnP.py': [Errno 2] No such file or directory
Thanks for the reading, and I hope there's someone who could help me!
Best regards & stay safe out there during Corona!!
Marc
cf_dir, cf_file, cf_phys_file = sys.argv[1:4]
means, the python file expects few arguments when called.
In order to run
python cfProcessor_AEAPnP.py cf_dir cf_file cf_phys_file task_file jobTaskRatingFile jobDataFile OESfile
the command prompt should be in that folder.
So, open command prompt and type
cd path_to_the_folder_where_ur_python_file_is_located
Now, you would have reached the path of the python file.
Also, make sure you give full path in double quotes for the arguments.

Can't get working command line on prompt to work on subprocess

I need to extract text from a PDF. I tried the PyPDF2, but the textExtract method returned an encrypted text, even though the pdf is not encrypted acoording to the isEncrypted method.
So I moved on to trying accessing a program that does the job from the command prompt, so I could call it from python with the subprocess module. I found this program called textExtract, which did the job I wanted with the following command line on cmd:
"textextract.exe" "download.pdf" /to "download.txt"
However, when I tried running it with subprocess I couldn't get a 0 return code.
Here is the code I tried:
textextract = shlex.split(r'"textextract.exe" "download.pdf" /to "download.txt"')
subprocess.run(textextract)
I already tried it with shell=True, but it didn't work.
Can anyone help me?
I was able to get the following script to work from the command line after installing the PDF2Text Pilot application you're trying to use:
import shlex
import subprocess
args = shlex.split(r'"textextract.exe" "download.pdf" /to "download.txt"')
print('args:', args)
subprocess.run(args)
Sample screen output of running it from a command line session:
> C:\Python3\python run-textextract.py
args: ['textextract.exe', 'download.pdf', '/to', 'download.txt']
Progress:
Text from "download.pdf" has been successfully extracted...
Text extraction has been completed!
The above output was generated using Python 3.7.0.
I don't know if your use of spyder on anaconda affects things or not since I'm not familiar with it/them. If you continue to have problems with this, then, if it's possible, I suggest you see if you can get things working directly—i.e. running the the Python interpreter on the script manually from the command line similar to what's shown above. If that works, but using spyder doesn't, then you'll at least know the cause of the problem.
There's no need to build a string of quoted strings and then parse that back out to a list of strings. Just create a list and pass that:
command=["textextract.exe", "download.pdf", "/to", "download.txt"]
subprocess.run(command)
All that shlex.split is doing is creating a list by removing all of the quotes you had to add when creating the string in the first place. That's an extra step that provides no value over just creating the list yourself.

Python subprocess.call with multiline string EOF

I've hit a issue that I don't really understand how to overcome. I'm trying to create a subprocess in python to run another python script. Not too difficult. The issue is I'm unable to get around is EOF error when a python file includes a super long string.
Here's an example of what my files look like.
Subprocess.py:
### call longstr.py from the primary pyfile
subprocess.call(['python longstr.py'], shell = True)
Longstr.py
### called from subprocess.py
### the actual string is a lot longer; this is an example to illustrate how the string is formatted
lngstr = """here is a really long
string (which has many n3w line$ and "characters")
that are causing the python file to state the file is ending early
"""
print lngstr
Printer error in terminal
SyntaxError: EOF while scanning triple-quoted string literal
As a work around, I tried to remove all linebreaks as well as all spaces to see if it was due to it being multi-line. That still returned the same result.
My assumption is that when the subprocess is running and the shell is doing something with the file contents, when the new line is reached the shell itself is freaking out and that's what's terminating the process; not the file.
What is the correct workaround for having subprocess run a file like this?
Thank you for your help.
Answering my own question here; my problem was that I didn't file.close() before trying to execute a subprocess.call.
If you encounter this problem, and are working with recently written files this could be your issue too. Thank you to everyone who read or responded to this thread.

Python command line: editing mistake on previous line?

When using python via the command line, if I see a mistake on a previous line of a nested statement is there any way to remove or edit that line once it has already been entered?
e.g.:
>>> file = open("file1", "w")
>>> for line in file:
... parts = line.split('|') <-- example, I meant to type '\' instead
... print parts[0:1]
... print ";"
... print parts[1:]
so rather than retyping the entire thing all over to fix one char, can I go back and edit something in hindsight?
I know I could just code it up in vim or something and have a persistent copy I can do anything I want with, but I was hoping for a handy-dandy trick with the command line.
-- thanks!
You can't do such a thing in the original python interpreter, however, if you use the last version of IPython, it provides a lightweight GUI (looks like a simple shell, but is a GUI in fact) which features multi-line editing, syntax highlighting and a bunch of other things. To use IPython GUI, run it with the ipython qtconsole command.
Not that I know of in all the years I've been coding Python. That's what text editors are for =)
If you are an Emacs user, you can set your environment up such that the window is split into the code buffer and Python shell buffer, and then execute your entire buffer to see the changes.
Maybe. The Python Tutorial says:
Perhaps the quickest check to see whether command line editing is supported is typing Control-P to the first Python prompt you get. If it beeps, you have command line editing; see Appendix Interactive Input Editing and History Substitution for an introduction to the keys. If nothing appears to happen, or if ^P is echoed, command line editing isn’t available; you’ll only be able to use backspace to remove characters from the current line.
In addition to #MatToufoutu's suggestion, you might also take a look at DreamPie, though it's just a GUI for the shell without IPython's other extensions.
Now instead of ipython use
jupyter console
in cmd prompt

What to do with "The input line is too long" error message?

I am trying to use os.system() to call another program that takes an input and an output file. The command I use is ~250 characters due to the long folder names.
When I try to call the command, I'm getting an error: The input line is too long.
I'm guessing there's a 255 character limit (its built using a C system call, but I couldn't find the limitations on that either).
I tried changing the directory with os.chdir() to reduce the folder trail lengths, but when I try using os.system() with "..\folder\filename" it apparently can't handle relative path names. Is there any way to get around this limit or get it to recognize relative paths?
Even it's a good idea to use subprocess.Popen(), this does not solve the issue.
Your problem is not the 255 characters limit, this was true on DOS times, later increased to 2048 for Windows NT/2000, and increased again to 8192 for Windows XP+.
The real solution is to workaround a very old bug in Windows APIs: _popen() and _wpopen().
If you ever use quotes during the command line you have to add the entire command in quoates or you will get the The input line is too long error message.
All Microsoft operating systems starting with Windows XP had a 8192 characters limit which is now enough for any decent command line usage but they forgot to solve this bug.
To overcome their bug just include your entire command in double quotes, and if you want to know more real the MSDN comment on _popen().
Be careful because these works:
prog
"prog"
""prog" param"
""prog" "param""
But these will not work:
""prog param""
If you need a function that does add the quotes when they are needed you can take the one from http://github.com/ssbarnea/tendo/blob/master/tendo/tee.py
You should use the subprocess module instead. See this little doc for how to rewrite os.system calls to use subprocess.
You should use subprocess instead of os.system.
subprocess has the advantage of being able to change the directory for you:
import subprocess
my_cwd = r"..\folder\"
my_process = subprocess.Popen(["command name", "option 1", "option 2"], cwd=my_cwd)
my_process.wait() # wait for process to end
if my_process.returncode != 0:
print "Something went wrong!"
The subprocess module contains some helper functions as well if the above looks a bit verbose.
Assuming you're using windows, from the backslashes, you could write a .bat file from python and then os.system() on that. It's a hack.
Make sure when you're using '\' in your strings that they're being properly escaped.
Python uses the '\' as the escape character, so the string "..\folder\filename" evaluates to "..folderfilename" since an escaped f is still an f.
You probably want to use
r"..\folder\filename"
or
"..\\folder\\filename"
I got the same message but it was strange because the command was not that long (130 characters) and it used to work, it just stopped working one day.
I just closed the command window and opened a new one and it worked.
I have had the command window opened for a long time (maybe months, it's a remote virtual machine).
I guess is some windows bug with a buffer or something.

Categories