setting a "checkpoint" in Pycharm - python

I am using Python 2.7 with Pycharm and I am working on a quite large text files; they are about 3gb in total.
I need to run LDA, PoS tagging, and other feature extraction methods on the data from the file but everytime I test my code, it has to read the file and go through the same process all over again from the beginning.
This is why I often use Jupyter because all the data / variables in previous cells are kept in memory.
Is there any way to do something similar with Pycharm?
For instance, let's say I am adding features to do_some_feature_extraction()
def do_some_feature_extraction(str_list):
# feature extraction 1
# feature extraction 2
str_list = []
with open("some_file.txt", "rb") as f_in:
for line in f_in:
str_list.append(line)
do_some_feature_extraction(str_list)
Let's say, there was an error on "feature extraction 1" and then I fixed it.
Then I will run the code again, then there will be another error on "feature extraction 2". Then I will fix it and run the code again from the beginning.
Instead of doing this, can I just set some sort of checkpoint before executing do_some_feature_extraction(str_list) ?

click the left side of your code ... next to the line number(or where the line number would be if you have them turned off)
a red dot should appear (this is called a breakpoint)...
now run it in debug mode
when you reach your breakpoint you can click the console tab
and then click the interactive terminal button(>_) to work directly with the context of the program

Related

Unknown problem in Python. Printing does not work and files are not saved correctly

When I try to write something, such as variables, the code is renamed to the file name on the computer.
For example, if I write:
a = 20
f = 15
print(a+f)
then the code file will automatically be renamed to the first line, i.e. "a = 20"
Then, when I try to run the code, the program outputs nothing but "Python" and some incomprehensible words.
What could it be related to?
enter image description here
enter image description here
I installed the latest version of Visual Stuio Code with Python, they are new, so there should be no problems. But this time it went wrong.
After reinstalling the program, the problem remains.
First of all, if there is no special requirement, please do not use Code Runner to run the script, using the official extension Python is a better choice.
In addition, the dot on your file label means that you have not saved the file, you can add the following setting to enable automatic saving in the settings.
"files.autoSave": "afterDelay",
You may have created the file using the following method. File --> New File... --> Python File. At this time, the file has not been named, also not saved. You can see that there is no such file in the resource manager list at this time.
So the file label shows the first line of codes. This is a feature of vscode, you can refer to this link. And because the file has not been saved, there will be problems executing the script.
You can rename the script file directly (F2), or vscode will remind you to name the file when saving. Another way to create a file is to right click and choose New File..., enter filename and end with .py extension.

Saving retrieved documents from API

I am currently trying to download a cif file from materialsproject.org which is only possible via an API. They told me to use Mybinder.org to run their code:
from mp_api.client import MPRester
from pymatgen.analysis.diffraction.xrd import XRDCalculator
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
with MPRester(api_key='8dI6UZHs3Nc9lxTp75RrJcPdwvPn6jZb') as mpr:
# first retrieve the relevant structure
structure = mpr.get_structure_by_material_id('mp-980949')
# important to use the conventional structure to ensure
# that peaks are labelled with the conventional Miller indices
sga = SpacegroupAnalyzer(structure)
conventional_structure = sga.get_conventional_standard_structure()
# this example shows how to obtain an XRD diffraction pattern
# these patterns are calculated on-the-fly from the structure
calculator = XRDCalculator(wavelength='CuKa')
pattern = calculator.get_pattern(conventional_structure)
When I run the code it tells me "Retrieving MaterialsDoc documents: 100%". How do I go on from here? I assume it has only retrieved the document, not downloaded it yet onto my pc. I have exactly zero knowledge about programming and APIs. It also doesn't need to download as a cif file. A simple txt. File would also help. I could create my own cif file from that.
I tried running the code from my PC with Python, but nothing really happens. After using Google to find out ways to download data retrieved from APIs I copied some code that others used to download retrieved data, but that also didn't work.
The little you've provided (which may be all they told you?) assumes prior knowledge of or familiarity with Python and/or MyBinder. If they didn't provide you that then perhaps they directed you to other materials and resources on their sites?
Here's how you'd accomplish this, I think, without the authors helping further:
Click here to start up a temporary session that has all the necessary dependencies needed to run your provided code already installed.
Start a new notebook and enter your code. (Alternatively to have that code and further steps already be in a notebook, in the new notebook cell enter !curl -OL https://gist.githubusercontent.com/fomightez/8ef2f588965fbc10f19db79ee0035094/raw/778f399652a08a1f4be70400c5592c59e59cd876/demo_use_mp_api_with_MaterialsProject.ipynb. Give it a few seconds to fetch that notebook file & then double-click that demo_use_mp_api_with_MaterialsProject.ipynb file that should now appear in the file browser panel on the left. Open that notebook and then select from the 'File menu' area > 'Run' > 'Run All Cells' and look there for the rest of what I write about here.)
I noted that it didn't didn't generate any files even though it said 'Retrieving MaterialsDoc documents'. So the contents retrieved must be among the Python objects now active, it seems.
To see what those are, I looked at the assigned variables and entered them as the last line in Jupyter notebook cells and ran those cells. Example, I entered in a cell the following:
structure
Ran that cell and then I saw:
Structure Summary
Lattice
abc : 7.086166087833533 7.086166087833533 8.514034991951231
angles : 54.50319138768853 54.50319138768853 54.77885193228324
volume : 264.22028210282747
A : 3.259891 6.291809 0.0
B : -3.259891 6.291809 0.0
C : 0.0 5.567899 6.441063
pbc : True True True
PeriodicSite: W (1.9253, 6.2918, 0.0000) [0.7953, 0.2047, 0.0000]
PeriodicSite: W (-1.9253, 6.2918, 0.0000) [0.2047, 0.7953, 0.0000]
PeriodicSite: Cl (0.0000, 13.1912, 5.4807) [0.6718, 0.6718, 0.8509]
PeriodicSite: Cl (0.0000, 7.1635, 5.2908) [0.2058, 0.2058, 0.8214]
PeriodicSite: Cl (1.7149, 10.5162, 4.5739) [0.7845, 0.2585, 0.7101]
PeriodicSite: Cl (-1.7149, 10.5162, 4.5739) [0.2585, 0.7845, 0.7101]
PeriodicSite: Cl (1.7149, 7.6353, 1.8672) [0.7415, 0.2155, 0.2899]
PeriodicSite: Cl (-1.7149, 7.6353, 1.8672) [0.2155, 0.7415, 0.2899]
PeriodicSite: Cl (0.0000, 10.9880, 1.1503) [0.7942, 0.7942, 0.1786]
PeriodicSite: Cl (0.0000, 4.9603, 0.9603) [0.3282, 0.3282, 0.1491]
I also did that with pattern and conventional_structure.
(Note that the last line in a Jupyter notebook cell is special in that the REPL context will get used to evaluate and display the corresponding output whatever is on that line.)
What it shows in the output for those, you can select and copy and then make files back on your local machine by pasting the clipboard into your favorite code editor. For your needs maybe that is sufficient? If you read on, I describe how you can send the values assigned the variables to text files you can download without copying-pasting.
If you want to make text files from the contents assigned to each variable you can use Jupyter/IPython %store magic to send the values of the variables to a text file. I'll use structure as an example again. Enter the following in a cell to save the value of pattern to a text file.
%store pattern >pattern.txt
You'll see pattern.txt show up in the file browser a few seconds later after it automatically updates. You can run ls in a cell if you don't want to wait. It should be listed among the files present in the current working directory.
Anything that is made that is useful, you need to download from the temporary session before it times out. So if you made the text files using step #6, you'll want to download those text files from the remote session back to your local machine. If you do this in a notebook, you may want to download that as well. For example, that's how I got this, which I optionally suggested you fetch and run in your session in step #2 above.
To download the files showing in the file browser on the left, locate the files in the list, then right-click on them in the file browser to select them individually, and then from the menu that comes up select 'Download'. You'll be prompted as to where you want to save the files on your call machine.

Saving lines of script to a text file?

I am wondering would it be possible to save few line of code (as it is) from a script (in python) to a textile while running the script in Pycharm which I run every time with new arguments, and I would like to save these arguments with other results automatically, so that I can know which arguments leads to which results, I don't have to add them manually and it's automatically saved in text file to my path.
Example!
For example I want to save the following line of python code from the script to a text file so that I can know I have multiplied the loss2 with 0.75:
"return loss + 0.75*loss2"
OR
"self.lstm = tf.keras.layers.Bidirectional(LSTM(80, return sequences=False,return_state=False),merge mode='ave')"
I want to save these kind of scripts (while running) it as a text file specifically in Pycharm.

I'm using same py file in VS Code to learn different examples. Why does it still run the first block of code after I delete and write different code?

I started with this example:
cars = ['audi', 'bmw', 'subaru', 'toyota']
for car in cars:
if car == 'bmw':
print(car.upper())
else:
print(car.title())
I deleted this and moved on to a new example, within same py file:
requested_topping = 'mushrooms'
if requested_topping != 'anchovies':
print("Hold the anchovies!")
After I click run on this second code, VS Code still prints the output for the cars example. What could be the matter?
The white dot indicates the file has not been saved.
You can add this in the settings.json file:
"files.autoSave": "afterDelay",
"files.autoSaveDelay": 1000,
to enable the autosave function in the VSCode.
The issue is unlikely to be due to the code you've written. It is likely related to how you're saving your .py file, and how it is being ran.
Are you running the file using the 'play' button / Ctrl+F5 in VSCode, or are you double clicking on the .py file from File Explorer / Finder?
If it's the latter, then perhaps you're not saving the file in VSCode first, try saving the file with Ctrl+S first.
Screenshots would likely go a long way to helping you answer this question.
Command n (New file..), convert language mode to python from plain text and paste the new block and run ?

Problem with exiting a Word doc using Python

This is my first time using this so be kind :) basically my question is I am making a program that opens many Microsoft Word 2007 docs and reads from a certain table in that document and writes that info to an excel file there is well in excess of 1000 word docs. I have all of this working but the only problem when I run my code it does not close MSword after opening each doc I have to manually do this at the end of the program run by opening word and selecting exit word option from the Home menu. Another problem is also if a run this program consecutively on the second run everything goes to hell it prints the same thing repeatedly no matter which doc is selected I think this may have to do with how MSword is deciding which doc is active e.g. is it still opening the last active document that was not closed from the last run. Anyways here is my code for the opening and closing part I wont bore you guys with the rest::
MSWord = win32com.client.Dispatch("Word.Application")
MSWord.Visible = 0
# Open a specific file
#myWordDoc = tkFileDialog.askopenfilename()
MSWord.Documents.Open("C:\\Documents and Settings\\fdosier" + chosen_doc)
#Get the textual content
docText = MSWord.Documents[0].Content
charText = MSWord.Documents[0].Characters
# Get a list of tables
ListTables = MSWord.Documents[0].Tables
------Main Code---------
MSWord.Documents.Close
MSWord.Documents.Quit
del MSWord
Basically, Python is not VBA, so this:
MSWord.Documents.Close
is equivalent to:
getattr(MSWord.Documents, "Close")
i.e. you just get some method object and do nothing with it. You need to call the method with the call operator (the parentheses :) :
MSWord.Documents.Close()
Accordingly for .Quit.
Before your MSWord.Quit did you try using:
MSWord.ActiveWindow.Close
Or even more simpley just doing
MSWord.Quit
I dont really understand if you are trying to close a document or the application.
I think you need a MSWord.Quit at the end (before and/or instead of the the del)

Categories