On python 2.7, I am currently using the following code to send data via a post request to a webpage (unfortunately, I cannot really change this). I prepare a string data which I prepare according to http://everydayscripting.blogspot.co.at/2009/09/python-jquery-open-browser-and-post.html, then write it to a file, and then open the file with webbrowser.open:
f = tempfile.NamedTemporaryFile(delete=False)
f.write(data)
f.close()
webbrowser.open(f.name)
time.sleep(1)
f.unlink(f.name)
However, I had to learn that sleeping a little sometimes is a little too little: I might delete the file before the data were submitted.
How can I avoid this?
One idea is, of course, to delete the file later, but when could this be? The whole thing is a method in a class - is there a method that is relieably executed on destruction? Or is it somehow possible to start the browser in a way so that it does not return, until the tab is closed?
Related
Here is my code of accessing&editing the file:
def edit_default_settings(self, setting_type, value):
with open("cam_settings.json", "r") as f:
cam_settings = json.load(f)
cam_settings[setting_type] = value
with open("cam_settings.json", 'w') as f:
json.dump(cam_settings, f, indent=4)
I use It in a program that runs for several hours in a day, and once in a ~week I'm noticing, that cam_settings.json file becoming empty (literally empty, the file explorer shows 0 bytes), but can't imagine how that is possible
Would be glad to hear some comments on what could go wrong
I can't see any issues with the code itself, but there can be an issue with the execution environment. Are you running the code in a multi-threaded environment or running multiple instances of the same program at once?
This situation can arise if this code is executed parallelly and multiple threads/processes try to access the file at the same time. Try logging each time the function was executed and if the function was executed successfully. Try exception handlers and error logging.
If this is a problem, using buffers or singleton pattern can solve the issue.
As #Chels said, the file is truncated when it's opened with 'w'. That doesn't explain why it stays that way; I can only imagine that happening if your code crashed. Maybe you need to check logs for code crashes (or change how your code is run so that crash reasons get logged, if they aren't).
But there's a way to make this process safer in case of crashes. Write to a separate file and then replace the old file with the new file, only after the new file is fully written. You can use os.replace() for this. You could do this simply with a differently-named file:
with open(".cam_settings.json.tmp", 'w') as f:
json.dump(cam_settings, f, indent=4)
os.replace(".cam_settings.json.tmp", "cam_settings.json")
Or you could use a temporary file from the tempfile module.
When openning a file with the "w" parameter, everytime you will write to it, the content of the file will be erased. (You will actually replace what's written already).
Not sure if this is what you are looking for, but could be one of the reasons why "cam_settings.json" becomes empty after the call of open("cam_settings.json", 'w')!
In such a case, to append some text, use the "a" parameter, as:
open("cam_settings.json", 'a')
Trying to loop through some license ids to get data from a website. Example: when I enter id "E09OS0018" in the search box, I get a list of one school/daycare. But when I type the following code in my python script (website link and arguments obtained from developer tools), I get no data in the file. What's wrong with this requests.get() command? If I should use requests.post() instead, what arguments would I use with the requests.post() command (not very familiar with this approach).
flLicenseData = requests.get('https://cares.myflfamilies.com/PublicSearch/SuggestionSearch?text=E09OS0018&filter%5Bfilters%5D%5B0%5D%5Bvalue%5D=e09os0018&filter%5Bfilters%5D%5B0%5D%5Boperator%5D=contains&filter%5Bfilters%5D%5B0%5D%5Bfield%5D=&filter%5Bfilters%5D%5B0%5D%5BignoreCase%5D=true&filter%5Blogic%5D=and')
openFile = open('fldata', 'wb')
for chunk in flLicenseData.iter_content(100000):
openFile.write(chunk)
do openFile.flush() before checking the file's content.
Most likely, you are reading the file immediately before the contents are (Actually) written to the file.
There could be a lag between the contents writen to the file handler and contents actually transfered to the physical file, Due to the levels of buffers between the programming language API, OS and the actual physical file.
use openFile.flush() to ensure that the data is written into the file.
An excellent explanation of flush can be found here.
Or alternatively close the open file with openFile.close() or use a context manager
with open('fldata', 'wb') as open_file:
for chunk in flLicenseData.iter_content(100000):
openFile.write(chunk)
I'm trying to create an asynchronous function that reads the constantly updating log file and gets every line of it. That's what I have for now:
async def log_reader():
with open(LOG_PATH, "r", encoding='utf-8', errors='ignore') as logfile:
logfile.seek(0, os.SEEK_END)
while True:
line = logfile.readline()
if not line:
await asyncio.sleep(0.2)
continue
# do stuff
It works fine until the file is restarted. I was thinking about checking whether the file's size became smaller than it was, that would mean that it was refreshed, but I feel there must be a better option for that.
Any tips are welcome.
For refreshing the file, you can check it's inode. Get it from the path using os.stat and then extract the inode number. If the inode you get is different than the previous one, you'll have to reopen the file. (so doing this using a with block may not be easy)
To optimise it a bit so you don't query the file all the time, you could implement some timeout which you can easily accept, but which is higher than the usual delay between the log lines.
This will work if the file has been replaced, which is the usual method of rotating logfiles. It will not work if the file has only been truncated.
Do files opened like file("foo.txt") have any info about file modification time?
Basically I want to know if the file has been modified or replaced since a certain time, but if the file is replaced between checking modification time and opening the file, then you have inaccurate information.
How can I be sure?
Thanks.
UPDATE
#rubayeet: Thanks for the answer (+1), I actually didn't think of that. But... What to do if the modification time has changed? Perhaps I reload the file again. But what if it changes that time? If the file is being touched regularly I could end up in a loop forever! What I really want is a way to just get an open file handle and a modification time to go with it, without a potential infinite loop.
PS The answer you gave was actually plenty good enough for my purposes as the file won't be changed regularly, its general interest on my part now.
UPDATE 2
Thinking the previous update through (and experimenting a little) I realize that simply knowing the file modification time at the point the file was opened is not so much use as if the file is modified while reading you can have some or all of the modified data in the stuff you read in, so you'd have to open and read/process the whole file, then check mtime again (as per #rubayeet's answer) to see if you may have stale data.
For simple modtimes you would use:
from os.path import getmtime
modtime = getmtime('/file/to/path')
If you want something like a callback functionality you could check the inotify bindings for python: pyinotify.
You essentialy set a watchmanager up, which notifies you in a event-loop if any changes happens in the monitored directory. You register for specific events, like opening a file (which changes the modtime if written to).
If you are interested in an exclusive access to a file, i would point to the fnctl module, which has some lowlevel and file-locking mechanism on filedescriptors.
import os
filepath = '/path/to/file'
modifytime1 = os.path.getmtime(filepath)
fp = open(filepath)
modifytime2 = os.path.getmtime(filepath)
if modifytime1 != modifytime2:
print "File modified after opening"
This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 11 years ago.
I needed to download a file within a python program, someone told me to do this.
source = urllib2.urlopen("http://someUrl.com/somePage.html").read()
open("/path/to/someFile", "wb").write(source)
It working very well, but I would like to understand the code.
When you have something like
patatoe = 1
Isn't a variable?
and when you have a something like:
blabla()
isn't to define a function?
Please, I would LOVE to understand correctly the code.
The word "source" is a variable. When you call urllib2's urlopen method and pass it a URL, it will open that url. You could then type "source.read()" to read the web page (i.e. download it). In your example, it's combined into one line. See http://docs.python.org/library/urllib2.html
The second piece opens a file. The first argument is the path to the file. The "wb" part means that it will write in binary mode. If the file already exists, it will be overwritten. Normally, I would write it like this:
f = open("/path/to/someFile", "wb")
f.write(source)
f.close()
The way you're doing it is a shortcut. When that code is run and the script ends, the file is closed automatically. See also http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files
You define a function using the def keyword:
def f():
...
Without it, you are simply calling the function. open(...) returns a file object. which you then use to write the data out. It's practically the same as this:
f = open(...)
f.write(source)
It isn't quite the same, though, since the variable f holds onto the file object until it goes out of scope, whereas calling open(...).write(source) creates a temporary reference to the file object that disappears immediately after write() returns. The consequence of this is that the single-line form will immediately flush and close the file, while the two-line form wil keep the file open — and possibly some or all of the output buffered — until f goes out of scope.
You can observe this behaviour in the interactive shell:
>>> f = open('xxx', 'w')
>>> f.write('hello')
>>> open('yyy', 'w').write('world')
Now, without exiting the interactive shell, open another terminal window and check the contents of xxx and yyy. They'll both exist, but only yyy will have anything in it. Also, if you go back to Python and invoke f = None or del f, you'll find that xxx has now been written to.
The first line is assigning the result of downloading the file to the variable source. source is then written to disk.
To answer your broader points:
You're right that variables are assigned with an equals sign (=). What we're doing in that first line is assigning the variable source to whatever we receive from the URL.
Parentheses (()) are used to call functions which have been defined by def. To call a function means to ask the function to act. The things inside of the parentheses are called arguments.
You should start with Learn Python the Hard Way to get an understanding of what is happening.
Here's a (hopefully understandable) explanation of the code I showed you the other day (How to download a file in python - feel free to comment here or on that question if you need any more details / explanation):
# Open a local file called "someFile.html" for writing (like opening notepad.exe but not entering text yet)
out_file = open("/path/to/someFile.html", "wb")
# Connect to the server at someUrl.com and ask for "somePage.html" - the socket sends the "GET /somePage.html HTTP/1.1" request.
# This is like typing the url in your browser window and (if there were an option for it) only getting the headers but not the page content yet.
conn = urllib2.urlopen("http://someUrl.com/somePage.html")
# Read the contents of the remote file "somePage.html".
# This is what actually gets data from the web server and
# saves the data into the 'pageSource' variable.
pageSource = conn.read()
# Write the data we got from the web page to our local file that we opened earlier: 'someFile.html'
out_file.write(pageSource)