I'm following the python tutorial seen here (LINK) with this code:
# !/usr/bin/python
import os
import stat
filename = '/tmp/tmpfile'
mode = 0600|stat.S_IRUSR
# filesystem node specified with different modes
os.mknod(filename, mode)
That works well. But I want to write the file with group write permissions. But when I change mode to "Write by group" mode:
mode = 0600|stat.S_IWGRP
(from LINK2) the file runs without throwing an error, but the file doesn't have group write permissions. All the "mode" permissions work except group write and others write.
How can I get my python/uwsgi/nginx app to create files with group write permissions?
try this:
mode = stat.S_IFREG | stat.S_IWGRP | stat.S_IRUSR
from the help docstring:
mknod(filename [, mode=0600, device])
Create a filesystem node (file, device special file or named pipe)
named filename. mode specifies both the permissions to use and the
type of node to be created, being combined (bitwise OR) with one of
S_IFREG, S_IFCHR, S_IFBLK, and S_IFIFO. For S_IFCHR and S_IFBLK,
device defines the newly created device special file (probably using
os.makedev()), otherwise it is ignored.
P.S. try running umask in bash. If it returns something other that 0000 than it is automagically subtracting that value from what you specify (man 2 umask). Try running umask 0000 then running your python script again!
Related
By default open writes files with 666 octal permission: -rw-rw-rw-. I wonder if there's a way to make open creates files with the execution bit set. For instance, if presumably my system's umask value is 0000 then any file written with open will be written with the permission -rw-rw-rw-:
$ umask
0000
>>> open("aaa", "w")
$ ls -l aaa
-rw-rw-rw- 1 Kuser Kuser 0 Jun 19 08:44 aaa
I'm looking for a way to set the default permission value of open to 777 octal so I can write executable files directly without os.chmod. Or generally is there a way to achieve this in Python? Probably using lower-level file processing tools from os module? touch and most editors use 666 octal permission mode by default.
I wasn't able to obtain files with the execution bit set for files created by touch command, touch uses 666 by default.
Note: this just an artificial question.
open accepts an opener argument that returns a file descriptor; os.open accepts a mode, which defaults to 0o777.
import os
with open("aaa", "w", opener=os.open) as f:
⋮
On a windows machine, I am trying to get a file's mode using the os module in python, like this (short snippet):
import os
from stat import *
file_stat = os.stat(path)
mode = file_stat[ST_MODE]
An example for the mode I got for a file is 33206.
My question is, how can I convert it to the linux-file mode method? (for example, 666).
Thanks to all repliers!
Edit:
found my answer down here :) for all who want to understand this topic further:
understanding and decoding the file mode value from stat function output
Check if this translates properly:
import os
import stat
file_stat = os.stat(path)
mode = file_stat[ST_MODE]
print oct(stat.S_IMODE(mode))
For your example:
>>>print oct(stat.S_IMODE(33206))
0666
Took it from here. Read for more explanation
One workaround would be to use:os.system(r'attrib –h –s d:\your_file.txt') where you can use the attribute switches :
R – This command will assign the “Read-Only” attribute to your selected files or folders.
H – This command will assign the “Hidden” attribute to your selected files or folders.
A – This command will prepare your selected files or folders for “Archiving.”
S – This command will change your selected files or folders by assigning the “System” attribute.
First of all, thank you to everyone on Stack Overflow for past, present, and future help. You've all saved me from disaster (both of my own design and otherwise) too many times to count.
The present issue is part of a decision at my firm to transition from a Microsoft SQL Server 2005 database to PostgreSQL 9.4. We have been following the notes on the Postgres wiki (https://wiki.postgresql.org/wiki/Microsoft_SQL_Server_to_PostgreSQL_Migration_by_Ian_Harding), and these are the steps we're following for the table in question:
Download table data [on Windows client]:
bcp "Carbon.consensus.observations" out "Carbon.consensus.observations" -k -S [servername] -T -w
Copy to Postgres server [running CentOS 7]
Run Python pre-processing script on Postgres server to change encoding and clean:
import sys
import os
import re
import codecs
import fileinput
base_path = '/tmp/tables/'
cleaned_path = '/tmp/tables_processed/'
files = os.listdir(base_path)
for filename in files:
source_path = base_path + filename
temp_path = '/tmp/' + filename
target_path = cleaned_path + filename
BLOCKSIZE = 1048576 # or some other, desired size in bytes
with open(source_path, 'r') as source_file:
with open(target_path, 'w') as target_file:
start = True
while True:
contents = source_file.read(BLOCKSIZE).decode('utf-16le')
if not contents:
break
if start:
if contents.startswith(codecs.BOM_UTF8.decode('utf-8')):
contents = contents.replace(codecs.BOM_UTF8.decode('utf-8'), ur'')
contents = contents.replace(ur'\x80', u'')
contents = re.sub(ur'\000', ur'', contents)
contents = re.sub(ur'\r\n', ur'\n', contents)
contents = re.sub(ur'\r', ur'\\r', contents)
target_file.write(contents.encode('utf-8'))
start = False
for line in fileinput.input(target_path, inplace=1):
if '\x80' in line:
line = line.replace(r'\x80', '')
sys.stdout.write(line)
Execute SQL to load table:
COPY consensus.observations FROM '/tmp/tables_processed/Carbon.consensus.observations';
The issue is that the COPY command is failing with a unicode error:
[2015-02-24 19:52:24] [22021] ERROR: invalid byte sequence for encoding "UTF8": 0x80
Where: COPY observations, line 2622420: "..."
Given that this could very likely be because of bad data in the table (which also contains legitimate non-ASCII characters), I'm trying to find the actual byte sequence in context, and I can't find it anywhere (sed to look at the line in question, regexes to replace the character as part of the preprocessing, etc). For reference, this grep returns nothing:
cat /tmp/tables_processed/Carbon.consensus.observations | grep --color='auto' -P "[\x80]"
What am I doing wrong in tracking down where this byte sequence sits in context?
I would recommend loading the SQL file (which appears to be /tmp/tables_processed/Carbon.consensus.observations) into an editor that has a hex mode. This should allow you to see it (depending on the exact editor) in context.
gVim (or terminal-based Vim) is one option I would recommend.
For example, if I open in gVim an SQL copy file that has this content:
1 1.2
2 1.1
3 3.2
I can the convert it into hex mode via the command %!xxd (in gVim or terminal Vim) or the Menu option Tools > Convert to HEX.
That yields this display:
0000000: 3109 312e 320a 3209 312e 310a 3309 332e 1.1.2.2.1.1.3.3.
0000010: 320a 2.
You can then run %!xxd -r to convert it back, or the Menu option Tools > Convert back.
Note: This actually modifies the file, so it would be advisable to do this to a copy of the original, just in case the changes somehow get written (you would have to explicitly save the buffer in Vim).
This way, you can see both the hex sequences on the left, and their ASCII equivalent on the right. If you search for 80, you should be able to see it in context. With gVim, the line numbering will be different for both modes, though, as is evidenced by this example.
It's likely the first 80 you find will be that line, though, since if there were earlier ones, it likely would've failed on those instead.
Another tool which might help that I've used in the past is the graphical hex editor GHex. Since that's a GNOME project, not quite sure it'll work with CentOS. wxHexEditor supposedly works with CentOS and looks promising from the website, although I've not yet used it. It's pitched as a "hex editor for massive files", so if your SQL file is large, that might be the way to go.
Python 2.4.x (cannot install any non-stock modules).
Question for you all. (assuming use of subprocess.popen)
Say you had 20 - 30 machines - each with 6 - 10 files on them that you needed to read into a variable.
Would you prefer to scp into each machine, once for each file (120 - 300 SCP commands total), reading each file after it's SCP'd down into a variable - then discarding the file.
Or - SSH into each machine, once for each file - reading the file into memory. (120 - 300 ssh commands total).
?
Unless there's some other way to grab all desired files in one shot per machine (files are named YYYYMMDD.HH.blah - range would be given 20111023.00 - 20111023.23). - reading them into memory that I cannot think of?
Depending on the size of the file, you can possibly do something like:
...
files= "file1 file2 ..."
myvar = ""
for tm in machine_list
myvar = myvar+ subprocess.check_output(["ssh", "user#" + tm, "/bin/cat " + files]);
...
file1 file2 etc are space delimited. Assuming all are unix boxes you can /bin/cat them all in one shot from each machine. (This is assuming that you are simply loading the ENTIRE content in one variable) variations of above.. SSH will be simpler to diagnose.
At least that's my thought.
UPDATE
use something like
myvar = myvar+Popen(["ssh", "user#" +tm ... ], stdout=PIPE).communicate()[0]
Hope this helps.
scp lets you:
Copy entire directories using the -r flag: scp -r g0:labgroup/ .
Specify a glob pattern: scp 'g0:labgroup/assignment*.hs' .
Specify multiple source files: scp 'g0:labgroup/assignment1*' 'g0:labgroup/assignment2*' .
Not sure what sort of globbing is supported, odds are it just uses the shell for this. I'm also not sure if it's smart enough to merge copies from the same server into one connection.
You could run a remote command via ssh that uses tar to tar the files you want together (allowing the result to go to standard out), capture the output into a Python variable, then use Python's tarfile module to split the files up again. I'm not actually sure how tarfile works; you may have to put the read output into a file-like StringIO object before accessing it with tarfile.
This will save you a bit of time, since you'll only have to connect to each machine once, reducing time spent in ssh negotiation. You also avoid having to use local disk storage, which could save a bit of time and/or energy — useful if you are running in laptop mode, or on a device with a limited file system.
If the network connection is relatively slow, you can speed things up further by using gzip or bzip compression; decompression is supported by tarfile.
As an extra to Inerdia's answer, yes you can get scp to transfer multiple files in one connection, by using brace patterns:
scp "host:{path/to/file1,path/to/file2}" local_destination"
And you can use the normal goodies of brace patterns if your files have common prefixes or suffixes:
scp "host:path/to/{file1,file2}.thing" local_destination"
Note that the patterns are inside quotes, so they're not expanded by the shell before calling scp. I have a host with a noticeable connection delay, on which I created two empty files. Then executing the copy like the above (with the brace pattern quoted) resulted in a delay and then both files quickly transferred. When I left out the quotes, so the local shell expanded the braces into two separate host:file arguments to scp, then there was a noticeable delay before the first file and between the two files.
This suggests to me that Inerdia's suggestion of specifying multiple host:file arguments will not reuse the connection to transfer all the files, but using quoted brace patterns will.
In python I'm doing a os.system('chmod o+w filename.png') command so I can overwrite the file with pngcrush.
These are the permissions after I set them in python:
-rw-rw-rw- 1 me users 925 Sep 20 11:25 filename.png
Then I attempt:
os.system('pngcrush filename.png filename.png')
which is supposed to overwrite the file, but I get:
Cannot overwrite input file filename.png
What could be the problem? Isn't pngcrush being run as an 'other' user, for which write permissions are enabled?
Thanks!
The problem is with the way you execute the pngcrush program, not with permissions of filename.png or Python. It simply attempts to open filename.png both for input and output, which is of course invalid.
Give pngcrush either the -e or the -d option to tell it how to write output. Read its man for more information.
Perhaps pngcrush isn't letting you use the same name for both input and output files? Have you tried changing the output filename? If so, what were the results?
As an aside (not related to the problem of the input and output files being the same), you can change the mode of a file using os.chmod, which is more efficient than running chmod:
import os
import stat
path = "filename.png"
mode = os.stat(path).st_mode # get current mode
newmode = mode | stat.S_IWOTH # set the 'others can write' bit
os.chmod(path, newmode) # set new mode
Perhaps you're supposed to give a different (non-existing) filename for output. Have you tried the same in a shell?