Writing to a file py - python

if anyone can help me I need my code to display as such:
Hammad | Won | 5
The code I'm using is:
f = open("Statistics.txt", "a")
f.write(str(player_name) +''+ str(Outcome)+''+str(max_guesses)+"\n"
f = open("Statistics.txt", "r")
print(f.read())
f.close()
I need the output to be:
Hammad | Won | 6
Instead I'm getting:
Hammad Won 6

Python does not add | character automatically while string concatenation, you have to do it manually,
f.write(str(player_name) +' | '+ str(Outcome)+' | '+str(max_guesses)+"\n")
PS : f.write need a closing parenthesis(all functions do)

Try replacing the write line with:
f.write(f'{player_name} | {Outcome} | {max_guesses}\n')

Replace f.write with this
f.write(str(player_name)+'|'+str(Outcome)+'|'+str(max_guesses)+"\n"

Related

Printing issues when moving from Python 2 to 3 with code

I have a code that I am trying to convert but it's written in Python 2 and I would like to print this code in Python 3. However, it's not able to print in matrix format. I am getting output in an unrecognizable table format.
The code is following:
for n in cols:
print('/t',n),
print
cost = 0
for g in sorted(costs):
print('\t', g)
for n in cols:
y = res[g][n]
if y != 0:
print (y),
cost += y * costs[g][n]
print ('\t'),
print
print ("\n\nTotal Cost = ", cost)
The expected output is in the following format:
|0 |A |B |C |D |E |
|- |- |- |- |- |- |
|W | | |20| | |
|X | |40|10| |20|
|Y | |20| | |30|
|Z | | | | |60|
Total Cost = 1000
Could you suggest what changes I need to make in this code?
In py 2 print did not have parenthesis, in py3 is a must.
python3 simple print
# in py3 a print must be like
print("my text")
python3 print with no newline / carriage return
Also in your py2 code you have print ('\t'), please mind the comma after print => which means do not add a newline after print.
In python3 would translate to
print('\t', end='')
Your print calls must always be enclosed in parenthesis like this:
print("\t", n)
Another thing - whenever you see the Python 2 print command with
a comma at the end, such as
print y,
you need to change that line to:
print(y,end="")
This will print the variable without a new line.

Print multiline strings side-by-side

I want to print the items from a list on the same line.
The code I have tried:
dice_art = ["""
-------
| |
| N |
| |
------- ""","""
-------
| |
| 1 |
| |
------- """] etc...
player = [0, 1, 2]
for i in player:
print(dice_art[i], end='')
output =
ASCII0
ASCII1
ASCII2
I want output to =
ASCII0 ASCII1 ASCII2
This code still prints the ASCII art representation of my die on a new line. I would like to print it on the same line to save space and show each player's roll on one screen.
Since the elements of dice_art are multiline strings, this is going to be harder than that.
First, remove newlines from the beginning of each string and make sure all lines in ASCII art have the same length.
Then try the following
player = [0, 1, 2]
lines = [dice_art[i].splitlines() for i in player]
for l in zip(*lines):
print(*l, sep='')
If you apply the described changes to your ASCII art, the code will print
------- ------- -------
| || || |
| N || 1 || 2 |
| || || |
------- ------- -------
The fact that your boxes are multiline changes everything.
Your intended output, as I understand it, is this:
------- -------
| || |
| N || 1 | ...and so on...
| || |
------- -------
You can do this like so:
art_split = [art.split("\n") for art in dice_art]
zipped = zip(*art_split)
for elems in zipped:
print("".join(elems))
# ------- -------
# | || |
# | N || 1 |
# | || |
# ------- -------
N.B. You need to guarantee that each line is the same length in your output. If the lines of hyphens are shorter than the other, your alignment will be off.
In the future, if you provide the intended output, you can get much better responses.
Change print(dice_art[i], end='') to:
print(dice_art[i], end=' '), (Notice the space inbetween the two 's and the , after your previous code)
If you want to print the data dynamically, use the following syntax:
print(dice_art[i], sep=' ', end='', flush=True),
A join command should do it.
dice_art = ['ASCII0', 'ASCII1', 'ASCII2']
print(" ".join(dice_art))
The output would be:
ASCII0 ASCII1 ASCII2

Fastest way to "grep" big files

I have big log files (from 100MB to 2GB) that contain a (single) particular line I need to parse in a Python program. I have to parse around 20,000 files. And I know that the searched line is within the 200 last lines of the file, or within the last 15000 bytes.
As it is a recurring task, I need it be as fast as possible. What is the fastest way to get it?
I have thought about 4 strategies:
read the whole file in Python and search a regex (method_1)
read only the last 15,000 bytes of the file and search a regex (method_2)
make a system call to grep (method_3)
make a system call to grep after tailing the last 200 lines (method_4)
Here are the functions I created to test these strategies :
import os
import re
import subprocess
def method_1(filename):
"""Method 1: read whole file and regex"""
regex = r'\(TEMPS CP :[ ]*.*S\)'
with open(filename, 'r') as f:
txt = f.read()
match = re.search(regex, txt)
if match:
print match.group()
def method_2(filename):
"""Method 2: read part of the file and regex"""
regex = r'\(TEMPS CP :[ ]*.*S\)'
with open(filename, 'r') as f:
size = min(15000, os.stat(filename).st_size)
f.seek(-size, os.SEEK_END)
txt = f.read(size)
match = re.search(regex, txt)
if match:
print match.group()
def method_3(filename):
"""Method 3: grep the entire file"""
cmd = 'grep "(TEMPS CP :" {} | head -n 1'.format(filename)
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
print process.communicate()[0][:-1]
def method_4(filename):
"""Method 4: tail of the file and grep"""
cmd = 'tail -n 200 {} | grep "(TEMPS CP :"'.format(filename)
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
print process.communicate()[0][:-1]
I ran these methods on two files ("trace" is 207MB and "trace_big" is 1.9GB) and got the following computation time (in seconds):
+----------+-----------+-----------+
| | trace | trace_big |
+----------+-----------+-----------+
| method_1 | 2.89E-001 | 2.63 |
| method_2 | 5.71E-004 | 5.01E-004 |
| method_3 | 2.30E-001 | 1.97 |
| method_4 | 4.94E-003 | 5.06E-003 |
+----------+-----------+-----------+
So method_2 seems to be the fastest. But is there any other solution I did not think about?
Edit
In addition to the previous methods, Gosha F suggested a fifth method using mmap :
import contextlib
import math
import mmap
def method_5(filename):
"""Method 5: use memory mapping and regex"""
regex = re.compile(r'\(TEMPS CP :[ ]*.*S\)')
offset = max(0, os.stat(filename).st_size - 15000)
ag = mmap.ALLOCATIONGRANULARITY
offset = ag * (int(math.ceil(offset/ag)))
with open(filename, 'r') as f:
mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_COPY, offset=offset)
with contextlib.closing(mm) as txt:
match = regex.search(txt)
if match:
print match.group()
I tested it and get the following results:
+----------+-----------+-----------+
| | trace | trace_big |
+----------+-----------+-----------+
| method_5 | 2.50E-004 | 2.71E-004 |
+----------+-----------+-----------+
You may also consider using memory mapping (mmap module) like this
def method_5(filename):
"""Method 5: use memory mapping and regex"""
regex = re.compile(r'\(TEMPS CP :[ ]*.*S\)')
offset = max(0, os.stat(filename).st_size - 15000)
with open(filename, 'r') as f:
with contextlib.closing(mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_COPY, offset=offset)) as txt:
match = regex.search(txt)
if match:
print match.group()
also some side notes:
in the case of using a shell command, ag may be in some cases orders of magnitude faster than grep (although with only 200 lines of greppable text the difference probably vanishes compared to the overhead of starting a shell)
just compiling your regex in the beginning of the function may make some difference
Probably faster to do the processing in the shell so as to avoid the python overhead. Then you can pipe the result into a python script. Otherwise it looks like you did the fastest thing.
Seeking then regex match should be very fast. Method 2 and 4 are the same but you incur the extra overhead of python making a syscall.
Does it have to be in Python? Why not a shell script?
My guess is that method 4 will be the fastest/most efficient. That's certainly how I'd write it as shell script. And it's got the be faster than 1 or 3. I'd still time it in comparison to method 2 to be 100% sure though.

Convert Python Pretty table to CSV using shell or batch command line

Whats an easy way convert the output of Python Pretty table to grammatically usable format such as CSV.
The output looks like this :
C:\test> nova list
spu+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
| 6bca09f8-a320-44d4-a11f-647dcec0aaa1 | tester | ACTIVE | - | Running | OpenStack-net=10.0.0.1, 10.0.0.3 |
+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
Perhaps this will get you close:
nova list | grep -v '\-\-\-\-' | sed 's/^[^|]\+|//g' | sed 's/|\(.\)/,\1/g' | tr '|' '\n'
This will strip the --- lines
Remove the leading |
Replace all but the last | with ,
Replace the last | with \n
Here's a real ugly one-liner
import csv
s = """\
spu+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
| 6bca09f8-a320-44d4-a11f-647dcec0aaa1 | tester | ACTIVE | - | Running | OpenStack-net=10.0.0.1, 10.0.0.3 |
+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+"""
result = [tuple(filter(None, map(str.strip, splitline))) for line in s.splitlines() for splitline in [line.split("|")] if len(splitline) > 1]
with open('output.csv', 'wb') as outcsv:
writer = csv.writer(outcsv)
writer.writerows(result)
I can unwrap it a bit to make it nicer:
splitlines = s.splitlines()
splitdata = line.split("|")
splitdata = filter(lambda line: len(line) > 1, data)
# toss the lines that don't have any data in them -- pure separator lines
header, *data = [[field.strip() for field in line if field.strip()] for line in splitdata]
result = [header] + data
# I'm really just separating these, then re-joining them, but sometimes having
# the headers separately is an important thing!
Or possibly more helpful:
result = []
for line in s.splitlines():
splitdata = line.split("|")
if len(splitdata) == 1:
continue # skip lines with no separators
linedata = []
for field in splitdata:
field = field.strip()
if field:
linedata.append(field)
result.append(linedata)
#AdamSmith's answer has a nice method for parsing the raw table string. Here are a few additions to turn it into a generic function (I chose not to use the csv module so there are no additional dependencies)
def ptable_to_csv(table, filename, headers=True):
"""Save PrettyTable results to a CSV file.
Adapted from #AdamSmith https://stackoverflow.com/questions/32128226
:param PrettyTable table: Table object to get data from.
:param str filename: Filepath for the output CSV.
:param bool headers: Whether to include the header row in the CSV.
:return: None
"""
raw = table.get_string()
data = [tuple(filter(None, map(str.strip, splitline)))
for line in raw.splitlines()
for splitline in [line.split('|')] if len(splitline) > 1]
if table.title is not None:
data = data[1:]
if not headers:
data = data[1:]
with open(filename, 'w') as f:
for d in data:
f.write('{}\n'.format(','.join(d)))
Here's a solution using a regular expression. It also works for an arbitrary number of columns (the number of columns is determined by counting the number of plus signs in the first input line).
input_string = """spu+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+
| 6bca09f8-a320-44d4-a11f-647dcec0aaa1 | tester | ACTIVE | - | Running | OpenStack-net=10.0.0.1, 10.0.0.3 |
+--------------------------------------+--------+--------+------------+-------------+-----------------------------------+"""
import re, csv, sys
def pretty_table_to_tuples(input_str):
lines = input_str.split("\n")
num_columns = len(re.findall("\+", lines[0])) - 1
line_regex = r"\|" + (r" +(.*?) +\|"*num_columns)
for line in lines:
m = re.match(line_regex, line.strip())
if m:
yield m.groups()
w = csv.writer(sys.stdout)
w.writerows(pretty_table_to_tuples(input_string))

Writing list in CSV to create a model file

I have this CSV that I have to modify using Python.
The number of files varies each time. The input CSV files i have are just a list of coordinates (x, y, z) and i have to modify the file into a 'model' which contains the same coordinates but also some information/headers.
The model looks like this :
Number | 1 | |
Head | N | E | El
Begin | list | list | list
| . | . | .
| . | . | .
| . | . | .
| . | . | .
End | . | . | .
| . | . | .
BeginR | Ok | |
EndR | | |
The dots are the coordinates that are in the lists.
So far I've managed to write almost everything.
What's left is to write the Begin and the End in the first column.
Because the size of the list varies, I have difficulties to place it where they should be : Begin at the same line with the first coordinates and End at the second to last coordinate line.
This is my updated code :
for i in ficList:
with open(i, newline='') as f:
reader = csv.reader(f, delimiter = ';')
next(reader) # skip the header
for row in reader:
coord_x.append(row[0]) # X
coord_y.append(row [1]) # Y
coord_z.append(row[2]) # Z
list_list = [coord_x, coord_y, coord_z] # list of coordinates
len_x = len(coord_x) # length of list
with open(i, 'w', newline='') as fp:
writer = csv.writer(fp, delimiter = ';')
writer.writerow(['Number', number])
writer.writerow(['Head','N', 'E', 'El'])
for l in range(len_x):
if l == 0:
writer.writerow(['Begin',list_list[0][l], list_list[1][l], list_list[2][l]])
if l == len_x-2 :
writer.writerow(['End',list_list[0][l], list_list[1][l], list_list[2][l]])
writer.writerow(['',list_list[0][l], list_list[1][l], list_list[2][l]]) # write the coordinates
writer.writerow(['BeginR', 'Ok'])
writer.writerow(['EndR'])
coord_x.clear() # empty list x
coord_y.clear() # empty list y
coord_z.clear() # empty list z
You're probably better off to define the row labels in advance in a map, then look them up for each row. Also list_list is not really needed, you should just stick to the separate vectors:
...
with open(i, 'w', newline='') as fp:
writer = csv.writer(fp, delimiter = ';')
writer.writerow(['Number', number])
writer.writerow(['Head','N', 'E', 'El'])
row_label_map = {0:'Begin',len_x-2:'End'}
for l in range(len_x):
row_label = row_label_map.get(l,"")
writer.writerow([row_label, coord_x[l], coord_y[l], coord_z[l]])
writer.writerow(['BeginR', 'Ok'])
writer.writerow(['EndR'])
...
Also you don't need to clear the vectors coord_x etc. afterwards as they will be deleted when they go out of scope.
With your latest code, I am guessing the issue is because you are first writing the line with BEGIN tag and then without it, move the logic into a if..elif..else part -
for l in range(len_x):
if l == 0:
writer.writerow(['Begin',list_list[0][l], list_list[1][l], list_list[2][l]])
elif l == len_x-2 :
writer.writerow(['End',list_list[0][l], list_list[1][l], list_list[2][l]])
else:
writer.writerow(['',list_list[0][l], list_list[1][l], list_list[2][l]]) # write the coordinates
To me it seems like it would be easier to first modify the input CSV to include and extra column that has the Begin and End tags with sed like this:
sed -e 's/^/,/' -e '1s/^/Begin/' -e '$ s/^/End/' -e 's/^,/ ,/' test.csv
Then you can simply print the columns as they are without having to add logic for when to add the additional tags in python. This assumes that the input CSV is called test.csv

Categories