How to erase part of a read only file when printing it

How to erase part of a read only file when printing it - python

Practically, I'm reading a file line by line and then printing onto the screen in pygame.
textbeingread = f.readline()
The code takes 'textbeingread' and uses that to show text on the screen but because each piece of writing is on a separate line it has the little icon to show that there a line underneath it (not exactly sure how to show it). I was just wondering if there was a way (because each line is a different length) to omit the last character in the line but use everything else. Thanks in advance :)

textbeingread = f.readline()[:-1]
or
textbeingread = f.readline()[:-2]
Depends on whether you want to get rid of newline character or also the character before it.

textbeingread = f.readline().rstrip("\r\n")

Related

Spaces are different in the same sentence/string

I have some stuff in an Excel spreadsheet, which is loaded into a webpage, where the content is displayed. However, what I have noticed is, that some of the content has weird formatting, i.e. a sudden line shift or something.
Then I just tried to copy the text from the spreadsheet, and pasting it into Notepad++, and enabled "Show White Space and Tab", and then the output was this:
The second line is the one directly copied from the spreadsheet, where the first one is just where I copied the string into a variable in Python, printed it, and then copied the output from the output console.
And as you can see the first line has all dots for space, where the other misses some dots. And I have an idea that that is what is doing this trickery, especially because it's at those place the line shift happens.
I have tried to just do something like:
import pandas as pd
data = pd.read_excel("my_spreadsheet.xlsx")
data["Strings"] = [str(x).replace(" ", " ") for x in data["Strings"]]
data.to_excel("my_spreadsheet.xlsx", index=False)
But that didn't change anything, as if I copied it straight from the output console.
So yeah, is there any easy way to make spaces the same type of spaces, or do I have to do something else ?

I think you would need to figure out which exact character is being used there.
You can load the file and print out the characters one by one together with the character code to figure out what's what.
See the code example below. I added some code to skip alphanumeric characters to reduce the actual output somewhat...
with open("filename.txt") as infile:
text = infile.readlines()
def print_ordinal(text: str, skip_alphanum: bool=True):
for line in text:
for character in line:
if not(skip_alphanum and character.isalnum()):
print(f"{character} - {ord(character)}")
print_ordinal(text)

Python not inserting an EOF character after file close?

I'm having a strange problem where some Python code that prints to a file is not inserting an EOF character. Basically, the Python script generates runscripts to later be submitted as jobs on a cluster. I essentially wrote the entire runscript between """'s, allowing for variables to be plugged in (to vary some parameters in my simulation). I write the runscripts using the
with open(file_name, 'w') as runscrpt:
runscrpt.write("""ENTIRE_FILE_CONTENTS_HERE""")
syntax. I can give the actual code if necessary but it's not much more than above. Despite the script running fine and generating all of my runsripts, whenever I submitted them nothing happened. It took me a long time to figure out why, but it's because they're missing an EOF character. I can fix it by, for example, opening one, adding some trailing whitespace or a blank line somewhere in vim, and resaving the file.
Why isn't Python inserting the EOF character, and is there a better way to fix this than manually making trivial edits to all the files with vim?

Sounds like you mean there is no EOL (not EOF!) at the end, because that's what diff will typically tell you. Just add a newline at the end of the write (make sure there is a newline before the final """ terminator, or write a separate newline explicitly).
with open(file_name, 'w') as runscript:
runscript.write("""ENTIRE_FILE_CONTENTS_HERE\n""")
(As a bonus, I added the missing vowel.)

why different download way result in different display?

When i down the file on the web with my firefox,
http://quotes.money.163.com/service/lrb_000559.html
it looks fine in my EXCEL.
When i down the file with my python code,
from urllib.request import urlopen
url="http://quotes.money.163.com/service/lrb_000559.html"
html=urlopen(url)
outfile=open("g:\\000559.csv","w")
outfile.write(html.read().decode("gbk"))
outfile.close()
it looks stange, when open it with my EXCEL,there is one line filled with proper content ,and one line filled with blank ,you can try it in your pc.
Why will different download way result in different display ?

My guess is that line endings are changed when decoding and writing the result in python. Try using a binary file instead. Off the top of my head, I think it would go something like this:
outfile=open("g:\\000559.csv","wb")
outfile.write(html.read())

Add a 'b' flag to the file open, i.e. change this:
outfile=open("g:\\000559.csv","w")
To this:
outfile=open("g:\\000559.csv","wb")
Explanation here. The original file had a \r\n, and Python is converting the \n to \r\n, meaning you have an extra carriage return at the end of every line (\r\r\n).

How to read line by line from stdin in python

Everyone knows how to count the characters from STDIN in C. However, when I tried to do that in python3, I find it is a puzzle. (counter.py)
import sys
chrCounter = 0
for line in sys.stdin.readline():
chrCounter += len(line)
print(chrCounter)
Then I try to test the program by
python3 counter.py < counter.py
The answer is only the length of the first line "import sys". In fact, the program ONLY read the first line from the standard input, and discard the rest of them.
It will be work if I take the place of sys.stdin.readline by sys.stdin.read()
import sys
print(len(sys.stdin.read()))
However, it is obviously, that the program is NOT suitable for a large input. Please give me a elegant solution. Thank you!

It's simpler:
for line in sys.stdin:
chrCounter += len(line)
The file-like object sys.stdin is automatically iterated over line by line; if you call .readline() on it, you only read the first line (and iterate over that character-by-character); if you call read(), then you'll read the entire input into a single string and iterate over that character-by.character.

The answer from Tim Pietzcker is IMHO the correct one. There are 2 similar ways of doing this. Using:
for line in sys.stdin:
and
for line in sys.stdin.readlines():
The second option is closer to your original code. The difference between these two options is made clear by using e.g. the following modification of the for-loop body and using keyboard for input:
for line in sys.stdin.readlines():
line_len = len(line)
print('Last line was', line_len, 'chars long.')
chrCounter += len(line)
If you use the first option (for line in sys.stdin:), then the lines are processed right after you hit enter.
If you use the second option (for line in sys.stdin.readlines():), then the whole file is first read, split into lines and only then they are processed.

If I just wanted a character count, I'd read in blocks at a time instead of lines at a time:
# 4096 chosen arbitrarily. Pick any other number you want to use.
print(sum(iter(lambda:len(sys.stdin.read(4096)), 0)))

Parse log file in python

I have a log file that has lines that look like this:
"1","2546857-23541","f_last","user","4:19 P.M.","11/02/2009","START","27","27","3","c2546857-23541",""
Each line in the log as 12 double quote sections and the 7th double quote section in the string comes from where the user typed something into the chat window:
"22","2546857-23541","f_last","john","4:38 P.M.","11/02/2009","
What's up","245","47","1","c2546857-23541",""
This string also shows the issue I'm having; There are areas in the chat log where the text the user typed is on a new line in the log file instead of the same line like the first example.
So basically I want the lines in the second example to look like the first example.
I've tried using Find/Replace in N++ and I am able to find each "orphaned" line but I was unable to make it join the line above it.
Then I thought of making a python file to automate it for me, but I'm kind of stuck about how to actually code it.
Python errors out at this line running unutbu's code
"1760","4746880-00129","bwhiteside","tom","11:47 A.M.","12/10/2009","I do not see ^"refresh your knowledge
^" on the screen","422","0","0","c4746871-00128",""

The csv module is smart enough to recognize when a quoted item is not finished (and thus must contain a newline character).
import csv
with open('data.log',"r") as fin:
with open('data2.log','w') as fout:
reader=csv.reader(fin,delimiter=',', quotechar='"', escapechar='^')
writer=csv.writer(fout, delimiter=',',
doublequote=False, quoting=csv.QUOTE_ALL)
for row in reader:
row[6]=row[6].replace('\n',' ')
writer.writerow(row)

If you data is valid CSV you can use Python's csv.reader class. It should work just fine with your sample data. It may not work correctly depending an what an embeded double-quote looks like from the source system. See: http://docs.python.org/library/csv.html#module-contents.

Unless I'm misunderstanding the problem. You simply need to read in the file and remove any newline characters that occur between double quote characters.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to erase part of a read only file when printing it - python

textbeingread = f.readline()[:-1] or textbeingread = f.readline()[:-2] Depends on whether you want to get rid of newline character or also the character before it.

textbeingread = f.readline().rstrip("\r\n")

Related

Spaces are different in the same sentence/string

Python not inserting an EOF character after file close?

why different download way result in different display?

How to read line by line from stdin in python

Parse log file in python

Categories

Resources