Python: Checking and adding to a text file - python

I've been working on this, and googling for hours. I can't seem to figure out what is going wrong.
The purpose of this program, is to check a text file for stock market ticker symbols, and add a ticker only if it is not already in the file.
There are two things going wrong. When the text file is empty, it won't add any tickers at all. When it has even a single character in the text file, it is adding every ticker you give it, regardless of if that ticker is already on the list.
import re
def tickerWrite(tick):
readTicker = open('Tickers.txt', 'r')
holder = readTicker.readlines()
readTicker.close()
if check(tick) == False:
writeTicker = open('Tickers.txt', 'w')
holder.append(tick.upper() + '\n')
writeTicker.writelines(holder)
writeTicker.close()
def check(ticker):
with open('Tickers.txt') as tList:
for line in tList:
if re.search(ticker, line):
return True
else:
return False
Another module calls AddReadTickers.tickerWrite(ticker) in order to add tickers entered by a user.

First of all.
I suggest to use
if not check(tick):
instead of
if check(tick) == False:
Then. I think it is better to use
writeTicker = open('Tickers.txt', 'a')
and you will not need holder at all.
Just tried to rewrite the code
from __future__ import print_function
import re
import sys
def tickerWrite(tick):
if not check(tick):
with open('Tickers.txt', 'a') as writeTicker:
print(tick.upper(), file=writeTicker)
def check(ticker):
with open('Tickers.txt') as tList:
for line in tList:
return bool(re.search(ticker, line))
if __name__ == '__main__':
tickerWrite(sys.argv[1])
It works as it seems for me.

function check should return False defaultly. It returns None for an empty Tickers.txt, that causes the line "if check(tick) == False:" always False. This is the reason it won't add any ticker for empty file
my guess is because of content of ticker. Since you use the ticker as pattern, it probably cause unexpected result when ticker contains some special characters of regular expression. In my understanding, you can just use code
if ticker==line:
return True
else:
return False

Related

Removing '\n' from a string without using .translate, .replace or strip()

I'm making a simple text-based game as a learning project. I'm trying to add a feature where the user can input 'save' and their stats will be written onto a txt file named 'save.txt' so that after the program has been stopped, the player can then upload their previous stats and play from where they left off.
Here is the code for the saving:
user inputs 'save' and class attributes are saved onto the text file as text, one line at a time
elif first_step == 'save':
f = open("save.txt", "w")
f.write(f'''{player1.name}
{player1.char_type} #value is 'Wizard'
{player1.life}
{player1.energy}
{player1.strength}
{player1.money}
{player1.weapon_lvl}
{player1.wakefulness}
{player1.days_left}
{player1.battle_count}''')
f.close()
But, I also need the user to be able to load their saved stats next time they run the game. So they would enter 'load' and their stats will be updated.
I'm trying to read the text file one line at a time and then the value of that line would become the value of the relevant class attribute in order, one at a time. If I do this without converting it first to a string I get issues, such as some lines being skipped as python is reading 2 lines as one and putting them altogether as a list.
So, I tried the following:
In the below example, I'm only showing the data from the class attributes 'player1.name' and 'player1.char_type' as seen above as to not make this question as short as possible.
elif first_step == 'load':
f = open("save.txt", 'r')
player1.name_saved = f.readline() #reads the first line of the text file and assigns it's value to player1.name_saved
player1.name_saved2 = str(player1.name_saved) # converts the value of player1.name_saved to a string and saves that string in player1.name_saved2
player1.name = player1.name_saved2 #assigns the value of player1.name_saved to the class attribute player1.name
player1.char_type_saved = f.readlines(1) #reads the second line of the txt file and saves it in player1.char_type_saved
player1.char_type_saved2 = str(player1.char_type_saved) #converts the value of player1.char_type_saved into a string and assigns that value to player1.char_type_saved2
At this point, I would assign the value of player1.char_type_saved2 to the class attribute player1.char_type so that the value of player1.char_type enables the player to load the previous character type from the last time they played the game. This should make the value of player1.char_type = 'Wizard' but I'm getting '['Wizard\n']'
I tried the following to remove the brackets and \n:
final_player1.char_type = player1.char_type_saved2.translate({ord(c): None for c in "[']\n" }) #this is intended to remove everything from the string except for Wizard
For some reason, the above only removes the square brackets and punctuation marks but not \n from the end.
I then tried the following to remove \n:
final_player1.char_type = final_player1.char_type.replace("\n", "")
final_player1.char_type is still 'Wizard\n'
I've also tried using strip() but I've been unsuccessful.
If anyone could help me with this I would greatly appreciate it. Sorry if I have overcomplicated this question but it's hard to articulate it without lots of info. Let me know if this is too much or if more info is needed to answer.
If '\n' is always at the end it may be best to use:
s = 'wizard\n'
s = s[:-1]
print(s, s)
Output:
wizard wizard
But I still think strip() is best:
s = 'wizard\n'
s = s.strip()
print(s, s)
Output:
wizard wizard
Normaly it should work with just
char_type = "Wizard\n"
char_type.replace("\n", "")
print(char_type)
The output will be "Wizard"

Method to search for specific character strings

I have two different kinds of URLs in a list:
The first kind looks like this and starts with the word 'meldung':
meldung/xxxxx.html
The other kind starts with 'artikel':
artikel/xxxxx.html
I want to detect if a URL starts with 'meldung' or 'artikel' and then do different operations based on that. To achieve this I tired to use a loop with if and else conditions:
for line in r:
if re.match(r'^meldung/', line):
print('je')
else:
print('ne')
I also tried this with line.startswith():
for line in r:
if line.startswith('meldung/'):
print('je')
else:
print('ne')
But both methods dont work since the strings I am checking dont have any whitespaces.
How can I do this correctly?
You can just use the following, if the links are stored as strings within the list:
for line in r:
if ‘meldung’ in line:
print(‘je’)
else:
print(‘ne’)
What about this:
r = ['http://example.com/meldung/page1.html', 'http://example.com/artikel/page2.html']
for line in r:
url_tokens = line.split('/')
if url_tokens[-2] == 'meldung':
print(url_tokens[-1]) # the xxxxx.html part
elif url_tokens[-2] == 'artikel':
print('ne')
else:
print('something else')
you can do it using regex:
import re
def check(string):
if (re.search('^meldung|artikel*', string)):
print("je")
else:
print("ne")
for line in r:
check(line)

How to pass string variable into search function?

I am having issues passing a string variable into a search function.
Here is what I'm trying to accomplish:
I have a file full of values and I want to check the file to make sure a specific matching line exists before I proceed. I want to ensure that the line <endSW=UNIQUE-DNS-NAME-HERE<> exists if a valid <begSW=UNIQUE-DNS-NAME-HERE<> exists and is reachable.
Everything works fine until I call if searchForString(searchString,fileLoc): which always returns false. If I assign the variable 'searchString' a direct value and pass it it works, so I know it must be something with the way I'm combining the strings, but I can't seem to figure out what I'm doing wrong.
If I examine the data that 'searchForString' is using I see what seems to be valid values:
values in fileLines list:
['<begSW=UNIQUE-DNS-NAME-HERE<>', ' <begPortType=UNIQUE-PORT-HERE<>', ' <portNumbers=80,443,22<>', ' <endPortType=UNIQUE-PORT-HERE<>', '<endSW=UNIQUE-DNS-NAME-HERE<>']
value of searchVar:
<endSW=UNIQUE-DNS-NAME-HERE<>
An example of the entry in the file is:
<begSW=UNIQUE-DNS-NAME-HERE<>
<begPortType=UNIQUE-PORT-HERE<>
<portNumbers=80,443,22<>
<endPortType=UNIQUE-PORT-HERE<>
<endSW=UNIQUE-DNS-NAME-HERE<>
Here is the code in question:
def searchForString(searchVar,readFile):
with open(readFile) as findMe:
fileLines = findMe.read().splitlines()
print fileLines
print searchVar
if searchVar in fileLines:
return True
return False
findMe.close()
fileLoc = '/dir/folder/file'
fileLoc.lstrip()
fileLoc.rstrip()
with open(fileLoc,'r') as switchFile:
for line in switchFile:
#declare all the vars we need
lineDelimiter = '#'
endLine = '<>\n'
begSWLine= '<begSW='
endSWLine = '<endSW='
begPortType = '<begPortType='
endPortType = '<endPortType='
portNumList = '<portNumbers='
#skip over commented lines -(REMOVE THIS)
if line.startswith(lineDelimiter):
pass
#checks the file for a valid switch name
#checks to see if the host is up and reachable
#checks to see if there is a file input is valid
if line.startswith(begSWLine):
#extract switch name from file
switchName = line[7:-3]
#check to make sure switch is up
if pingCheck(switchName):
print 'Ping success. Host is reachable.'
searchString = endSWLine+switchName+'<>'
**#THIS PART IS SUCKING, WORKS WITH DIRECT STRING PASS
#WONT WORK WITH A VARIABLE**
if searchForString(searchString,fileLoc):
print 'found!'
else:
print 'not found'
Any advice or guidance would be extremely helpful.
Hard to tell without the file's contents, but I would try
switchName = line[7:-2]
So that would look like
>>> '<begSW=UNIQUE-DNS-NAME-HERE<>'[7:-2]
'UNIQUE-DNS-NAME-HERE'
Additionally, you could look into regex searches to make your cleanup more versatile.
import re
# re.findall(search_expression, string_to_search)
>>> re.findall('\=(.+)(?:\<)', '<begSW=UNIQUE-DNS-NAME-HERE<>')[0]
'UNIQUE-DNS-NAME-HERE'
>>> e.findall('\=(.+)(?:\<)', ' <portNumbers=80,443,22<>')[0]
'80,443,22'
I found how to recursively iterate over XML tags in Python using ElementTree? and used the methods detailed to parse an XML file instead of using a TXT file.

Filtering a CSV file in python

I have downloaded this csv file, which creates a spreadsheet of gene information. What is important is that in the HLA-* columns, there is gene information. If the gene is too low of a resolution e.g. DQB1*03 then the row should be deleted. If the data is too high resoltuion e.g. DQB1*03:02:01, then the :01 tag at the end needs to be removed. So, ideally I want to proteins to be in the format DQB1*03:02, so that it has two levels of resolution after DQB1*. How can I tell python to look for these formats, and ignore the data stored in them.
e.g.
if (csvCell is of format DQB1*03:02:01):
delete the :01 # but do this in a general format
elif (csvCell is of format DQB1*03):
delete row
else:
goto next line
UPDATE: Edited code I referenced
import csv
import re
import sys
csvdictreader = csv.DictReader(open('mhc.csv','r+b'), delimiter=',')
csvdictwriter = csv.DictWriter(file('mhc_fixed.csv','r+b'), fieldnames=csvdictreader.fieldnames, delimiter=',')
csvdictwriter.writeheader()
targets = [name for name in csvdictreader.fieldnames if name.startswith('HLA-D')]
for rowfields in csvdictreader:
keep = True
for field in targets:
value = rowfields[field]
if re.match(r'^\w+\*\d\d$', value):
keep = False
break # quit processing target fields
elif re.match(r'^(\w+)\*(\d+):(\d+):(\d+):(\d+)$', value):
rowfields[field] = re.sub(r'^(\w+)\*(\d+):(\d+):(\d+):(\d+)$',r'\1*\2:\3', value)
else: # reduce gene resolution if too high
# by only keeping first two alles if three are present
rowfields[field] = re.sub(r'^(\w+)\*(\d+):(\d+):(\d+)$',r'\1*\2:\3', value)
if keep:
csvdictwriter.writerow(rowfields)
Here's something that I think will do what you want. It's not as simple as Peter's answer because it uses Python's csv module to process the file. It could probably be rewritten and simplified to just treat the file as a plain text as his does, but that should be easy.
import csv
import re
import sys
csvdictreader = csv.DictReader(sys.stdin, delimiter=',')
csvdictwriter = csv.DictWriter(sys.stdout, fieldnames=csvdictreader.fieldnames, delimiter=',')
csvdictwriter.writeheader()
targets = [name for name in csvdictreader.fieldnames if name.startswith('HLA-')]
for rowfields in csvdictreader:
keep = True
for field in targets:
value = rowfields[field]
if re.match(r'^DQB1\*\d\d$', value): # gene resolution too low?
keep = False
break # quit processing target fields
else: # reduce gene resolution if too high
# by only keeping first two alles if three are present
rowfields[field] = re.sub(r'^DQB1\*(\d\d):(\d\d):(\d\d)$',
r'DQB1*\1:\2', value)
if keep:
csvdictwriter.writerow(rowfields)
The hardest part for me was determining what you wanted to do.
Here's an ultra-simple filter:
import sys
for line in sys.stdin:
line = line.replace( ',DQB1*03:02:01,', ',DQB1*03:02,' )
if line.find( ',DQB1*03,' ) == -1:
sys.stdout.write( line )
Or, if you want to use regular expressions
import re
import sys
for line in sys.stdin:
line = re.sub( ',DQB1\\*03:02:01,', ',DQB1*03:02,', line )
if re.search( ',DQB1\\*03,', line ) == None:
sys.stdout.write( line )
Run it as
python script.py < data.csv

what is wrong with my encoding/cipher code?

My code that is meant to replace certain letters (a with e, e with a and s with 3 specifically) is not working, but I am not quite sure what the error is as it is not changing the text file i am feeding it.
pattern = "ae|ea|s3"
def encode(pattern, filename):
message = open(filename, 'r+')
output = []
pattern2 = pattern.split('|')
for letter in message:
isfound = false
for keypair in pattern2:
if letter == keypair[0]:
output.append(keypair[1])
isfound = true
if isfound == true:
break;
if isfound == false:
output.append(letter)
message.close()
Been racking my brain out trying to figure this out for a while now..
It is not changing the textfile because you do not replace the textfile with the output you create. Instead this function is creating the output string and dropping it at the end of the function. Either return the output string from the function and store it outside, or replace the file in the function by writing to the file without appending.
As this seems like an exercise I prefer to not add the code to do it, as you will probably learn more from writing the function yourself.
Here is a quick implementation with the desired result, you will need to modify it yourself to read files, etc:
def encode(pattern, string):
rep = {}
for pair in pattern.split("|"):
rep[pair[0]] = pair[1]
out = []
for c in string:
out.append(rep.get(c, c))
return "".join(out)
print encode("ae|ea|s3", "Hello, this is my default string to replace")
#output => "Hallo, thi3 i3 my dafeult 3tring to rapleca"
If you want to modify a file, you need to specifically tell your program to write to the file. Simply appending to your output variable will not change it.

Categories