delete only 1 instance of a string from a file

delete only 1 instance of a string from a file - python

I have a file that looks like this:
1234:AnneShirly:anneshirley#seneca.ca:4:5\[SRT111,OPS105,OPS110,SPR100,ENG100\]
3217:Illyas:illay#seneca.ca:2:4\[SRT211,OPS225,SPR200,ENG200\]
1127:john Marcus:johnmarcus#seneca.ca:1:4\[SRT111,OPS105,SPR100,ENG100\]
0001:Amin Malik:amin_malik#seneca.ca:1:3\[OPS105,SPR100,ENG100\]
I want to be able to ask the user for an input(the student number at the beginning of each line) and then ask which course they want to delete(the course codes are the list). So the program would delete the course from the list in the student number without deleting other instances of the course. Cause other students have the same courses.
studentid = input("enter studentid")
course = input("enter the course to delete")
with open("studentDatabase.dat") as file:
f = file.readlines()
with open("studentDatabase.dat","w") as file:
for line in lines:
if line.find(course) == -1:
file.write(line)
This just deletes the whole line but I only want to delete the course

Welcome to the site. You have a little ways to go to make this work. It would be good if you put some additional effort in to this before asking somebody to code this up. Let me suggest a structure for you that perhaps you can work on/augment and then you can re-post if you get stuck by editing your question above and/or commenting back on this answer. Here is a framework that I suggest:
make a section of code to read in your whole .dat file into memory. I would suggest putting the data into a dictionary that looks like this:
data = {1001: (name, email, <whatever the digits stand for>, [SRT111, OPS333, ...],
1044: ( ... )}
basically a dictionary with the ID as the key and the rest in a tuple or list. Test that, make sure it works OK by inspecting a few values.
Make a little "control loop" that uses your input statements, and see if you can locate the "record" from your dictionary. Add some "if" logic to do "something" if the ID is not found or if the user enters something like "quit" to exit/break the loop. Test it to make sure it can find the ID's and then test it again to see that it can find the course in the list inside the tuple/list with the data. You probably need another "if" statement in there to "do something" if the course is not in the data element. Test it.
Make a little "helper function" that can re-write a data element with the course removed. A suggested signature would be:
def remove_course(data_element, course):
# make the new data element (name, ... , [reduced course list]
return new_data_element
Test it, make sure it works.
Put those pieces together and you should have the ingredients to change the dictionary by using the loop and function to put the new data element into the dictionary, over-writing the old one.
Write a widget to write the new .dat file from the dictionary in its entirety.
EDIT:
You can make the dictionary from a data file with something like this:
filename = 'student_data.dat'
data = {} # an empty dictionary to stuff the results in
# use a context manager to handle opening/closing the file...
with open(filename, 'r') as src:
# loop through the lines
for line in src:
# strip any whitespace from the end and tokenize the line by ":"
tokens = line.strip().split(':')
# check it... (remove later)
print(tokens)
# gather the pieces, make conversions as necessary...
stu_id = int(tokens[0])
name = tokens[1]
email = tokens[2]
some_number = int(tokens[3])
# splitting the number from the list of courses is a little complicated
# you *could* do this more elegantly with regex, but for your level,
# here is a simple way to find the "chop points" and split this up...
last_blobs = tokens[4].split('[')
course_count = int(last_blobs[0])
course_list = last_blobs[1][:-1] # everything except the last bracket
# split up the courses by comma
courses = course_list.split(',')
# now stuff that into the dictionary...
# a little sanity check:
if data.get(stu_id):
print(f'duplicate ID found: {stu_id}. OVERWRITING')
data[stu_id] = (name,
email,
some_number,
course_count,
courses)
for key, value in data.items():
print(key, value)

i got something for you. What you want to do is to find the student first and then delete the course: like this.
studentid = input("enter studentid")
course = input("enter the course to delete")
with open("studentDatabase.dat") as file:
f = file.readlines()
with open("studentDatabase.dat","w") as file:
for line in lines:
if studentid in line: # Check if it's the right sudent
line = line.replace(course, "") # replace course with nothing
file.write(line)
You want to check if we are looking at the correct student, then replace the line but without the course code. Hope you can find it useful.

Related

Remove certain lines in an external textfile

I'm working on a program which should be able to handle basic library tasks. I've a
problem with a class method which is suppose to offer the user the possibility to remove a certain book from the library. The list of books is contained on an external textfile with the following format (author, title):
Vibeke Olsson, Molnfri bombnatt
Axel Munthe, Boken om San Michele
The metod I'm using is shown below:
def removeBook(self):
removal_of_book = input("What's the book's titel, author you'd like to remove?: ")
with open("books1.txt" , "r+") as li:
new_li = li.readlines()
li.seek(0)
for line in new_li:
if removal_of_book not in line:
li.write(line)
li.truncate()
print(removal_of_book + " is removed from the system!")
The problem with this method is it that every row containing removal_of_book gets removed (or not rewritten on the file). I know that the method is far from optimal and probably should be scratched but I'm completely lost in finding an alternative.
Does anyone have a better solution to this problem?

You can create your lines to write into the new file on the fly using a list comprehension and then write them to the new file afterwards (using the same file name to overwrite the original file):
def removeBook(self):
to_remove = input("What's the book's title, author you'd like to remove?: ")
with open("books1.txt" , "r+") as li:
new_li = [line for line in li.readlines() if to_remove not in line]
new_file = open('books1.txt', 'w'); new_file.write(new_li); new_file.close()
print(to_remove + " is removed from the system!")
Note that string membership checking is case sensitive, so you are expecting your user to match your case in the original file exactly. You might think about converting the strings to lower-case prior to performing your check using lower().

I need assistance with my dictionary program in python

ok so.. i'm trying to write a program that creates a dictionary of son:father entries and another dictionary that contains father:son entries. The program must present the user a menu with five options.
text file is this: john:fred, fred:bill, sam:tony, jim:william, william:mark, krager:holdyn, danny:brett, danny:issak, danny:jack, blasaen:zade, david:dieter, adamLseth, seth:enos
Problem Statement:
Write a program that creates a dictionary of son:father entries and another dictionary that contains father:son entries. Your program must present the user a menu with five options. The following is an example only:
Father/Son Finder
0 – Quit
1 – Find a Father
2 – Find a Grandfather
3 – Find a Son
4 – Find a Grandson
Option 0 ends the program.
Option 1 prompts the user for the name of a son. If the dictionary contains the son:father pair, the program displays the father. Otherwise, the program should tell the user it does not know who the father is.
Option 2 prompts the user for the name of a grandson. If the dictionary contains enough information, the program displays the grandfather. Otherwise, the program should tell the user it does not know who the grandfather is.
Option 3 prompts the user for the name of a father. If the dictionary contains the son:father pair, the program displays the son. Otherwise, the program should tell the user it does not know who the son is.
Option 4 prompts the user for the name of a grandfather. If the dictionary contains enough information, the program displays the grandson. Otherwise, the program should tell the user it does not know who the grandson is.
The program must create the dictionary structure and populate it from data contained in a file provided to you. In addition, the program must continue to ask the user for a menu choice until the user chooses to quit.
I have this thus far. I haven't gotten very far in it...
sons_fathers = {}
fathers_sons = {}
#open filename: names.dat
fo = open("names.dat", "r")
data = fo.read()
print (data)
for line in fo:
here is the flow chart: ![Flow chart][1]
https://jsu.blackboard.com/bbcswebdav/pid-2384378-dt-content-rid-3427920_1/xid-3427920_1
Thanks for the help. I need it lol.

Let's hope nobody give you an exact solution to this homework.
Here some hints, you need to know what you can do with string, string.split() will help you a lot. Also, read about what you can do with dictionary. You will also need the raw_input function
The rest is simple programming. Good luck.

How you describe your solution, I don't think a dictionary is what you want for this.
The keys must be unique.
# wont work, keys aren't unique
father_son = {'danny':'brett', 'danny':'issak', 'danny':'jack'}
You could however try a dictionary with a list as the value:
father_son = {'danny':['brett','issak', 'jack']}
if 'danny' in father_son.keys() and 'brett' in father_son['danny']:
#do something
Or you could use a list of 2-tuples that stores the pairs:
father_son = [('danny', 'brett'), ('danny', 'issak'), ('danny', 'jack')]
if ('danny', 'brett') in father_son:
#do something

sons_fathers = {} # one father per son
fathers_sons = {} # one or many sons per father, use list or
# set for the values
with open("names.dat", "r") as fo: # use context manager to close file automatically
for line in fo: # ?? is there only one line, or one pair per line??
# do something with line
# assume you extracted "son" and "father"
sons_fathers[son] = father
if father in fathers_sons:
fathers_sons[father].add(son)
else:
fathers_sons[father] = {son}

Regex line by line over large string

I have a lot of rows like below in a file:
{"first_name":"John","last_name":"Smith","age":30}
{"first_name":"Tim","last_name":"Johnson","age":34}
I first tried importing this as a dictionary with the json module so I could just print the values of the keys. The problem is some of the lines are missing the right curly bracket or have other issues and the fields aren't in the same order per line. That is preventing the import.
So now I am trying to do this with a regex. I have this:
fo = open("c:\\newgoodtestsample.txt", "r")
x = fo.read()
match1 = re.search('first_name"(.*?)"(.*?)"', x)
if match1:
print match1.group(2)
That returns the value of just the name. I would like to be able to return other fields as well. This worked in a regex tester but I can't get it to work in my code:
(first_name|last_name|age)"(.*?)"(.*?)"
Lastly, once that is figured out, I need to read each line in the file (not just the first one) and print the requested regex data from each line into a file. I have tried inserting a for loop but I keep getting the first line repeated over and over so I must be inserting it incorrectly. Any assistance is appreciated.

The following seems to do what you want, the regex should give you back as matching groups all the value fields from the JSON (although not the keywords under which those values are stored).
I also encourage you to use the with context manager as that will close the file handle automatically after all lines have been read, which is easily done just with a for loop.
with open("c:\\newgoodtestsample.txt", "r") as fo:
for line in fo:
result = re.findallr'"(\w*?)":"?(\w*)"?', line)
d = {k:v for k,v in re.findall(r'"(\w*?)":"?(\w*)"?', line)}
if 'first_name' in d:
# print first_name into file
else:
# print empty first_name field

Python - How to check if the name from file is used?

I have small scraping script. I have file with 2000 names and I use these names to search for Video IDs in YouTube. Because of the amount it takes pretty long time to get all the IDs so I can't do that in one time. What I want is to find where I ended my last scrape and then start from that position. What is the best way to do this? I was thinking about adding the used name to the list and then just check if it's in the list, if no - start scraping but maybe there's a better way to do this? (I hope yes).
Part that takes name from file and scraped IDs. What I want is when I quit scraping, next time when I start it, it would run not from beginning but from point where it ended last time:
index = 0
for name in itertools.islice(f, index, None):
parameters = {'key': api_key, 'q': name}
request_url = requests.get('https://www.googleapis.com/youtube/v3/search?part=snippet&maxResults=1&type=video&fields=items%2Fid', params = parameters)
videoid = json.loads(request_url.text)
if 'error' in videoid:
pass
else:
index += 1
id_file.write(videoid['items'][0]['id']['videoId'] + '\n')
print videoid['items'][0]['id']['videoId']

You could just remember the index number of the last scraped entry. Every time you finish scraping one entry, increment a counter, then assuming the entries in your text file don't change order, just pick up again at that number?

The simplest answer here is probably mitim's answer. Just keep a file that you rewrite with the last-processed index after each line. For example:
savepath = os.path.expanduser('~/.myprogram.lines')
skiplines = 0
try:
with open(savepath) as f:
skiplines = int(f.read())
except:
pass
with open('names.txt') as f:
for linenumber, line in itertools.islice(enumerate(f), skiplines, None):
do_stuff(line)
with open(savepath, 'w') as f:
f.write(str(linenumber))
However, there are other ways you could do this that might make more sense for your use case.
For example, you could rewrite the "names" file after each name is processed to remove the first line. Or, maybe better, preprocess the list into an anydbm (or even sqlite3) database, so you can more easily remove (or mark) names once they're done.
Or, if you might run against different files, and need to keep a progress for each one, you could store a separate .lines file for each one (probably in a ~/.myprogram directory, rather than flooding the top-level home directory), or use an anydbm mapping pathnames to lines done.

Python program to search for specific strings in hash values (coding help)

Trying to write a code that searches hash values for specific string's (input by user) and returns the hash if searchquery is present in that line.
Doing this to kind of just learn python a bit more, but it could be a real world application used by an HR department to search a .csv resume database for specific words in each resume.
I'd like this program to look through a .csv file that has three entries per line (id#;applicant name;resume text)
I set it up so that it creates a hash, then created a string for the resume text hash entry, and am trying to use the .find() function to return the entire hash for each instance.
What i'd like is if the word "gpa" is used as a search query and it is found in s['resumetext'] for three applicants(rows in .csv file), it prints the id, name, and resume for every row that has it.(All three applicants)
As it is right now, my program prints the first row in the .csv file(print resume['id'], resume['name'], resume['resumetext']) no matter what the searchquery is, whether it's in the resumetext or not.
lastly, are there better ways to doing this, by searching word documents, pdf's and .txt files in a folder for specific words using python (i've just started reading about the re module and am wondering if this may be the route, rather than putting everything in a .csv file.)
def find_details(id2find):
resumes_f=open("resume_data.csv")
for each_line in resumes_f:
s={}
(s['id'], s['name'], s['resumetext']) = each_line.split(";")
resumetext = str(s['resumetext'])
if resumetext.find(id2find):
return(s)
else:
print "No data matches your search query. Please try again"
searchquery = raw_input("please enter your search term")
resume = find_details(searchquery)
if resume:
print resume['id'], resume['name'], resume['resumetext']

The line
resumetext = str(s['resumetext'])
is redundant, because s['resumetext'] is already a string (since it comes as one of the results from a .split call). So, you can merge this line and the next into
if id2find in s['resumetext']: ...
Your following else is misaligned -- with it placed like that, you'll print the message over and over again. You want to place it after the for loop (and the else isn't needed, though it would work), so I'd suggest:
for each_line in resumes_f:
s = dict(zip('id name resumetext'.split(), each_line.split(";"))
if id2find in s['resumetext']:
return(s)
print "No data matches your search query. Please try again"
I've also shown an alternative way to build dict s, although yours is fine too.

What #Justin Peel said. Also to be more pythonic I would say change
if resumetext.find(id2find) != -1: to if id2find in resumetext:
A few more changes: you might want to lower case the comparison and user input so it matches GPA, gpa, Gpa, etc. You can do this by doing searchquery = raw_input("please enter your search term").lower() and resumetext = s['resumetext'].lower(). You'll note I removed the explicit cast around s['resumetext'] as it's not needed.

One change that I recommend for your code is changing
if resumetext.find(id2find):
to
if resumetext.find(id2find) != -1:
because find() returns -1 if id2find wasn't in resumetext. Otherwise, it returns the index where id2find is first found in resumetext, which could be 0. As #Personman commented, this would give you the false positive because -1 is interpreted as True in Python.
I think that problem has something to do with the fact that find_details() only returns the first entry for which the search string is found in resumetext. It might be good to make find_details() into a generator instead and then you could iterate over it and print the found records out one by one.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.