removing duplicate id entry from text file using python

removing duplicate id entry from text file using python - python

I have a text file which contains this data (items corresponds to code,entry1,entry2) :
a,1,2
b,2,3
c,4,5
....
....
Here a,b,c.. will be unique always
Every time I read this file in python to either create a new entry for example d,6,7 or to update existing values: say a,1,2 to a,4,3.
I use the following code :
data = ['a',5,6]
datastring = ''
for d in data
datastring = datastring + str(d) + ','
try:
with open("opfile.txt", "a") as f:
f.write(datastring + '\n')
f.close()
return(True)
except:
return(False)
This appends any data as a new entry.
I am trying something like this which checks the first character of each line:
f = open("opfile.txt", "r")
for x in f:
if(x[0] == username):
pass
I don't know how to club these two so that a check will be done on first character(lets say it as id) and if an entry with id is already in the file, then it should be replaced with new data and all other data remains same else it will be entered as new line item.

Read the file into a dictionary that uses the first field as keys. Update the appropriate dictionary, then write it back.
Use the csv module to parse and format the file.
import csv
data = ['a',5,6]
with open("opfile.txt", "r", newline='') as infile:
incsv = csv.reader(infile)
d = {row[0]: row for row in incsv if len(row) != 0}
d[data[0]] = data
with open("opfile.txt", "w") as outfile:
outcsv = csv.writer(outfile)
outcsv.writerows(d.values())

first append all new row to the file.
second, try using write to update rows in your file
def update_record(file_name, field1, field2, field3):
with open(file_name, 'r') as f:
lines = f.readlines()
with open(file_name, 'w') as f:
for line in lines:
if field1 in line:
f.write(field1 + ',' + field2 + ',' + field3 + '\n')
else:
f.write(line)

Related

Getting unique values from csv file, output to new file

I am trying to get the unique values from a csv file. Here's an example of the file:
12,life,car,good,exellent
10,gift,truck,great,great
11,time,car,great,perfect
The desired output in the new file is this:
12,10,11
life,gift,time
car,truck
good.great
excellent,great,perfect
Here is my code:
def attribute_values(in_file, out_file):
fname = open(in_file)
fout = open(out_file, 'w')
# get the header line
header = fname.readline()
# get the attribute names
attrs = header.strip().split(',')
# get the distinct values for each attribute
values = []
for i in range(len(attrs)):
values.append(set())
# read the data
for line in fname:
cols = line.strip().split(',')
for i in range(len(attrs)):
values[i].add(cols[i])
# write the distinct values to the file
for i in range(len(attrs)):
fout.write(attrs[i] + ',' + ','.join(list(values[i])) + '\n')
fout.close()
fname.close()
The code currently outputs this:
12,10
life,gift
car,truck
good,great
exellent,great
12,10,11
life,gift,time
car,car,truck
good,great
exellent,great,perfect
How can I fix this?

You could try to use zip to iterate over the columns of the input file, and then eliminate the duplicates:
import csv
def attribute_values(in_file, out_file):
with open(in_file, "r") as fin, open(out_file, "w") as fout:
for column in zip(*csv.reader(fin)):
items, row = set(), []
for item in column:
if item not in items:
items.add(item)
row.append(item)
fout.write(",".join(row) + "\n")
Result for the example file:
12,10,11
life,gift,time
car,truck
good,great
exellent,great,perfect

How to replace a line with another line?

I am trying to make a contact book application with command-line arguments. This is the code written so far to update the new contact details of a particular contact. args.name has the name of the contact. And args.number has the new number which needs to be updated.
How can I update the entire line? When I run this, it replaces the entire file, contacts.txt, with an empty string. This functionality will also help in the delete function.
thefile = open("contacts.txt","w+")
lines = thefile.readlines()
for line in lines:
if name in line:
line.replace(line,"Name: "+ args.name + " Number: "+args.number+ "\n")

You could firstly read the data from the file, create an empty string, append each line to the newly created string conditionally, and write(replace) the newly obtained string onto the existing file.
f1 = open('contacts.txt','r')
data = f1.readlines()
f1.close()
new_data = ""
for line in data:
if name in line:
update = line.replace(line,"Name: "+ args.name + " Number: "+args.number+ "\n")
new_data += update
else:
new_data += line
f2 = open('contacts.txt','w')
f2.write(new_data)
f2.close()

When you open a file with "w+" python erase the file !
First you whoud write two function: One that writes data and the other read data
def reader():
f = open("MYFILE.txt", "r")
lines = f.readlines()
f.close()
return lines
def writer(data):
f = open("MYFILE.txt", "w")
for i in data:
f.write(i)
f.close()
Then you can actualise lines how you want:
lines = reader()
for i in range(len(lines)):
if lines[i] == "Something\n":
lines[i] = "New_Value\n"
writer(lines)

Update Txt file in python

I have a text file with names and results. If the name already exists, only the result should be updated. I tried with this code and many others, but without success.
The content of the text file looks like this:
Ann, 200
Buddy, 10
Mark, 180
Luis, 100
PS: I started 2 weeks ago, so don't judge my bad code.
from os import rename
def updatescore(username, score):
file = open("mynewscores.txt", "r")
new_file = open("mynewscores2.txt", "w")
for line in file:
if username in line:
splitted = line.split(",")
splitted[1] = score
joined = "".join(splitted)
new_file.write(joined)
new_file.write(line)
file.close()
new_file.close()
maks = updatescore("Buddy", "200")
print(maks)

I would suggest reading the csv in as a dictionary and just update the one value.
import csv
d = {}
with open('test.txt', newline='') as f:
reader = csv.reader(f)
for row in reader:
key,value = row
d[key] = value
d['Buddy'] = 200
with open('test2.txt','w', newline='') as f:
writer = csv.writer(f)
for key, value in d.items():
writer.writerow([key,value])

So what needed to be different mostly is that when in your for loop you said to put line in the new text file, but it's never said to Not do that when wanting to replace a score, all that was needed was an else statement below the if statement:
from os import rename
def updatescore(username, score):
file = open("mynewscores.txt", "r")
new_file = open("mynewscores2.txt", "w")
for line in file:
if username in line:
splitted = line.split(",")
splitted[1] = score
print (splitted)
joined = ", ".join(splitted)
print(joined)
new_file.write(joined+'\n')
else:
new_file.write(line)
file.close()
new_file.close()
maks = updatescore("Buddy", "200")
print(maks)

You can try this, add the username if it doesn't exist, else update it.
def updatescore(username, score):
with open("mynewscores.txt", "r+") as file:
line = file.readline()
while line:
if username in line:
file.seek(file.tell() - len(line))
file.write(f"{username}, {score}")
return
line = file.readline()
file.write(f"\n{username}, {score}")
maks = updatescore("Buddy", "300")
maks = updatescore("Mario", "50")

You have new_file.write(joined) inside the if block, which is good, but you also have new_file.write(line) outside the if block.
Outside the if block, it's putting both the original and fixed lines into the file, and since you're using write() instead of writelines() both versions get put on the same line: there's no \n newline character.
You also want to add the comma: joined = ','.join(splitted) since you took the commas out when you used line.split(',')
I got the result you seem to be expecting when I put in both these fixes.
Next time you should include what you are expecting for output and what you're giving as input. It might be helpful if you also include what Error or result you actually got.
Welcome to Python BTW

Removed issues from your code:
def updatescore(username, score):
file = open("mynewscores.txt", "r")
new_file = open("mynewscores2.txt", "w")
for line in file.readlines():
splitted = line.split(",")
if username == splitted[0].strip():
splitted[1] = str(score)
joined = ",".join(splitted)
new_file.write(joined)
else:
new_file.write(line)
file.close()
new_file.close()

I believe this is the simplest/most straightforward way of doing things.
Code:
import csv
def update_score(name: str, score: int) -> None:
with open('../resources/name_data.csv', newline='') as file_obj:
reader = csv.reader(file_obj)
data_dict = dict(curr_row for curr_row in reader)
data_dict[name] = score
with open('../out/name_data_out.csv', 'w', newline='') as file_obj:
writer = csv.writer(file_obj)
writer.writerows(data_dict.items())
update_score('Buddy', 200)
Input file:
Ann,200
Buddy,10
Mark,180
Luis,100
Output file:
Ann,200
Buddy,200
Mark,180
Luis,100

How to not just add a new first column to csv but alter the header names

I would like to do the following
read a csv file, Add a new first column, then rename some of the columns
then load the records from csv file.
Ultimately, I would like the first column to be populated with the file
name.
I'm fairly new to Python and I've kind of worked out how to change the fieldnames however, loading the data is a problem as it's looking for the original fieldnames which no longer match.
Code snippet
import csv
import os
inputFileName = "manifest1.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_modified.csv"
with open(inputFileName, 'rb') as inFile, open(outputFileName, 'wb') as outfile:
r = csv.DictReader(inFile)
fieldnames = ['MapSvcName','ClientHostName', 'Databasetype', 'ID_A', 'KeepExistingData', 'KeepExistingMapCache', 'Name', 'OnPremisePath', 'Resourcestype']
w = csv.DictWriter(outfile,fieldnames)
w.writeheader()
*** Here is where I start to go wrong
# copy the rest
for node, row in enumerate(r,1):
w.writerow(dict(row))
Error
File "D:\Apps\Python27\ArcGIS10.3\lib\csv.py", line 148, in _dict_to_list
+ ", ".join([repr(x) for x in wrong_fields]))
ValueError: dict contains fields not in fieldnames: 'Databases [xsi:type]', 'Resources [xsi:type]', 'ID'
Would like to some assistance to not just learn but truly understand what I need to do.
Cheers and thanks
Peter
Update..
I think I've worked it out
import csv
import os
inputFileName = "manifest1.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_modified.csv"
with open(inputFileName, 'rb') as inFile, open(outputFileName, 'wb') as outfile:
r = csv.reader(inFile)
w = csv.writer(outfile)
header = next(r)
header.insert(0, 'MapSvcName')
#w.writerow(header)
next(r, None) # skip the first row from the reader, the old header
# write new header
w.writerow(['MapSvcName','ClientHostName', 'Databasetype', 'ID_A', 'KeepExistingData', 'KeepExistingMapCache', 'Name', 'OnPremisePath', 'Resourcestype'])
prevRow = next(r)
prevRow.insert(0, '0')
w.writerow(prevRow)
for row in r:
if prevRow[-1] == row[-1]:
val = '0'
else:
val = prevRow[-1]
row.insert(0,val)
prevRow = row
w.writerow(row)

Python CSV script

I'm trying to add a new column by copying col#3 and then append #hotmail to the new column
Here is the script, only problem is that it will not finish processing the input file, it only show 61409 rows in the output file, whereas in the input file there are 61438 rows.
Also, there is an error message (the input file does not have empty line at the end):
email = row[3]
IndexError: list index out of range
inFile = 'c:\\Python27\\scripts\\intake.csv'
outFile = 'c:\\Python27\\scripts\\final.csv'
with open(inFile, 'rb') as fp_in1, open(outFile, 'wb') as fp_out1:
writer = csv.writer(fp_out1, delimiter=",")
reader = csv.reader(fp_in1, delimiter=",")
for col in reader:
del col[6:]
writer.writerow(col)
headers = next(reader)
writer.writerow(headers + ['email2'])
for row in reader:
if len(row) > 3:
email = email.split('#', 1)[0] + '#hotmail.com'
writer.writerow(row + [email])

It looks like you edited the code you received in your earlier answer.
Change
email = email.split('#', 1)[0] + '#hotmail.com'
to
email = row[3].split('#', 1)[0] + '#hotmail.com'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

removing duplicate id entry from text file using python - python

Related

Getting unique values from csv file, output to new file

How to replace a line with another line?

Update Txt file in python

How to not just add a new first column to csv but alter the header names

Python CSV script

Categories

Resources