Problems with handling files in Python

Problems with handling files in Python - python

Good evening everyone,
I am fairly new to Python and at the moment and at the moment I'm struggling with the problem of how to properly edit a file (.txt or .csv) in python. I am trying to write a little program that will take each line of a text file, encrypt it and then overwrite the file line by line and save it. The relevant part of my code looks like this so far:
with open('/home/path/file.csv', 'r+') as csvfile:
for row in csv.reader(csvfile, delimiter='\t'):
y = []
for i in range(0, len(row)):
x = encrypt(row[i], password)
y.append(x)
csvfile.write(''.join(y))
Which, when executed, does nothing. I've played with the code a little, sometimes it runs into a
TypeError: expected a character buffer object
The encryption function returns a string and my file consists of 3 strings per row, seperated by a tab, like this:
key1 value1 value1'
key2 value2 value2'
key3 value3 value3'
...
The csv.reader seems to read the file properly and returns one list per row, y then returns a list with the encrypted phrases. However, I can't seem to get the file.write() function to actually overwrite the file. Does anyone know how to get around this?
Any help would be greatly appreciated.
Thanks,
Andy

You've open the file as read only. You need to open a second file for writing.
with open('/home/path/file.csv', 'r+') as csvfile:
for row in csv.reader(csvfile, delimiter='\t'):
y = []
for i in range(0, len(row)):
x = encrypt(row[i], password)
y.append(x)
with open('/home/path/file.csv', 'w') as csvfile:
csvfile.write(''.join(y))
I never like to overwrite my files, disk space is cheap.
with open('/home/path/file.csv', 'r+') as csvfile:
with open('/home/path/file.enc', 'w') as csvencryptedfile:
for row in csv.reader(csvfile, delimiter='\t'):
y = []
for i in range(0, len(row)):
x = encrypt(row[i], password)
y.append(x)
csvencryptedfile.write('\t'.join(y))
csvencryptedfile.write('\n')

Related

im trying to print the csv data in vertical with headers but instead it always prints in a horizontal line. can someone look my code plss

def players_stats():
with open("playerstststsst.csv") as csvfile:
csvreader = csv.reader(csvfile, delimiter=',',newline:'\n')
player = (input("Enter the subject to use as filter:").lower().title())
for row in csvreader:
if player in str(row[0]):
print(row)
I have provided the image of the CSV file and
I have tried using line break to put the name, number, position, and date in vertical with a header but it's not working for some reason. I tried everything but it's not working someone please help.
this is the image of the CSV file
enter image description here
This is what the results should look like
enter image description here

If you want to print each field from the row on a separate line, just loop over them.
As an aside, probably refactor your function so it doesn't require interactive input.
def players_stats(player):
player = player.lower().title()
with open("playerstststsst.csv") as csvfile:
csvreader = csv.reader(csvfile, delimiter=',',newline:'\n')
for idx, row in enumerate(csvreader):
if idx == 0:
titles = row
# row[0] is already a str, no need to convert
elif player in row[0]:
for key, value in zip(titles, row):
print("%s: %s" % (key, value))
# Let the caller ask the user for input if necessary
pläyers_stats(input("Enter the subject to use as filter:"))
Probably a better design for most situations is to read the CSV file into memory at the beginning of your script, and then simply search through the corresponding data structure without needing to touch the disk again.

How to have multiple arrays in python variable

Right now I have a small script that writes and read data to a CSV file.
Brief snippet of the write function:
with open(filename,'w') as f1:
writer=csv.writer(f1, delimiter=';',lineterminator='\n',)
for a,b in my_function:
do_things_to_get_data
writer.writerow([tech_link, str(total), str(avg), str(unique_count)])
Then brief snippet of reading the file:
infile = open(filename,"r")
for line in infile:
row = line.split(";")
tech = row[0]
total = row[1]
average = row[2]
days_worked = row[3]
do_things_with_each_row_of_data
I'd like to just skip the CSV part all together and see if I can just hold all that data in a variable but I'm not sure what that looks like. Any help is appreciated.
Thank you.

...no point in me saving data to a csv file just to read it later in the script
Just keep it in a list of lists
data = []
for a,b in my_function:
do_things_to_get_data
data.append([tech_link, str(total), str(avg), str(unique_count)])
...
for tech, total, average, days_worked in data:
do_things_with_each_row_of_data
It might be worth saving each row as a namedtuple or a dictionary

Problems looking for csv values in txt file using Python

I'm new to Stackoverflow and relatively new to Python. I Have tried searching the site for an answer to this question, but haven't found one related to matching values between csv and txt files.
I'm writing a simple Python script that reads in a row from large csv file (~600k lines), grabs a value from that row, assigns to a variable, then uses the variable to try to find a matching value from a large txt file (~1.8MM lines). It's not working and I'm not sure why.
Here's a snippet from the source.csv file:
DocNo,Title,DOI
1,"Title One",10.1080/02724634.2016.1269539
2,"Title Two",10.1002/2015ja021888
3,"Title Three",10.1016/j.palaeo.2016.09.019
Here's a snippet from the lookup.txt file (note that it's separated by \t):
DOI 10.1016/j.palaeo.2016.09.019 M First
DOI 10.1016/j.radmeas.2015.12.002 M First
DOI 10.1097/SCS.0000000000002859 M First
Here's the offending code:
import csv
with open('source.csv', newline='', encoding = "ISO-8859-1") as f, open('lookup.txt', 'r') as i:
reader = csv.reader(f, dialect='excel')
counter = 0
for line in i:
for row in reader:
doi = row[2]
doi = str(doi) # I think this might actually be redundant...
if doi in line:
# This will eventually do more interesting things, but right now it's just a test
print(doi)
break
else:
# This will be removed--is also just a test (so I can watch progress)
print(counter)
counter += 1
Currently, when it runs, it just counts the lines, even though there's a matching doi in each file.
The maddening thing is that when I give doi a hard-coded value, it executes as it should. This makes me think that either the slashes in doi are breaking things somehow, or I need to convert the data type of the doi variable.
For example, this works:
doi = "10.1016/j.palaeo.2016.09.019"
for line in i:
if doi in line:
print(doi)
break
else:
print(counter)
counter += 1
Thanks in advance for your help, I'm at my wit's end!

Your problem is that repeating for line in i: does not start over from the beginning on each loop, but rather it keeps going where it was when you called break the last time. If you have any line in the lookup file i that has no match, you will effectively go through the lookup file completely and then all calls to for line in i: will do nothing (empty loop).
You probably want to keep the lookup lines in a list, as a first step. Turning it into a lookup dict by parsing the row would likely be the next step.
Here is a demonstration of what happens:
!cat 1.txt
row1
row2
row3
!cat 2.txt
row A
row B
row C
with open('1.txt', 'r') as i, open('2.txt', 'r') as j:
for irow in i:
print "irow", irow.strip()
for jrow in j:
print "jrow", jrow.strip()
irow row1
jrow row A
jrow row B
jrow row C
irow row2
irow row3

You can try this:
import csv
data = csv.reader(open('data1.csv'))
data1 = [i.strip('\n').split()[1] for i in open('data2.txt')]
file_data = [i[-1] for i in data if i[-1] in data1]
Output from sample files provided:
['10.1016/j.palaeo.2016.09.019']

Python working with CSV with 2 delimiters

I have a programs which outputs the data into a CSV file. These files contain 2 delimiters, these are , and "" for text. The text also contains commas.
How can I work with these 2 delimiters?
My current code gives me list index out of range. If the CSV file is needed I can provide it.
Current code:
def readcsv():
with open('pythontest.csv') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024),delimiters=',"')
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
for row in reader:
asset_ip_addresses.append(row[0])
service_protocollen.append(row[1])
service_porten.append(row[2])
vurn_cvssen.append(row[3])
vurn_risk_scores.append(row[4])
vurn_descriptions.append(row[5])
vurn_cve_urls.append(row[6])
vurn_solutions.append(row[7])
The CSV File im working with: http://www.pastebin.com/bUbDC419
It seems to have problems with handling the second line. If i append the rows to a list the first row seems to be ok but the second row seems to take it as whole thing and not seperating the commas anymore.
I guess it has something to do with the "enters"

I don't think you should need to define a custom dialect, unless I'm missing something.
The official documentation shows you can provide quotechar as a keyword to the reader() method. The example from the documentation modified for your code:
import csv
with open('pythontest.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
#do something to the row
row is a list of strings for each item in the row with " quotes removed.
The issue with the index out of range suggests that one of the row[x] cannot be accessed.

OK, I think I understand what kind of file you are reading... let's say the content of your CSV file looks like this
192.168.12.255,"Great site, a lot of good, recommended",0,"Last, first, middle"
192.168.0.255,"About cats, dogs, must visit!",1,"One, two, three"
Here is the code that will allow you to read it line by line, text in quotes will be taken out as single array element, but it will not split it. The parameter that you need is this quoting=csv.QUOTE_ALL
import csv
with open('students.csv', newline='') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_ALL)
for row in reader:
print(row[0])
print(row[1])
print(row[2])
print(row[3])
The printed output will look like this
192.168.12.255
Great site, a lot of good, recommended
0
Last, first, middle
192.168.0.255
About cats, dogs, must visit!
1
One, two, three
PS solution is based on the latest official documentation, see here https://docs.python.org/3/library/csv.html

how about a quick solution like this
a quick fix, that would split a row in csv like a,"b,c",d as strings a,b,c,d
def readcsv():
with open('pythontest.csv') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024),delimiters=',"')
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
for rowx in reader:
row=[e.split(r',') if isinstance(e,str) else e for e in rowx]
#do your stuff on row

Using CSV module to append multiple files while removing appended headers

I would like to use the Python CSV module to open a CSV file for appending. Then, from a list of CSV files, I would like to read each csv file and write it to the appended CSV file. My script works great - except that I cannot find a way to remove the headers from all but the first CSV file being read. I am certain that my else block of code is not executing properly. Perhaps my syntax for my if else code is the problem? Any thoughts would be appreciated.
writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
for files in lstFiles:
readFile = open(input_file,'rU')
reader = csv.reader(readFile,dialect='excel')
for i in range(0,len(lstFiles)):
if i == 0:
oldHeader = readFile.readline()
newHeader = writeFile.write(oldHeader)
for row in reader:
writer.writerow(row)
else:
reader.next()
for row in reader:
row = readFile.readlines()
writer.writerow(row)
readFile.close()
writeFile.close()

You're effectively iterating over lstFiles twice. For each file in your list, you're running your inner for loop up from 0. You want something like:
writeFile = open(append_file,'a+b')
writer = csv.writer(writeFile,dialect='excel')
headers_needed = True
for input_file in lstFiles:
readFile = open(input_file,'rU')
reader = csv.reader(readFile,dialect='excel')
oldHeader = reader.next()
if headers_needed:
newHeader = writer.writerow(oldHeader)
headers_needed = False
for row in reader:
writer.writerow(row)
readFile.close()
writeFile.close()
You could also use enumerate over the lstFiles to iterate over tuples containing the iteration count and the filename, but I think the boolean shows the logic more clearly.
You probably do not want to mix iterating over the csv reader and directly calling readline on the underlying file.

I think you're iterating too many times (over various things: both your list of files and the files themselves). You've definitely got some consistency problems; it's a little hard to be sure since we can't see your variable initializations. This is what I think you want:
with open(append_file,'a+b') as writeFile:
need_headers = True
for input_file in lstFiles:
with open(input_file,'rU') as readFile:
headers = readFile.readline()
if need_headers:
# Write the headers only if we need them
writeFile.write(headers)
need_headers = False
# Now write the rest of the input file.
for line in readFile:
writeFile.write(line)
I took out all the csv-specific stuff since there's no reason to use it for this operation. I also cleaned the code up considerably to make it easier to follow, using the files as context managers and a well-named boolean instead of the "magic" i == 0 check. The result is a much nicer block of code that (hopefully) won't have you jumping through hoops to understand what's going on.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problems with handling files in Python - python

Related

im trying to print the csv data in vertical with headers but instead it always prints in a horizontal line. can someone look my code plss

How to have multiple arrays in python variable

Problems looking for csv values in txt file using Python

Python working with CSV with 2 delimiters

Using CSV module to append multiple files while removing appended headers

Categories

Resources