Copying CSV File when integer exceeds maximum - python

If this input exists in a specific row, for example DD in row WORD4 (row 3), the program will then ask them to enter an integer and if this is over a certain number it will write it including the line.
Something like so:
a0,a1,a2,a3,a4
JA,BV,PA,DD,6
The error received I did receive was:
TypeError: writerows() takes exactly one argument (2 given)
And
TypeError: can only concatenate list (not "str") to list
Thanks to Joel Johnson and Stevieb for the solution to this problem!
The solution is as followed, Thanks Joel Johnson:

First, you need to use with open('CSVFile2.csv', 'a') as f: to write anything to the file(if you want to keep any content already in CSVFile2.csv or use 'w' if you want to overwrite it).
Second, since you are only trying to write one row with format
['JA',BV','PA','DD','6'] use writer.writerow() instead of writer.writerows() else you will end up with J,A,B,V,P,A,D,D,6 as your output.
Third, simply append integer_input to row before passing it to writer.writerow() also note that it needs to be in str() format
If you have any other questions I would refer you to the docs here
example:
with open('CSVFile1.csv', "rb") as csvfile:
a = csv.reader(csvfile, delimiter=',')
for row in a:
if user_input in row[3] and integer_input>5:
with open('CSVFile2.csv', 'a') as f:
new_row = row
new_row.append(str(integer_input))
writer = csv.writer(f)
writer.writerow(new_row)
f.close()

writerows() does only take one parameter. The below code appends the row[3] to the row, then the entire row is sent to writerow() as its only parameter. I've also moved the writer file to outside of the loop, otherwise if more than one match occurs, you'd be overwriting it on each iteration.
with open('CSVFile1.csv', 'rb') as csvfile:
fh = csv.reader(csvfile)
wfh = open('CSVFile2.csv', 'ab')
for row in fh:
if user_input in row[3] and int(integer_input) > 5:
row.append(integer_input)
writer = csv.writer(wfh)
writer.writerow(row)
wfh.close()

Related

Python reading in integers from a csv file into a list

I am having some trouble trying to read a particular column in a csv file into a list in Python. Below is an example of my csv file:
Col 1 Col 2
1,000,000 1
500,000 2
250,000 3
Basically I am wanting to add column 1 into a list as integer values and am having a lot of trouble doing so. I have tried:
for row in csv.reader(csvfile):
list = [int(row.split(',')[0]) for row in csvfile]
However, I get a ValueError that says "invalid literal for int() with base 10: '"1'
I then tried:
for row in csv.reader(csvfile):
list = [(row.split(',')[0]) for row in csvfile]
This time I don't get an error however, I get the list:
['"1', '"500', '"250']
I have also tried changing the delimiter:
for row in csv.reader(csvfile):
list = [(row.split(' ')[0]) for row in csvfile]
This almost gives me the desired list however, the list includes the second column as well as, "\n" after each value:
['"1,000,000", 1\n', etc...]
If anyone could help me fix this it would be greatly appreciated!
Cheers
You should choose your delimiter wisely :
If you have floating numbers using ., use , delimiter, or if you use , for floating numbers, use ; as delimiter.
Moreover, as referred by the doc for csv.reader you can use the delimiter= argument to define your delimiter, like so:
with open('myfile.csv', 'r') as csvfile:
mylist = []
for row in csv.reader(csvfile, delimiter=';'):
mylist.append(row[0]) # careful here with [0]
or short version:
with open('myfile.csv', 'r') as csvfile:
mylist = [row[0] for row in csv.reader(csvfile, delimiter=';')]
To parse your number to a float, you will have to do
float(row[0].replace(',', ''))
You can open the file and split at the space using regular expressions:
import re
file_data = [re.split('\s+', i.strip('\n')) for i in open('filename.csv')]
final_data = [int(i[0]) for i in file_data[1:]]
First of all, you must parse your data correctly. Because it's not, in fact, CSV (Comma-Separated Values) but rather TSV (Tab-Separated) of which you should inform CSV reader (I'm assuming it's tab but you can theoretically use any whitespace with a few tweaks):
for row in csv.reader(csvfile, delimiter="\t"):
Second of all, you should strip your integer values of any commas as they don't add new information. After that, they can be easily parsed with int():
int(row[0].replace(',', ''))
Third of all, you really really should not iterate the same list twice. Either use a list comprehension or normal for loop, not both at the same time with the same variable. For example, with list comprehension:
csvfile = StringIO("Col 1\tCol 2\n1,000,000\t1\n500,000\t2\n250,000\t3\n")
reader = csv.reader(csvfile, delimiter="\t")
next(reader, None) # skip the header
lst = [int(row[0].replace(',', '')) for row in reader]
Or with normal iteration:
csvfile = StringIO("Col 1\tCol 2\n1,000,000\t1\n500,000\t2\n250,000\t3\n")
reader = csv.reader(csvfile, delimiter="\t")
lst = []
for i, row in enumerate(reader):
if i == 0:
continue # your custom header-handling code here
lst.append(int(row[0].replace(',', '')))
In both cases, lst is set to [1000000, 500000, 250000] as it should. Enjoy.
By the way, using reserved keyword list as a variable is an extremely bad idea.
UPDATE. There's one more option that I find interesting. Instead of setting the delimiter explicitly you can use csv.Sniffer to detect it e.g.:
csvdata = "Col 1\tCol 2\n1,000,000\t1\n500,000\t2\n250,000\t3\n"
csvfile = StringIO(csvdata)
dialect = csv.Sniffer().sniff(csvdata)
reader = csv.reader(csvfile, dialect=dialect)
and then just like the snippets above. This will continue working even if you replace tabs with semicolons or commas (would require quotes around your weird integers) or, possibly, something else.

Python working with CSV with 2 delimiters

I have a programs which outputs the data into a CSV file. These files contain 2 delimiters, these are , and "" for text. The text also contains commas.
How can I work with these 2 delimiters?
My current code gives me list index out of range. If the CSV file is needed I can provide it.
Current code:
def readcsv():
with open('pythontest.csv') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024),delimiters=',"')
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
for row in reader:
asset_ip_addresses.append(row[0])
service_protocollen.append(row[1])
service_porten.append(row[2])
vurn_cvssen.append(row[3])
vurn_risk_scores.append(row[4])
vurn_descriptions.append(row[5])
vurn_cve_urls.append(row[6])
vurn_solutions.append(row[7])
The CSV File im working with: http://www.pastebin.com/bUbDC419
It seems to have problems with handling the second line. If i append the rows to a list the first row seems to be ok but the second row seems to take it as whole thing and not seperating the commas anymore.
I guess it has something to do with the "enters"
I don't think you should need to define a custom dialect, unless I'm missing something.
The official documentation shows you can provide quotechar as a keyword to the reader() method. The example from the documentation modified for your code:
import csv
with open('pythontest.csv') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
#do something to the row
row is a list of strings for each item in the row with " quotes removed.
The issue with the index out of range suggests that one of the row[x] cannot be accessed.
OK, I think I understand what kind of file you are reading... let's say the content of your CSV file looks like this
192.168.12.255,"Great site, a lot of good, recommended",0,"Last, first, middle"
192.168.0.255,"About cats, dogs, must visit!",1,"One, two, three"
Here is the code that will allow you to read it line by line, text in quotes will be taken out as single array element, but it will not split it. The parameter that you need is this quoting=csv.QUOTE_ALL
import csv
with open('students.csv', newline='') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_ALL)
for row in reader:
print(row[0])
print(row[1])
print(row[2])
print(row[3])
The printed output will look like this
192.168.12.255
Great site, a lot of good, recommended
0
Last, first, middle
192.168.0.255
About cats, dogs, must visit!
1
One, two, three
PS solution is based on the latest official documentation, see here https://docs.python.org/3/library/csv.html
how about a quick solution like this
a quick fix, that would split a row in csv like a,"b,c",d as strings a,b,c,d
def readcsv():
with open('pythontest.csv') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024),delimiters=',"')
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
for rowx in reader:
row=[e.split(r',') if isinstance(e,str) else e for e in rowx]
#do your stuff on row

How to search CSV line for string in certain column, print entire line to file if found

Sorry, very much a beginner with Python and could really use some help.
I have a large CSV file, items separated by commas, that I'm trying to go through with Python. Here is an example of a line in the CSV.
123123,JOHN SMITH,SMITH FARMS,A,N,N,12345 123 AVE,CITY,NE,68355,US,12345 123 AVE,CITY,NE,68355,US,(123) 555-5555,(321) 555-5555,JSMITH#HOTMAIL.COM,15-JUL-16,11111,2013,22-DEC-93,NE,2,1\par
I'd like my code to scan each line and look at only the 9th item (the state). For every item that matches my query, I'd like that entire line to be written to an CSV.
The problem I have is that my code will find every occurrence of my query throughout the entire line, instead of just the 9th item. For example, if I scan looking for "NE", it will write the above line in my CSV, but also one that contains the string "NEARY ROAD."
Sorry if my terminology is off, again, I'm a beginner. Any help would be greatly appreciated.
I've listed my coding below:
import csv
with open('Sample.csv', 'rb') as f, open('NE_Sample.csv', 'wb') as outf:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf)
for line in f:
if "NE" in line:
print ('Found: []'.format(line))
writer.writerow([line])
You're not actually using your reader to read the input CSV, you're just reading the raw lines from the file itself.
A fixed version looks like the following (untested):
import csv
with open('Sample.csv', 'rb') as f, open('NE_Sample.csv', 'wb') as outf:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf)
for row in reader:
if row[8] == 'NE':
print ('Found: {}'.format(row))
writer.writerow(row)
The changes are as follows:
Instead of iterating over the input file's lines, we iterate over the rows parsed by the reader (each of which is a list of each of the values in the row).
We check to see if the 9th item in the row (i.e. row[8]) is equal to "NE".
If so, we output that row to the output file by passing it in, as-is, to the writer's writerow method.
I also fixed a typo in your print statement - the format method uses braces (not square brackets) to mark replacement locations.
This snippet should solves your problem
import csv
with open('Sample.csv', 'rb') as f, open('NE_Sample.csv', 'wb') as outf:
reader = csv.reader(f, delimiter=',')
writer = csv.writer(outf)
for row in reader:
if "NE" in row:
print ('Found: {}'.format(row))
writer.writerow(row)
if "NE" in line in your code is trying to find out whether "NE" is a substring of string line, which works not as intended. The lines are raw lines of your input file.
If you use if "NE" in row: where row is parsed line of your input file, you are doing exact element matching.

Remove double quotes from iterator when using csv writer

I want to create a csv from an existing csv, by splitting its rows.
Input csv:
A,R,T,11,12,13,14,15,21,22,23,24,25
Output csv:
A,R,T,11,12,13,14,15
A,R,T,21,22,23,24,25
So far my code looks like:
def update_csv(name):
#load csv file
file_ = open(name, 'rb')
#init first values
current_a = ""
current_r = ""
current_first_time = ""
file_content = csv.reader(file_)
#LOOP
for row in file_content:
current_a = row[0]
current_r = row[1]
current_first_time = row[2]
i = 2
#Write row to new csv
with open("updated_"+name, 'wb') as f:
writer = csv.writer(f)
writer.writerow((current_a,
current_r,
current_first_time,
",".join((row[x] for x in range(i+1,i+5)))
))
#do only one row, for debug purposes
return
But the row contains double quotes that I can't get rid of:
A002,R051,02-00-00,"05-21-11,00:00:00,REGULAR,003169391"
I've tried to use writer = csv.writer(f,quoting=csv.QUOTE_NONE) and got a _csv.Error: need to escape, but no escapechar set.
What is the correct approach to delete those quotes?
I think you could simplify the logic to split each row into two using something along these lines:
def update_csv(name):
with open(name, 'rb') as file_:
with open("updated_"+name, 'wb') as f:
writer = csv.writer(f)
# read one row from input csv
for row in csv.reader(file_):
# write 2 rows to new csv
writer.writerow(row[:8])
writer.writerow(row[:3] + row[8:])
writer.writerow is expecting an iterable such that it can write each item within the iterable as one item, separate by the appropriate delimiter, into the file. So:
writer.writerow([1, 2, 3])
would write "1,2,3\n" to the file.
Your call provides it with an iterable, one of whose items is a string that already contains the delimiter. It therefore needs some way to either escape the delimiter or a way to quote out that item. For example,
write.writerow([1, '2,3'])
Doesn't just give "1,2,3\n", but e.g. '1,"2,3"\n' - the string counts as one item in the output.
Therefore if you want to not have quotes in the output, you need to provide an escape character (e.g. '/') to mark the delimiters that shouldn't be counted as such (giving something like "1,2/,3\n").
However, I think what you actually want to do is include all of those elements as separate items. Don't ",".join(...) them yourself, try:
writer.writerow((current_a, current_r,
current_first_time, *row[i+2:i+5]))
to provide the relevant items from row as separate items in the tuple.

replace blank values in column in csv with python

I am trying to replace blank values in a certain column (column 6 'Author' for example) with "DMD" in CSV using Python. I am fairly new to the program, so a lot of the lingo throws me. I have read through the CSV Python documentation but there doesn't seem to be anything that is specific to my question. Here is what I have so far. It doesn't run. I get the error 'dict' object has no attribute replace. It seems like there should be something similar to replace in the dict. Also, I am not entirely sure my method to search the field is accurate. Any guidance would be appreciated.
import csv
inputFileName = "C:\Author.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_edited.csv"
field = ['Author']
with open(inputFileName) as infile, open(outputFileName, "w") as outfile:
r = csv.DictReader(infile)
w = csv.DictWriter(outfile, field)
w.writeheader()
for row in r:
row.replace(" ","DMD")
w.writerow(row)
I think you're pretty close. You need to pass the fieldnames to the writer and then you can edit the row directly, because it's simply a dictionary. For example:
with open(inputFileName, "rb") as infile, open(outputFileName, "wb") as outfile:
r = csv.DictReader(infile)
w = csv.DictWriter(outfile, r.fieldnames)
w.writeheader()
for row in r:
if not row["Author"].strip():
row["Author"] = "DMD"
w.writerow(row)
turns
a,b,c,d,e,Author,g,h
1,2,3,4,5,Smith,6,7
8,9,10,11,12,Jones,13,14
13,14,15,16,17,,18,19
into
a,b,c,d,e,Author,g,h
1,2,3,4,5,Smith,6,7
8,9,10,11,12,Jones,13,14
13,14,15,16,17,DMD,18,19
I like using if not somestring.strip(): because that way it won't matter if there are no spaces, or one, or seventeen and a tab. I also prefer DictReader to the standard reader because this way you don't have to remember which column Author is living in.
[PS: The above assumes Python 2, not 3.]
Dictionaries don't need the replace method because simple assignment does this for you:
for row in r:
if row[header-6] == "":
row[header-6] = "DMD"
w.writerow(row)
Where header-6 is the name of your sixth column
Also note that your call to DictReader appears to have the wrong fields attribute. That argument should be a list (or other sequence) containing all the headers of your new CSV, in order.
For your purposes, it appears to be simpler to use the vanilla reader:
import csv
inputFileName = "C:\Author.csv"
outputFileName = os.path.splitext(inputFileName)[0] + "_edited.csv"
with open(inputFileName) as infile, open(outputFileName, "w") as outfile:
r = csv.reader(infile)
w = csv.writer(outfile)
w.writerow(next(r)) # Writes the header unchanged
for row in r:
if row[5] == "":
row[5] = "DMD"
w.writerow(row)
(1) to use os.path.splitest, you need to add an import os
(2) Dicts don't have a replace method; dicts aren't strings. If you're trying to alter a string that's the value of a dict entry, you need to reference that dict entry by key, e.g. row['Author']. If row['Author'] is a string (should be in your case), you can do a replace on that. Sounds like you need an intro to Python dictionaries, see for example http://www.sthurlow.com/python/lesson06/ .
(3) A way to do this, that also deals with multiple spaces, no spaces etc. in the field, would look like this:
field = 'Author'
marker = 'DMD'
....
## longhand version
candidate = str(row[field]).strip()
if candidate:
row[field] = candidate
else:
row[field] = marker
or
## shorthand version
row[field] = str(row[field]).strip() and str(row[field]) or marker
Cheers
with open('your file', 'r+') as f2:
txt=f2.read().replace('#','').replace("'",'').replace('"','').replace('&','')
f2.seek(0)
f2.write(txt)
f2.truncate()
Keep it simple and replace your choice of characters.

Categories