Reading and writing variables from CSV file in Python (Selenium) - python

I'm having some difficulties with my code - wondering if anyone could help me as to where I'm going wrong.
The general syntax of the goal I'm trying to achieve is:
Get user input
Split input into individual variables
Write variables (amend) to 'data.csv'
Read variables from newly amended 'data.csv'
Add variables to list
If variable 1 <= length of list, #run some code
If variable 2 <= length of list, #run some code
Here is my python code:
from selenium import webdriver
import time
import csv
x = raw_input("Enter numbers separated by a space")
integers = [[int(i)] for i in x.split()]
with open("data.csv", "a") as f:
writer = csv.writer(f)
writer.writerows(integers)
with open('data.csv', 'r') as f:
file_contents = f.read()
previous_FONs = file_contents.split(' ')
if list.count(integers[i]) == 1:
#run some code
elif list.count(integers[i]) == 2:
#run some code
The error message I'm receiving is TypeError: count() takes exactly one argument (0 given)

Because of the following line
integers = [[int(i)] for i in x.split()]
you're creating a list of lists. Therefore you're passing lists to the count method. Try this one:
integers = [int(i) for i in x.split()]
Edit: Based on your explanation what you want to achieve, this code should do it:
import csv
x = raw_input('Enter numbers separated by a space: ')
new_FONs = [[int(i)] for i in x.split()]
with open('data.csv', 'a', newline='') as f:
writer = csv.writer(f)
writer.writerows(new_FONs)
with open('data.csv', 'r') as f:
all_FONs_str = [line.split() for line in f]
all_FONs = [[int(FON[0])] for FON in all_FONs_str]
# For each of the user-input numbers
for FON in new_FONs:
# Count the occurance of this number in the CSV file
FON_count = all_FONs.count(FON)
if FON_count == 1:
print(f'{FON[0]} occurs once')
# do stuff
elif FON_count == 2:
print(f'{FON[0]} occurs twice')
# do stuff
I have changed the name of the list read from CSV to all_FONs just to remind that this contains the old entries and also the new once (as we wrote them into the file before reading).
In addition you need to convert the entries as when reading from CSV you get strings not integers, what would make the comparison difficult. Maybe the whole conversion to int is not necessary, just work on strings. But that depends on what you need.
Edit2: Sorry forgot to change from input to raw_input for Python 2.7 :)

Related

My code is not working properly only on my device

I created a program to create a csv where every number from 0 to 1000000
import csv
nums = list(range(0,1000000))
with open('codes.csv', 'w') as f:
writer = csv.writer(f)
for val in nums:
writer.writerow([val])
then another program to remove a number from the file taken as input
import csv
import os
while True:
members= input("Please enter a number to be deleted: ")
lines = list()
with open('codes.csv', 'r') as readFile:
reader = csv.reader(readFile)
for row in reader:
if all(field != members for field in row):
lines.append(row)
else:
print('Removed')
os.remove('codes.csv')
with open('codes.csv', 'w') as writeFile:
writer = csv.writer(writeFile)
writer.writerows(lines)
The above code is working fine on any other device except my pc, in the first program it creates the csv file with empty rows between every number, in the second program the number of empty rows multiplies and the file size also multiples.
what is wrong with my device then?
Thanks in advance
I think you shouldn't use a csv file for single column data. Use a json file instead.
And the code that you've written for checking which value to not remove, is unnecessary. Instead you could write a list of numbers to the file, and read it back to a variable while removing a number you desire to, using the list.remove() method.
And then write it back to the file.
Here's how I would've done it:
import json
with open("codes.json", "w") as f: # Write the numbers to the file
f.write(json.dumps(list(range(0, 1000000))))
nums = None
with open("codes.json", "r") as f: # Read the list in the file to nums
nums = json.load(f)
to_remove = int(input("Number to remove: "))
nums.remove(to_remove) # Removes the number you want to
with open("codes.json", "w") as f: # Dump the list back to the file
f.write(json.dumps(nums))
Seems like you have different python versions.
There is a difference between built-in python2 open() and python3 open(). Python3 defaults to universal newlines mode, while python2 newlines depends on mode argument open() function.
CSV module docs provides a few examples where open() called with newline argument explicitly set to empty string newline='':
import csv
with open('some.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(someiterable)
Try to do the same. Probably without explicit newline='' your writerows calls add one more newline character.
CSV file from English - Comma-Separated Values, you have a record with spaces.
To remove empty lines - when opening a file for writing, add newline="".
Since this format is tabular data, you cannot simply delete the element, the table will go. It is necessary to insert an empty string or "NaN" instead of the deleted element.
I reduced the number of entries and made them in the form of a table for clarity.
import csv
def write_csv(file, seq):
with open(file, 'w', newline='') as f:
writer = csv.writer(f)
for val in seq:
writer.writerow([v for v in val])
nums = ((j*10 + i for i in range(0, 10)) for j in range(0, 10))
write_csv('codes.csv', nums)
nums_new = []
members = input("Please enter a number, from 0 to 100, to be deleted: ")
with open('codes.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
rows_new = []
for elem in row:
if elem == members:
elem = ""
rows_new.append(elem)
nums_new.append(rows_new)
write_csv('codesdel.csv', nums_new)

More efficient way to go through .csv file?

I'm trying to parse through a few dictionary a in .CSV file, using two lists in separate .txt files so that the script knows what it is looking for. The idea is to find a line in the .CSV file which matches both a Word and IDNumber, and then pull out a third variable if there is a match. However, the code is running really slow. Any ideas how I could make it more efficient?
import csv
IDNumberList_filename = 'IDs.txt'
WordsOfInterest_filename = 'dictionary_WordsOfInterest.txt'
Dictionary_filename = 'dictionary_individualwords.csv'
WordsOfInterest_ReadIn = open(WordsOfInterest_filename).read().split('\n')
#IDNumberListtoRead = open(IDNumberList_filename).read().split('\n')
for CurrentIDNumber in open(IDNumberList_filename).readlines():
for CurrentWord in open(WordsOfInterest_filename).readlines():
FoundCurrent = 0
with open(Dictionary_filename, newline='', encoding='utf-8') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
if ((row['IDNumber'] == CurrentIDNumber) and (row['Word'] == CurrentWord)):
FoundCurrent = 1
CurrentProportion= row['CurrentProportion']
if FoundCurrent == 0:
CurrentProportion=0
else:
CurrentProportion=1
print('found')
First of all, consider to load file dictionary_individualwords.csv into the memory. I guess that python dictionary is proper data structure for this case.
Your are opening the CSV file N times where N = (# lines in IDS.txt) * (# lines in dictionary_WordsOfInterest.txt). If the file is not too large, you can avoid that by saving its content to a dictionary or a list of lists.
The same way you open dictionary_WordsOfInterest.txt every time you read a new line from IDS.txt
Also It seems that you are looking for any combination of pair (CurrentIDNumber, CurrentWord) possible from the txt files. So for example you can store the ids in a set, and the words in an other, and for each row in the csv file, you can check if both the id and the word are in their respective set.
As you use readlines for the .txt files, you already build an in memory list with them. You should build those lists first and them only parse once the csv file. Something like:
import csv
IDNumberList_filename = 'IDs.txt'
WordsOfInterest_filename = 'dictionary_WordsOfInterest.txt'
Dictionary_filename = 'dictionary_individualwords.csv'
WordsOfInterest_ReadIn = open(WordsOfInterest_filename).read().split('\n')
#IDNumberListtoRead = open(IDNumberList_filename).read().split('\n')
numberlist = open(IDNumberList_filename).readlines():
wordlist = open(WordsOfInterest_filename).readlines():
FoundCurrent = 0
with open(Dictionary_filename, newline='', encoding='utf-8') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
for CurrentIDNumber in numberlist:
for CurrentWord in wordlist :
if ((row['IDNumber'] == CurrentIDNumber) and (row['Word'] == CurrentWord)):
FoundCurrent = 1
CurrentProportion= row['CurrentProportion']
if FoundCurrent == 0:
CurrentProportion=0
else:
CurrentProportion=1
print('found')
Beware: untested

Check whether string is in CSV

I want to search a CSV file and print either True or False, depending on whether or not I found the string. However, I'm running into the problem whereby it will return a false positive if it finds the string embedded in a larger string of text. E.g.: It will return True if string is foo and the term foobar is in the CSV file. I need to be able to return exact matches.
username = input()
if username in open('Users.csv').read():
print("True")
else:
print("False")
I've looked at using mmap, re and csv module functions, but I haven't got anywhere with them.
EDIT: Here is an alternative method:
import re
import csv
username = input()
with open('Users.csv', 'rt') as f:
reader = csv.reader(f)
for row in reader:
re.search(r'\bNOTSUREHERE\b', username)
when you look inside a csv file using the csv module, it will return each row as a list of columns. So if you want to lookup your string, you should modify your code as such:
import csv
username = input()
with open('Users.csv', 'rt') as f:
reader = csv.reader(f, delimiter=',') # good point by #paco
for row in reader:
for field in row:
if field == username:
print "is in file"
but as it is a csv file, you might expect the username to be at a given column:
with open('Users.csv', 'rt') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
if username == row[2]: # if the username shall be on column 3 (-> index 2)
print "is in file"
You should have a look at the csv module in python.
is_in_file = False
with open('my_file.csv', 'rb') as csvfile:
my_content = csv.reader(csvfile, delimiter=',')
for row in my_content:
if username in row:
is_in_file = True
print is_in_file
It assumes that your delimiter is a comma (replace with the your delimiter. Note that username must be defined previously. Also change the name of the file.
The code loops through all the lines in the CSV file. row a list of string containing each element of your row. For example, if you have this in your CSV file: Joe,Peter,Michel the row will be ['Joe', 'Peter', 'Michel']. Then you can check if your username is in that list.
I have used the top comment, it works and looks OK, but it was too slow for me.
I had an array of many strings that I wanted to check if they were in a large csv-file. No other requirements.
For this purpose I used (simplified, I iterated through a string of arrays and did other work than print):
with open('my_csv.csv', 'rt') as c:
str_arr_csv = c.readlines()
Together with:
if str(my_str) in str(str_arr_csv):
print("True")
The reduction in time was about ~90% for me. Code locks ugly but I'm all about speed. Sometimes.
import csv
scoresList=[]
with open ("playerScores_v2.txt") as csvfile:
scores=csv.reader(csvfile, delimiter= ",")
for row in scores:
scoresList.append(row)
playername=input("Enter the player name you would like the score for:")
print("{0:40} {1:10} {2:10}".format("Name","Level","Score"))
for i in range(0,len(scoresList)):
print("{0:40} {1:10} {2:10}".format(scoresList[i] [0],scoresList[i] [1], scoresList[i] [2]))
EXTENDED ALGO:
As i can have in my csv some values with space:
", atleft,atright , both " ,
I patch the code of zmo as follow
if field.strip() == username:
and it's ok, thanks.
OLD FASHION ALGO
i had previously coded an 'old fashion' algorithm that takes care of any allowed separators ( here comma, space and newline),so i was curious to compare performances.
With 10000 rounds on a very simple csv file, i got:
------------------ algo 1 old fashion ---------------
done in 1.931804895401001 s.
------------------ algo 2 with csv ---------------
done in 1.926626205444336 s.
As this is not too bad, 0.25% longer, i think that this good old hand made algo can help somebody (and will be useful if more parasitic chars as strip is only for spaces)
This algo uses bytes and can be used for anything else than strings.
It search for a name not embedded in another by checking left and right bytes that must be in the allowed separators.
It mainly uses loops with ejection asap through break or continue.
def separatorsNok(x):
return (x!=44) and (x!=32) and (x!=10) and (x!=13) #comma space lf cr
# set as a function to be able to run several chained tests
def searchUserName(userName, fileName):
# read file as binary (supposed to be utf-8 as userName)
f = open(fileName, 'rb')
contents = f.read()
lenOfFile = len(contents)
# set username in bytes
userBytes = bytearray(userName.encode('utf-8'))
lenOfUser = len(userBytes)
posInFile = 0
posInUser = 0
while posInFile < lenOfFile:
found = False
posInUser = 0
# search full name
while posInFile < lenOfFile:
if (contents[posInFile] == userBytes[posInUser]):
posInUser += 1
if (posInUser == lenOfUser):
found = True
break
posInFile += 1
if not found:
continue
# found a fulll name, check if isolated on left and on right
# left ok at very beginning or space or comma or new line
if (posInFile > lenOfUser):
if separatorsNok(contents[posInFile-lenOfUser]): #previousLeft
continue
# right ok at very end or space or comma or new line
if (posInFile < lenOfFile-1):
if separatorsNok(contents[posInFile+1]): # nextRight
continue
# found and bordered
break
# main while
if found:
print(userName, "is in file") # at posInFile-lenOfUser+1)
else:
pass
to check: searchUserName('pirla','test.csv')
As other answers, code exit at first match but can be easily extended to find all.
HTH
#!/usr/bin/python
import csv
with open('my.csv', 'r') as f:
lines = f.readlines()
cnt = 0
for entry in lines:
if 'foo' in entry:
cnt += 1
print"No of foo entry Count :".ljust(20, '.'), cnt

Writing csv file in Python: unwanted commas appear

I'm trying to extract a list of dictionary values into a file with the following code:
import csv
def function(file, output, deli = ','):
dictionary = dict()
with open(file, 'r') as source, open(output, 'w') as outp:
data = csv.reader(source)
line0 = next(data)
i = 0
for element in line0:
dictionary[i] = element
i += 1
my_writer = csv.writer(outp)
for element in dictionary.values():
print(element)
my_writer.writerow(element)
if __name__ == '__main__':
from sys import argv
if len(argv) == 2:
function(argv[1])
elif len(argv) == 3:
function(argv[1], argv[2])
elif len(argv) == 4:
function(argv[1], argv[2], argv[3])
print("ok")
To run this code on the shell, I use the command:
python function.py input output
However, trying on a csv file like:
alpha, beta, gamma, delta
I get the following result:
a,l,p,h,a
,b,e,t,a
,g,a,m,m,a
,d,e,l,t,a
I tried to change the delimiter to ' ' and I got the same result with spaces instead of commas.
Am I missing something?
The problem is in the lines
for element in dictionary.values():
print(element)
my_writer.writerow(element)
From the docs argument to writerow should be a list of objects. In your case the argument is a string (which is iterable).
So what you are actually doing is my_writer.writerow("alpha") which is written as a,l,p,h,a.
You should simply do
my_writer.writerow(element.values())
Also, you are getting leading commas beacuse your CSV string is alpha, beta, gamma. So when the split happens the elements are ['aplha',' beta',' gamma',]. You could use strip() to remove them
You are looping over the characters in the line instead of the actual elements
with open(file_, 'r') as source, open(output, 'w') as outp:
data = csv.reader(source, delimiter=deli)
#line0 = next(data)
i = 0
for element in data:
dictionary[i] = element
i += 1
also it's considered to be good practice to not override the built-ins in this case you are overriding file, instead PEP 008 recommends that you add an underscore to the end of you variable name.

Reading csv file and compare objects to a list

I have a .txt file,primary list, with strings like this:
f
r
y
h
g
j
and I have a .csv file,recipes list, with rows like this:
d,g,r,e,w,s
j,f,o,b,x,q,h
y,n,b,w,q,j
My programe is going throw each row and counts number of objects which belongs to primary list, for example in this case outcome is:
2
3
2
I always get 0, the mistake must be silly, but I can't figure it out:
from __future__ import print_function
import csv
primary_data = open('test_list.txt','r')
primary_list = []
for line in primary_data.readlines():
line.strip('\n')
primary_list.append(line)
recipes_reader = csv.reader(open('test.csv','r'), delimiter =',')
for row in recipes_reader:
primary_count = 0
for i in row:
if i in primary_list:
primary_count += 1
print (primary_count)
Here's the bare-essentials pedal-to-the-metal version:
from __future__ import print_function
import csv
with open('test_list.txt', 'r') as f: # with statement ensures your file is closed
primary_set = set(line.strip() for line in f)
with open('test.csv', 'rb') as f: #### see note below ###
for row in csv.reader(f): # delimiter=',' is the default
print(sum(i in primary_set for i in row)) # i in primary_set has int value 0 or 1
Note: In Python 2.x, always open csv files in binary mode. In Python3.x, always open csv files with newline=''
The reading into primary_list adds \n to each number - you should remove it:
When appending to primary_list do:
for line in primary_data:
primary_list.append(line.strip())
Note the strip call. Also, as you can see, you don't really need realines, since for line in primary_data already does what you need when primary_data is a file object.
Now, as a general comment, since you're using the primary list for lookup, I suggest replacing the list by a set - this will make things much faster if the list is large. Python sets are very efficient for key-based lookup, lists are not designed for that purpose.
Following code would solve the problem.
from __future__ import print_function
import csv
primary_data = open('test_list.txt','r')
primary_list = [line.rstrip() for line in primary_data]
recipies_reader = csv.reader(open('recipies.csv','r'), delimiter =',')
for row in recipies_reader:
count = 0
for i in row:
if i in primary_list:
count += 1
print (count)
Output
2
3
2

Categories