I'm trying to extract csv files by the cities with `re.findall(), but when I try to do that and write to results to another csv file, it loops over and over many times!
import io
import csv
import re
lines=0
outfile1 =codecs.open('/mesh/وسطى.csv','w','utf_8')
outfile6 =codecs.open('/mesh/أخرى.csv','w','utf_8')
with io.open('/mishal.csv','r',encoding="utf-8",newline='') as f:
reader = csv.reader(f)
for row in f :
for rows in row:
lines += 1
#الوسطى
m = re.findall('\u0634\u0642\u0631\u0627\u0621',row)
if m:
outfile1.write(row)
else:
outfile6.write(row)
print("saved In to mishal !")
f.close()
I want the re.finall() cities to not loop, just execute once for each match—not loooooooping so many times whenever there's a match.
Here's a screenshot of the output showing the excessive looping:
csv readers return a list for each line of the file - your outer loop is iterating over the lines/rows and your inner loop is iterating over items in each row. It isn't clear what you want. but your conditional writes happen for each item in each row. If your intent is to check and see if there is a match in the row instead of items in the row,
for row in f :
match = False
for item in row:
lines += 1 #??
#الوسطى
match = re.search('\u0634\u0642\u0631\u0627\u0621',item)
if match:
outfile1.write(row)
else:
outfile6.write(row)
You could accomplish the same thing just iterating over the lines in the file without using a csv reader
with io.open('/mishal.csv','r',encoding="utf-8",newline='') as f:
for line in f:
#الوسطى
if re.search('\u0634\u0642\u0631\u0627\u0621',line):
outfile1.write(line)
else:
outfile6.write(line)
Related
I'm trying print lines randomly from a csv.
Lets say the csv has the below 10 lines -
1,One
2,Two
3,Three
4,Four
5,Five
6,Six
7,Seven
8,Eight
9,Nine
10,Ten
If I write a code like below, it prints each line as a list in the same order as present in the CSV
import csv
with open("MyCSV.csv") as f:
reader = csv.reader(f)
for row_num, row in enumerate(reader):
print(row)
Instead, I'd like it to be random.
Its just a print for now. I'll later pass each line as a List to a Function.
This should work. You can reuse the lines list in your code as it is shuffled.
import random
with open("tmp.csv", "r") as f:
lines = f.readlines()
random.shuffle(lines)
print(lines)
import csv
import random
csv_elems = []
with open("MyCSV.csv") as f:
reader = csv.reader(f)
for row_num, row in enumerate(reader):
csv_elems.append(row)
random.shuffle(csv_elems)
print(csv_elems[0])
As you can see I'm just printing the first elem, you can iterate over the list, keep shuffling & print
Well you can define a list, append all elements of csv file into it, then shuffle it and print them, assume that the name of this list is temp
import csv
import random
temp = []
with open("your csv file.csv") as file:
reader = csv.reader(file)
for row_num, row in enumerate(reader):
temp.append(row)
random.shuffle(temp)
for i in range(len(temp)):
print(temp[i])
Why better don't you use pandas to handle csv?
import pandas as pd
data = pd.read_csv("MyCSV.csv")
And to get the samples you are looking for just write:
data.sample() # print one sample
data.sample(5) # to write 5 samples
Also if you want to pass each line to a function.
data_after_function = data.appy(function_name)
and inside the function you can cast the line into a list with list()
Hope this helps!
Couple of things to do:
Store CSV into a sequence of some sort
Get the data randomly
For 1, it’s probably best to use some form of sequence comprehension (I’ve gone for nested tuple in a list as it seems you want the row numbers and we can’t use dictionaries for shuffle).
We can use the random module for number 2.
import random
import csv
with open("MyCSV.csv") as f:
reader = csv.reader(f)
my_csv = [(row_num, row) for row_num, row in enumerate(reader)]
# get only 1 item from the list at random
random_row = random.choice(my_csv)
# randomise the order of all the rows
shuffled_csv = random.shuffle(my_csv)
I trying to get specific cell value in csv file and count number of the rows, but if I count the number before read the specific cell the error will come
my code is :
import os
import sys
import csv
with open('C:\Users\Administrator\Desktop\python test\update_test\datalog.csv','rb') as csvfile:
data= csv.reader(csvfile)
row_count=sum(1 for row in data)
data=list(data)
text=data[0][0]
print(text)
print row_count
You can't just read from a file twice. sum(1 for row in data) already read all the data, so data = list(data) is an empty list, because the file pointer is at the end of the file and won't return more data unless you rewind the file to the start.
You don't even need to use the sum() call, remove it. You can get the same count with len(data) after you used list() on it:
with open('C:\Users\Administrator\Desktop\python test\update_test\datalog.csv','rb') as csvfile:
data= csv.reader(csvfile)
data = list(data)
text = data[0][0]
print(text)
print len(data)
I want to read rows in excel table but when I want, during reading process, I would like to stop reading forward and I want to read previous lines (backward reading)? How can I go previous rows again?
import csv
file = open('ff.csv2', 'rb')
reader = csv.reader(file)
for row in reader:
print row
You could always store the lines in a list and then access each line using the index.
import csv
file = open('ff.csv2', 'r')
def somereason(line):
# some logic to decide if stop reading by returning True
return False # keep on reading
for row in csv.reader(file):
if somereason(line):
break
lines.append(line)
# visit the stored lines in reverse
for row in reversed(lines):
print(row)
I am new at handling csv files with python and I want to write code that allows me to do the following: I have a pattern as:
pattern="3-5;7;10-16"(which may vary)
and I want to delete (in that case) rows 3 to 5 , 7 and 10 to 16
does any one have an idea how to do that?
You cannot simply delete lines from a csv. Instead, you have to read it in and then write it back with the accepted values. The following code works:
import csv
pattern="3-5;7;10-16"
off = []
for i in pattern.split(';'):
if '-' in i:
off += range(int(i.split('-')[0]),int(i.split('-')[1])+1)
else:
off += [int(i)]
with open('test.txt') as f:
reader = csv.reader(f)
reader = [','.join(item) for i,item in enumerate(reader) if i+1 not in off]
print reader
with open('input.txt', 'w') as f2:
for i in reader:
f2.write(i+'\n')
I have a for loop that prints 4 details:
deats = soup.find_all('p')
for n in deats:
print n.text
The output is 4 printed lines.
Instead of printing, what I'd like to do is have each 'n' written to a different column in a .csv. Obviously, when I use a regular .write() it puts it in the same column. In other words, how would I make it write each iteration of the loop to the next column?
You would create the csv row as a loop (or using list comprehension) I will show the explicit loop for ease of reading and you can change it to a single list comprehension line yourself.
row = []
for n in deats:
row.append(n)
Now you have row ready to write to the .csv file using csv.Writer()
Hei, try like this:
import csv
csv_output = csv.writer(open("output.csv", "wb")) # output.csv is the output file name!
csv_output.writerow(["Col1","Col2","Col3","Col4"]) # Setting first row with all column titles
temp = []
deats = soup.find_all('p')
for n in deats:
temp.append(str(n.text))
csv_output.writerow(temp)
You use the csv module for this:
import csv
with open('output.csv', 'wb') as csvfile:
opwriter = csv.writer(csvfile, delimiter=','
opwriter.writerow([n.text for n in deats])
extra_stuff = pie,cake,eat,too
some_file.write(",".join(n.text for n in deats)+"," + ",".join(str(s) for s in extra_stuff))
??? is that all you are looking for?