I have such code:
url = "https://www.reformagkh.ru/opendata/export/"
regions = ["150", "101"]
csv_files = []
for region in regions:
result = requests.get(url, params={"t":region})
zf = ZipFile(BytesIO(result.content))
for filename in zf.namelist():
if filename.endswith(".csv"):
file = zf.open(filename)
csv_files.append(file)
if len(csv_files) == 1:
reader = csv.reader(TextIOWrapper(file, 'utf-8'))
for row in reader:
print(row)
else:
print("Error")
I have 2 links, where located some unzip csv files and I should open them and read. The main question is how work with list of urls and open them step by step?
When I am trying to debug and fix it, I have 400 error and problem with loop. Could somebody give me advise how to handle it?
I should open and handle such links:
['https://www.reformagkh.ru/opendata/export/150',
'https://www.reformagkh.ru/opendata/export/101']
You need to prepare the url in the loop instead of passing region as params.
Use f-strings to prepare the url as for Python 3.6+:
for region in regions:
url_cur = f"{url}{region}"
result = requests.get(url_cur)
Use format() if you are using python version less than 3.6:
for region in regions:
url_cur = "{}{}".format(url, region)
result = requests.get(url_cur)
You also need to create the csv_files list newly for each url.
The complete code would be:
url = "https://www.reformagkh.ru/opendata/export/"
regions = ["150", "101"]
for region in regions:
cur_url = f"{url}{region}"
result = requests.get(cur_url)
zf = ZipFile(BytesIO(result.content))
csv_files = [] # create a new list everytime
for filename in zf.namelist():
if filename.endswith(".csv"):
file = zf.open(filename)
csv_files.append(file)
if len(csv_files) == 1:
reader = csv.reader(TextIOWrapper(file, 'utf-8'))
for row in reader:
print(row)
else:
print("Error")
regions = ["150", "101"]
csv_files = []
for region in regions:
url = "https://www.reformagkh.ru/opendata/export/%s" % region
result = requests.get(url)
zf = ZipFile(BytesIO(result.content))
for filename in zf.namelist():
if filename.endswith(".csv"):
file = zf.open(filename)
csv_files.append(file)
if len(csv_files) == 1:
reader = csv.reader(TextIOWrapper(file, 'utf-8'))
for row in reader:
print(row)
else:
print("Error")
I think it is much easier with %s. I often use the same method.
Related
How do i get the screen names from a list of twitter IDs? I have the IDs saved in a pandas dataframe and have 38194 IDs that i wish to match to their screen names so i can do a network analysis. I am using python, but i am quite new to coding so i do not know if this is even possible? I have tried the following:
myIds = friend_list
if myIds:
myIds = myIds.replace(' ','')
myIds = myIds.split(',')
# Set a new list object
myHandleList = []
i = 0
# Loop trough the list of usernames
for idnumber in myIds:
u = api.get_user(myIds[i])
uid = u.screen_name
myHandleList.append(uid)
i = i+1
# Print the lists
print('Twitter-Ids',myIds)
print('Usernames',myHandleList)
#set a filename based on current time
csvfilename = "csvoutput-"+time.strftime("%Y%m%d%-H%M%S")+".csv"
print('We also outputted a CSV-file named '+csvfilename+' to your file parent directory')
with open(csvfilename, 'w') as myfile:
wr = csv.writer(myfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
wr.writerow(['username','twitter-id'])
j = 0
for handle in myHandleList:
writeline = myHandleList[j],myIds[j]
wr.writerow(writeline)
j = j+1
else:
print('The input was empty')
Updating your loop, as I believe you are pretty close.
myHandleList = []
myIds = ['1031291359', '960442381']
for idnumber in myIds:
u = api.get_user(idnumber)
myHandleList.append(u.screen_name)
print(myHandleList)
Sorry if this has been asked, but is it possible to skip a column when writing to a csv file?
Here is the code I have:
with open("list.csv","r") as f:
reader2 = csv.reader(f)
for row in reader2:
url = 'http://peopleus.intelius.com/results.php?ReportType=33&qi=0&qk=10&qp='+row
req = urllib.request.Request(url)
response = urllib.request.urlopen(req)
html = response.read()
retrieved_name = b'class="singleName">(.*?)<\/h1'
retrieved_number = b'<div\sclass="phone">(.*?)<\/div'
retrieved_nothing = b"(Sorry\swe\scouldn\\'t\sfind\sany\sresults)"
if re.search(retrieved_nothing,html):
noth = re.search(retrieved_nothing.decode('utf-8'),html.decode('utf-8')).group(1)
add_list(phone_data, noth)
else:
if re.search(retrieved_name,html):
name_found = re.search(retrieved_name.decode('utf-8'),html.decode('utf-8')).group(1)
else:
name_found = "No name found on peopleus.intelius.com"
if re.search(retrieved_number,html):
number_found = re.search(retrieved_number.decode('utf-8'),html.decode('utf-8')).group(1)
else:
number_found = "No number found on peopleus.intelius.com"
add_list(phone_data, name_found, number_found)
with open('column_skip.csv','a+', newline='') as mess:
writ = csv.writer(mess, dialect='excel')
writ.writerow(phone_data[-1])
time.sleep(10)
Assuming that there is data in the first three rows of column_skip.csv, can I have my program start writing its info in column 4?
Yeah, don't use csv.writer method and write it as an simple file write operation:
`file_path ='your_csv_file.csv'
with open(file_path, 'w') as fp:
#following are the data you want to write to csv
fp.write("%s, %s, %s" % ('Name of col1', 'col2', 'col4'))
fp.write("\n")`
I hope this helps...
I'm attempting to get a series of weather reports from a website, I have the below code which creates the needed URLs for the XMLs I want, what would be the best way to save the returned XMLs with different names?
with open('file.csv') as csvfile:
towns_csv = csv.reader(csvfile, dialect='excel')
for rows in towns_csv:
x = float(rows[2])
y = float(rows[1])
url = ("http://api.met.no/weatherapi/locationforecast/1.9/?")
lat = "lat="+format(y)
lon = "lon="+format(x)
text = url + format(lat) + ";" + format(lon)
I have been saving single XMls with this code;
response = requests.get(text)
xml_text=response.text
winds= bs4.BeautifulSoup(xml_text, "xml")
f = open('test.xml', "w")
f.write(winds.prettify())
f.close()
The first column of the CSV file has city names on it, I would ideally like to use those names to save each XML file as it is created. I'm sure another for loop would do, I'm just not sure how to create it.
Any help would be great, thanks again stack.
You have done most of the work already. Just use rows[0] as your filename. Assuming rows[0] is 'mumbai', then rows[0]+'.xml' will give you 'mumbai.xml' as the filename. You might want to check if city names have spaces which need to be removed, etc.
with open('file.csv') as csvfile:
towns_csv = csv.reader(csvfile, dialect='excel')
for rows in towns_csv:
x = float(rows[2])
y = float(rows[1])
url = ("http://api.met.no/weatherapi/locationforecast/1.9/?")
lat = "lat="+format(y)
lon = "lon="+format(x)
text = url + format(lat) + ";" + format(lon)
response = requests.get(text)
xml_text=response.text
winds= bs4.BeautifulSoup(xml_text, "xml")
f = open(rows[0]+'.xml', "w")
f.write(winds.prettify())
f.close()
I have a csv file partList.csv with strings that I want to use to search through a larger group of txt files. For some reason when I use the direct string 'L 99' I get a result. When I load the string L 99 from the csv I get no result.
partList.csv only contains cells in the first column with part numbers, one of which is L-99. txt_files_sample\5.txt is a text document that at some point contains the string L 99
My code:
def GetPartList():
partList = []
f = open('partList.csv', 'rb')
try:
reader = csv.reader(f)
for row in reader:
part = row[0].replace('-',' ').strip()
partList.append(part)
finally:
f.close()
return partList
def FindFileNames(partList):
i = 0
files = []
for root, dirs, filenames in os.walk('txt_files_sample'):
for f in filenames:
document = open(os.path.join(root, f), 'rb')
for line in document:
if partList[i] in line:
#if 'L 99' in line:
files.append(f)
break
i = i + 1
return files
print FindFileNames(GetPartList())
The code, as it stands above produces:
>>> []
If I uncomment if 'L 99' in line: and comment out if partList[i] in line: I get the result:
>>> ['5.txt']
So using Martijn's input, I discovered the issue was how I looped over partList. Rewritting FindFileNames() worked:
def FindFileList(partList):
i = 0
files = []
for root, dirs, filenames in os.walk('txt_files'):
for f in filenames:
a = 0
document = open(os.path.join(root, f), 'rb')
for line in document:
if a is 1:
break
for partNo in partList:
if partNo in line:
files.append(f)
a = 1
document.close()
return files
With the updated code I got a result that was an accurate list of filenames.
I am trying to iterate through several CSV files in a directory and grab a particular cell (same cell location) from each CSV file (cell location found when opened in Excel) and then post all similar cells in a single CSV or xls file, one after the other.
I have writen the code below (with some researched help) but I am just iterating over the first csv file in my list and printing the same value each time, dependant on the number of CSV files in my list. Could anybody point me in the right direction?
Here's my poor attempt!
import xlwt
import xlrd
import csv
import glob
import os
files = ['1_IQ_QTA.csv','2_IQ_QTA.csv','3_IQ_QTA.csv','4_IQ_QTA.csv']
n = 0
row = 0
filename = ('outputList.csv', 'a')
fname = files[n]
workbookr = xlrd.open_workbook(fname)
sheetr = workbookr.sheet_by_index(0)
workbookw = xlwt.Workbook()
sheetw = workbookw.add_sheet('test')
while n<len(files):
fname = files[n]
workbookr = xlrd.open_workbook(fname[n])
data = [sheetr.cell_value(12, 1) for col in range(sheetr.ncols)]
for index, value in enumerate(data):
sheetw.write(row, index, value)
workbookw.save('outputList.csv')
row = row +1
n = n+1
workbookw.save('outputList.csv')
My code is still a bit messy, I may have leftover code from my various attempts!
Thanks
MikG
Assuming you are just trying to make a CSV file of the same cells from each file. So if you had 4 files, your output file will have 4 entries.
files = ['1_IQ_QTA.csv','2_IQ_QTA.csv','3_IQ_QTA.csv','4_IQ_QTA.csv']
n = 0
row = 0
outputfile = open('outputList.csv', 'w')
cellrow = 12 #collect the cell (12, 1) from each file and put it in the output list
cellcolumn = 1
while n<len(files):
fname = files[n]
currentfile = open(fname,'r')
for i in range (cellrow):
currentrow = currentfile.readline()
# print currentrow #for testing
columncnt=0
currentcell = ''
openquote = False
for char in currentrow:
if char == '"' and not openquote:
openquote = True
elif char == '"' and openquote:
openquote = False
elif char == ',' and not openquote:
columncnt+=1
if columncnt == cellcolumn:
cellvalue = currentcell
# print cellvalue #for testing
currentcell=''
else:
currentcell += char
outputfile.write (cellvalue + ',')
currentfile.close()
n += 1
outputfile.close()
It seemed to me that since you already had a CSV it would be easier to deal with as a regular file and parse through to find the right information, plus nothing to import. Happy coding!
I think you have an error at this line in the while loop:
workbookr = xlrd.open_workbook(fname[n])
must be:
workbookr = xlrd.open_workbook(fname)
otherwise your workbookr remains as you set it before outside the loop:
fname = files[n]
workbookr = xlrd.open_workbook(fname)
which is the first file in your list.
Since they are just csv files, there is no need for the excel libraries.
#!/usr/bin/env python
import argparse, csv
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='merge csv files on field', version='%(prog)s 1.0')
parser.add_argument('infile', nargs='+', type=str, help='list of input files')
parser.add_argument('--col', type=int, default=0, help='Column to grab')
parser.add_argument('--row', type=int, default=0, help='Row to grab')
parser.add_argument('--out', type=str, default='temp.csv', help='name of output file')
args = parser.parse_args()
data = []
for fname in args.infile:
with open(fname, 'rb') as df:
reader = csv.reader(df)
for index, line in enumerate(reader):
if index == args.row:
data.push(line[args.column])
del reader
writer = csv.writer(open(args.out, "wb"), dialect='excel')
writer.writerows(data)
del writer