Hi so I loaded a JSON file into a list using the following code:
import json
postal_mtl = ['H9W','H4W','H9P','H3B','H3A','H2Z','H3H','H3G','H3X','H9J','H1B','H1G','H1H','H4X','H2Y','H9R','H3Z','H3Y']
data = []
with open('business.json',encoding="utf8") as f:
for line in f:
data.append(json.loads(line))
Now I am trying to find the number of restaurants in montreal in this dataset (coming from Yelp). I tried the following code:
compteur3 = 0
for i in range(len(data)):
if data[i]['postal_code'][0:3] in postal_mtl and 'Restaurants' in data[i]['categories']:
compteur3 += 1
print(compteur3)
But I am getting an error saying "argument of type 'NoneType' is not iterable" I guess Python considers the date[i]['categories'] as a Nonetype ? Why is that ? If I enter the following I can see that it's clearly a string:
data[5]['categories']
'Shipping Centers, Couriers & Delivery Services, Local Services, Printing Services'
Now I just want to iterate over all the elements in my data list and find each line where we have the word 'Restaurants' (I got the Montreal stuff fixed)... Any idea ? Thanks !
Based on the code provided, it seems that the error is most likely coming from the if condition. Specifically, it is most likely coming from the statement 'Restaurants' in data[i]['categories']. Under the hood, Python is trying to iterate through data[i]['categories'] to see if 'Restaurants' is in it. If data[i]['categories'] is None, that would cause this error.
This may be caused by the JSON string not being formatted the way you expected. Perhaps, if no categories were listed in the 'Categories' field, a null was put instead of an empty list. To check for this in your code, you can try the following:
compteur3 = 0
for i in range(len(data)):
is_inmontreal = data[i]['postal_code'][0:3] in postal_mtl
is_restaurant = data[i]['categories'] and 'Restaurants' in data[i]['categories']
if is_inmontreal and is_restaurant:
compteur3 += 1
print(compteur3)
Above, I simply split the condition into two parts. Functionally, this would be the same as having the conditions in one line, it just makes it slightly clearer. However, I also added a check in is_restaurant to see if data[i]['categories'] has a positive truth value. In effect, this will check if the value is not None and it is not an empty list. If you really want to be explicit, you can also do
is_restaurant = data[i]['categories'] is not None and 'Restaurants' in data[i]['categories']
Depending on how dirty the data is, you may need to go a little further than this and use exception handling. However, the above is just speculation as I do not know what the data looks like.
Related
I am trying to take the value from the input and put it into the browser.find_elements_by_xpath("//div[#class='v1Nh3 kIKUG _bz0w']") function. However, the string formatting surely doesn't work, since it's the list, hence it throws the AttributeError.
Does anyone know any alternatives to use with lists (possibly without iterating over each file)?
xpath_to_links = input('Enter the xpath to links: ')
posts = browser.find_elements_by_xpath("//div[#class='{}']").format(devops)
AttributeError: 'list' object has no attribute 'format'
Looks like the reason of error is that you are placing the format function in the wrong place, so instead of operating on string "//div[#class='{}']" you call it for the list returned by find_elements_by_xpath. Could you please try to replace your code with one of the following lines ?
posts = browser.find_elements_by_xpath("//div[#class='{}']".format(devops))
posts = browser.find_elements_by_xpath(f"//div[#class='{devops}']")
I am currently writing a program which uses the ComapaniesHouse API to return a json file containing information about a certain company.
I am able to retrieve the data easily using the following commands:
r = requests.get('https://api.companieshouse.gov.uk/company/COMPANY-NO/filing-history', auth=('API-KEY', ''))
data = r.json()
With that information I can do an awful lot, however I've ran into a problem which I was hoping you guys could possible help me with. What I aim to do is go through every nested entry in the json file and check if the value of certain keys matches certain criteria, if the values of 2 keys match a certain criteria then other code is executed.
One of the keys is the date of an entry, and I would like to ignore results that are older than a certain date, I have attempted to do this with the following:
date_threshold = datetime.date.today() - datetime.timedelta(days=30)``
for each in data["items"]:
date = ['date']
type = ['type']
if date < date_threshold and type is "RM01":
print("wwwwww")
In case it isn't clear, what I'm attempting to do (albeit very badly) is assign each of the entries to a variable, which then gets tested against certain criteria.
Although this doesn't work, python spits out a variable mismatch error:
TypeError: unorderable types: list() < datetime.date()
Which makes me think the date is being stored as a string, and so I can't compare it to the datetime value set earlier, but when I check the API documentation (https://developer.companieshouse.gov.uk/api/docs/company/company_number/filing-history/filingHistoryItem-resource.html), it says clearly that the 'date' entry is returned as a date type.
What am I doing wrong, its very clear that I'm extremely new to python given what I presume is the atrocity of my code, but in my head it seems to make at least a little sense. In case none of this clear, I basically want to go through all the entries in the json file, and the if the date and type match a certain description, then other code can be executed (in this case I have just used random text).
Any help is greatly appreciated! Let me know if you need anything cleared up.
:)
EDIT
After tweaking my code to the below:
for each in data["items"]:
date = each['date']
type = each['type']
if date is '2016-09-15' and type is "RM01":
print("wwwwww")
The code executes without any errors, but the words aren't printed, even though I know there is an entry in the json file with that exact date, and that exact type, any thoughts?
SOLUTION:
Thanks to everyone for helping me out, I had made a couple of very basic errors, the code that works as expected is below::
for each in data["items"]:
date = each['date']
typevariable = each['type']
if date == '2016-09-15' and typevariable == "RM01":
print("wwwwww")
This prints the word "wwwwww" 3 times, which is correct seeing as there are 3 entries in the JSON that fulfil those criteria.
You need to first convert your date variable to a datetime type using datetime.strptime()
You are comparing a list type variable date with datetime type variable date_threshold.
Code is importing another file, which is working perfectly.
But, there is a problem in the line where I try to import the csv file, with a column called 'account key', returning the TypeError above.
import file_import as fi
Function for collectively finding data necessary from a csv file.
def unique_students(csv_file):
unique_students_list = set()
for information in csv_file:
unique_students_list.add(csv_file["account_key"])
return len(unique_students_list)
#enrollment_num_rows = len(fi.enrollments)
#engagement_num_rows = len(fi.daily_engagement)
#submission_num_rows = len(fi.project_submissions)
#enrollment_num_unique_students = unique_students(fi.enrollments)
#engagement_num_unique_students = unique_students(fi.daily_engagement)
#submission_num_unique_students = unique_students(fi.project_submissions)
csv_file["account_key"]
Lists expect a numeric index. As far as I know, only dictionaries accept String indices.
I'm not entirely sure what this is supposed to do; I think your logic is flawed. You bind information in the for loop, then never use it. Even if the list did accept a string index, all it would do is populate the Set with the same information over and over since the for loop body remains the same same every loop. This would only work if you were expecting csv_file to be a custom container type that had side effects when indexed (like advancing some internal counter).
Warning: I'm a total newbie; apologies if I didn't search for the right thing before submitting this question. I found lots on how to ignore errors, but nothing quite like what I'm trying to do here.
I have a simple script that I'm using to grab data off a database, parse some fields apart, and re-write the parsed values back to the database. Multiple users are submitting to the database according to a delimited template, but there is some degree of non-compliance, meaning sometimes the string won't contain all/any delimiters. My script needs to be able to handle those instances by throwing them out entirely.
I'm having trouble throwing out non-compliant strings, rather than just ignoring the errors they raise. When I've tried try-except-pass, I've ended up getting errors when my script attempts to append parsed values into the array I'm ultimately writing back to the db.
Originally, my script said:
def parse_comments(comments):
parts = comments.split("||")
if len(parts) < 20:
raise ValueError("Comment didn't have enough || delimiters")
return Result._make([parts[i].strip() for i in xrange(2, 21, 3)])
Fully compliant uploads would append Result to an array and write back to db.
I've tried try/except:
def parse_comments(comments):
parts = comments.split("||")
try:
Thing._make([parts[i].strip() for i in xrange(2, 21, 3)])
except:
pass
return Thing
But I end up getting an error when I try and append the parsed values to an array -- specifically TypeError: 'type' object has no attribute 'getitem'
I've also tried:
def parse_comments(comments):
parts = comments.split("||")
if len(parts) >= 20:
Thing._make([parts[i].strip() for i in xrange(2, 21, 3)])
else:
pass
return Thing
but to no avail.
tl;dr: I need to parse stuff and append parsed items. If a string can't be parsed how I want it, I want my code to ignore that string entirely and move on.
But I end up getting an error when I try and append the parsed values to an array -- specifically TypeError: 'type' object has no attribute 'getitem'
Because Thing means the Thing class itself, not an instance of that class.
You need to think more clearly about what you want to return when the data is invalid. It may be the case that you can't return anything directly usable here, so that the calling code has to explicitly check.
I am not sure I understand everything you want to do. But I think you are not catching the error at the right place. You said yourself that it arose when you wanted to append the value to an array. So maybe you should do:
try:
# append the parsed values to an array
except TypeError:
pass
You should give the exception type to catch after except, otherwise it will catch any exception, even a user's CTRL+C which raise a KeyboardInterrupt.
I made this code in python but it has much errors that i want you to check it for me as I'm new at python
for i in range(phones):
pho = int(raw_input( "Phone Number %d : " % (i+1)))
phNums.append(pho)
for name in range(phot):
name1 = '{0}/phone.txt'.format(pathname)
file = open(name1, 'w')
file.write = (pho)
file.close()
First issue that when i run the script i had this error
Traceback (most recent call last):
file.write = (pho)
AttributeError: 'file' object attribute 'write' is read-only
The script should do this scenario :
First the user give the number of Perfixes and the count of the progs
The script will count phot which it the progs / phones
then the user will give each phone number
the script will take the first number and write it in the text file "phone.txt" on the folders 1 2 3 until reach phot count of folders then move to the next number and continue writing to next count of folders ... etc
please check what is the issue with the code
phNums.append(pho) adds ints to your list phNums, you then try to iterate over phNums[i] which is each int in your list which you cannot do and why you get the error.
Just iterate over each element directly:
for i in phNums:
name = '{}phone.txt'.format(i) # i is an int so use str.format to concat
Or looking at your code, you may have meant to iterate over a list of names that is not provided in your code.
This is a simple error that has already been answered sufficiently, so rather than give you a direct answer, I want to direct you through my debugging process. Hopefully this will save time in the future.
First step is to check where the error occurred:
for name in phNums[i]:
Ok, now what was the error?
TypeError: 'int' object is not iterable
The "for" statement will iterate over a given iterable and is generally like this:
for _item_ in _iterable_
Now we know which part of the line to look at. What type is phNums[i] ?
phNums is defined earlier as a list, but with phNums[i], we want to find the type of the items in that list. It appears that the only time the script adds to the list is when appending pho, which is an int.
There's the issue, you're trying to iterate over an int item! The error makes sense now, but how should we fix it?
note: I tried going further, but your description is unclear. Do you want the folder structure to be:
1/phone.txt
2/phone.txt
3/phone.txt
where phone.txt contains the phone number in each?