Google's radarsearch API results - python
I'm trying to geolocate all the businesses related to a keyword in my city using, first, the radarsearch API in order to retrieve the Place ID and later using the Places API to get more information of each Place ID (such as the name, or the formatted address).
In my first approach I splitted my city in 9 circumferences, each one with radius 22km and avoiding rural zones, where there's no supposed to be a business. This way I obtained (once removing duplicated results, due to the circumferences overlapping) approximately 150 businesses. This result is not reliable because the official webpage of the company asserts there are 245.
In order to retrieve ALL the businesses, I split my city in circumferences of radius 10km. Therefore with approx 50 pairs of coordinates I fill the city, including now all zones, both rural and non-rural. Now, surprisingly I obtain only 81 businesses! How can this be possible?
I'm storing all the information in separated dictionaries and I noticed the amount of data of each dictionary increases with the increasing of the radius and is always the same (for a fixed radius).
Now, apart from the previous question, is there any way to limit the amount of results each request yields?
The code I'm using is the following:
dict1 = {}
radius=20000
keyword='keyworkd'
key=YOUR_API_KEY
url_base="https://maps.googleapis.com/maps/api/place/radarsearch/json?"
list_dicts = []
for i,(lo, la) in enumerate(zip(lon_txt,lat_txt)):
url=url_base+'location='+str(lo)+','+str(la)+'&radius='+str(radius)+'&keyword='+keyword+'&key='+key
response = urllib2.urlopen(url)
table = json.load(response)
if table['status']=='OK':
for j,line in enumerate(table['results']):
temp = {j : line['place_id']}
dict1.update(temp)
list_dicts.append(dict1)
else:
pass
Finally I managed to solve this problem.
The thing was the dict initialization must be done in each loop iteration. Now it stores all the information and I retrieve what I wanted from the beginning.
dict1 = {}
radius=20000
keyword='keyworkd'
key=YOUR_API_KEY
url_base="https://maps.googleapis.com/maps/api/place/radarsearch/json?"
list_dicts = []
for i,(lo, la) in enumerate(zip(lon_txt,lat_txt)):
url=url_base+'location='+str(lo)+','+str(la)+'&radius='+str(radius)+'&keyword='+keyword+'&key='+key
response = urllib2.urlopen(url)
table = json.load(response)
if table['status']=='OK':
for j,line in enumerate(table['results']):
temp = {j : line['place_id']}
dict1.update(temp)
list_dicts.append(dict1)
dict1 = {}
else:
pass
Related
Making a dictionary out of API data
In the first part of the project i'm doing using [NeuroMorpho.org][1] i found the ID number of neurons need. Now i want to get morphometric data for those neurons for which i want to download certain morphometric data from and save them in a dictionary. I used the following code: n = [100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464,100,101,1016,1019,102,1020,1023,1026,1028,1029,103,1030,1034,1035,104,1040,108828,108829,108830,108831,108838,108839,108840,108841,111008,111009,111010,111011,111012,111013,111014,111015,111016,111017,111018,111019,111020,111021,111022,111023,111024,111025,111026,111027,111028,112460,112461,112462,112463,112464] rat_morphometry = [] for i in n: url = "http://neuromorpho.org/api//morphometry/id/" + str(i) response = requests.get(url) json_data = response.json() rat_morphometry.append(json_data) print (rat_morphometry) All of the data gets printed (takes time). I tried just to get surface data for the begining and tried this: df_dict = { 'Surface': list()} for row in rat_morphometry: df_dict['Surface'].append(str(row['surface'])) rat_morphometry_df = pd.DataFrame(df_dict) print(rat_morphometry) However, i'm not getting anything in the form of a dataframe. Basically, i'm gettting the same output as in the first section. Additionally, now i'm only getting info for like 8 neurons. I'm afraid it is due to NeuroMorpho limitations concerning the morphometric data (they state: The number of the neuron measurements listed per page. Please note that the maximum size allowed per page is 500.). The problem is the first section doesn't show me the number of pages through which i can loop and get the info for all neurons. Tnx in advance if anyone can help!
Python geocoder limit
I have data of post number in excel. I input in python as a list. I use geocoder library to get the latitude and longitude by the post number so i can put on map later on. g = geocoder.google('1700002') g.latlng g.latlng brings me a list with [latitude,longitude] in it. Since is take string only. I changed the values from float to int to get rid of point 0 (133.0 = 130). then make it to string to read it. yubin_frame = yubin['yubin'] #post data #1st put it to ing to get rid of float yubin_list_int = map(int, yubin_list) #then make it to string to in put all to string yubin_list_str = map(str, yubin_list_int) I made this for-loop to make list of both latitude and longitude like this. #create a new list that include all data in Yubin_zahyou list Yubin_zahyou = [] for i in range(len(yubin_list_str)): Yubin_zahyou.append(geocoder.google(yubin_list_str[i]).latlng) My problem is that i have nearly 30000 data and geocoder brings only nearly 2500 input!. Does this mean geocoder has a limit or I made a mistake somehow?
Yes, it has rate limit as written here in Providers. https://github.com/DenisCarriere/geocoder https://developers.google.com/maps/documentation/geocoding/usage-limits as for google it nearly only gives 2500 limit per day.
Python find average of element in list with multiple elements
I have a ticker that grabs current information of multiple elements and adds it to a list in the format: trade_list.append([[trade_id, results]]). Say we're tracking trade_id's 4555, 5555, 23232, the trade_list will keep ticking away adding their results to the list, I then want to find the averages of their results individually. The code works as such: Find accounts for a in accounts: find open trades of accounts for t in range(len(trades)): do some math trades_list.append(trade_id,result) avernum = 0 average = [] for r in range(len(trades_list)): average.append(trades_list[r][1]) # This is the value attached to the trade_id avernum+=1 results = float(sum(average)/avernum)) results_list.append([[trade_id,results]]) This fills out really quickly. This is after two ticks: print(results_list) [[[53471, 28.36432]], [[53477, 31.67835]], [[53474, 32.27664]], [[52232, 1908.30604]], [[52241, 350.4758]], [[53471, 28.36432]], [[53477, 31.67835]], [[53474, 32.27664]], [[52232, 1908.30604]], [[52241, 350.4758]]] These averages will move and change very quickly. I want to use results_list to track and watch them, then compare previous averages to current ones Thinking: for r in range(len(results_list)): if results_list[r][0] == trade_id: restick.append(results_list[r][1]) resnum = len(restick) if restick[resnum] > restick[resnum-1]: do fancy things
Here is some short code that does what you I think you have described, although I might have misunderstood. You basically do exactly what you say; select everything that has a certain trade_id and returns its average.: TID_INDEX = 0 DATA_INDEX = 1 def id_average(t_id, arr): filt_arr = [i[DATA_INDEX] for i in arr if i[TID_INDEX] == t_id] return sum(filt_arr)/len(filt_arr)
Python: Finding a path between nodes within groups with nested dictionaries
I have a dataset containing historical transaction records for real estate properties. Each property has an ID number. To check if the data is complete, for each property I am identifying a "transaction chain": I take the original buyer, and go through all intermediate buyer/seller combinations until I reach the final buyer of record. So for data that looks like this: Buyer|Seller|propertyID Bob|Jane|23 Tim|Bob|23 Karl|Tim|23 The transaction chain will look like: [Jane, Bob, Tim, Karl] I am using three datasets to do this. The first contains the names of only the first buyer of each property. The second contains the names of all intermediate buyers and sellers, and the third contains only the final buyer for each property. I use three datasets so I can follow the process given by vikramls answer here. In my version of the graph dictionary, each seller is a key to its corresponding buyer, and the oft-cited find_path function finds the path from first seller to last buyer. The problem is that the dataset is very large, so I get a maximum recursion depth reached error. I think I can solve this by nesting the graph dictionary inside another dictionary where they key is the property id number, and then searching for the path within ID groups. However, when I tried: graph = {} propertyIDgraph = {} with open('buyersAndSellers.txt','r') as f: for row in f: propertyid, seller, buyer = row.strip('\n').split('|') graph.setdefault(seller, []).append(buyer) propertyIDgraph.setdefault(propertyid, []).append(graph) f.close() It assigned every buyer/seller combination to every property id. I would like it to assign the buyers and sellers to only their corresponding property ID.
You might attempt to something like the following. I adapted from the link at https://www.python.org/doc/essays/graphs/ Transaction = namedtuple('Transaction', ['Buyer', 'PropertyId']) graph = {} ## maybe this is a db or a file for data in datasource: graph[data.seller] = Transaction(data.buyer,data.property_id) ## returns something like ## graph = {'Jane': [Transaction('Bob',23)], ## 'Bob': [Transaction('Tim',23)], ## 'Time': [Transaction('Karl',23)]} ## def find_transaction_path(graph, original_seller,current_owner,target_property_id path=[]): assert(target_property_id is not None) path = path + [original_seller] if start == end: return path if not graph.has_key(original_seller): return None shortest = None for node in graph[start]: if node not in path and node.property_id == target_property_id: newpath = find_shortest_path(graph, node.Buyer, current_owner, path,target_property_id) if newpath: if not shortest or len(newpath) < len(shortest): shortest = newpath return shortest
I wouldn't recommend append to a graph. It will append to every node. Better check if exists first than right after append it to the already existed object. Try this: graph = {} propertyIDgraph = {} with open('buyersAndSellers.txt','r') as f: for row in f: propertyid, seller, buyer = row.strip('\n').split('|') if seller in graph.iterkeys() : graph[seller] = graph[seller] + [buyer] else: graph[seller] = [buyer] if propertyid in propertyIDgraph.iterkeys(): propertyIDgraph[propertyid] = propertyIDgraph[propertyid] + [graph] else: propertyIDgraph[propertyid] = [graph] f.close() Here a link that maybe will be usefull: syntax for creating a dictionary into another dictionary in python
Simple example of retrieving 500 items from dynamodb using Python
Looking for a simple example of retrieving 500 items from dynamodb minimizing the number of queries. I know there's a "multiget" function that would let me break this up into chunks of 50 queries, but not sure how to do this. I'm starting with a list of 500 keys. I'm then thinking of writing a function that takes this list of keys, breaks it up into "chunks," retrieves the values, stitches them back together, and returns a dict of 500 key-value pairs. Or is there a better way to do this? As a corollary, how would I "sort" the items afterwards?
Depending on you scheme, There are 2 ways of efficiently retrieving your 500 items. 1 Items are under the same hash_key, using a range_key Use the query method with the hash_key you may ask to sort the range_keys A-Z or Z-A 2 Items are on "random" keys You said it: use the BatchGetItem method Good news: the limit is actually 100/request or 1MB max you will have to sort the results on the Python side. On the practical side, since you use Python, I highly recommend the Boto library for low-level access or dynamodb-mapper library for higher level access (Disclaimer: I am one of the core dev of dynamodb-mapper). Sadly, neither of these library provides an easy way to wrap the batch_get operation. On the contrary, there is a generator for scan and for query which 'pretends' you get all in a single query. In order to get optimal results with the batch query, I recommend this workflow: submit a batch with all of your 500 items. store the results in your dicts re-submit with the UnprocessedKeys as many times as needed sort the results on the python side Quick example I assume you have created a table "MyTable" with a single hash_key import boto # Helper function. This is more or less the code # I added to devolop branch def resubmit(batch, prev): # Empty (re-use) the batch del batch[:] # The batch answer contains the list of # unprocessed keys grouped by tables if 'UnprocessedKeys' in prev: unprocessed = res['UnprocessedKeys'] else: return None # Load the unprocessed keys for table_name, table_req in unprocessed.iteritems(): table_keys = table_req['Keys'] table = batch.layer2.get_table(table_name) keys = [] for key in table_keys: h = key['HashKeyElement'] r = None if 'RangeKeyElement' in key: r = key['RangeKeyElement'] keys.append((h, r)) attributes_to_get = None if 'AttributesToGet' in table_req: attributes_to_get = table_req['AttributesToGet'] batch.add_batch(table, keys, attributes_to_get=attributes_to_get) return batch.submit() # Main db = boto.connect_dynamodb() table = db.get_table('MyTable') batch = db.new_batch_list() keys = range (100) # Get items from 0 to 99 batch.add_batch(table, keys) res = batch.submit() while res: print res # Do some usefull work here res = resubmit(batch, res) # The END EDIT: I've added a resubmit() function to BatchList in Boto develop branch. It greatly simplifies the worklow: add all of your requested keys to BatchList submit() resubmit() as long as it does not return None. this should be available in next release.