How to extract a list of points from Django PolygonField? - python

In one of my models that contains a PolygonField I have a to_dict method that takes all the values and turns them into a readable dictionary.
I do a similar thing for a model that has a PointField and it looks like this:
'point': {
'type': 'Point',
'coordinates': [self.point.x, self.point.y]
},
For the PolygonField I have to loop through the points to put them in the dictionary. I tried this but as expected, django complained:
'polygon': {
'type': 'Polygon',
'coordinates': [
for point in self.path:
[point.x, point.y]
]
},
How do you add all the points from a PolygonField into a dictionary?

Figured it out!
'polygon': {
'type': 'Polygon',
'coordinates': [[point.x, point.y] for point in self.polygon]
},

Related

Best way to convert string values to nested JSON

I have a string that looks like this:
Standard,NonStandard=[Hybrid,Non-standard,Preferred],AnotherOne=[a, b, c]
and am looking into ways to convert it to this dictionary / json structure via Python.
[{
'value': 'Standard',
},
{
'value': 'NonStandard',
'sub': [ 'Hybrid', 'Non-standard', 'Preferred' ]
},
{
'value': 'AnotherOne',
'sub': [ 'a', 'b', 'c']
}]
I think I can do this via looping over strings and keeping track of the =[ and closing ] but was wondering if there is a more "pythonic" solution.

Format an f-string for each dataframe object

Requirement
My requirement is to have a Python code extract some records from a database, format and upload a formatted JSON to a sink.
Planned approach
1. Create JSON-like templates for each record. E.g.
json_template_str = '{{
"type": "section",
"fields": [
{{
"type": "mrkdwn",
"text": "Today *{total_val}* customers saved {percent_derived}%."
}}
]
}}'
2. Extract records from DB to a dataframe.
3. Loop over dataframe and replace the {var} variables in bulk using something like .format(**locals()))
Question
I haven't worked with dataframes before.
What would be the best way to accomplish Step 3 ? Currently I am
3.1 Looping over the dataframe objects 1 by 1 for i, df_row in df.iterrows():
3.2 Assigning
total_val= df_row['total_val']
percent_derived= df_row['percent_derived']
3.3 In the loop format and add str to a list block.append(json.loads(json_template_str.format(**locals()))
I was trying to use the assign() method in dataframe but was not able to figure out a way to use like a lambda function to create a new column with my expected value that I can use.
As a novice in pandas, I feel there might be a more efficient way to do this (which may even involve changing the JSON template string - which I can totally do). Will be great to hear thoughts and ideas.
Thanks for your time.
I would not write a JSON string by hand, but rather create a corresponding python object and then use the json library to convert it into a string. With this in mind, you could try the following:
import copy
import pandas as pd
# some sample data
df = pd.DataFrame({
'total_val': [100, 200, 300],
'percent_derived': [12.4, 5.2, 6.5]
})
# template dictionary for a single block
json_template = {
"type": "section",
"fields": [
{"type": "mrkdwn",
"text": "Today *{total_val:.0f}* customers saved {percent_derived:.1f}%."
}
]
}
# a function that will insert data from each row
# of the dataframe into a block
def format_data(row):
json_t = copy.deepcopy(json_template)
text_t = json_t["fields"][0]["text"]
json_t["fields"][0]["text"] = text_t.format(
total_val=row['total_val'], percent_derived=row['percent_derived'])
return json_t
# create a list of blocks
result = df.agg(format_data, axis=1).tolist()
The resulting list looks as follows, and can be converted into a JSON string if needed:
[{
'type': 'section',
'fields': [{
'type': 'mrkdwn',
'text': 'Today *100* customers saved 12.4%.'
}]
}, {
'type': 'section',
'fields': [{
'type': 'mrkdwn',
'text': 'Today *200* customers saved 5.2%.'
}]
}, {
'type': 'section',
'fields': [{
'type': 'mrkdwn',
'text': 'Today *300* customers saved 6.5%.'
}]
}]

Extracting dictionary values from dataframe and creating a new Pandas column

I have a Pandas DataFrame with several columns, one of which is a dictionary containing coordinates in a list. This is what the entry looks like:
{'type': 'Point', 'coordinates': [-120.12345, 50.23456]}
I would like to extract this data and create 2 new columns in the original DataFrame, one for the latitude and one for longitude.
ID
Latitude
Longitude
1
-120.12345
50.23456
I have not been able to find a simple solution at this point and would be grateful for any guidance.
You can access dictionary get method through the .str
test = pd.DataFrame(
{
"ID": [1, 2],
"point": [
{'type': 'Point', 'coordinates': [-120.12345, 50.23456]},
{'type': 'Point', 'coordinates': [-10.12345, 50.23456]}]
},
)
pd.concat([
test["ID"],
pd.DataFrame(
test['point'].str.get('coordinates').to_list(),
columns=['Latitude', 'Longitude']
)
],axis=1)
You can use str to fetch the required structure:
df = pd.DataFrame({'Col' : [{'type': 'Point', 'coordinates': [-120.12345, 50.23456]}]})
df['Latitude'] = df.Col.str['coordinates'].str[0]
df['Longitude'] = df.Col.str['coordinates'].str[1]
OUTPUT:
Col Latitude Longitude
0 {'type': 'Point', 'coordinates': [-120.12345, ... -120.12345 50.23456

How to extract the value of a given key based on the value of another key, from a list of nested (not-always-existing) dictionaries [Python]

I have a list of dictionaries called api_data, where each dictionary has this structure:
{
'location':
{
'indoor': 0,
'exact_location': 0,
'latitude': '45.502',
'altitude': '133.9',
'id': 12780,
'country': 'IT',
'longitude': '9.146'
},
'sampling_rate': None,
'id': 91976363,
'sensordatavalues':
[
{
'value_type': 'P1',
'value': '8.85',
'id': 197572463
},
{
'value_type': 'P2',
'value': '3.95',
'id': 197572466
}
{
'value_type': 'temperature',
'value': '20.80',
'id': 197572625
},
{
'value_type': 'humidity',
'value': '97.70',
'id': 197572626
}
],
'sensor':
{
'id': 24645,
'sensor_type':
{
'name': 'DHT22',
'id': 9,
'manufacturer':
'various'
},
'pin': '7'
},
'timestamp': '2020-04-18 18:37:50'
},
This structure is not complete for each dictionary, meaning that sometimes a dictionary, a list element or a key is missing.
I want to extract the value of a key when the key value of the same dictionary is equal to a certain value.
For example, for dictionary sensordatavalues, I want the value of the key 'value' when 'value_type' is equal to 'P1'.
I have developed this code working with for and if cycles, but I bet it is heavily inefficient.
How can I do it in a quicker and more efficient way?
Please note that sensordatavalues always exists
for sensor in api_data:
sensordatavalues = sensor['sensordatavalues']
# L_sdv = len(sensordatavalues)
for physical_quantity_recorded in sensordatavalues:
if physical_quantity_recorded['value_type'] == 'P1':
PM10_value = physical_quantity_recorded['value']
If you are confident that the value 'P1' is unique to the key you are searching, you can use the 'in' operator with dict.values()
Should be ok to omit this assignment: sensordatavalues = sensor['sensordatavalues']
for sensor in api_data:
for physical_quantity_recorded in sensor['sensordatavalues']:
if 'P1' in physical_quantity_recorded.values():
PM10_value = physical_quantity_recorded['value']
You just need one for loop:
for x in api_data["sensordatavalues"]:
if x["value_type"] == "P1":
print(x["value"])
Output:
8.85
Use dictionary.get() method if the key not exist it will return default value
for physical_quantity_recorded in api_data['sensordatavalues']:
if physical_quantity_recorded.get('value_type', 'default_value') == 'P1':
PM10_value = physical_quantity_recorded.get('value', 'default_value')
this is an alternative: jmespath - allows you to search and filter a nested dict/json :
summary of jmespath ... to access a key, use the . notation, if ur values are in a list, u access it via the [] notation
NB: dict is wrapped in a data variable
import jmespath
#sensordatavalues is a key, so we can access it directly
#the values of sensordatavalues are wrapped in a list
#to access it we pass the bracket(```[]```)
#we are interested in the dict where value_type is P1
#in jmespath, we identify that using the ? mark to precede the filter object
#pass the filter
#and finally access the key we are interested in ... value
expression = jmespath.compile('sensordatavalues[?value_type==`P1`].value')
expression.search(data)
['8.85']

Using Python module Glom, Extract Irregular Nested Lists into a Flattened List of Dictionaries

Glom makes accessing complex nested data structures easier.
https://github.com/mahmoud/glom
Given the following toy data structure:
target = [
{
'user_id': 198,
'id': 504508,
'first_name': 'John',
'last_name': 'Doe',
'active': True,
'email_address': 'jd#test.com',
'new_orders': False,
'addresses': [
{
'location': 'home',
'address': 300,
'street': 'Fulton Rd.'
}
]
},
{
'user_id': 209,
'id': 504508,
'first_name': 'Jane',
'last_name': 'Doe',
'active': True,
'email_address': 'jd#test.com',
'new_orders': True,
'addresses': [
{
'location': 'home',
'address': 251,
'street': 'Maverick Dr.'
},
{
'location': 'work',
'address': 4532,
'street': 'Fulton Cir.'
},
]
},
]
I am attempting to extract all address fields in the data structure into a flattened list of dictionaries.
from glom import glom as glom
from glom import Coalesce
import pprint
"""
Purpose: Test the use of Glom
"""
# Create Glomspec
spec = [{'address': ('addresses', 'address') }]
# Glom the data
result = glom(target, spec)
# Display
pprint.pprint(result)
The above spec provides:
[
{'address': [300]},
{'address': [251]}
]
The desired result is:
[
{'address':300},
{'address':251},
{'address':4532}
]
What Glomspec will generate the desired result?
As of glom 19.1.0 you can use the Flatten() spec to succinctly get the results you want:
from glom import glom, Flatten
glom(target, (['addresses'], Flatten(), [{'address': 'address'}]))
# [{'address': 300}, {'address': 251}, {'address': 4532}]
And that's all there is to it!
You may also want to check out the convenient flatten() function, as well as the powerful Fold() spec, for all your flattening needs :)
Prior to 19.1.0, glom did not have first-class flattening or reduction (as in map-reduce) capabilities. But one workaround would have been to use Python's built-in sum() function to flatten the addresses:
>>> from glom import glom, T, Call # pre-19.1.0 solution
>>> glom(target, ([('addresses', [T])], Call(sum, args=(T, [])), [{'address': 'address'}]))
[{'address': 300}, {'address': 251}, {'address': 4532}]
Three steps:
Traverse the lists, as you had done.
Call sum on the resulting list, flattening/reducing it.
Filter down the items in the resulting list to only contain the 'address' key.
Note the usage of T, which represents the current target, sort of like a cursor.
Anyways, no need to do that anymore, in part due to this answer. So, thanks for the great question!

Categories