My dictionary looks like below, and I am following this link to update the values in "Column_Type" key. Bascially, I would like to replace values "String" with "VARCHAR(256)", DATE with "NUMBER (4,0)", Int with "NUMBER" and Numeric with "Number". Whenever I run below code, my values are not getting updated to my dictionary.My desired output for updated dictionary is as below
Please note: The location of column_types might vary as well. For ex: Column_type[String] currently is at position 1, but It might be at position 3 later on .
{'Column_name': ['Name', 'Salary', 'Date', 'Phone'], 'Column_Type': ['String', 'Numeric', 'Date', 'Int']}
Code:
for key1, key2 in my_dict.items():
if key2== 'String':
my_dict[key2] = "VARCHAR(256)"
print(my_dict)
Desired Output:
{'Column_name': ['Name', 'Salary', 'Date', 'Phone'], 'Column_Type': ['VARCHAR(256)', 'NUMBER', 'NUMBER(4,0)', 'NUMBER']}
In your example, your keys are "Column_Name" and Column_Type". There is no key named "String" in your dict. Both values in your dict are of type list so neither are equal to the string String either.
What you want is to replace a specific value in a list.
Try like this:
for index, value in enumerate(my_dict["Column_Type"]):
if value == "String":
my_dict["Column_Type"][index] = "VARCHAR(256)"
This replaces the value in the list, not the dict. That is what you want.
If you need to replace multiple values you can use a dict, like #Jeremy suggested:
type_strs = {
'String': 'VARCHAR(256)',
'Numeric': 'NUMBER',
'Date': 'NUMBER(4,0)',
'Int': 'NUMBER'
}
for index, value in enumerate(my_dict["Column_Type"]):
my_dict["Column_Type"][index] = type_strs.get(value, value)
Here, the .get() function on a dict returns the value corresponding to the key given by the first argument, or the second argument if no such key exists.
type_strs = {
'String': 'VARCHAR(256)',
'Numeric': 'NUMBER',
'Date': 'NUMBER(4,0)',
'Int': 'NUMBER'
}
my_dict['Column_Type'] = [type_strs[t] for t in my_dict['Column_Type']]
I would recommend a dictionary instead of if statements for translating the type strings
Your are in this line comparing a list with an element of this list if key2== 'String':
key2 when you are traveling the variable contains the next ['String', 'Numeric', 'Date', 'Int'], so you will need to join to this value of the array for compare. You can do it with a for cycle
The program is the next:
my_dict={'Column_name': ['Name', 'Salary', 'Date', 'Phone'], 'Column_Type': ['String', 'Numeric', 'Date', 'Int']}
# We create this variable to save the position of the element
position=0
# We travel to the dictionary
for i in my_dict['Column_Type']:
# If the variable is equal to the string
if i == 'String':
# We assign the new information to the variable
my_dict['Column_Type'][position]="VARCHAR(256)"
#And add one to the position
position+=1
print(my_dict)
Output
{'Column_name': ['Name', 'Salary', 'Date', 'Phone'], 'Column_Type': ['VARCHAR(256)', 'Numeric', 'Date', 'Int']}
You can use list.update(val1, val2)
example:
# Dictionary of strings to ints
word_freq = {
"Hello": 56,
"at": 23,
"test": 43,
"this": 43
}
# Adding a new key value pair
word_freq.update({'before': 23})
print(word_freq)
Related
I am trying to find if a string Date is present in a list of items. If Date is not present i want to get a null list.
Code
data = [['Organizations', 'Name', 'San Franciso', 11, 32],
['CreativeTeamRoles', 'Description', 'Music Director', 945, 959],
['Persons', 'FullName', 'Salonen', 5761, 5778],
['CreativeTeamRoles', 'Description', 'Conductor', 7322, 7331],
['SoloistRoles', 'Description', 'Piano', 7627, 7632],
['Performances', 'Starttime', '2:00PM', 8062, 8068],
['Performances', 'Date', '2021-05-07', 8247, 8252],
['Performances', 'Endtime', '7:30PM', 8262, 8268]]
output_list = [item for items in data for item in items if 'Date' in item]
Since it has both strings and integers i am getting an error
TypeError: argument of type 'int' is not iterable
try this:
[d for d in data if 'Date' in d]
As from the question,
It seems like you want the Boolean value of the presence of a given string inside a nested list, you can try like this, which returns only True and False
print(any([True for i in data if 'Data' in i else False]))
If you want the list that contains the given string, then -
print([*i for i in data if 'Data' in i])
tell me if this is okay for you...
I am reading a .csv called courses. Each row corresponds to a course which has an id, a name, and a teacher. They are to be stored in a Dict. An example:
list_courses = {
1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'},
...
}
While iterating the rows using enumerate(file_csv.readlines()) I am performing the following:
list_courses={}
for idx, row in enumerate(file_csv.readlines()):
# Skip blank rows.
if row.isspace(): continue
# If we're using the row, turn it into a list.
row = row.strip().split(",")
# If it's the header row, take note of the header. Use these values for the dictionaries' keys.
# As of 3.7 a Dict remembers the order in which the keys were inserted.
# Since the order is constant, simply load each other row into the corresponding key.
if not idx:
sheet_item = dict.fromkeys(row)
continue
# Loop through the keys in sheet_item. Assign the value found in the row, converting to int where necessary.
for idx, key in enumerate(list(sheet_item)):
sheet_item[key] = int(row[idx].strip()) if key == 'id' or key == 'mark' else row[idx].strip()
# Course list
print("ADDING COURSE WITH ID {} TO THE DICTIONARY:".format(sheet_item['id']))
list_courses[sheet_item['id']] = sheet_item
print("\tADDED: {}".format(sheet_item))
print("\tDICT : {}".format(list_courses))
Thus, the list_courses dictionary is printed after each sheet_item is added to it.
Now comes the issue - when reading in two courses, I expect that list_courses should read:
list_courses = {
1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'},
2: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}
}
However, the output of my print statements (substantiated by errors later in my program) is:
ADDING COURSE WITH ID 1 TO THE DICTIONARY:
ADDED: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'}
DICT : {1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'}}
ADDING COURSE WITH ID 2 TO THE DICTIONARY:
ADDED: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}
DICT : {1: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}, 2: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}}
Thus, the id with which the sheet_item is being added to courses_list is correct (1 or 2), however the assignment which occurs for the second course appears to be overwriting the value for key 1. I'm not even sure how this is possible. Please let me know your thoughts.
You're using the same dictionary for both the header and all the rows. You never create any new dictionaries after the header. Key assignments are overwriting previous ones, because there are no new dictionaries to write to.
Store the keys in a list, and make a new sheet_item before the for loop:
list_courses={}
keys = None # Let Python know this is defined
for idx, row in enumerate(file_csv.readlines()):
# Skip blank rows.
if row.isspace(): continue
# If we're using the row, turn it into a list.
row = row.strip().split(",")
# If it's the header row, take note of the header. Use these values for the dictionaries' keys.
# As of 3.7 a Dict remembers the order in which the keys were inserted.
# Since the order is constant, simply load each other row into the corresponding key.
if not idx:
keys = row
continue
sheet_item = {}
# Loop through the keys in sheet_item. Assign the value found in the row, converting to int where necessary.
for idx, key in enumerate(keys):
sheet_item[key] = int(row[idx].strip()) if key == 'id' or key == 'mark' else row[idx].strip()
# Course list
print("ADDING COURSE WITH ID {} TO THE DICTIONARY:".format(sheet_item['id']))
list_courses[sheet_item['id']] = sheet_item
print("\tADDED: {}".format(sheet_item))
print("\tDICT : {}".format(list_courses))
I have a very big dictionary with keys containing a list of items, these are unordered. I would like to group certain elements in a new key. For example
input= [{'name':'emp1','state':'TX','areacode':'001','mobile':123},{'name':'emp1','state':'TX','areacode':'002','mobile':234},{'name':'emp1','state':'TX','areacode':'003','mobile':345},{'name':'emp2','state':'TX','areacode':None,'mobile':None},]
for above input i would like to group areacode and mobile in a new key contactoptions
opdata = [{'name':'emp1','state':'TX','contactoptions':[{'areacode':'001','mobile':123},{'areacode':'002','mobile':234},{'areacode':'003','mobile':345}]},{'name':'emp2','state':'TX','contactoptions':[{'areacode':None,'mobile':None}]}]
i am doing this now with a two long iterations. i wanted to achieve the same more efficiently as the number of records are large. open to using existing methods if available in packages like pandas.
Try
result = (
df.groupby(['name', 'state'])
.apply(lambda x: x[['areacode', 'mobile']].to_dict(orient='records'))
.reset_index(name='contactoptions')
).to_dict(orient='records')
With regular dictionaries, you can do it in a single pass/loop using the setdefault method and no sorting:
data = [{'name':'emp1','state':'TX','areacode':'001','mobile':123},{'name':'emp1','state':'TX','areacode':'002','mobile':234},{'name':'emp1','state':'TX','areacode':'003','mobile':345},{'name':'emp2','state':'TX','areacode':None,'mobile':None}]
merged = dict()
for d in data:
od = merged.setdefault(d["name"],{k:d[k] for k in ("name","state")})
od.setdefault("contactoptions",[]).append({k:d[k] for k in ("areacode","mobile")})
merged = list(merged.values())
output:
print(merged)
# [{'name': 'emp1', 'state': 'TX', 'contactoptions': [{'areacode': '001', 'mobile': 123}, {'areacode': '002', 'mobile': 234}, {'areacode': '003', 'mobile': 345}]}, {'name': 'emp2', 'state': 'TX', 'contactoptions': [{'areacode': None, 'mobile': None}]}]
As you asked, you want to group the input items by 'name' and 'state' together.
My suggestion is, you can make a dictionary which keys will be 'name' plus 'state' such as 'emp1-TX' and values will be list of 'areacode' and 'mobile' such as [{'areacode':'001','mobile':123}]. In this case, the output can be achieved in one iteration.
Output:
{'emp1-TX': [{'areacode':'001','mobile':123}, {'areacode':'001','mobile':123}, {'areacode':'003','mobile':345}], 'emp2-TX': [{'areacode':None,'mobile':None}]}
I have a list of dictionaries called api_data, where each dictionary has this structure:
{
'location':
{
'indoor': 0,
'exact_location': 0,
'latitude': '45.502',
'altitude': '133.9',
'id': 12780,
'country': 'IT',
'longitude': '9.146'
},
'sampling_rate': None,
'id': 91976363,
'sensordatavalues':
[
{
'value_type': 'P1',
'value': '8.85',
'id': 197572463
},
{
'value_type': 'P2',
'value': '3.95',
'id': 197572466
}
{
'value_type': 'temperature',
'value': '20.80',
'id': 197572625
},
{
'value_type': 'humidity',
'value': '97.70',
'id': 197572626
}
],
'sensor':
{
'id': 24645,
'sensor_type':
{
'name': 'DHT22',
'id': 9,
'manufacturer':
'various'
},
'pin': '7'
},
'timestamp': '2020-04-18 18:37:50'
},
This structure is not complete for each dictionary, meaning that sometimes a dictionary, a list element or a key is missing.
I want to extract the value of a key when the key value of the same dictionary is equal to a certain value.
For example, for dictionary sensordatavalues, I want the value of the key 'value' when 'value_type' is equal to 'P1'.
I have developed this code working with for and if cycles, but I bet it is heavily inefficient.
How can I do it in a quicker and more efficient way?
Please note that sensordatavalues always exists
for sensor in api_data:
sensordatavalues = sensor['sensordatavalues']
# L_sdv = len(sensordatavalues)
for physical_quantity_recorded in sensordatavalues:
if physical_quantity_recorded['value_type'] == 'P1':
PM10_value = physical_quantity_recorded['value']
If you are confident that the value 'P1' is unique to the key you are searching, you can use the 'in' operator with dict.values()
Should be ok to omit this assignment: sensordatavalues = sensor['sensordatavalues']
for sensor in api_data:
for physical_quantity_recorded in sensor['sensordatavalues']:
if 'P1' in physical_quantity_recorded.values():
PM10_value = physical_quantity_recorded['value']
You just need one for loop:
for x in api_data["sensordatavalues"]:
if x["value_type"] == "P1":
print(x["value"])
Output:
8.85
Use dictionary.get() method if the key not exist it will return default value
for physical_quantity_recorded in api_data['sensordatavalues']:
if physical_quantity_recorded.get('value_type', 'default_value') == 'P1':
PM10_value = physical_quantity_recorded.get('value', 'default_value')
this is an alternative: jmespath - allows you to search and filter a nested dict/json :
summary of jmespath ... to access a key, use the . notation, if ur values are in a list, u access it via the [] notation
NB: dict is wrapped in a data variable
import jmespath
#sensordatavalues is a key, so we can access it directly
#the values of sensordatavalues are wrapped in a list
#to access it we pass the bracket(```[]```)
#we are interested in the dict where value_type is P1
#in jmespath, we identify that using the ? mark to precede the filter object
#pass the filter
#and finally access the key we are interested in ... value
expression = jmespath.compile('sensordatavalues[?value_type==`P1`].value')
expression.search(data)
['8.85']
I am trying to convert csv to Json. If I encounter csv headers with naming convention "columnName1.0.columnName2.0.columnName3" I need to create a nested JSON --> {ColumnName1 : {columnName2 : {columnName3 : value }}}..
So far I am able to split header into list of subColumnNames and create a nested JSON type, but I am unable to assign a value. Any Help?
data = open(str(fileName.strip("'")),'rb')
reader = csv.DictReader(data,delimiter = ',',quotechar='"')
'''
Get the header '''
for line in reader:
for x,y in line.items():
columns = re.split("\.\d\.",x)
if len(columns) == 1:
continue
else:
print "COLUMNS %s"%columns
testLine = {}
for subColumnName in reversed(columns):
testLine = {subColumnName: testLine}
''' Need to Assign value y? '''
print "LINE%s"%testLine
Output:
COLUMNS ['experience', 'title']
LINE{'experience': {'title': {}}}
COLUMNS ['experience', 'organization', 'profile_url']
LINE{'experience': {'organization': {'profile_url': {}}}}
COLUMNS ['experience', 'start']
LINE{'experience': {'start': {}}}
COLUMNS ['raw_experience', 'organization', 'profile_url']
LINE{'raw_experience': {'organization': {'profile_url': {}}}}
COLUMNS ['raw_experience', 'end']
LINE{'raw_experience': {'end': {}}}
COLUMNS ['experience', 'organization', 'name']
LINE{'experience': {'organization': {'name': {}}}}
The value you want is currently {}, the initial value of testLine. You can try this:
testLine = value
for subColumnName in reversed(columns):
testLine = {subColumnName: testLine}