Update list at index 3 with new value - python

Greeting everyone. New user of this Stack Overflow resource. I have code I'm developing that requires me to capture some faculty data into a list, update that list then merge with another list and ftp a csv file. First things first.
I created an empty list
records: List[EmptyRecord] = []
and using
records.extend(faculty_records)
now have a list of faculty data. The email address is at index 3.
I have a doc string SQL statement GET_MAIL that will return the email address I need to update the value at index 3 in faculty_records. I think I need some sort of
records.insert(3, '{email address}')
inside a while loop for all the values in faculty_records.
I have username at index 2 and ID at index 4 in the list to match which address to update. It's Peoplesoft data so ID in the list has to match the emplid from the SQL results.
Can someone assist in getting my pseudocode into python?
Once I get the values updated I need to merge with my student data list, which should be as easy as.
records.extend(student_records)
and send both student and faculty data to a vendor.

insert() adds a new element at that index, shifting all the following elements over to make room, it doesn't replace. Just use ordinary assignment to replace an element.
Loop through the records with a for loop to find the record with the username you want to update. Then assign to element 3 to update the email.
for record in records:
if record[2] == username_to_update:
record[3] = new_email
break

Related

Quick search in dataframe

We will be inserting data into mongodb. We are using pymongo.
We need to ensure that we insert a value into db only if the value is already not present in the db.
Since find() operation for every record to check for dup takes a long time, one suggestion was to bring entire values from db into list based on an identifier and search for the value in this list instead of doing find multiple times.
Here, the issue i am facing is if there are 1 million records for that identifier in db and i bring in the 1 million records into list. while processing 3k of my new records, i was checking for each record in 1 million which took around 9 minutes for the entire processing. I am assuming that this list can be replaced by dataframe for the faster search but someone can please help me achieve?
This is my code snippet to bring in the records into list first:
project_keys = {"_id":0}
for key in queryString:
project_keys[key] = 1
curr_data_arr = list(audit_collection.find({"source_name":default_attributes['source_name']}, project_keys))
Processing of new records:
df_dict = df.to_dict('records')
for row in df_dict:
query_string={}
for i in range(len(queryString)):
if(queryString[i]=="source_name"):
query_string[queryString[i]]=default_attributes["source_name"]
else:
query_string[queryString[i]]=row[queryString[i]]
if(len(query_string)!=0):
audit_data = next((sub for sub in curr_data_arr if (sub == query_string)), None)
logger.info(f"Duplicate data check in list from mongo db, result is {audit_data}")
else:
audit_data = {}
# do not insert/update if record found for audit data, log it and move to next record
if(audit_data):
duplicate_data_count += 1
# logger.info(f"Duplicate record found for audit. Duplicate record being inserted: {row}")
continue
# Insert if no record found
else:
arr.append(row)
new_data_count += 1
if(len(arr)>0):
logger.info(f"Length of unique array is {len(arr)}")
audit_collection.insert_many(arr)
So in above code snippet this will take ling processing time audit_data = next((sub for sub in curr_data_arr if (sub == query_string)). Example: db has 1 million records and currently i am inserting 3k records which took 9 minutes to check. If i remove that line without duplicate data check, data insertion takes place within 1 minute. How do i make this duplicate data check faster? Can i bring in the data in a dataframe instead and search in a dataframe instead of list? NOTE: Search would be based on primary key/queryString. would it become faster? Please provide code snippets

Populate new column in dataframe based on dictionary key matches values in another column and some more conditions

I have a data frame like
I have a dictionary with the ec2 instance details
Now, I want to add a new column 'Instance Name' and populate it based on a condition that the instance ID in the dictionary is in the column 'ResourceId' and further, depending on what is there in the Name field in dictionary for that instance Id, I want to populate the new column value for each matching entry
Finally I want to create separate data frames for my specific use-cases e.g. to get only Box-Usage results. Something like this
box_usage = df[df['lineItem/UsageType'].str.contains('BoxUsage')]
print(box_usage.groupby('Instance Name')['lineItem/BlendedCost'].sum())
The new column value is not coming up against the respective Resource Id as I desire. It is rather coming up sequentially.
I have tried bunch of things including what I mentioned in above code, but no result yet. Any help?
After struggling through several options, I used the .apply() way and it did the trick
df.insert(loc=17, column='Instance_Name', value='Other')
instance_id = []
def update_col(x):
for key, val in ec2info.items():
if x == key:
if ('MyAgg' in val['Name']) | ('MyAgg-AutoScalingGroup' in val['Name']):
return 'SharkAggregator'
if ('MyColl AS Group' in val['Name']) | ('MyCollector-AutoScalingGroup' in val['Name']):
return 'SharkCollector'
if ('MyMetric AS Group' in val['Name']) | ('MyMetric-AutoScalingGroup' in val['Name']):
return 'Metric'
df['Instance_Name'] = df.ResourceId.apply(update_col)
df.Instance_Name.fillna(value='Other', inplace=True)

Pass a key-array to a Lotus-Notes COM method

I am trying to get a specific document from a Domino view.
The view has 3 columns: Name, Surname, Age.
The problem is, that Name is not unique, so I need to get the document that matches 'John' in the Name column (1st column) as well as 'Doe' in the second column (Surname).
So obviously the following won't work: doc = view.GetDocumentByKey('John')
There is a NotesView COM class which contains the .GetDocumentByKey() method, which allows one to enter a key array. But I am not able to enter a key array in Python.
I have tried the following:
doc = view.GetDocumentByKey('John Doe')
doc = view.GetDocumentByKey('John, Doe')
doc = view.GetDocumentByKey(('John', 'Doe'))
doc = view.GetDocumentByKey(['John', 'Doe'])
But none of them are able to get the needed document.
What is the correct way to pass a key array?
EDIT:
Solution found. There was a sorted hidden column with unique values that I ended up using.
Solution found. There was a sorted hidden column with unique values that I ended up using.

Updating a single value in a list of dictionaries

I have a webpage where I display a list of passenger information as a table. This is the usual stuff: arrival_time, flight number etc.
I've made the table editable so when the user clicks a certain column with information he can edit this column. When he finally clicks confirm I send only the columns that were edited along with value and row number to the view which is suppose to locate the row of the list from the data I sent, find the key and update the original value.
The json values I get from editing a column look like this:
[{u'classname': u'flight', u'column': 6, u'value': u'F1521', u'row': 0}, {u'classname': u'flight', u'column': 6, u'value': u'FI521', u'row': 1}]
The code I have so far to update the value looks like this:
# Original data
query = UploadOrderModel.objects.filter(hashId = receiptNr)
# Gives me a list of dictionaries containing these keys
data = query.values("office", "reserv", "title","surname","firstN",
"arrival", "flight", "transferF", "hotelD", "tour")
# Update
json_dump = json.loads(request.body)
if json_dump:
for d in json_dump:
# get row number
row = int(d['row'])
# updates dictionary value at row
data[row][d['classname']] = d['value']
But this does not update the value. I have checked if is getting the correct values to update and it is, so that's not the case, row is correct and if I print out:
data[row][d['classname']]
I get the element I want to update. Is there anything really obvious I'm missing here.
Should I be making updates to the entire row instead? so update the entire dictionary at
the current location?
EDIT:
I'm still having problems. First off, i misread your good answer lyschoening. I thought you meant that values() does not return a writeable list, silly me. The saving of the model is done later in the code and works as expected. However I still have the problem of the dictionary at the location I'm trying to update does not update at all :/
Ok I found out what was the problem.
django values() does not return a list of dictionaries as it looks but a ValuesQuerySet.
It is therefor not possible to do updates on this list as one would do with a regular list.
All I had to do was turning it into a list of dictionaries:
updates = [item for item in data]

Adding Keys and Values to Python Dictionary in Reverse Order

I have written a simple script that prints out and adds the name of a table and it's associated column headings to a python list:
for table in arcpy.ListTables():
for field in arcpy.ListFields(table):
b.append(field.name + "," + fc)
print b
In each table there are a number of column headings. There are many instances where one or more tables contain the same column headings. I want to do a bit of a reverse python dictionary instead of a list, where keys are the column headings and the values are the table names. My idea is, to find the all the tables that each column heading lies within.
I've been playing around all afternoon and I think I am over thinking this so I came here for some help. If anyone can suggest how I can accomplish this, i would appreciate it.
Thanks,
Mike
Try this:
result = {}
for table in arcpy.ListTables():
for field in arcpy.ListFields(table):
result.setdefault(field.name, []).append(table)
If I understand correctly, you want to map from a column name to a list of tables that contain that have columns with that name. That should be easy enough to do with a defaultdict:
from collections import defaultdict
header_to_table_dict = defaultdict(list)
for table in arcpy.ListTables():
for field in arcpy.ListFields(table):
header_to_table_dict[field.name].append(table.name)
I'm not sure if table.name is what you want to save, exactly, but this should get you on the right track.
You want to create a dictionary in which each key is a field name, and each value is a list of table names:
# initialize the dictionary
col_index = {}
for table in arcpy.ListTables():
for field in arcpy.ListFields(table):
if field.name not in col_index:
# this is a field name we haven't seen before,
# so initialize a dictionary entry with an empty list
# as the corresponding value
col_index[field.name] = []
# add the table name to the list of tables for this field name
col_index[field.name].append(table.name)
And then, if you want want a list of tables that contain the field LastName:
list_of_tables = col_index['LastName']
If you're using a database that is case-insensitive with respect to column names, you might want to convert field.name to upper case before testing the dictionary.

Categories