mongodb remove document that match each element in array [duplicate] - python

This question already has answers here:
Remove multiple documents from mongo in a single query
(5 answers)
Closed 5 years ago.
mongodb
{'id':'a'}
{'id':'b'}
{'id':'c'}
{'id':'d'}
......
python
pool = ['a','b','c']
for element in pool:
mongodb.remove({'id':element})
Just like such situation.
I have a list, as id list.
And I want to remove each one from mongodb.
is there any method better than do it one by one?

db.collection.remove({'_id':{'$in': pool}})
This will remove all the records at one go.

if you want to delete the whole document:
The .remove() method takes a query object, you could use regular expressions with it :
db.collection.remove({ "id": /your_regex/})
will remove every document that match your regular expression.
If you want to remove a specific field you should use the $unset attribute just like this:
db.collection.update({}, {$unset: {"field":1}}, {multi: true})

Related

Replace String only if it is not present

Let's say I have a dictionary, like this {"view": object.get("view")}. Let's say the value to be returned is "Table".
I want to add "_string" to it only if it is not already present. So Table should be Table_string, but if it is already Table_string, then let it be.
For example {"view": object.get(f"{self.view}_string")}, but only if "_string" is not already present. How can I do that using python?
you can do it using when otherwise and regexp_replcae function in pyspark
df.withColumn("value",when(value=="Table",regexp_replace(col(value),"Table","table_String")).otherwise(col(value))

join two JSON objects in Python on common data point [duplicate]

This question already has answers here:
Why does creating a list of tuples using list comprehension requires parentheses?
(2 answers)
Why do tuples in a list comprehension need parentheses? [duplicate]
(3 answers)
Closed 8 months ago.
I keep getting a syntax error for this and Google is no help for my specific issue.
I'm trying to merge two data sets into a single dictionary. One data set comes from https://universalis.app/api/v2/marketable and looks to be an array. The other comes from https://raw.githubusercontent.com/ffxiv-teamcraft/ffxiv-teamcraft/master/apps/client/src/assets/data/items.json and appears to be just an object of objects. Example below with what I've tried.
Code:
import requests
import json
url = "https://universalis.app/api/v2/marketable"
response = json.loads(requests.get(url).text)
marketableItems = [
item
for item in response
]
url = "https://raw.githubusercontent.com/ffxiv-teamcraft/ffxiv-teamcraft/master/apps/client/src/assets/data/items.json"
allItemsResponse = json.loads(requests.get(url).text)
itemDictionary = [
Item, allItemsResponse[str(Item)]["en"]
for Item in marketableItems
]
this produces:
Item, allItemsResponse[str(Item)]["en"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: did you forget parentheses around the comprehension target?
I've googled a fair bit for this exact Syntax Error, but I'm not really able to find any sort of guide on how to join two objects like this. I'm able to get allItemsResponse[str(Item)]["en"] to return data, I just want it paired with the original data from the first URL.

Possible to convert string to a named variable or alternative? [duplicate]

This question already has answers here:
How do I create variable variables?
(17 answers)
How can I create multiple variables from a list of strings? [duplicate]
(2 answers)
generating variable names on fly in python [duplicate]
(6 answers)
Closed 3 years ago.
I have a ticker and I want to check a specific list of tickers to see if the ticker is found. If it is found, it will replace it.
The new tickers come from another data source and therefore do not know which specific list of tickers to check. In order to find that list, I can pass the lists name as a string but upon iterating the code (naturally) recognizes this as string as opposed to a list to iterate.
Is there a way to have the code/function recognize that the string is actually a specific list to be checked? In reading other questions, I know this may not be possible...in that case what is an alternative?
list_1=['A','B']
list_2=['C','D']
old_ticker='A'
new_ticker='E'
assigned_list='list_1'
def replace_ticker(old_ticker,new_ticker,list):
for ticker in list:
if new_ticker in list:
return
else:
list.append(new_ticker)
list.remove(old_ticker)
replace_ticker(old_ticker,new_ticker,assigned_list)
You key the needed lists by name in a dictionary:
ticker_directory = {
"list_1": list_1,
"list_2": list_2
}
Now you can accept the name and get the desired list as ticker_directory[assigned_list].
list_1=['A','B']
list_2=['C','D']
lists = {
'list_1':list_1,
'list_2':list_2
}
old_ticker='A'
new_ticker='E'
assigned_list='list_1'
def replace_ticker(old_ticker,new_ticker,list_name):
if old_ticker not in lists[list_name]:
return
else:
lists[list_name].append(new_ticker)
lists[list_name].remove(old_ticker)
replace_ticker(old_ticker,new_ticker,assigned_list)
print(lists[assigned_list])
This is the complete program from what i perceived.
#prune already answered this, I have just given the whole solution
There are at least two possibilities:
1 As noted in comments kind of overkill but possible:
Use eval() to evaluate string as python expressions more in the link:
https://thepythonguru.com/python-builtin-functions/eval/
For example:
list_name = 'list_1'
eval('{}.append(new_ticker)'.format(list_name))
2 Second
Using locals() a dictionary of locally scoped variables similiar to the other answers but without the need of creating the dict by hand which also requires the knowledge of all variables names.
list_name = 'list_1'
locals()[list_name].append(new_ticker)

json strip multiple lists [duplicate]

This question already has answers here:
Why can't Python parse this JSON data? [closed]
(3 answers)
Closed 5 years ago.
I am looking for more info regarding this issue I have. So far I have checked the JSON encoding/decoding but it was not precisely what I was looking for.
I am looking for some way to strip this kind of list quite easily:
//response
{
"age":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"age2":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"days_month":31,
"year":2017
}
So how do I easily extract the data? i.e. I want to get the result age of person in age2 with # == 3.
To get the results for year/days_months I found the solution with google:
j=json.loads(r.content)
print(j['year'])
to retrieve the data. Probably I have missed something somewhere on the internet, but I could not find the specific solution for this case.
I think this is what #Jean-François Fabre tried to indicate:
import json
response = """
{
"age":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"age2":[
{"#":"1","age":10},
{"#":"2","age":12},
{"#":"3","age":16},
{"#":"4","age":3}
],
"days_month":31,
"year":2017
}
"""
j = json.loads(response)
# note that the [2] means the third element in the "age2" list-of-dicts
print(j['age2'][2]['#']) # -> 3
print(j['age2'][2]['age']) # -> 16
json.loads() converts a string in JSON format into a Python object. In particular it converts JSON objects into Python dictionaries and JSON lists into Python list objects. This means you can access the contents of the result stored in the variable j in this case, just like you would if it was a native mixture of one or more of those types of Python datatypes (and would look very similar to what is shown in the response).
As the search criterion you are looking for is not contained in the indices of the respective datastructures, I would do it using a list comprehension. For your example, this would be
[person['age'] for person in j['age2'] if person['#'] == u'3'][0]
This iterates through all the items in the list under 'age2', and puts all the items where the number is '3' into a list. The [0] selects the first entry of the list.
However, this is very inefficient. If you have large datasets, you might want to have a look at pandas:
df = pandas.DataFrame(j['age2'])
df[df['#'] == '3']['age']
which is much more performant as long as your data can be represented by a sort of series or table.

In Python, how to iterate more than once for an iterative object? [duplicate]

This question already has answers here:
Why can't I iterate twice over the same iterator? How can I "reset" the iterator or reuse the data?
(5 answers)
Closed 4 years ago.
I encounter some code that get back an iterative object from the Dynamo database, and I can do:
print [en["student_id"] for en in enrollments]
However, when I do similar things again:
print [en["course_id"] for en in enrollments]
Then the second iteration will print out nothing, because the iterative structure can only be iterated only once and it has reached its end.
The question is, how can we iterate it more than once, for the case of (1) what if it is known to be only several items in the iteration (2) what if we know there will be lots of items (say a million items) in the iteration, and we don't want to cost a lot of additional memory space?
Related is, I looked up rewind, and it seems like it exists for PHP and Ruby, but not for Python?
enrollments is a generator. Either recreate the generator if you need to iterate again, or convert it to a list first:
enrollments = list(enrollments)
Take into account that APIs often use generators to avoid memory bloat; a list must have references to all objects it contains, so all those objects have to exist at the same time. A generator can produce the elements one by one, as needed; your list comprehension discards those objects again once the 'student_id' key has been extracted.
The alternative is to iterate just once, and do all the things with each object you want to do. So instead of running two list comprehensions, run one regular for loop and extract all the data you need in one place, appending to separate lists as you go along:
courses = []
students = []
for enrollment in enrollments:
courses.append(enrollment['course_id'])
students.append(enrollment['student_id'])
rewind in PHP is unrelated to this; Python has fileobj.seek(0) to do the same, but file objects are not generators.
import itertools
it1, it2 = itertools.tee(enrollments, n=2)
Looks like it is an answer from here: Why can't I iterate twice over the same data?
But it is valid only if you are going to iterate not too much times.

Categories