parse data from Dictionary in python

parse data from Dictionary in python - python

So i have this dictionary "runs":
[{
'id': 12,
'suite_id': 2,
'name': 'name',
'description': "desc.",
'nice_id': 3,
'joku_id': None,
'onko': False,
'eikai': False,
'tehty': None,
'config': None,
'config_ids': [],
'passed_count': 1,
'blocked_count': 2,
'untested_count': 3,
'retest_count': 4,
'failed_count': 5,
'custom_status1_count': 0,
'custom_status2_count': 0,
'custom_status3_count': 0,
'custom_status4_count': 0,
'custom_status5_count': 0,
'custom_status6_count': 0,
'custom_status7_count': 0,
'projekti_id': 1,
'plan_id': None,
'created_on': 12343214,
'created_by': 11,
'url': 'google.com'
}, {
'id': 16,
'suite_id': 2,
'name': 'namae)',
'description': "desc1",
'nice_id': 5,
'joku_id': None,
'onko': False,
'eikai': False,
'tehty': None,
'config': None,
'config_ids': [],
'passed_count': 100,
'blocked_count': 1,
'untested_count': 3,
'retest_count': 2,
'failed_count': 5,
'custom_status1_count': 0,
'custom_status2_count': 0,
'custom_status3_count': 0,
'custom_status4_count': 0,
'custom_status5_count': 0,
'custom_status6_count': 0,
'custom_status7_count': 0,
'prokti_id': 7,
'plan_id': None,
'created_on': 4321341644,
'created_by': 11,
'url': 'google.com/2' }
there is "id" for about 50 times. that is just a part of it.
i need to find all "id":s (Not joku_ids, ncie_ids etc. Only "id") and make a string/dict of them
and same for name, and description
i have tried:
j = json.load(run)
ids = (j["id"])
j = json.load(run)
names = (j["name"])
j = json.load(run)
descriptions = (j["description"])
but it returns:
AttributeError: 'list' object has no attribute 'read'
I also need to send a request with specific id and in this case the specific id is marked by o. so id[o]
the request code is below:
test = client.send_get('get_tests/1/ ')
so i need to have the id[o] instead of the 1.
i have tried
test = client.send_get('get_tests/' + id[o] + '/ ')
but it returns:
TypeError: 'int' object is not subscriptable

May be this can help you.
id = []
for i in runs :
id.append(i.get('id'))
[12, 16]

You are trying to pass a list to a json.load function. Please read the docs. Load() does not accep lists, it accepts
a .read()-supporting file-like object containing a JSON document

If you want your result in list of dictionary then:
result = [{x:y} for i in range(len(data)) for x,y in data[i].items() if x=='id' or x=='name' or x=='description']
output:
[{'name': 'name'}, {'id': 12}, {'description': 'desc.'}, {'name': 'namae)'}, {'id': 16}, {'description': 'desc1'}]
the data is your list of dictionary data.
hope this answer helpful for you.

Related

OperationFailure: unknown top level operator: $ne (Monogbd)

What is wrong with this code? When I try to run it I get OperationFailure: unknown top level operator: $ne full error: {'ok': 0.0, 'errmsg': 'unknown top level operator: $ne', 'code': 2, 'codeName': 'BadValue'}.
Any ideas what this means? Thank you in advance :)
import pandas as pd
def length_vs_references(articles):
res = {"1-5" : 0, "6-10" : 0, "11-15" : 0, "16-20" : 0, "21-25" : 0, "25-30" : 0, ">30" :0}
n = {"1-5" : 0, "6-10" : 0, "11-15" : 0, "16-20" : 0, "21-25" : 0, "25-30" : 0, ">30" :0}
cursor = articles.aggregate([
{'$match': {'$and' : [{'references': {'$exists': False}
}, {'$ne':['$page_end', '']}, {'$ne':['$page_start', '']} ]}},
{'$project': {'len_refernces': {"$size": '$references'},
'pages': {'$subtract': [{"$toInt": 'page_end'},
{"$toInt" : 'page_start'}]}}},
{'$bucket' :{
'$groupBy': '$pages',
'boundaries': [ 0, 6, 11, 16, 21, 26, 31, 1000000],
'default': 'Other',
'key': {
'output': {"average": {"$avg" : '$len_references'}},
}
}
}
])
return cursor
print(length_vs_references(articles))

Reading between the lines I suspect you want:
cursor = articles.aggregate([
{'$match': {'references': {'$exists': False}, 'page_end': {'$ne': ''}, 'page_start': {'$ne': ''}}},
{'$project': {'len_refernces': {"$size": '$references'},
'pages': {'$subtract': [{"$toInt": '$page_end'},
{"$toInt": '$page_start'}]}}},
{'$bucket': {
'groupBy': '$pages',
'boundaries': [0, 6, 11, 16, 21, 26, 31, 1000000],
'default': 'Other'
}
}
])
You don't need to AND your match filters as they are ANDed by default. I'm guessing you are trying to filter out blank page_end and page_start items. If not, please describe what you are trying to do.

Fuzzywuzzy for a list of dictionaries

I have a list of dictionaries (API response) and I use the following function to search for certain nations:
def nation_search(self):
result = next((item for item in nations_v2 if (item["nation"]).lower() == (f"{self}").lower()), False)
if result:
return result
else:
return next((item for item in nations_v2 if (item["leader"]).lower() == (f"{self}").lower()), False)
2 examples :
nations_v2 = [{'nation_id': 5270, 'nation': 'Indo-Froschtia', 'leader': 'Saxplayer', 'continent': 2, 'war_policy': 4, 'domestic_policy': 2, 'color': 15, 'alliance_id': 790, 'alliance': 'Rose', 'alliance_position': 3, 'cities': 28, 'offensive_wars': 0, 'defensive_wars': 0, 'score': 4945, 'v_mode': False, 'v_mode_turns': 0, 'beige_turns': 0, 'last_active': '2020-08-10 04:04:48', 'founded': '2014-08-05 00:09:31', 'soldiers': 0, 'tanks': 0, 'aircraft': 2100, 'ships': 0, 'missiles': 0, 'nukes': 0},
{'nation_id': 582, 'nation': 'Nightsilver Woods', 'leader': 'Luna', 'continent': 4, 'war_policy': 4, 'domestic_policy': 2, 'color': 10, 'alliance_id': 615, 'alliance': 'Seven Kingdoms', 'alliance_position': 2, 'cities': 23, 'offensive_wars': 0, 'defensive_wars': 0, 'score': 3971.25, 'v_mode': False, 'v_mode_turns': 0, 'beige_turns': 0, 'last_active': '2020-08-10 00:22:16', 'founded': '2014-08-05 00:09:35', 'soldiers': 0, 'tanks': 0, 'aircraft': 1725, 'ships': 115, 'missiles': 0, 'nukes': 0}]
I want to add a fuzzy-search using fuzzywuzzy to get 5 possible matches in case there's a spelling error in the argument passed into the function but I can't seem to figure it out.
I only want to search in values for nation and leader.

If you need 5 possible matches, use process.
from fuzzywuzzy import process
def nation_search(self):
nations_only = [ v2['nation'].lower() for v2 in nations_v2 ]
leaders_only = [ v2['leader'].lower() for v2 in nations_v2 ]
matched_nations = process.extract((f"{self}").lower(), nations_only, limit=5)
matched_leaders = process.extract((f"{self}").lower(), leaders_only, limit=5)
return matched_nations, matched_leaders

Why is this weird ID returned when using todoist python api?

This question is regarding the use of todoist python api.
After adding an item (that's what a task is called in the api) I get these weird looking IDs. I say weird because a regular id is just an integer. But these IDs are not usable in anyway, I can't do api.items.get_by_id() with this id. What is going on? How do I get out from this weird state?
'reply email': 'b5b4eb2c-b28f-11e9-bd8d-80e6500af142',
I printed out a few more IDs, and all the integer ones work well with all API calls, the UUID ones throw an exception.
3318771761
3318771783
3318771807
3318771823
3318771843
61c30a10-b2a0-11e9-98d7-80e6500af142
62326586-b2a0-11e9-98d7-80e6500af142
62a3ea9e-b2a0-11e9-98d7-80e6500af142
631222ac-b2a0-11e9-98d7-80e6500af142
63816338-b2a0-11e9-98d7-80e6500af142
63efd14c-b2a0-11e9-98d7-80e6500af142

You have to use api.commit() to do the Sync and have the final id. The id you're trying to use is just a temporary id while the sync is not done.
>>> import todoist; import os; token = os.environ.get('token'); api = todoist.TodoistAPI(token); item=api.items.add('test');
>>> item
Item({'content': 'test',
'id': '3b4d77c0-b891-11e9-a080-2afbabeedbe3',
'project_id': 1490600000})
>>> api.commit()
{'full_sync': False, 'sync_status': {'3b4d793c-b891-11e9-a080-2afbabeaeaef': 'ok'}, 'temp_id_mapping': {'3b4d77c0-b891-11e9-a080-2afbabeaeaef': 3331113774}, 'labels': [], 'project_notes': [], 'filters': [], 'sync_token': 'EEVprctG1E39VDqJfu_cwyhpO6rkOaavyU5r70Eu0nY1ZjsWSjssGr4qLLLikucJAu_Zakld7DniBsEyZ7i820dqcZNcOAbUcbzHFpMpSjzr-GALTA', 'day_orders': {}, 'projects': [], 'collaborators': [], 'day_orders_timestamp': '1564672738.25', 'live_notifications_last_read_id': 2259500000, 'items': [{'legacy_project_id': 1484800000, 'day_order': -1, 'assigned_by_uid': 5050000, 'labels': [], 'sync_id': None, 'section_id': None, 'in_history': 0, 'child_order': 3, 'date_added': '2019-08-06T21:29:18Z', 'id': 3331113774, 'content': 'test2', 'checked': 0, 'user_id': 5050000, 'due': None, 'priority': 1, 'parent_id': None, 'is_deleted': 0, 'responsible_uid': None, 'project_id': 1490600000, 'date_completed': None, 'collapsed': 0}], 'notes': [], 'reminders': [], 'due_exceptions': [], 'live_notifications': [], 'sections': [], 'collaborator_states': []}
>>> item
Item({'assigned_by_uid': 5050000,
'checked': 0,
'child_order': 3,
'collapsed': 0,
'content': 'test',
'date_added': '2019-08-06T21:29:18Z',
'date_completed': None,
'day_order': -1,
'due': None,
'id': 3331100000,
'in_history': 0,
'is_deleted': 0,
'labels': [],
'legacy_project_id': 148400000,
'parent_id': None,
'priority': 1,
'project_id': 1490660000,
'responsible_uid': None,
'section_id': None,
'sync_id': None,
'user_id': 5050000})

Access specific dict value in Python

I have a dict object structured in this way:
{'snapshots': [{'snapshot': 'test_2018.11.19', 'uuid': 'Lv1C02wIRYGIljr3S16eIQ', 'version_id': 5060699, 'version': '5.6.6', 'indices': ['cribiscom_x_mydocs_entries_201712'], 'state': 'SUCCESS', 'start_time': '2018-11-19T16:57:44.014Z', 'start_time_in_millis': 1542646664014, 'end_time': '2018-11-19T16:57:46.380Z', 'end_time_in_millis': 1542646666380, 'duration_in_millis': 2366, 'failures': [], 'shards': {'total': 3, 'failed':
0, 'successful': 3}}]}
I would like to get the value of ket state but I'm not really understanding how to do that since 'napshots is a dict and then there is a composed object.
can anyone explain it to me?

This is just a matter of accessing values from a dictionary. You can do it like this:
mydict = {'snapshots': [{'snapshot': 'test_2018.11.19', 'uuid':'Lv1C02wIRYGIljr3S16eIQ', 'version_id': 5060699, 'version': '5.6.6', 'indices': ['cribiscom_x_mydocs_entries_201712'], 'state': 'SUCCESS', 'start_time': '2018-11-19T16:57:44.014Z', 'start_time_in_millis': 1542646664014, 'end_time': '2018-11-19T16:57:46.380Z', 'end_time_in_millis': 1542646666380, 'duration_in_millis': 2366, 'failures': [], 'shards': {'total': 3, 'failed': 0, 'successful': 3}}]}
value = mydict['snapshots'][0]['state']

limit() and sort() order pymongo and mongodb

Despite reading peoples answers stating that the sort is done first, evidence shows something different that the limit is done before the sort. Is there a way to force sort always first?
views = mongo.db.view_logging.find().sort([('count', 1)]).limit(10)
Whether I use .sort().limit() or .limit().sort(), the limit takes precedence. I wonder if this is something to do with pymongo...

According to the documentation, regardless of which goes first in your chain of commands, sort() would be always applied before the limit().
You can also study the .explain() results of your query and look at the execution stages - you will find that the sorting input stage examines all of the filtered (in your case all documents in the collection) and then the limit is applied.
Let's go through an example.
Imagine there is a foo database with a test collection having 6 documents:
>>> col = db.foo.test
>>> for doc in col.find():
... print(doc)
{'time': '2016-03-28 12:12:00', '_id': ObjectId('56f9716ce4b05e6b92be87f2'), 'value': 90}
{'time': '2016-03-28 12:13:00', '_id': ObjectId('56f971a3e4b05e6b92be87fc'), 'value': 82}
{'time': '2016-03-28 12:14:00', '_id': ObjectId('56f971afe4b05e6b92be87fd'), 'value': 75}
{'time': '2016-03-28 12:15:00', '_id': ObjectId('56f971b7e4b05e6b92be87ff'), 'value': 72}
{'time': '2016-03-28 12:16:00', '_id': ObjectId('56f971c0e4b05e6b92be8803'), 'value': 81}
{'time': '2016-03-28 12:17:00', '_id': ObjectId('56f971c8e4b05e6b92be8806'), 'value': 90}
Now, let's execute queries with different order of sort() and limit() and check the results and the explain plan.
Sort and then limit:
>>> from pprint import pprint
>>> cursor = col.find().sort([('time', 1)]).limit(3)
>>> sort_limit_plan = cursor.explain()
>>> pprint(sort_limit_plan)
{u'executionStats': {u'allPlansExecution': [],
u'executionStages': {u'advanced': 3,
u'executionTimeMillisEstimate': 0,
u'inputStage': {u'advanced': 6,
u'direction': u'forward',
u'docsExamined': 6,
u'executionTimeMillisEstimate': 0,
u'filter': {u'$and': []},
u'invalidates': 0,
u'isEOF': 1,
u'nReturned': 6,
u'needFetch': 0,
u'needTime': 1,
u'restoreState': 0,
u'saveState': 0,
u'stage': u'COLLSCAN',
u'works': 8},
u'invalidates': 0,
u'isEOF': 1,
u'limitAmount': 3,
u'memLimit': 33554432,
u'memUsage': 213,
u'nReturned': 3,
u'needFetch': 0,
u'needTime': 8,
u'restoreState': 0,
u'saveState': 0,
u'sortPattern': {u'time': 1},
u'stage': u'SORT',
u'works': 13},
u'executionSuccess': True,
u'executionTimeMillis': 0,
u'nReturned': 3,
u'totalDocsExamined': 6,
u'totalKeysExamined': 0},
u'queryPlanner': {u'indexFilterSet': False,
u'namespace': u'foo.test',
u'parsedQuery': {u'$and': []},
u'plannerVersion': 1,
u'rejectedPlans': [],
u'winningPlan': {u'inputStage': {u'direction': u'forward',
u'filter': {u'$and': []},
u'stage': u'COLLSCAN'},
u'limitAmount': 3,
u'sortPattern': {u'time': 1},
u'stage': u'SORT'}},
u'serverInfo': {u'gitVersion': u'6ce7cbe8c6b899552dadd907604559806aa2e9bd',
u'host': u'h008742.mongolab.com',
u'port': 53439,
u'version': u'3.0.7'}}
Limit and then sort:
>>> cursor = col.find().limit(3).sort([('time', 1)])
>>> limit_sort_plan = cursor.explain()
>>> pprint(limit_sort_plan)
{u'executionStats': {u'allPlansExecution': [],
u'executionStages': {u'advanced': 3,
u'executionTimeMillisEstimate': 0,
u'inputStage': {u'advanced': 6,
u'direction': u'forward',
u'docsExamined': 6,
u'executionTimeMillisEstimate': 0,
u'filter': {u'$and': []},
u'invalidates': 0,
u'isEOF': 1,
u'nReturned': 6,
u'needFetch': 0,
u'needTime': 1,
u'restoreState': 0,
u'saveState': 0,
u'stage': u'COLLSCAN',
u'works': 8},
u'invalidates': 0,
u'isEOF': 1,
u'limitAmount': 3,
u'memLimit': 33554432,
u'memUsage': 213,
u'nReturned': 3,
u'needFetch': 0,
u'needTime': 8,
u'restoreState': 0,
u'saveState': 0,
u'sortPattern': {u'time': 1},
u'stage': u'SORT',
u'works': 13},
u'executionSuccess': True,
u'executionTimeMillis': 0,
u'nReturned': 3,
u'totalDocsExamined': 6,
u'totalKeysExamined': 0},
u'queryPlanner': {u'indexFilterSet': False,
u'namespace': u'foo.test',
u'parsedQuery': {u'$and': []},
u'plannerVersion': 1,
u'rejectedPlans': [],
u'winningPlan': {u'inputStage': {u'direction': u'forward',
u'filter': {u'$and': []},
u'stage': u'COLLSCAN'},
u'limitAmount': 3,
u'sortPattern': {u'time': 1},
u'stage': u'SORT'}},
u'serverInfo': {u'gitVersion': u'6ce7cbe8c6b899552dadd907604559806aa2e9bd',
u'host': u'h008742.mongolab.com',
u'port': 53439,
u'version': u'3.0.7'}}
As you can see, in both cases the sort is applied first and affects all the 6 documents and then the limit limits the results to 3.
And, the execution plans are exactly the same:
>>> from copy import deepcopy # just in case
>>> cursor = col.find().sort([('time', 1)]).limit(3)
>>> sort_limit_plan = deepcopy(cursor.explain())
>>> cursor = col.find().limit(3).sort([('time', 1)])
>>> limit_sort_plan = deepcopy(cursor.explain())
>>> sort_limit_plan == limit_sort_plan
True
Also see:
How do you tell Mongo to sort a collection before limiting the results?

The mongodb documentation states that the skip() method controls the starting point of the results set, followed by sort() and ends with the limit() method.
This is regardless the order of your code. The reason is that mongo gets all the methods for the query, then it orders the skip-sort-limit methods in that exact order, and then runs the query.

I suspect, you're passing wrong key in sort parameter. something like "$key_name" instead of just "key_name"
refer How do you tell Mongo to sort a collection before limiting the results?solution for same problem as yours

Logically it should be whatever comes first in pipeline, But MongoDB always sort first before limit.
In my test Sort operation does takes precedence regardless of if it's coming before skip or after. However, it appears to be very strange behavior to me.
My sample dataset is:
[
{
"_id" : ObjectId("56f845fea524b4d098e0ef81"),
"number" : 48.98052410874508
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef82"),
"number" : 50.98747461471063
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef83"),
"number" : 81.32911244349772
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef84"),
"number" : 87.95549919039071
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef85"),
"number" : 81.63582683594402
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef86"),
"number" : 43.25696270026136
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef87"),
"number" : 88.22046335409453
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef88"),
"number" : 64.00556739160076
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef89"),
"number" : 16.09353150244296
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef8a"),
"number" : 17.46667776660574
}
]
Python test code:
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017")
database = client.get_database("test")
collection = database.get_collection("collection")
print("----------------[limit -> sort]--------------------------")
result = collection.find().limit(5).sort([("number", pymongo.ASCENDING)])
for r in result:
print(r)
print("----------------[sort -> limit]--------------------------")
result = collection.find().sort([("number", pymongo.ASCENDING)]).limit(5)
for r in result:
print(r)
Result:
----------------[limit -> sort]--------------------------
{u'_id': ObjectId('56f845fea524b4d098e0ef89'), u'number': 16.09353150244296}
{u'_id': ObjectId('56f845fea524b4d098e0ef8a'), u'number': 17.46667776660574}
{u'_id': ObjectId('56f845fea524b4d098e0ef86'), u'number': 43.25696270026136}
{u'_id': ObjectId('56f845fea524b4d098e0ef81'), u'number': 48.98052410874508}
{u'_id': ObjectId('56f845fea524b4d098e0ef82'), u'number': 50.98747461471063}
----------------[sort -> limit]--------------------------
{u'_id': ObjectId('56f845fea524b4d098e0ef89'), u'number': 16.09353150244296}
{u'_id': ObjectId('56f845fea524b4d098e0ef8a'), u'number': 17.46667776660574}
{u'_id': ObjectId('56f845fea524b4d098e0ef86'), u'number': 43.25696270026136}
{u'_id': ObjectId('56f845fea524b4d098e0ef81'), u'number': 48.98052410874508}
{u'_id': ObjectId('56f845fea524b4d098e0ef82'), u'number': 50.98747461471063}

The accepted answer didn't work for me, but this does:
last5 = db.collection.find( {'key': "YOURKEY"}, sort=[( '_id', pymongo.DESCENDING )] ).limit(5)
with the limit outside and the sort inside of the find argument.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

parse data from Dictionary in python - python

May be this can help you. id = [] for i in runs : id.append(i.get('id')) [12, 16]

You are trying to pass a list to a json.load function. Please read the docs. Load() does not accep lists, it accepts a .read()-supporting file-like object containing a JSON document

Related

OperationFailure: unknown top level operator: $ne (Monogbd)

Fuzzywuzzy for a list of dictionaries

Why is this weird ID returned when using todoist python api?

Access specific dict value in Python

limit() and sort() order pymongo and mongodb

Categories

Resources