How to use counter in python for dictionaries - python

I'm trying to do a count of the employees titles
I've tried alot but I dont think I've applied them correctly to the scenario.
employees = [
{
"email": "jonathan2532.calderon#gmail.com",
"employee_id": 101,
"firstname": "Jonathan",
"lastname": "Calderon",
"title": "Mr",
"work_phone": "(02) 3691 5845"
}]
EDIT:
from collections import Counter
class Employee:
def __init__(self, title,):
self.title = title
title_count = Counter()
for employee in [Employee("title") for data in employees]:
title_count[employee.title,] += 1
print(title_count)
Counter({('title',): 4})
I can't seem to get the specific names there.

In your example, for title in employees actually yields a dict object in every iteration since employees is a list of dict objects. While the Counter accepts a dict mapping as input, it isn't quite what you're looking for. The cnt['title'] simply increases the count by 1 for each iteration, effectively counting the number of dict objects in the employees list.
To count by titles, you have to unpack each of the dict object in your list first.
from collections import Counter
titles = [e['title'] for e in employees]
>>>Counter(titles)
Counter({'Mr': 2, 'Mrs': 1, 'Ms': 1})

A few things here, welcome to stack overflow. Please read how to ask a good question. Next, python is trying to help you out with the error it is giving you.
Try copying and pasting a portion of the error into google. Then, visit the docs on the data type you are trying to use. I think your question has been edited, but yeah––it will still help.
Finally, we need to see a minimal, complete, and verifiable example. So, code, we need to see what kind of code you're attempting to solve your problem with.
It helps to think about the structure of your data:
from collections import Counter
class Employee:
def __init__(self, title, employee_id):
# all other fields omitted
self.title = title
self.employee_id = employee_id
Here is some minimal data for your problem (arguably you could use a little less).
employees = [
{
"title": "Mr",
"employee_id": 1
},
{
"title": "Mr",
"employee_id": 2
},
{
"title": "Mrs",
"employee_id": 3
},
{
"title": "Ms",
"employee_id": 4
}
]
Define other necessary data structures.
title_count = Counter()
# Just to demo results.
for employee in [Employee(**data) for data in employees]:
print(f"title: {employee.title} id: {employee.employee_id}")
I'll leave the **data notation up to google. But now you have some well-structured data and can process it accordingly.
# Now we have some Employee objects with named fields that are
# easier to work with.
for employee in [Employee(**data) for data in employees]:
title_count[employee.title] += 1
print(title_count) # Counter({'Mr': 2, 'Mrs': 1, 'Ms': 1})

Related

Create dictionary using JSON data

I have a JSON file that has movie data in it. I want to create a dictionary that has the movie title as the key and a count of how many actors are in that movie as the value. An example from the JSON file is below:
{
"title": "Marie Antoinette",
"year": "2006",
"genre": "Drama",
"summary": "Based on Antonia Fraser's book about the ill-fated Archduchess of Austria and later Queen of France, 'Marie Antoinette' tells the story of the most misunderstood and abused woman in history, from her birth in Imperial Austria to her later life in France.",
"country": "USA",
"director": {
"last_name": "Coppola",
"first_name": "Sofia",
"birth_date": "1971"
},
"actors": [
{
"first_name": "Kirsten",
"last_name": "Dunst",
"birth_date": "1982",
"role": "Marie Antoinette"
},
{
"first_name": "Jason",
"last_name": "Schwartzman",
"birth_date": "1980",
"role": "Louis XVI"
}
]
}
I have the following but it's counting all of the actors from all of the movies instead of each movie and the number of actors per movie. I'm not sure how to do this correctly as I'm newer to Python so help would be great.
import json
def actor_count(json_data):
with open("movies_db.json", 'r') as file:
data = json.load(file)
for t in data:
title = [t['title'] for t in data]
for element in data:
for actor in element['actors']:
rolee = [actor['role'] for movie in data for actor in movie['actors']]
len_role = [len(role)]
newD = dict(zip(title, len_role))
print(newD)
json_data = open('movies_db.json')
actor_count(json_data)
You show json that only contains a dictionary, yet you seem to process it as if it were a list of dictionaries with the structure you have shown. Pending clarification, I am answering here as if the latter is true -- you have a list of dictionaries, since you would be asking a different question about a different error if this was not the case.
In your function, each element of data is a dictionary that contains the information for a single movie. To get a dict correlating the title to the count of actors in this movie, you just need to access the "title" key and the length of the "actors" key for each element.
def actor_count(json_data):
movie_actors = {}
for movie in json_data:
title = movie["title"]
num_actors = len(movie["actors"])
movie_actors[title] = num_actors
return movie_actors
Alternatively, use a dictionary comprehension to build this dictionary:
def actor_count(json_data):
movie_actors = {movie["title"]: len(movie["actors"]) movie in json_data}
return movie_actors
Now, load your json file once, and use that when you call actors_count. This will return a dictionary mapping each movie title to the number of actors.
with open("movies_db.json", 'r') as file:
data = json.load(file)
actors_count(data)
Note that loading the json file again in the function is unnecessary, since you already did it before calling the function, and are passing the parsed object to the function.
If you want to keep your current logic of using list comprehensions, and then zipping the resultant lists to create a dict, that is also possible although slightly less efficient. There are significant changes you will need to make:
def actor_count(json_data):
title = [t['title'] for t in json_data]
n_actors = [len(t['actors'] for t in json_data)]
newD = dict(zip(title, n_actors))
return newD
As before, no need to read the file again in the function
You're already looping over all elements in json_data as part of the list comprehension, so no need for another loop outside this.
You can get the number of actors simply by len(t['actors'])
You seem to have misconceptions about how list comprehensions and loops work. A list comprehension is a self-contained loop that builds a list. If you have a list comprehension, there's usually no need to surround it by the same for ... in ... statement that already exists in the comprehension.
def actor_count(json_data):
newD = dict()
with open("movies_db.json", 'r') as file:
data = json.load(file)
for t in data:
if t == 'title':
title_ = json_data[t]
newD[ title_ ] = 0
if t == 'actors':
newD[ title_ ] = len(json_data[t])
print(newD)
Output:
{'Marie Antoinette': 2}

How to add a key if it is not present in a JSON Object

I have json Array, I iterating that array and trying to print a particular key and value of json object, but I am getting KeyError.
employees = [
{
"id":"101",
"name": "abc",
"mobile":"123"
},
{
"id": "102",
"name": "xyz"
}
]
for employee in employees:
print employee['mobile']
I want to add key 'mobile' in json object where 'mobile' does not exist. Can you please help me, How can I do this in python
mobile = "mobile"
for employee in employees:
if mobile not in employee:
employee[mobile]=123999
Your second employee object has no attribute 'mobile', so the code raises an error as it should. To avoid that you can use an if statement to avoid printing the mobile key if it doesn't exist:
for employee in employees:
if 'mobile' in employee:
print(employee['mobile'])
You can also add the key to existing objects:
for employee in employees:
employee['mobile'] = '123'
print(employee['mobile'])
Next time though before asking a question I suggest doing a bit of research as this kind of question has already been asked many times ;)

Is it possible to store an array in Django model?

I was wondering if it's possible to store an array in a Django model?
I'm asking this because I need to store an array of int (e.g [1,2,3]) in a field and then be able to search a specific array and get a match with it or by it's possible combinations.
I was thinking to store that arrays as strings in CharFields and then, when I need to search something, concatenate the values(obtained by filtering other model) with '[', ']' and ',' and then use a object filter with that generated string. The problem is that I will have to generate each possible combination and then filter them one by one until I get a match, and I believe that this might be inefficient.
So, I hope you can give me other ideas that I could try.
I'm not asking for code, necessarily, any ideas on how to achieve this will be good.
I'd have two advices for you:
1) Use ArrayField if you are using PostgreSQL as your database. You can read more about ArrayField here.
2) Encode your array as JSON and store it either as a plain string or using a JSONField as found here.
I'd personally prefer option number 1 since that is the cleaner and nicer way but depending on what you are actually using to store your data that might not be available to you.
Yes, you can use it like this:
from django.contrib.postgres.fields import ArrayField
class Board(models.Model):
pieces = ArrayField(ArrayField(models.IntegerField()))
However, it can only be available when using PostgreSQL for the database.
If you aren't using Postgres, I recommend Django's validate_comma_separated_integer_list validator.
https://docs.djangoproject.com/en/dev/ref/validators/#django.core.validators.validate_comma_separated_integer_list
You use is as a validator on a CharField().
I don't know why nobody has suggested it, but you can always pickle things and put the result into a binary field.
The advantages of this method are that it will work with just about any database, it's efficient, and it's applicable to more than just arrays. The downside is that you can't have the database run queries on the pickled data (not easily, anyway).
you can store a json and good to go with sub arrays of that JSON:
if (data != "attachmentto") {
formData.append(data, this.recipe[data])
console.log('rec data ', data)}
else if (data == "attachmentto") {
console.log('rec data434 ', this.recipe.attachmentto)
var myObj = { name: this.recipe.attachmentto.name, age: 31, city: "New York" };
let kokos = JSON.stringify(myObj);
// this.recipe.attachmentto.name = this.recipe.attachmentto.name
formData.append('attachmentto', kokos)
}
Django backend:
class Video(models.Model):
objects = models.Manager()
title = models.CharField(max_length=80)
description = models.TextField(max_length=300)
picture = JSONField(encoder=None)
price = models.IntegerField(default=0)
url = models.URLField()
category = models.CharField(max_length=50)
subcategory = models.TextField(max_length=50)
attachmentto = JSONField(encoder=None)
# attachmentto2 = models.JSONField()
user = models.ForeignKey(User, on_delete=models.CASCADE, default=1)
and result on backend API:
{
"id": 174,
"title": "ads",
"description": "ads",
"picture": {
"name": "https://v3.vuejs.org/logo.png"
},
"price": 0,
"user": 1,
"rating_average": 0,
"attachmentto": {
"age": 31,
"city": "New York",
"name": [
"https://restdj.herokuapp.com/media/uploads/ftf_3_cJ0V7TF.png",
"https://restdj.herokuapp.com/media/uploads/ftf_3.jpg"
]
}
},
I call it noicely(nicely). Notice that we send a JSON and we have a array in that JSON
Kokos is the full JSON disagned for djangoo:
var myObj = { name: this.recipe.attachmentto.name, age: 31, city: "New York" };
let kokos = JSON.stringify(myObj);
formData.append('attachmentto', kokos)
Above; name: this.recipe.attachmentto.name is an array
Here is the array:
"name": [
"https://restdj.herokuapp.com/media/uploads/ftf_3_cJ0V7TF.png",
"https://restdj.herokuapp.com/media/uploads/ftf_3.jpg"
]

How to mix JSON-serialized Django models with flat JSON

The problem I'm having is with mixing serialized Django models using django.core.serializers with some other piece of data and then trying to serialize the entire thing using json.dumps.
Example code:
scores = []
for indicator in indicators:
score_data = {}
score_data["indicator"] = serializers.serialize("json", [indicator,])
score_data["score"] = evaluation.indicator_percent_score(indicator.id)
score_data["score_descriptor"] = \
serializers.serialize("json",
[form.getDescriptorByPercent(score_data["score"]),],
fields=("order", "value", "title"))
scores.append(score_data)
scores = json.dumps(scores)
return HttpResponse(scores)
Which returns a list of objects like this:
{
indicator: "{"pk": 345345, "model": "forms.indicator", "fields": {"order": 1, "description": "Blah description.", "elements": [10933, 4535], "title": "Blah Title"}}",
score: 37.5,
score_descriptor: "{"pk": 66666, "model": "forms.descriptor", "fields": {"order": 1, "value": "1.00", "title": "Unsatisfactory"}}"
}
The problem I'm having can be seen in the JSON with the serialized Django models being wrapped in multiple sets of quotations. This makes the it very hard to work with on the client side as when I try to do something like
indicator.fields.order
it evaluates to nothing because the browser thinks I'm dealing with a string of some sort.
Ideally I would like valid JSON without the conflicting quotations that make it unreadable. Something akin to a list of objects like this:
{
indicator: {
pk: 12931231,
fields: {
order: 1,
title: "Blah Title",
}
},
etc.
}
Should I be doing this in a different order, using a different data structure, different serializer?
My solution involved dropping the use of django.core.serializers and instead using django.forms.model_to_dict and django.core.serializers.json.DjangoJSONEncoder.
The resulting code looked like this:
for indicator in indicators:
score_data = {}
score_data["indicator"] = model_to_dict(indicator)
score_data["score"] = evaluation.indicator_percent_score(indicator.id)
score_data["score_descriptor"] = \
model_to_dict(form.getDescriptorByPercent(score_data["score"]),
fields=("order", "value", "title"))
scores.append(score_data)
scores = json.dumps(scores, cls=DjangoJSONEncoder)
The problem seemed to arise from the fact that I was essentially serializing the Django models twice. Once with the django.core.serializers function and once with the json.dumps function.
The solution solved this by converting the model to a dictionary first, throwing it in the dictionary with the other data and then serializing it once using the json.dumps using the DjangoJSONEncoder.
Hope this helps someone out because I couldn't find my specific issue but was able to piece it together using other answer to other stackoverflow posts.

Django/python querysets - adding another "child" queryset as an item to each list object?

Apologies if the answer to this is obvious - I'm very new to django/python & haven't been able to find a solution in my searching so far.
I have a straightforward queryset, eg
members = LibraryMembers.objects.all()
with this I can do:-
for m in members:
member_books = LibraryBorrows.objects.filter(member_id=m[u'id'])
What I really want though is to be able to serialize the results into json, so it looks something like this:-
{
"members":
[
{
"id" : "1",
"name" : "Joe Bloggs"
"books":
[
{
"name" : "Five Go Exploring",
"author" : "Enid Blyton",
},
{
"name" : "Princess of Mars",
"author" : "Edgar Rice Burroughs",
},
]
}
]
}
To my mind, the obvious thing to try was:-
for m in members:
m[u'books'] = LibraryBorrows.objects.filter(member_id=m[u'id'])
However I'm getting TypeError: 'LibraryBorrows' object does not support item assignment
Is there any way to achieve what I'm after?
Model instances are not indeed not dicts. Now if you want dicts instead of model instances, then Queryset.values() is your friend - you get a list of dicts with only the required fields, and you avoid the overhead of retrieving unneeded fields from the database and building full-blown model instances.
>> members = LibraryMember.objects.values("id", "name")
>> print members
[{"id" : 1, "name" : "Joe Bloggs"},]
Then you code would look like:
members = LibraryMember.objects.values("id", "name")
for m in members:
m["books"] = LibraryBorrows.objects.filter(
member_id=m['id']
).values("name", "author")
Now you still have to issue one additionnal db query for each parent row which may not be that efficient, depending on the number of LibraryMember. If you have hundreds or more LibraryMember, a better approach would be to query on the LibraryBorrow instead, including the related fields from LibraryMember, then regroup the rows based on LibraryMember id, ie:
from itertools import group_by
def filter_row(row):
for name in ("librarymember__id", "librarymember__name"):
del row[name]
return row
members = []
rows = LibraryBorrow.objects.values(
'name', 'author', 'librarymember__id', 'librarymember__name'
).order_by('librarymember__id')
for key, group in group_by(rows, lambda r: r['librarymember__id']):
group = list(group)
member = {
'id' : group[0]['librarymember_id'],
'name':group[0]['librarymember_name']
'books' = [filter_row(row) for row in group]
}
members.append(member)
NB : this can be seen as premature optimization (and would be if you only have a couple LibraryMember in your db), but trading hundreds or more queries for one single query and a bit of postprocessing usually makes a real difference for "real life" datasets.
Well m is a LibraryMember object so you won't be able to treat it as a dictionary. As a side note: Most people don't name the models in plural form since they are just a class modeling an object, not a collection of objects.
One possible solution is to make a list of dictionaries with the values that you need from both objects, something like this in a one-liner:
o = [ { "id": m.id, "name": m.name, "books": [{"name": b.name, "author": b.author} for b in m.libraryborrows_set.all()] } for m in LibraryMembers.objects.all()]
Note that you can use the related manager to get the books for a given member. For better clarity:
o = []
for m in LibraryMembers.objects.all():
member_books = [{"name": b.name, "author": b.author} for b in m.libraryborrows_set.all()]
o.append( { "id": m.id, "name": m.name, "books": member_books } )
EDIT:
To serialize all the fields:
members = []
for member in LibraryMembers.objects.all():
member_details = {}
for field in member._meta.get_all_field_names():
member_details[field] = getattr(member, field)
books = []
for book in member.librayborrows_set.all():
book_details = {}
for field in book._meta.get_all_field_names():
book_details[field] = getattr(book, field)
books.append(book_details)
member_details['books'] = books
members.append(member_details)
I also found DjangoFullSerializers which I hadn't heard about until today:
http://code.google.com/p/wadofstuff/wiki/DjangoFullSerializers

Categories