Function to copy fields from a model to another model in django - python

I want to merge 2 different models with different but overlapping fields.
I try to write a function that copy the fields with its data from model A to model B.
def create_field():
old = DetailItem.objects.all()
new = CrawlItem.objects.all()
for item in old:
c = CrawlItem.objects.get(title=item.title, description=item.description, link=item.link, cpvcode=item.cpvcode, postalcode=item.postalcode )
c.save()
and i don't know whereis the mistake. I want to have a model that contains the data from the old model and some new fields.
Here is my code for the two models:
class DetailItem(Base):
title = models.CharField(max_length=500)
description = models.CharField(max_length=20000)
link = models.URLField()
cpvcode = models.ManyToManyField('CPVCode',related_name='cpv')
postalcode = models.ForeignKey('PostalCode',on_delete=models.SET_NULL,null=True,blank=True,related_name='postal')
def __str__(self):
return self.title
class CrawlItem(Base):
guid = models.CharField( primary_key=True, max_length=500)
title = models.CharField(max_length=500)
link = models.URLField()
description = models.CharField(max_length=2000)
pubdate = models.DateTimeField()
detail = models.ForeignKey('DetailItem',on_delete=models.SET_NULL,null=True,blank=True,related_name='crawldetail')
def __str__(self):
return str(self.title)
This is what I want to get:
class CrawlItem(Base):
guid = ...
title = ...
link = ...
cpvcodes = ...
postalcode = ...
pubdate = ...
vergabestelle = ...
vergabeart = ...
anmeldefrist = ...
description = ...
Any ideas how to get there are highly appreciated!

It's not entirely clear which objects already exist in your database, and when you consider two objects to be "equal". Assuming a CrawlItem is "equal" to a "DetailItem" when title, description and link are the same, then you can use the update_or_create function like this:
for item in old:
CrawlItem.objects.update_or_create(
# if matching, existing item updated, otherwise new item created
title=item.title, description=item.description, link=item.link,
defaults = {'cpvcode': item.cpvcode, 'postalcode': item.postalcode}
)
Alternatively, if the two models are linked with the fk as shown in your models (and you want to remove it later on), then you don't even need to check for "equal" objects because you already have all the related ones (assuming title, description and link are already equal):
for item in old:
item.crawldetail.all().update(cpvcode=item.cpvcode, postalcode=item.postalcode)

in your for statement you are just trying to select de CrawlItem with the same values as DetailItem using the get queryset method.
if you want to create a CrawlItem you should use the create method (docs here -> https://docs.djangoproject.com/en/2.2/ref/models/querysets/#create)
c = CrawlItem.objects.create(title=item.title, ..., postalcode=item.postalcode)
it will be created when create() is called, so, you don't need to save it afterwards, c is set to the newly created object.
For performance reasons you can use bulk_create() method as follows (docs here -> https://docs.djangoproject.com/en/2.2/ref/models/querysets/#bulk-create)
new_crawlitems = []
for item in old:
new_crawlitems.append(CrawlItem(title=item.title, description=item.description, link=item.link, cpvcode=item.cpvcode, postalcode=item.postalcode)
CrawlItem.objects.bulk_create(new_crawlitems)
Hope this helps and put you on the right direction.
G.

Related

How to fetch related model in django_tables2 to avoid a lot of queries?

I might be missing something simple here. And I simply lack the knowledge or some how-to.
I got two models, one is site, the other one is siteField and the most important one - siteFieldValue.
My idea is to create a django table (for site) that uses the values from siteFieldValue as a number in a row, for a specific site, under certain header. The problem is - each site can have 50s of them. That * number of columns specified by def render_ functions * number of sites equals to a lot of queries and I want to avoid that.
My question is - is it possible to, for example, prefetch all the values for each site (SiteFieldValue.objects.filter(site=record).first() somewhere in the SiteListTable class), put them into an array and then use them in the def render_ functions by simply checking the value assigned to a key (id of the field).
Models:
class Site(models.Model):
name = models.CharField(max_length=100)
class SiteField(models.Model):
name = models.CharField(max_length=100)
description = models.CharField(max_length=500, null=True, blank=True)
def __str__(self):
return self.name
class SiteFieldValue(models.Model):
site = models.ForeignKey(Site, on_delete=models.CASCADE)
field = models.ForeignKey(SiteField, on_delete=models.CASCADE)
value = models.CharField(max_length=500)
Table view
class SiteListTable(tables.Table):
name = tables.Column()
importance = tables.Column(verbose_name='Importance',empty_values=())
vertical = tables.Column(verbose_name='Vertical',empty_values=())
#... and many more to come... all values based on siteFieldValue
def render_importance(self, value, record):
q = SiteFieldValue.objects.filter(site=record, field=1).first()
# ^^ I don't want this!! I would want the SiteFieldValue to be prefetched somewhere else for that model and just check the array for field id in here.
if (q):
return q.value
else:
return None
def render_vertical(self, value, record):
q = SiteFieldValue.objects.filter(site=record, field=2).first()
# ^^ I don't want this!! I would want the SiteFieldValue to be prefetched somewhere else for that model and just check the array for field id in here.
if (q):
return q.value
else:
return None
class Meta:
model = Site
attrs = {
"class": "table table-striped","thead" : {'class': 'thead-light',}}
template_name = "django_tables2/bootstrap.html"
fields = ("name", "importance", "vertical",)
This might get you started. I've broken it up into steps but they can be chained quite easily.
#Get all the objects you'll need. You can filter as appropriate, say by site__name).
qs = SiteFieldValue.objects.select_related('site', 'field')
#lets keep things simple and only get the values we want
qs_values = qs.values('site__name','field__name','value')
#qs_values is a queryset. For ease of manipulation, let's make it a list
qs_list = list(qs_values)
#set up a final dict
final_dict = {}
# create the keys (sites) and values so they are all grouped
for site in qs_list:
#create the sub_dic for the fields if not already created
if site['site__name'] not in final_dict:
final_dict[site['site__name']] = {}
final_dict[site['site__name']][site['name']] = site['site__name']
final_dict[site['site__name']][site['field__name']] = site['value']
#now lets convert our dict of dicts into a list of dicts
# for use as per table2 docs
data = []
for site in final_dict:
data.append(final_dict[site])
Now you have a list of dicts eg,
[{'name':site__name, 'col1name':value...] and can add it as shown in the table2 docs

Annotating a Subquery with queryset filtering methods from another model through Many to Many fields

I'm not sure how to make this possible, but I'm hoping to understand the intended method to do the following:
I have a simple model:
class Author(models.Model):
id = models.TextField(primary_key=True, default=uuid4)
name = models.TextField()
main_titles = models.ManyToManyField(
"Book",
through="BookMainAuthor",
related_name="main_authors",
)
objects = AuthorManager.from_queryset(AuthorQuerySet)()
class Book(models.Model):
id = models.TextField(primary_key=True, default=uuid4)
title = models.TextField()
genre = models.ForeignKey("Genre", on_delete=models.CASCADE)
objects = BookManager.from_queryset(BookQuerySet)()
class BookQuerySet(QuerySet):
def by_genre_scifi(self) -> QuerySet:
return self.filter(**self.LOOKUP_POPULAR_SCIFI)
I'd like to add a new QuerySet method for AuthorQuerySet to annotate Author objects using the method from BookQuerySet above. I've tried the following, which is incorrect:
class AuthorQuerySet(QuerySet):
def annotate_with_total_titles_by_genres(self) -> QuerySet:
main_titles_by_genre_sci_fi_query = Book.objects.filter(main_authors__in=[OuterRef('pk')]).all()
.by_genre_sci_fi()
.annotate(cnt=Count('*'))
.values('cnt')[:1]
return self.annotate(sci_fi_titles_total =
Subquery(main_titles_by_genre_sci_fi_query, output_field=IntegerField()))
Intended usage:
annotated_authors = Author.objects.filter(<some filter>).annotate_with_total_titles_by_genres()
There are additional fields in the lookup not shown in the model above, but the method here is working, and returns a BookQuerySet filtered by the lookup:
Book.objects.filter(main_authors__in=['some_author_id']).all().by_genre_sci_fi()
Similarly, I can run the subquery independently and get the count like so:
`Book.objects.filter(main_authors__in=['some_author_id']).all()
.by_genre_sci_fi()
.annotate(cnt=Count('*'))
.values('cnt')[:1]`
Out[1]: <BookQuerySet [{'cnt': 1}]>
However when I try to annotate using the AuthorQuerySet method above, I get None for every entry.
I wonder if there is an issue here with OuterRef and using in which will evaluate each character independently if it receives a string. If I try running it without the square parens:
ProgrammingError: syntax error at or near ""bookshop_author"" LINE 1: ...RE (U0."deleted_at" IS NULL AND U1."author_id" IN "bookshop_...

What is the "instance" being passed to the to_representation function of my ListSerializer?

The goal of this project is to create an API that refreshes hourly with the most up to date betting odds for a list of games that I'll be scraping hourly from the internet. The goal structure for the JSON returned will be each game as the parent object and the nested children will be the top 1 record for each of linesmakers being scraped by updated date. My understanding is that the best way to accomplish this is to modify the to_representation function within the ListSerializer to return the appropriate queryset.
Because I need the game_id of the parent element to grab the children of the appropriate game, I've attempted to pull the game_id out of the data that gets passed. The issue is that this line looks to be populated correctly when I see what it contains through an exception, but when I let the full code run, I get a list index is out of range exception.
For ex.
class OddsMakerListSerializer(serializers.ListSerializer):
def to_representation(self, data):
game = data.all()[0].game_id
#if I put this here it evaluates to 1 which should run the raw sql below correctly
raise Exception(game)
data = OddsMaker.objects.filter(odds_id__in = RawSQL(''' SELECT o.odds_id
FROM gamesbackend_oddsmaker o
INNER JOIN (
SELECT game_id
, oddsmaker
, max(updated_datetime) as last_updated
FROM gamesbackend_oddsmaker
WHERE game_id = %s
GROUP BY game_id
, oddsmaker
) l on o.game_id = l.game_id
and o.oddsmaker = l.oddsmaker
and o.updated_datetime = l.last_updated
''', [game]))
#if I put this here the data appears to be populated correctly and contain the right data
raise Exception(data)
data = [game for game in data]
return data
Now, if I remove these raise Exceptions, I get the list index is out of range. My initial thought was that there's something else that depends on "data" being returned as a list, so I created the list comprehension snippet, but that doesn't resolve the issue.
So, my question is 1) Is there an easier way to accomplish what I'm going for? I'm not using a postgres backend so distinct on isn't available to me. and 2) If not, its not clear to me what instance is that's being passed in or what is expected to be returned. I've consulted the documentation and it looks as though it expects a dictionary and that might be part of the issue, but again the error message references a list. https://www.django-rest-framework.org/api-guide/serializers/#overriding-serialization-and-deserialization-behavior
I appreciate any help in understanding what is going on here in advance.
Edit:
The rest of the serializers:
class OddsMakerSerializer(serializers.ModelSerializer):
class Meta:
list_serializer_class = OddsMakerListSerializer
model = OddsMaker
fields = ('odds_id','game_id','oddsmaker','home_ml',
'away_ml','home_spread','home_spread_odds',
'away_spread_odds','total','total_over_odds',
'total_under_odds','updated_datetime')
class GameSerializer(serializers.ModelSerializer):
oddsmaker_set = OddsMakerSerializer(many=True, read_only=True)
class Meta:
model = Game
fields = ('game_id','date','sport', 'home_team',
'away_team','home_score', 'away_score',
'home_win','away_win', 'game_completed',
'oddsmaker_set')
models.py:
class Game(models.Model):
game_id = models.AutoField(primary_key=True)
date = models.DateTimeField(null=True)
sport=models.CharField(max_length=256, null=True)
home_team = models.CharField(max_length=256, null=True)
away_team = models.CharField(max_length=256, null=True)
home_score = models.IntegerField(default=0, null=True)
away_score = models.IntegerField(default=0, null=True)
home_win = models.BooleanField(default=0, null=True)
away_win = models.BooleanField(default=0, null=True)
game_completed = models.BooleanField(default=0, null=True)
class OddsMaker(models.Model):
odds_id = models.AutoField(primary_key=True)
game = models.ForeignKey('Game', on_delete = models.CASCADE)
oddsmaker = models.CharField(max_length=256)
home_ml = models.IntegerField(default=999999)
away_ml = models.IntegerField(default=999999)
home_spread = models.FloatField(default=999)
home_spread_odds = models.IntegerField(default=9999)
away_spread_odds = models.IntegerField(default=9999)
total = models.FloatField(default=999)
total_over_odds = models.IntegerField(default=999)
total_under_odds = models.IntegerField(default=999)
updated_datetime = models.DateTimeField(auto_now=True)
views.py:
class GameView(viewsets.ModelViewSet):
queryset = Game.objects.all()
serializer_class = GameSerializer
Thanks
To answer the question in the title:
The instance being passed to the Serializer.to_representation() is the instance you pass when initializing the serializer
queryset = MyModel.objects.all()
Serializer(queryset, many=True)
instance = MyModel.objects.all().first()
Serializer(data)
Usually you don't have to inherit from ListSerializer per se. You can inherit from BaseSerializer and whenever you pass many=True during initialization, it will automatically 'becomeaListSerializer`. You can see this in action here
To answer your problem
from django.db.models import Max
class OddsMakerListSerializer(serializers.ListSerializer):
def to_representation(self, data): # data passed is a queryset of oddsmaker
# Do your filtering here
latest_date = data.aggregate(
latest_date=Max('updated_datetime')
).get('latest_date').date()
latest_records = data.filter(
updated_date_time__year=latest_date.year,
updated_date_time__month=latest_date.month,
updated_date_time__day=latest_date.day
)
return super().to_representation(latest_records)

django-tables 2 M2M field not shown

I am trying to show a M2M field in a django-table2 as seen in Django-tables2: How to use accessor to bring in foreign columns? and Accessing related models with django-tables2
Using: foreigncolumn = tables.Column(accessor='foreignmodel.foreigncolumnname'), I only see a '--'...
# The models:
class Organism(models.Model):
species_name = models.CharField(max_length=200)
strain_name = models.CharField(max_length=200)
eukaryotic = models.BooleanField(default=True)
lipids = models.ManyToManyField('Lipid',blank=True)
class Lipid(models.Model):
lm_id = models.CharField(max_length=100)
common_name = models.CharField(max_length=100,blank=True)
category = models.CharField(max_length=100,blank=True)
#The tables
class OrganismTable(tables.Table):
name = tables.LinkColumn('catalog:organism-detail', text=lambda record: record.species_name, args=[A('pk')])
lp = tables.Column(accessor='Lipid.common_name')
class Meta:
model = Organism
sequence = ['name','lp']
exclude = ['id','species_name']
Any idea what I'm doing wrong?
This does not work so easily for ManyToManyFields because of the simple way Accessor works. You could display the repr of the related QuerySet via 'lipids.all' but that does not seem sufficient here. You can, however, add a property (or method) to your Organism model and use it in the accessor. This way, you can display any custom information related to the instance:
class Organism(models.Model):
# ...
#property
def lipid_names(self):
return ', '.join(l.common_name for l in self.lipids.all()) # or similar
class OrganismTable(tables.Table):
# ...
lp = tables.Column(accessor='lipid_names')
I would recommend then to add a prefetch_related('lipids') to the Organism QuerySet that you pass to the table for better performance.

Django: Linking two tables

first of all, I'm aware that this question might've been answered already, but there are two reasons why I'm opening another question: One, obviously, is I'm struggling with the Django syntax. Secondly, and perhaps more importantly, I'm not quite sure whether my database setup makes even sense at this point. So, please bear with me.
I work in a hospital and one of my daily stuggles is that, oftentimes, one single drug can have a lot of different names. So, I thought that'd be a good task to practice some Django with.
Basically I want two databases: One that simply links the drugs "nick name" to it's actual name. And another one which links the actual name to some additional information, something along the lines of a wiki page.
What I've come up with so far:
(django)django#ip:~/medwiki$ cat medsearch/models.py
from django.db import models
# Create your models here.
class medsearch(models.Model):
proprietary_name = models.CharField(max_length = 100, unique = True)
non_proprietary_name = models.CharField(max_length = 100, unique = True)
def __str__(self):
return self.non_proprietary_name
class medwiki(models.Model):
proprietary_name = models.ForeignKey('medisearch', on_delete=models.CASCADE)
cetegory = models.CharField(max_length = 255)
#wiki = models.TextField() etc.
def __str__(self):
return self.proprietary_name
(django)django#ip-:~/medwiki$
So, I can add a new "medsearch object" just fine. However, when adding the "Category" at medwiki I get __str__ returned non-string (type medsearch). Presumably, because there's more than one key in medsearch? I thus suspect that "FroeignKey" is not suited for this application and I know that there are other ways to link databases in Django. However, I don't know which one to choose and how to implement it correctly.
Hopefully, some of you have some ideas?
EDIT: Here's what I've come up with so far:
class Proprietary_name(models.Model):
proprietary_name = models.CharField(max_length = 100, unique = True)
def __str__(self):
return self.proprietary_name
class Category(models.Model):
category = models.CharField(max_length = 100, unique = True)
def __str__(self):
return self.category
class Mediwiki(models.Model):
proprietary_name = models.ManyToManyField(Proprietary_name)
non_proprietary_name = models.CharField(max_length = 100, unique = True)
category = models.ManyToManyField(Category)
wiki_page = models.TextField()
def __str__(self):
return self.non_proprietary_name
Now I can attribute different categorys and different proprietary_names to one drug. Which works great so far.
So does looking up the non-proprietary_name when I know the proprietary "nick name".
>>> Mediwiki.objects.get(proprietary_name__proprietary_name="Aspirin")
<Mediwiki: acetylsalicylic acid>
>>>
However, I'd also like to display all the proprietary_names, when I know the non_proprietary_name. Do I have to further change the database design, or am I just missing some other thing here?
This would work:
return self.proprietary_name.proprietary_name
But that doesn't really make sense !
The main issue is that you've called the foreign key to medsearch, proprietary_name.
The second issue is just a convention one. In Python (and many programming languages), classes must start with an uppercase letter.
The following would be better:
class MedSearch(models.Model):
proprietary_name = models.CharField(max_length=100, unique=True)
non_proprietary_name = models.CharField(max_length=100, unique=True)
def __str__(self):
return self.non_proprietary_name
class MedWiki(models.Model):
med_search = models.ForeignKey('MedSearch', on_delete=models.CASCADE, related_name='wikis')
cetegory = models.CharField(max_length = 255)
#wiki = models.TextField() etc.
def __str__(self):
return self.med_serach.proprietary_name
As you note, the proprietary_name field on medwiki is a ForeignKey. You can't return that value directly from the __str__ method because that needs to return a string. You need to convert that value into a string before returning it: either use the default string representation of the medsearch instance:
return str(self.proprietary_name)
or choose a specific string field to return:
return self.proprietary_name.proprietary_name

Categories