I'm super new to Python (I started about 3 weeks ago) and I'm trying to make a script that scrapes web pages for information. After it's retrieved the information it runs through a function to format it and then passes it to a class that takes 17 variables as parameters. The class uses this information to calculate some other variables and currently has a method to construct a dictionary. The code works as intended but a plugin I'm using with Pycharm called SonarLint highlights that 17 variables is too many to use as parameters?
I've had a look for alternate ways to pass the information to the class, such as in a tuple or a list but couldn't find much information that seemed relevant. What's the best practice for passing many variables to a class as parameters? Or shouldn't I be using a class for this kind of thing at all?
I've reduced the amount of variables and code for legibility but here is the class;
Class GenericEvent:
def __init__(self, type, date_scraped, date_of_event, time, link,
blurb):
countdown_delta = date_of_event - date_scraped
countdown = countdown_delta.days
if countdown < 0:
has_passed = True
else:
has_passed = False
self.type = type
self.date_scraped = date_scraped
self.date_of_event = date_of_event
self.time = time
self.link = link
self.countdown = countdown
self.has_passed = has_passed
self.blurb = blurb
def get_dictionary(self):
event_dict = {}
event_dict['type'] = self.type
event_dict['scraped'] = self.date_scraped
event_dict['date'] = self.date_of_event
event_dict['time'] = self.time
event_dict['url'] = self.link
event_dict['countdown'] = self.countdown
event_dict['blurb'] = self.blurb
event_dict['has_passed'] = self.has_passed
return event_dict
I've been passing the variables as key:value pairs to the class after I've cleaned up the data the following way:
event_info = GenericEvent(type="Lunar"
date_scraped=30/01/19
date_of_event=28/07/19
time=12:00
link="www.someurl.com"
blurb="Some string.")
and retrieving a dictionary by calling:
event_info.get_dictionary()
I intend to add other methods to the class to be able to perform other operations too (not just to create 1 dictionary) but would like to resolve this before I extend the functionality of the class.
Any help or links would be much appreciated!
One option is a named tuple:
from typing import Any, NamedTuple
class GenericEvent(NamedTuple):
type: Any
date_scraped: Any
date_of_event: Any
time: Any
link: str
countdown: Any
blurb: str
#property
def countdown(self):
countdown_delta = date_of_event - date_scraped
return countdown_delta.days
#property
def has_passed(self):
return self.countdown < 0
def get_dictionary(self):
return {
**self._asdict(),
'countdown': self.countdown,
'has_passed': self.has_passed,
}
(Replace the Anys with the fields’ actual types, e.g. datetime.datetime.)
Or, if you want it to be mutable, a data class.
I don't think there's anything wrong with what you're doing. You could, however, take your parameters in as a single dict object, and then deal with them by iterating over the dict or doing something explicitly with each one. Seems like that would, in your case, make your code messier.
Since all of your parameters to your constructor are named parameters, you could just do this:
def __init__(self, **params):
This would give you a dict named params that you could then process. The keys would be your parameter names, and the values the parameter values.
If you aligned your param names with what you want the keys to be in your get_dictionary method's return value, saving off this parameter as a whole could make that method trivial to write.
Here's an abbreviated version of your code (with a few syntax errors fixed) that illustrates this idea:
from pprint import pprint
class GenericEvent:
def __init__(self, **params):
pprint(params)
event_info = GenericEvent(type="Lunar",
date_scraped="30/01/19",
date_of_event="28/07/19",
time="12:00",
link="www.someurl.com",
blurb="Some string.")
Result:
{'blurb': 'Some string.',
'date_of_event': '28/07/19',
'date_scraped': '30/01/19',
'link': 'www.someurl.com',
'time': '12:00',
'type': 'Lunar'}
Related
I recently started to work with Python's classes, since I need to work with it through the use of OTree, a Python framework used for online experiment.
In one file, I define the pages that I want to be created, using classes. So essentially, in the OTree system, each class corresponds to a new page. The thing is, all pages (so classes) are basically the same, at the exception to some two parameters, as shown in the following code:
class Task1(Page):
form_model = 'player'
form_fields = ['Envie_WordsList_Toy']
def is_displayed(self):
return self.round_number == self.participant.vars['task_rounds'][1]
def vars_for_template(player):
WordsList_Toy= Constants.WordsList_Toy.copy()
random.shuffle(WordsList_Toy)
return dict(
WordsList_Toy=WordsList_Toy
)
#staticmethod
def live_method(player, data):
player.WTP_WordsList_Toy = int(data)
def before_next_page(self):
self.participant.vars['Envie_WordsList_Toy'] = self.player.Envie_WordsList_Toy
self.participant.vars['WTP_WordsList_Toy'] = self.player.WTP_WordsList_Toy
So here, the only thing that would change would be the name of the class, as well as the suffix of the variable WordsList_ used throughout this code, which is Toy.
Naively, what I tried to do is to define a function that would take those two parameters, such as this:
def page_creation(Task_Number,name_type):
class Task+str(Task_Number)(Page):
form_model = 'player'
form_fields = ['Envie_WordsList_'+str(name_type)]
def is_displayed(self):
return self.round_number == self.participant.vars['task_rounds'][1]
def vars_for_template(player):
WordsList_+str(name_type) = Constants.WordsList+str(name_type).copy()
random.shuffle(WordsList_+str(name_type))
return dict(
WordsList_+str(name_type)=WordsList_+str(name_type)
)
#staticmethod
def live_method(player, data):
player.WTP_WordsList_+str(name_type) = int(data)
def before_next_page(self):
self.participant.vars['Envie_WordsList_+str(name_type)'] = self.player.Envie_WordsList_+str(name_type)
self.participant.vars['WTP_WordsList_+str(name_type)'] = self.player.WTP_WordsList_+str(name_type)
Obviously, it does not work since I have the feeling that it is not possible to construct variables (or classes identifier) this way. I just started to really work on Python some weeks ago, so some of its aspects might escape me still. Could you help me on this issue? Thank you.
You can generate dynamic classes using the type constructor:
MyClass = type("MyClass", (BaseClass1, BaseClass2), {"attr1": "value1", ...})
Thus, according to your case, that would be:
cls = type(f"Task{TaskNumber}", (Page, ), {"form_fields": [f"Envive_WordList_{name_type}"], ...})
Note that you still have to construct your common methods like __init__, is_displayed and so on, as inner functions of the class factory:
def class_factory(*args, **kwargs):
...
def is_displayed(self):
return self.round_number == self.participant.vars['task_rounds']
def vars_for_template(player):
...
# Classmethod wrapping is done below
def live_method(player, data):
...
cls = type(..., {
"is_displayed": is_displayed,
"vars_for_template": vars_for_template,
"live_method": classmethod(live_method),
...,
}
#classmethod could be used as a function - {"live_method": classmethod(my_method)}
I'm learning about classes in Python, particularly about nested classes.
I'm trying to execute the below code and I get an error: int object is not callable, but
I don't understand why!
All I want is to create an object that identify Man, and he has hands, and the hands have their own size, length, etc...
I want to be able to set the hand size and get its value in the most elegant and easy way as possible and nothing work for me... I tried the below code and I really thought it would work but it didn't and now I know that "I Don't know" what to do for real.
class Man:
def __init__(self, name):
self.name = name
self.hand = self.Hand_Object() # Here we reference an Object called
# "hand" to the subvlass "Hand_Object".
def length(self , length):
self.length = length
def handsize(self, size=None): # This "handsize()" function will call the
# subclass function "length()" out from the
# Hand_Object vlass when it will be issued
# in the program.
if size==None:
return = self.hand.length()
else:
self.hand.length(size) # The "length()" function of the "Hand_Object"
# class requires a variable, so when we call
# that function we need to add a variable to it.
class Hand_Object:
def length(self, length=None):
if length == None:
return self.length
else:
self.length = length
def fingers(self, fingers):
self.fingers = fingers
myman = Man('shlomi')
myman.handsize(6)
print(myman.handsize()) # Here I get the error.
The issue is the line self.length = length in Hand_Object. You're overwriting the function length with an integer. You should call the function and the variable something different.
I'm writing a script to find the moving average of different stocks. This script runs continuously, looping through my API call to add the current price to a list before averaging it. This works fine, however I'd like to be able to put this into a function to where the only input I need to give it is the name of the stock. I'd like this script to work for as many stocks as I want to specify, at the same time. That's where I run into issues because for each stock I have I need to have an empty list predefined that can hold the pricing information.
I've been trying to use the name of the stock to then create a related list, but as I now understand it it's not a great idea using one variable name to create another variable, so I'm not sure what to do. I believe the usual solution here would be to use a dictionary, but I'm a beginner to programming in general so I haven't figured out how to fit that into my situation. Any help would be greatly appreciated.
def sma(stock_name):
list_exists = stock_name + "_list" in locals() or stock_name + "_list" in globals()
if list_exists:
print()
else:
stock_name + "_list" = [] # Problem line, I would like for this to create a list called stock_name_list
stock_price = requests.get("myapi.com", params={"stock_name": stock_name, "bla bla": "blah"})
stock_name_list.append(stock_price)
When you have an operation based on a version of the data specific to that operation, that is usually a good time to think about using classes. This particular proposed class will encapsulate the name of a stock, the list of data specific to it, and perform sma on it:
class Stock:
n = 10
def __init__(self, name):
self.name = name
self.data = []
def sma(self):
stock_price = requests.get("myapi.com", params={"stock_name": self.stock_name, "bla bla": "blah"})
self.data.append(stock_price)
window = self.data[-n:]
return sum(window) / len(window)
Now you can maintain a dictionary of these objects. Any time you encounter a new stock, you just add an item to the dictionary:
stocks = {}
def sma(name):
stock = stocks.get(name)
if name is None: # None is what get returns when the key is missing
stock = Stock(name)
stocks[name] = stock
return stock.sma()
The nice thing is that you now have a dictionary of named datasets. If you want to add a different statistic, just add a method to the Stock class that implements it.
I defined a global sma function here that calls the eponymous method on the object it finds in your dictionary. You can carry encapsulation to an exterme by making the method perform the action of the function if called statically with a name instead of an instance. For example:
class Stock:
n = 10
named_stocks = {} # This is a class variable that replaces the global stocks
def __init__(self, name):
self.name = name
self.data = []
def sma(self):
if isinstance(self, str):
self = Stock.get_stock(self)
stock_price = requests.get("myapi.com", params={"stock_name": self.stock_name, "bla bla": "blah"})
self.data.append(stock_price)
window = self.data[-n:]
return sum(window) / len(window)
#classmethod
def get_stock(cls, name):
stock = cls.named_stocks.get(name)
if stock is None:
stock = cls(name)
cls.named_stocks[name] = stock
return stock
Now that there is a check for isinstance(self, str), you can call the sma method in one of two ways. You can all it directly on an instance, which knows its own name:
aapl = Stock('AAPL')
aapl.sma()
OR
Stock.get_stock('AAPL').sma()
Alternatively, you can call it on the class, and pass in a name:
Stock.sma('AAPL')
use defaultdict
from collections import defaultdict
stock_name_to_stock_prices = defaultdict(list)
stock_name_to_stock_prices['STOCK_NAME'].append(123.45)
My question is about getter/setter-type functionality in Python. I have a class, Week_Of_Meetings, that takes a blob of data from my Google Calendar and does some calculations on it.
wom = Week_Of_Meetings(google_meetings_blob)
I want to be able to return something like:
wom.total_seconds_in_meetings() # returns 36000
But, I'm not understanding how the getters/setters-type #property decorator can help me do this. In Java, I would use member variables, but you don't interact with them the same way in Python. How can I return calculations without starting with them in the constructor?
Class Week_Of_Meetings:
def __init__(self, google_meetings_blob)
self.google_meetings_blob = google_meetings_blob
def get_meetings_list(self, google_meetings_blob):
meetings_list = []
for meeting_id, meeting in enumerate(self.google_meetings_blob, 1):
summary = self._get_summary(meeting)
start = parse(meeting['start'].get('dateTime', meeting['start'].get('date')))
end = parse(meeting['end'].get('dateTime', meeting['end'].get('date')))
duration = end - start
num_attendees = self._get_num_attendees(meeting.get('attendees'))
m = Meeting(meeting_id, summary, start, end, duration, num_attendees)
meetings_list.append(m)
return meetings_list
def _get_summary(self, meeting):
summary = meeting.get('summary', 'No summary given')
return summary
def _get_num_attendees(self, num_attendees):
if num_attendees == None:
num_attendees = 1 # if invited only self to meeting
else:
num_attendees = len(num_attendees)
return num_attendees
When I add self.total_seconds_in_meetings to the
__init__()
I get "NameError: global name 'total_seconds_in_meetings' is not defined." That makes sense. It hasn't been defined. But I can't define it when it's supposed to be the result of calculations done on the google_meetings_blob. So, I'm confused where the 'total_seconds_in_meetings' goes in the class.
Thank you for the help!
Of course Python has member variables. How would classes work without them? You can set and get any instance data via self, as you are already doing with self.google_meetings_blob in __init__.
Below I have defined a basic class with some functions.
I am not sure of the best way to pass data from one function to be used in another.
My solution was to pass a dataframe as a parameter to the function. I am sure there is a better and more technically correct way so please call it out.
What I am trying to understand is why some functions in the class require the () to be used when calling them and others you can call just using the function name.
Once a ASX object called "market" is initiated The two examples are:
market.all_companies()
Returns: a dataframe
market.valid_industry
returns a series
class YourError( Exception ): pass
class asx(object):
def __init__(self, name):
try:
#initialise
self.name = name
#all companies on asx downloaded from asx website csv
df = pd.read_csv('http://asx.com.au/asx/research/ASXListedCompanies.csv', skiprows=1)
df.columns = ["company","asx_code","industry"]
df["yahoo_code"] = df["asx_code"]+".AX"
self.companies = df
self.industry = self.all_industry(df)
self.valid_stocks = self.valid_stocks(df)
self.valid_industry = self.valid_industry(df)
except:
raise YourError("asx companies CSV not available")
def all_companies(self):
return self.companies
def valid_industry(self,df):
return df["industry"].value_counts()
def all_industry(self,df):
return df["industry"].value_counts()
def valid_stocks(self,df):
return df[(df["industry"]!= "Not Applic") & (df["industry"]!="Class Pend")]
market = asx("asx")
market.all_companies()
market.valid_industry
All functions require () but you're doing some nasty stuff in you __init__ where you replace function with a series.
self.valid_industry = self.valid_industry(df)
this will overwrite the function valid_industry to no longer be a function on the instance created but to be value returned from self.valid_industry(df)
don't use same name for member properties and methods and all will make sense.
For your methods you don't need to pass in df as argument as you have it assigned to self.companies so your
def valid_industry(self,df):
return df["industry"].value_counts()
becomes:
def valid_industry(self):
return self.companies["industry"].value_counts()