Importing CSV lines into a class - python

This is an offensively simple question and I feel bad for even asking it, so some extent. I've been banging my head against the wall on this one for two days now.
I'm trying do to an object oriented program that takes the lines of a csv and turns each line of that CSV into a variable that I can use down the road. I want to somehow (I can't figure out how) get each line of that CSV into a class. I know this might not even be the best way to do this but I'm constrained to solve the problem in this way for other reasons.
I don't know enough Python to even know how to look up a solution to this and I need to know how to do this for a project I'm working on.
Here is the code I am basing this off:
import argparse
from collections import defaultdict
import csv
class Actor(object):
"""An actor with bounded rationality.
The methods on this class such as u_success, u_failure, eu_challenge are
meant to be calculated from the actor's perspective, which in practice
means that the actor's risk aversion is always used, including to calculate
utilities for other actors.
I don't understand why an actor would assume that other actors share the
same risk aversion, or how this implies that it is from the given actor's
point of view, but as far as I can tell this is faithful to BDM's original
formulation as well as Scholz's replication.
"""
def __init__(self, name, c, s, x, model, r=1.0):
self.name = name
self.c = c # capabilities, float between 0 and 1
self.s = s # salience, float between 0 and 1
self.x = x # number representing position on an issue
self.model = model
self.r = r # risk aversion, float between .5 and 2
def __str__(self):
return self.__repr__()
def __repr__(self):
return '%s(x=%s,c=%s,s=%s,r=%.2f)' % (
self.name, self.x, self.c, self.s, self.r)
def compare(self, x_j, x_k, risk=None):
"""Difference in utility to `self` between positions x_j and x_k."""
risk = risk or self.r
position_range = self.model.position_range
x_k_distance = (abs(self.x - x_k) / position_range) ** risk
x_j_distance = (abs(self.x - x_j) / position_range) ** risk
return self.c * self.s * (x_k_distance - x_j_distance)
def u_success(self, actor, x_j):
"""Utility to `actor` successfully challenging position x_j."""
position_range = self.model.position_range
val = 0.5 - 0.5 * abs(actor.x - x_j) / position_range
return 2 - 4 * val ** self.r
def u_failure(self, actor, x_j):
"""Utility to `actor` of failing in challenge position x_j."""
position_range = self.model.position_range
val = 0.5 + 0.5 * abs(actor.x - x_j) / position_range
return 2 - 4 * val ** self.r
def u_status_quo(self):
"""Utility to `self` of the status quo."""
return 2 - 4 * (0.5 ** self.r)
def eu_challenge(self, actor_i, actor_j):
"""Expected utility to `actor_i' of `actor_i` challenging `actor_j`.
This is calculated from the perspective of actor `self`, which in
practice means that `self.r` is used for risk aversion.
"""
prob_success = self.model.probability(actor_i.x, actor_j.x)
u_success = self.u_success(actor_i, actor_j.x)
u_failure = self.u_failure(actor_i, actor_j.x)
u_status_quo = self.u_status_quo()
eu_resist = actor_j.s * (
prob_success * u_success + (1 - prob_success) * u_failure)
eu_not_resist = (1 - actor_j.s) * u_success
eu_status_quo = self.model.q * u_status_quo
return eu_resist + eu_not_resist - eu_status_quo
def danger_level(self):
"""The amount of danger the actor is in from holding its policy position.
The smaller this number is, the more secure the actor is, in that it
expects fewer challenges to its position from other actors.
"""
return sum(self.eu_challenge(other_actor, self) for other_actor
in self.model.actors if other_actor != self)
def risk_acceptance(self):
"""Actor's risk acceptance, based on its current policy position.
I have two comments:
- It seems to me that BDM's intent was that in order to calculate
risk acceptance, one would need to compare an actor's danger level
across different policy positions that the actor could hold. Instead,
Scholz compares the actor's danger level to the danger level of all
other actors. This comparison doesn't seem relevant, given that other
actors will have danger levels not possible for the given actor
because of differences in salience and capability.
- Even (what I assume to be) BDM's original intention is an odd way to
calculate risk acceptance, given that the actor's policy position may
have been coerced, rather than having been chosen by the actor based
on its security preferences.
"""
# Alternative calculation, which I think is more faithful to
# BDM's original intent.
# orig_position = self.x
# possible_dangers = []
# for position in self.model.positions():
# self.x = position
# possible_dangers.append(self.danger_level())
# self.x = orig_position
# max_danger = max(possible_dangers)
# min_danger = min(possible_dangers)
# return ((2 * self.danger_level() - max_danger - min_danger) /
# (max_danger - min_danger))
danger_levels = [actor.danger_level() for actor in self.model.actors]
max_danger = max(danger_levels)
min_danger = min(danger_levels)
return ((2 * self.danger_level() - max_danger - min_danger) /
(max_danger - min_danger))
def risk_aversion(self):
risk = self.risk_acceptance()
return (1 - risk / 3.0) / (1 + risk / 3.0)
def best_offer(self):
offers = defaultdict(list)
for other_actor in self.model.actors:
if self.x == other_actor.x:
continue
offer = Offer.from_actors(self, other_actor)
if offer:
offers[offer.offer_type].append(offer)
best_offer = None
best_offer_key = lambda offer: abs(self.x - offer.position)
# This is faithful to Scholz' original code, but it appears to be a
# mistake, since Scholz' paper and BDM clearly state that each actor
# chooses the offer that requires him to change position the
# least. Instead, Scholz included a special case for compromises which
# results in some bizarre behavior, particularly in Round 4 when
# Belgium compromises with Netherlands to an extreme position rather
# than with France.
def compromise_best_offer_key(offer):
top = (abs(offer.eu) * offer.actor.x +
abs(offer.other_eu) * offer.other_actor.x)
return top / (abs(offer.eu) + abs(offer.other_eu))
if offers['confrontation']:
best_offer = min(offers['confrontation'], key=best_offer_key)
elif offers['compromise']:
best_offer = min(offers['compromise'],
key=compromise_best_offer_key)
elif offers['capitulation']:
best_offer = min(offers['capitulation'], key=best_offer_key)
return best_offer
class Offer(object):
CONFRONTATION = 'confrontation'
COMPROMISE = 'compromise'
CAPITULATION = 'capitulation'
OFFER_TYPES = (
CONFRONTATION,
COMPROMISE,
CAPITULATION,
)
def __init__(self, actor, other_actor, offer_type, eu, other_eu, position):
if offer_type not in self.OFFER_TYPES:
raise ValueError('offer_type "%s" not in %s'
% (offer_type, self.OFFER_TYPES))
self.actor = actor # actor receiving the offer
self.other_actor = other_actor # actor proposing the offer
self.offer_type = offer_type
self.eu = eu
self.other_eu = other_eu
self.position = position
#classmethod
def from_actors(cls, actor, other_actor):
eu_ij = actor.eu_challenge(actor, other_actor)
eu_ji = actor.eu_challenge(other_actor, actor)
if eu_ji > eu_ij > 0:
offer_type = cls.CONFRONTATION
position = other_actor.x
elif eu_ji > 0 > eu_ij and eu_ji > abs(eu_ij):
offer_type = cls.COMPROMISE
concession = (other_actor.x - actor.x) * abs(eu_ij / eu_ji)
position = actor.x + concession
elif eu_ji > 0 > eu_ij and eu_ji < abs(eu_ji):
offer_type = cls.CAPITULATION
position = other_actor.x
else:
return None
return cls(actor, other_actor, offer_type, eu_ij, eu_ji, position)
def __str__(self):
return self.__repr__()
def __repr__(self):
type_to_fmt = {
self.CONFRONTATION: '%s loses confrontation to %s',
self.COMPROMISE: '%s compromises with %s',
self.CAPITULATION: '%s capitulates to %s',
}
fmt = type_to_fmt[self.offer_type] + "\n\t%s vs %s\n\tnew_pos = %s"
return fmt % (self.actor.name, self.other_actor.name, self.eu,
self.other_eu, self.position)
class BDMScholzModel(object):
"""An expected utility model for political forecasting."""
def __init__(self, data, q=1.0):
self.actors = [
Actor(name=item['Actor'],
c=float(item['Capability']),
s=float(item['Salience']),
x=float(item['Position']),
model=self)
for item in data]
self.name_to_actor = {actor.name: actor for actor in self.actors}
self.q = q
positions = self.positions()
self.position_range = max(positions) - min(positions)
#classmethod
def from_csv_path(cls, csv_path):
return cls(csv.DictReader(open(csv_path, 'rU')))
def actor_by_name(self, name):
return self.name_to_actor.get(name)
def __getitem__(self, key):
return self.name_to_actor.get(key)
def positions(self):
return list({actor.x for actor in self.actors})
def median_position(self):
positions = self.positions()
median = positions[0]
for position in positions[1:]:
votes = sum(actor.compare(position, median, risk=1.0)
for actor in self.actors)
if votes > 0:
median = position
return median
def mean_position(self):
return (sum(actor.c * actor.s * actor.x for actor in self.actors) /
sum(actor.c * actor.s for actor in self.actors))
def probability(self, x_i, x_j):
if x_i == x_j:
return 0.0
# `sum_all_votes` below is faithful to Scholz' code, but I think it is
# quite contrary to BDM's intent. Instead, we should have.
# denominator = sum(actor.compare(x_i, x_j) for actor in self.actors)
# This would make sure that prob(x_i, x_j) + prob(x_j, x_i) == 1.
# However, because of the odd way that salience values are used as
# the probability that an actor will resist a proposal, this results in
# the actors almost always confronting each other.
# My theory is that Scholz got around the confrontation problem by
# introducing this large denominator, causing extremely small
# probability values. This prevents actors from confronting each other
# constantly, but the result is comical, in that the challenging actor
# always has a vanishingly small chance of winning a conflict, yet the
# challenged actor often gives up without a fight because of low
# salience.
sum_all_votes = sum(abs(actor.compare(a1.x, a2.x))
for actor in self.actors
for a1 in self.actors
for a2 in self.actors)
return (sum(max(0, actor.compare(x_i, x_j)) for actor in self.actors) /
sum_all_votes)
def update_risk_aversions(self):
for actor in self.actors:
actor.r = 1.0
actor_to_risk_aversion = [(actor, actor.risk_aversion())
for actor in self.actors]
for actor, risk_aversion in actor_to_risk_aversion:
actor.r = risk_aversion
def update_positions(self):
actor_to_best_offer = [(actor, actor.best_offer())
for actor in self.actors]
for actor, best_offer in actor_to_best_offer:
if best_offer:
print best_offer
actor.x = best_offer.position
def run_model(self, num_rounds=1):
print 'Median position: %s' % self.median_position()
print 'Mean position: %s' % self.mean_position()
for round_ in range(1, num_rounds + 1):
print ''
print 'ROUND %d' % round_
self.update_risk_aversions()
self.update_positions()
print ''
print 'Median position: %s' % self.median_position()
print 'Mean position: %s' % self.mean_position()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'csv_path',
help='path to csv with input data')
parser.add_argument(
'num_rounds',
help='number of rounds of simulation to run',
type=int)
args = parser.parse_args()
model = BDMScholzModel.from_csv_path(args.csv_path)
model.run_model(num_rounds=args.num_rounds)

Yeah, that's a lot of code, but reading the code, and then running it, I can see what's going on.
You're probably getting this error:
% python2 so.py sample.csv 1
Traceback (most recent call last):
File "so.py", line 336, in <module>
model = BDMScholzModel.from_csv_path(args.csv_path)
File "so.py", line 241, in from_csv_path
return cls(csv.DictReader(open(csv_path, 'rU')))
File "so.py", line 233, in __init__
for item in data]
KeyError: 'Actor'
And you're getting that error because just creating a DictReader doesn't actually read the data, that's still a set of steps you have to explicitly carry out. Here's the minimal example from the Python2 docs for DictReader:
import csv
with open('names.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
print(row['first_name'], row['last_name'])
In your case, you want to pass a list of dicts to your BDMScholzModel constructor, and in its __init__() method turn those individual dicts into Actors.
So, your from_csv_path() classmethod needs to look more like that example, with these changes:
create an empty list before creating the reader, data = []
inside the row-in-reader loop, just append each row to data, data.append(row) (DictReader handles the field/key names for you)
after the whole with-open block, finally call your BDMScholzModel initializer w/your data, return cls(data)
I did all that. Then sketched up this sample CSV:
sample.csv
Actor,Capability,Salience,Position
foo,1,1,1
bar,2,2,2
baz,3,3,3
I also added a debug-print statement just before the cls(data) call at the end of my new from_csv_path() classmethod:
print 'debug data: %s\n' % data
return cls(data)
And running:
python2 so.py sample.csv 1
got me:
debug data: [
{'Capability': '1', 'Position': '1', 'Salience': '1', 'Actor': 'foo'},
{'Capability': '2', 'Position': '2', 'Salience': '2', 'Actor': 'bar'},
{'Capability': '3', 'Position': '3', 'Salience': '3', 'Actor': 'baz'}
]
Median position: 3.0
Mean position: 2.57142857143
ROUND 1
Median position: 3.0
Mean position: 2.57142857143
Here's my complete from_csv_path() method:
#classmethod
def from_csv_path(cls, csv_path):
data = []
with open(csv_path) as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
data.append(row)
print 'debug data: %s\n' % data
return cls(data)

Related

I can’t understand why the ‘team’ term keeps coming up undefined?

While trying to complete an assignment, my result keeps coming up as partially correct in the ZyBooks system. I’ve tried everything I can possibly think of to solve the issue, and have no idea what else to try. Here are the instructions for the assignment:
class Team:
def __init__(self):
self.name = 'team'
self.wins = 0
self.losses = 0
# TODO: Define get_win_percentage()
def get_win_percentage(self):
return team.wins / (team.wins + team.losses)
# TODO: Define print_standing()
def print_standing(self):
print(f'Win percentage: {team.get_win_percentage():.2f}')
if team.get_win_percentage() >= 0.5:
print('Congratulations, Team', team.name,'has a winning average!')
else:
print('Team', team.name, 'has a losing average.')
if __name__ == "__main__":
team = Team()
user_name = input()
user_wins = int(input())
user_losses = int(input())
team.name = user_name
team.wins = user_wins
team.losses = user_losses
team.print_standing()
I’m passing all the auto-generated tests aside from the last three, and I can’t understand why? To Do’s have to be included as well.
As noted in comments, in the below you've used team rather than self.
# TODO: Define get_win_percentage()
def get_win_percentage(self):
return team.wins / (team.wins + team.losses)
# TODO: Define print_standing()
def print_standing(self):
print(f'Win percentage: {team.get_win_percentage():.2f}')
if team.get_win_percentage() >= 0.5:
print('Congratulations, Team', team.name,'has a winning average!')
else:
print('Team', team.name, 'has a losing average.')
Corrected:
# TODO: Define get_win_percentage()
def get_win_percentage(self):
return self.wins / (self.wins + self.losses)
# TODO: Define print_standing()
def print_standing(self):
print(f'Win percentage: {self.get_win_percentage():.2f}')
if self.get_win_percentage() >= 0.5:
print('Congratulations, Team', self.name, 'has a winning average!')
else:
print('Team', self.name, 'has a losing average.')
Just to add to the answer of joshmeranda.
In the class, when you call the atribute of this object you use the "self" parameter. This parameter is used to reference the instance itself.
For example:
def print_wins(self): print(self.wins) to refence the instance
When you do:
team = Team()
you created a instance of team that is called "team". With this, you can print the numbers of win, for example: print(team.wins)
Or, call the function "print_wins":
team.print_wins()

How to print my class and not get memory reference

I have a class called Pension, with attributes like a person's name, age, savings and a growth rate.
I have a class method which calculates the person's total savings at retirement year.
Under my main function, I want to print the class to see if my code is working as intended, but I don't know how to do as I only get the memory reference when printing.
How can I print the class instance, so that it goes through all its attributes and runs the function result, and prints the result? Worth to note; to run the function 'result' which calculates the total pension, the growth rate is user inputted and in a function of its own (and is run in main())
For example, if I try to print the 2nd last line: print(pensions) I only get the memory reference. So in this case, if a person (the data for which I read in from a file) has saved up 1000 dollars (using my result method), I would like that fact to be printed into a list.
This is my code:
class Pension:
def __init__(self, name,age,savings,growth):
self.name = name
self.age = age
self.savings = savings
self.growth = growth
def result(self):
amount=self.savings
rate=1+(self.growth/100)
years=65-self.age
return (amount * (1 - pow(rate, years))) / (1 - rate)
def convert(elem: str):
if not elem.isdigit():
return elem
return float(elem)
def convert_row(r: list) -> list:
return [convert(e) for e in r]
def get_growth(msg: str = "Enter growth rate: "):
return float((input(msg).strip()))
def main():
with open('personer.txt') as f:
raw_data = f.readlines()
data = [row.split("/") for row in raw_data]
data = [convert_row(row) for row in data]
pensions = [Pension(*i, get_growth()) for i in data]
main()
From the Pension class object's perspective it doesn't actually matter how is the growth provided. Also in this case maybe it's worth to make the result a property, then there's no need to call it as a function (just access like any other property, but the values will be calculated "dynamically").
You can customize the __str__ method to return any str representation of your object.
class Pension:
def __init__(self, name,age,savings,growth):
self.name = name
self.age = age
self.savings = savings
self.growth = growth
self._result = result
#property
def result(self):
amount=self.savings
rate=1+(self.growth/100)
years=65-self.age
return (amount * (1 - pow(rate, years))) / (1 - rate)
def __str__(self):
return f"Pension:\n{self.amount=}\n{self.age=}\n{self.savings}\n{self.growth=}\n{self.result}"
And then just:
for p in pensions:
print(p)

Optimisation algorithm, constraints and score calculation configurations with Optapy

I am using the Optapy library in python, and I am using the school timetabling instance on GitHub as a base. I have few questions regarding the library configurations:
How do I choose the optimisation algorithm (e.g. tabu search or
simulated annealing)?
How do Optapy calculate the score of a solution? Do I have the option
to change the score calculation type in python?
How can I decide the weights for each constraint, except hard or soft
constraint?
I was looking at OptaPlanner User Guide, but I am not sure how to implement it on python.
Guidance appreciated.
OptaPy can be configured using the programmatic API. The config classes can be found in the optapy.config package. In particular, you choose the optimisation algorithm via withPhases:
import optapy.config
solver_config = optapy.config.solver.SolverConfig().withEntityClasses(get_class(Lesson)) \
.withSolutionClass(get_class(TimeTable)) \
.withConstraintProviderClass(get_class(define_constraints)) \
.withTerminationSpentLimit(Duration.ofSeconds(30)) \
.withPhases([
optapy.config.constructionheuristic.ConstructionHeuristicPhaseConfig(),
optapy.config.localsearch.LocalSearchPhaseConfig()
.withAcceptorConfig(optapy.config.localsearch.decider.acceptor.LocalSearchAcceptorConfig()
.withSimulatedAnnealingStartingTemperature("0hard/0soft"))
])
(the above configures simulated annealing).
Recently added was the #easy_score_calculator and #incremental_score_calculator decorators, which allows you to define an EasyScoreCalculator or IncrementalScoreCalculator respectively. For example, (EasyScoreCalculator, maximize value):
#optapy.easy_score_calculator
def my_score_calculator(solution: Solution):
total_score = 0
for entity in solution.entity_list:
total_score += 0 if entity.value is None else entity.value
return optapy.score.SimpleScore.of(total_score)
solver_config = optapy.config.solver.SolverConfig()
termination_config = optapy.config.solver.termination.TerminationConfig()
termination_config.setBestScoreLimit('9')
solver_config.withSolutionClass(optapy.get_class(Solution)) \
.withEntityClasses(optapy.get_class(Entity)) \
.withEasyScoreCalculatorClass(optapy.get_class(my_score_calculator)) \
.withTerminationConfig(termination_config)
or with an IncrementalScoreCalculator (NQueens):
#optapy.incremental_score_calculator
class IncrementalScoreCalculator:
score: int
row_index_map: dict
ascending_diagonal_index_map: dict
descending_diagonal_index_map: dict
def resetWorkingSolution(self, working_solution: Solution):
n = working_solution.n
self.row_index_map = dict()
self.ascending_diagonal_index_map = dict()
self.descending_diagonal_index_map = dict()
for i in range(n):
self.row_index_map[i] = list()
self.ascending_diagonal_index_map[i] = list()
self.descending_diagonal_index_map[i] = list()
if i != 0:
self.ascending_diagonal_index_map[n - 1 + i] = list()
self.descending_diagonal_index_map[-i] = list()
self.score = 0
for queen in working_solution.queen_list:
self.insert(queen)
def beforeEntityAdded(self, entity: any):
pass
def afterEntityAdded(self, entity: any):
self.insert(entity)
def beforeVariableChanged(self, entity: any, variableName: str):
self.retract(entity)
def afterVariableChanged(self, entity: any, variableName: str):
self.insert(entity)
def beforeEntityRemoved(self, entity: any):
self.retract(entity)
def afterEntityRemoved(self, entity: any):
pass
def insert(self, queen: Queen):
row = queen.row
if row is not None:
row_index = queen.row
row_index_list = self.row_index_map[row_index]
self.score -= len(row_index_list)
row_index_list.append(queen)
ascending_diagonal_index_list = self.ascending_diagonal_index_map[queen.getAscendingDiagonalIndex()]
self.score -= len(ascending_diagonal_index_list)
ascending_diagonal_index_list.append(queen)
descending_diagonal_index_list = self.descending_diagonal_index_map[queen.getDescendingDiagonalIndex()]
self.score -= len(descending_diagonal_index_list)
descending_diagonal_index_list.append(queen)
def retract(self, queen: Queen):
row = queen.row
if row is not None:
row_index = queen.row
row_index_list = self.row_index_map[row_index]
row_index_list.remove(queen)
self.score += len(row_index_list)
ascending_diagonal_index_list = self.ascending_diagonal_index_map[queen.getAscendingDiagonalIndex()]
ascending_diagonal_index_list.remove(queen)
self.score += len(ascending_diagonal_index_list)
descending_diagonal_index_list = self.descending_diagonal_index_map[queen.getDescendingDiagonalIndex()]
descending_diagonal_index_list.remove(queen)
self.score += len(descending_diagonal_index_list)
def calculateScore(self) -> optapy.score.SimpleScore:
return optapy.score.SimpleScore.of(self.score)
solver_config = optapy.config.solver.SolverConfig()
termination_config = optapy.config.solver.termination.TerminationConfig()
termination_config.setBestScoreLimit('0')
solver_config.withSolutionClass(optapy.get_class(Solution)) \
.withEntityClasses(optapy.get_class(Queen)) \
.withScoreDirectorFactory(optapy.config.score.director.ScoreDirectorFactoryConfig() \
.withIncrementalScoreCalculatorClass(optapy.get_class(IncrementalScoreCalculator))) \
.withTerminationConfig(termination_config)
If by weights you mean ConstraintConfiguration (which allows you to define custom constraint weights per problem), that is not exposed via OptaPy yet. If you mean how to make a constraint weight more/less, either change the second parameter to penalize/reward (if constant), or add a third parameter that computes the constraint multiplier (which the second parameter will be multiplied by), like so:
def undesired_day_for_employee(constraint_factory: ConstraintFactory):
return constraint_factory.forEach(shift_class) \
.join(availability_class, [Joiners.equal(lambda shift: shift.employee,
lambda availability: availability.employee),
Joiners.equal(lambda shift: shift.start.date(),
lambda availability: availability.date)
]) \
.filter(lambda shift, availability: availability.availability_type == AvailabilityType.UNDESIRED) \
.penalize('Undesired day for employee', HardSoftScore.ofSoft(2),
lambda shift, availability: get_shift_duration_in_minutes(shift))
(this constraint penalizes by 2 soft for every minute an employee works on an UNDESIRED day)

Creating a List of a List in heavily repeated python functions

I'm rather new to python especially when it comes to class attributes and how they work. I've come across this problem where I have a function 'builddata' which outputs a list(Coarsegraining) of a few ints, and sends this to another function 'coarse_grain'.
Over the coarse of the script, these functions are called hundreds of times with CoarseGraining being different every time. What I want to do, is either:
a) Every time CoarseGraining reaches 'coarse_grain' it use that instance, but also saves it to a larger list, which after several repetitions of the function, will contain however many of these different CoarseGraining configurations there are, which can then be used later.
b) Define this process elsewhere, where CoarseGraining is instead sent to 2 functions, where it goes through its usual process in one, but then also is configured into this so called list of a list, which can then be used.
I should also mention, all these functions are defined within the same class 'MultiFitter'. I'd prefer method a) for simplicity reasons, but any possible solutions would be great. Below is a small excerpt of what i'm talking about.
Cheers
class MultiFitter(object):
def __init__(
self, models, mopt=None, ratio=False, fast=True, extend=False,
fitname=None, wavg_svdcut=None, **fitterargs
):
super(MultiFitter, self).__init__()
models = [models] if isinstance(models, MultiFitterModel) else models
self.models = models
self.fit = None # last fit
self.ratio = ratio
self.mopt = mopt
self.fast = fast
self.extend = extend
self.wavg_svdcut = wavg_svdcut
self.fitterargs = fitterargs
self.fitname = (
fitname if fitname is not None else
lambda x : x
)
def builddata(self, data=None, pdata=None, prior=None, mf=None):
if mf is None:
mf = self._get_mf()
mf['flatmodels'] = self.flatten_models(mf['models'])
if pdata is None:
if data is None:
raise ValueError('no data or pdata')
pdata = gvar.BufferDict()
for m in mf['flatmodels']:
M = m.builddata(data)
CoarseGraining = []
c1 = 1
c2 = 0
for i in range(1, M.shape[0]):
z = gvar.evalcorr([M[c2],M[i]])
corrValue = z[1][0]
if corrValue >= 0.7:
c1 = c1 + 1
if i == M.shape[0]-1:
CoarseGraining.append(int(c1))
else:
CoarseGraining.append(int(c1))
c2 = c2 + c1
c1 = 1
if i == M.shape[0]-1:
CoarseGraining.append(int(1))
pdata[m.datatag] = (
m.builddata(data) if m.ncg <= 1 else
MultiFitter.coarse_grain(m.builddata(data), CoarseGraining)
)
#staticmethod
def coarse_grain(G, CoarseGraining):
G = numpy.asarray(G)
D = []
counter = 0
for i, ncg in enumerate(CoarseGraining):
D.append(str(numpy.sum(G[..., counter:counter + ncg], axis=-1) / ncg))
counter = counter + ncg
D = numpy.asarray(D)
print(array, 'IS THIS IT???')
print(D ,'\n')
#return numpy.transpose([G])
return G
One way is to make coarse_grain a regular method of class MultiFitter and instantiate full_list in your class __init__. Then append to the list in your coarse_grain method.
You can then access your list of lists via self.full_list.
def __init__(...):
self.full_list = []
def coarse_grain(self, G, CoarseGraining):
G = numpy.asarray(G)
D = []
counter = 0
for i, ncg in enumerate(CoarseGraining):
D.append(str(numpy.sum(G[..., counter:counter + ncg], axis=-1) / ncg))
counter = counter + ncg
D = numpy.asarray(D)
self.full_list.append(D)
return G

Segmentation fault 11, python hash with lists, hashing 1 million objects

When I try to make and hash objects from a file, containing one million songs, I get a weird segmentation error after about 12000 succesfull hashes.
Anyone have any idea why this:
Segmentation fault: 11
happens when I run the program?
I have these classes for hashing the objects:
class Node():
def __init__(self, key, value = None):
self.key = key
self.value = value
def __str__(self):
return str(self.key) + " : " + str(self.value)
class Hashtable():
def __init__(self, hashsize, hashlist = [None]):
self.hashsize = hashsize*2
self.hashlist = hashlist*(self.hashsize)
def __str__(self):
return self.hashlist
def hash_num(self, name):
result = 0
name_list = list(name)
for letter in name_list:
result = (result*self.hashsize + ord(letter))%self.hashsize
return result
def check(self, num):
if self.hashlist[num] != None:
num = (num + 11**2)%self.hashsize#Kolla här jättemycket!
chk_num = self.check(num)#här med
return chk_num#lär dig
else:
return num
def check_atom(self, num, name):
if self.hashlist[num].key == name:
return num
else:
num = (num + 11**2)%self.hashsize
chk_num = self.check_atom(num, name)#läs här
return chk_num#läs det här
def put(self, name, new_atom):
node = Node(name)
node.value = new_atom
num = self.hash_num(name)
chk_num = self.check(num)
print(chk_num)
self.hashlist[chk_num] = node
def get(self, name):
num = self.hash_num(name)
chk_num = self.check_atom(num, name)
atom = self.hashlist[chk_num]
return atom.value
And I call upon the function in this code:
from time import *
from hashlist import *
import sys
sys.setrecursionlimit(1000000000)
def lasfil(filnamn, h):
with open(filnamn, "r", encoding="utf-8") as fil:
for rad in fil:
data = rad.split("<SEP>")
artist = data[2].strip()
song = data[3].strip()
h.put(artist, song)
def hitta(artist, h):
try:
start = time()
print(h.get(artist))
stop = time()
tidhash = stop - start
return tidhash
except AttributeError:
pass
h = Hashtable(1000000)
lasfil("write.txt", h)
The reason you're getting a segmentation fault is this line:
sys.setrecursionlimit(1000000000)
I assume you added it because you received a RuntimeError: maximum recursion depth exceeded. Raising the recursion limit doesn't allocate any more memory for the call stack, it just defers the aforementioned exception. If you set it too high, the interpreter runs out of stack space and accesses memory that doesn't belong to it, causing random errors (likely segfaults, but in theory anything is possible).
The real solution is to not use unbounded recursion. For things like balanced search trees, where the recursion depth is limited to a few dozen levels, it's okay, but you can't replace long loops with recursion.
Also, unless this is an exercise in creating hash tables, you should just use the built in dict. If it is an exercise in creating hash tables, consider this a hint that something about your hash table sucks: It indicates a probe length of at least 1000, more likely several thousand. It should only be a few dozen at most, ideally in the single digits.

Categories