Python ,pandas,data analysis - python

def section_articles():
Biology = (df2["Section"]=="Biology").sum()
Chemistry = (df2["Section"]=="Chemistry").sum()
Computer_Science = (df2["Section"]=="Computer Science").sum()
Earth_Environment = (df2["Section"]=="Earth & Environment").sum()
Mathematics = (df2["Section"]=="Mathematics").sum()
Physics = (df2["Section"]=="Physics").sum()
Statistics = (df2["Section"]=="Statistics").sum()
return()
print ("Biology",Biology)
print ("Chemistry",Chemistry)
print ("Computer_Science",Computer_Science)
print ("Earth_Environment",Earth_Environment)
print ("Mathematics",Mathematics)
print ("Physics",Physics)
print ("Statistics",Statistics)
section_articles()
I am expecting the number of articles in each section butgetting : Biology is not defined as error
can someone help me please

The issue is that the variables Biology, Chemistry, etc. are local variables defined inside the section_articles function, so they are not accessible outside of the function. To access the values returned by the function, you need to assign the function's output to a variable:
def section_articles():
Biology = (df2["Section"]=="Biology").sum()
Chemistry = (df2["Section"]=="Chemistry").sum()
Computer_Science = (df2["Section"]=="Computer Science").sum()
Earth_Environment = (df2["Section"]=="Earth & Environment").sum()
Mathematics = (df2["Section"]=="Mathematics").sum()
Physics = (df2["Section"]=="Physics").sum()
Statistics = (df2["Section"]=="Statistics").sum()
return (Biology, Chemistry, Computer_Science, Earth_Environment, Mathematics, Physics, Statistics)
section_counts = section_articles()
print ("Biology",section_counts[0])
print ("Chemistry",section_counts[1])
print ("Computer_Science",section_counts[2])
print ("Earth_Environment",section_counts[3])
print ("Mathematics",section_counts[4])
print ("Physics",section_counts[5])
print ("Statistics",section_counts[6])
An optimized version by using a dictionary to store the values of each section and then looping through the dictionary to print the values:
def section_articles():
sections = {"Biology": (df2["Section"]=="Biology").sum(),
"Chemistry": (df2["Section"]=="Chemistry").sum(),
"Computer Science": (df2["Section"]=="Computer Science").sum(),
"Earth & Environment": (df2["Section"]=="Earth & Environment").sum(),
"Mathematics": (df2["Section"]=="Mathematics").sum(),
"Physics": (df2["Section"]=="Physics").sum(),
"Statistics": (df2["Section"]=="Statistics").sum()}
return sections
section_counts = section_articles()
for section, count in section_counts.items():
print(f"{section}: {count}")

Your function returns an empty tuple () so you can't ask for its variables outside it.
One way to to fix the error and reduce visible noise is to make/return a dictionnary and loop:
def section_articles():
list_of_sections = ["Biology", "Chemistry", "Computer Science",
"Earth & Environment", "Mathematics", "Physics", "Statistics"]
return {k: (df2["Section"] == k).sum() for k in sections}
for k, v in section_articles().items():
print(k, v)
Another variant :
list_of_sections = ["Biology", "Chemistry", "Computer Science",
"Earth & Environment", "Mathematics", "Physics", "Statistics"]
def section_articles(section):
return (df2[section] == k).sum()
for section in list_of_sections:
print(section, section_articles(section))

Related

Creating a subject and grading system in Python

I am trying to create a gradingsystem for a UNI project.
we are told to have 3 global lists:
Emner = ["INFO100","INFO104","INFO110","INFO150","INFO125"]
FagKoder = [["Informasjonsvitenskap","INF"],["Kognitiv vitenskap","KVT"]
Karakterer=[["INFO100","C"],["INFO104","B"],["INFO110","E"]]
With these lists we are suppost to create a way to view the subjects(Emner), with grades from Karakterer, but we should also be able to view subjects without grades. It should be displayed like this:
We should also be able to add new subjects in (Emner) and add new Grades in (Karakterer). All of this should be displayed as in the picture above.
I have been trying all different kind of ways of doing this, but i keep returning to one of two problems. Either im not able to print a subject without a grade, or if i add a new subject(Emne), and want to add a grade(Karakter) i am not able to place it to the right Subject, as it just saves at the first one without a grade.
hope anyone can help me with this, going crazy here!
Code i have so far:
def emneliste():
global Emner
global Karakterer
emne,kar = zip(*Karakterer)
ans = [list(filter(None, i)) for i in itertools.zip_longest(Emner,kar)]
def LeggTilEmne():
global Karakterer
global Emner
nyttEmne = input("Skriv ny emnekode (4Bokstaver + 3 tall): ")
if nyttEmne not in Emner:
while re.match('^[A-Å]{3,4}[0-9]{3}$',nyttEmne):
Emner.append(nyttEmne)
print(nyttEmne + " Er lagt til!")
start()
print("Feil format")
LeggTilEmne()
else:
print("Dette Emnet er allerede i listen din")
start()
def SettKarakter():
global Karakterer
global Emner
VelgEmne = input("Hvilke emne? ")
Emne,Karakter = zip(*Karakterer)
if str(VelgEmne) not in str(Emner):
print("Dette faget er ikke i din liste")
feil = input("om du heller ønsket å opprette fag trykk 2, ellers trykk enter ")
if feil == str(2):
LeggTilEmne()
else:
start()
else:
if str(VelgEmne) in str(Karakterer):
index = Karakterer.index([VelgEmne,"C"])
Karakterer.pop(index)
SettKar = input("Karakter? ")
Emner.append([VelgEmne,SettKar])
print("Karakter " + SettKar + " Er Lagt til i " + VelgEmne)
start()
else:
SettKar = input("Karakter? ")
if str(VelgEmne) in str(Emner):
index = Emner.index(VelgEmne)
print(index)
Emner.pop(index)
Emner.insert(index,[VelgEmne,SettKar])
print("Karakter " + SettKar + " Er Lagt til i " + VelgEmne)
start()
else:
print("Virker Ikke")
start()
You can make Karakterer a dict instead so that you can iterate through the subjects in Emner and efficiently look up if a subject is in Karakterer with the in operator:
Karakterer = dict(Karakterer)
for subject in Emner:
print(*([subject] + ([Karakterer[subject]] if subject in Karakterer else [])))
This outputs:
INFO100 C
INFO104 B
INFO110 E
INFO150
INFO125
Here's an updated GradeHandler class demo. I tried to allow for updating grades, removing subjects, etc.:
__name__ = 'DEMO'
class GradeHandler(object):
EMNER = ["INFO100","INFO104","INFO110","INFO150","INFO125"]
FAGKODER= [["Informasjonsvitenskap","INF"],["Kognitiv vitenskap","KVT"]]
KARAKTERER = [["INFO100","C"],["INFO104","B"],["INFO110","E"]]
def __init__(self):
self.Emner = self.EMNER
self.FagKoder = self.FAGKODER
self.Karakterer = self.KARAKTERER
self.__create_grade_dict()
def remove_subject(self, subject_name):
"""
Remove a subject ot the classes class list variable.
"""
try:
self.Emner = [i for i in self.EMNER if i != subject_name]
self.__create_grade_dict()
except ValueError:
pass
def add_subject(self, subject_name):
"""
Append a subject ot the classes class list variable.
"""
if not subject_name in Emner:
self.Emner.append(subject_name)
self.__create_grade_dict()
def __create_grade_dict(self, grade_dict=None):
"""
Split grades matrix into separate parts; Create and set a dictionary of values.
"""
if grade_dict is None:
self.grade_dict = dict()
sub, grade = zip(*self.Karakterer)
karakterer_dict = {k:v for k, v in list(zip(sub, grade))}
for i in self.Emner:
if i in karakterer_dict.keys():
self.grade_dict[i] = karakterer_dict[i]
else:
self.grade_dict[i] = ''
def update_grade(self, subject_name, grade='A'):
"""
Update a grade in the grade dictionary.
Will also add a subject if not alrady in the dictionary.
"""
try:
self.grade_dict[subject_name] = grade
except (KeyError, ValueError):
pass
def print_grades(self, subject_name=None):
"""
Print dictionary results.
"""
if subject_name is None:
for k, v in self.grade_dict.items():
print('{} {}'.format(k, v))
else:
if subject_name in self.grade_dict.keys():
print('{} {}'.format(subject_name, self.grade_dict[subject_name]))
if __name__ == 'DEMO':
### Create an instance of the GradeHandler and print initial grades.
gh = GradeHandler()
gh.print_grades()
### Append a class
gh.add_subject('GE0124')
gh.print_grades()
### Add grade
gh.update_grade('GE0124', 'B+')
gh.print_grades()
### Update grades
gh.update_grade('GE0124', 'A-')
gh.print_grades()
### Remove subject (will also remove grade.
gh.remove_subject('GE0124')
gh.print_grades()

Calculating the average of specific numbers in a mixed text file?

I have a program that reads a file that has student names, IDs, majors, and GPAs in it.
For example (there is much more to the file):
OLIVER
8117411
English
2.09
OLIVIA
6478288
Law
3.11
HARRY
5520946
English
1.88
AMELIA
2440501
French
2.93
I have to figure out:
which medicine majors made the honor roll and
the average GPA of all the math majors
All I have right now is the list of medicine majors that made honor roll. I have no idea how to start calculating the average GPA of math majors. Any help is appreciated, and thanks in advance.
This is the code I currently have:
import students6
file = open("students.txt")
name = "x"
while name != "":
name, studentID, major, gpa = students6.readStudents6(file)
print(name, gpa, major, studentID)
if major == "Medicine" and gpa > "3.5":
print("Med student " + name + " made the honor roll.")
if major == "Math":
Here is the students6.py file that is being imported:
def readStudents6(file):
name = file.readline().rstrip()
studentID = file.readline().rstrip()
major = file.readline().rstrip()
gpa = file.readline().rstrip()
return name, studentID, major, gpa
You need to represent the data, currently you are returning tuples from reading the file.
Store them in a list, create methods to filter your students on theire major and one that creates the avgGPA of a given student-list.
You might want to make the GPA a float on reading:
with open("s.txt","w") as f:
f.write("OLIVER\n8117411\nEnglish\n2.09\nOLIVIA\n6478288\nLaw\n3.11\n" + \
"HARRY\n5520946\nEnglish\n1.88\nAMELIA\n2440501\nFrench\n2.93\n")
def readStudents6(file):
name = file.readline().rstrip()
studentID = file.readline().rstrip()
major = file.readline().rstrip()
gpa = float(file.readline().rstrip()) # make float
return name, studentID, major, gpa
Two new helper methods that work on the returned student-data-tuples:
def filterOnMajor(major,studs):
"""Filters the given list of students (studs) by its 3rd tuple-value. Students
data is given as (name,ID,major,gpa) tuples inside the list."""
return [s for s in studs if s[2] == major] # filter on certain major
def avgGpa(studs):
"""Returns the average GPA of all provided students. Students data
is given as (name,ID,major,gpa) tuples inside the list."""
return sum( s[3] for s in studs ) / len(studs) # calculate avgGpa
Main prog:
students = []
with open("s.txt","r") as f:
while True:
try:
stud = readStudents6(f)
if stud[0] == "":
break
students.append( stud )
except:
break
print(students , "\n")
engl = filterOnMajor("English",students)
print(engl, "Agv: ", avgGpa(engl))
Output:
# all students (reformatted)
[('OLIVER', '8117411', 'English', 2.09),
('OLIVIA', '6478288', 'Law', 3.11),
('HARRY', '5520946', 'English', 1.88),
('AMELIA', '2440501', 'French', 2.93)]
# english major with avgGPA (reformatted)
[('OLIVER', '8117411', 'English', 2.09),
('HARRY', '5520946', 'English', 1.88)] Agv: 1.9849999999999999
See: PyTut: List comprehensions and Built in functions (float, sum)
def prettyPrint(studs):
for name,id,major,gpa in studs:
print(f"Student {name} [{id}] with major {major} reached {gpa}")
prettyPrint(engl)
Output:
Student OLIVER [8117411] with major English reached 2.09
Student HARRY [5520946] with major English reached 1.88

Python: TypeError: 'list' object is not callable on global variable

I am currently in the process of programming a text-based adventure in Python as a learning exercise. I want "help" to be a global command, stored as values in a list, that can be called at (essentially) any time. As the player enters a new room, or the help options change, I reset the help_commands list with the new values. However, when I debug the following script, I get a 'list' object is not callable TypeError.
I have gone over my code time and time again and can't seem to figure out what's wrong. I'm somewhat new to Python, so I assume it's something simple I'm overlooking.
player = {
"name": "",
"gender": "",
"race": "",
"class": "",
"HP": 10,
}
global help_commands
help_commands = ["Save", "Quit", "Other"]
def help():
sub_help = '|'.join(help_commands)
print "The following commands are avalible: " + sub_help
def help_test():
help = ["Exit [direction], Open [object], Talk to [Person], Use [Item]"]
print "Before we go any further, I'd like to know a little more about you."
print "What is your name, young adventurer?"
player_name = raw_input(">> ").lower()
if player_name == "help":
help()
else:
player['name'] = player_name
print "It is nice to meet you, ", player['name'] + "."
help_test()
Edit:
You're like my Python guru, Moses. That fixed my problem, however now I can't get the values in help_commands to be overwritten by the new commands:
player = {
"name": "",
"gender": "",
"race": "",
"class": "",
"HP": 10,
}
# global help_commands
help_commands = ["Save", "Quit", "Other"]
def help():
sub_help = ' | '.join(help_commands)
return "The following commands are avalible: " + sub_help
def help_test():
print help()
help_commands = ["Exit [direction], Open [object], Talk to [Person], Use [Item]"]
print help()
print "Before we go any further, I'd like to know a little more about you."
print "What is your name, young adventurer?"
player_name = raw_input(">> ").lower()
if player_name == "help":
help()
else:
player['name'] = player_name
print "It is nice to meet you, ", player['name'] + "."
help_test()
Thoughts?
You are mixing the name of a list with that of a function:
help = ["Exit [direction], Open [object], Talk to [Person], Use [Item]"]
And then:
def help():
sub_help = '|'.join(help_commands)
print "The following commands are avalible: " + sub_help
The name help in the current scope (which references a list) is being treated as a callable, which is not the case.
Consider renaming the list or better still, both, since the name help is already being used by a builtin function.

Get variable name from input string without using an elif statement for all possibilities

I'm new to programming in Python 3.4.3 but I've started my first project that will help me as a chemist. I've created a series of code that asks the user to input the name of a chemical (e.g. water or ethanol) and then returns a list of chemical properties of the chemical. My code is below. I left out the chemical data I hard coded to make it easier to read hopefully.
print ("Welcome to [program name]! Copyright (C) 2015 [name omitted] All Rights Reserved.")
print ("----" * 4)
class Chemical():
def __init__(self, data):
self.data = data
def getData(self):
return self.data
chemical1 = Chemical("Data\nData\nData\n")
chemicalName = input("Choose a chemical: ")
if chemicalName == "chemical1":
print (chemical1.getData())
elif chemicalName == "Other chemical name":
# I now have a lot of elif statements, to account for all possible chemicals
print (other_chemical.getData())
else:
print ("\nThe chemical you chose hasn't been added yet\n")
When executed, if the input is chemical1 what will be produced is:
data
data
data
Basically, I have a huge list of chemicals with their data hard coded into the script and a large number of elif statements to account for each chemical (there are now 26 chemicals). The whole thing works perfectly but I would like to reduce the number of elif statements in order to reduce the number of lines of code. Perhaps by writing a for loop or some other expression as suggested by a friend of mine who has some experience with Python. I'm still a beginner so I'm not sure of all the different methods.
I tried this:
chemicalName = input("Choose a chemical: ")
print (chemicalName.getData())
but when I ran the script I got an error that said:
Traceback (most recent call last):
File "C:\Users\john\Desktop\test.py", line 42, in <module>
print (chemicalName.getData())
AttributeError: 'str' object has no attribute 'getData'
The idea is that when I input a chemical e.g. chemical2, I want chemicalName put straight into print (chemicalName.getData()) with the value of chemical2 which then prints the data I inputted for chemical2 = Chemical("Data\nData\nData")
I'm not sure what I can do at this point. To give you all an idea, I have around 150 lines of code with all the elif statements so any ideas or feedback is very welcome.
You can use a dictionary and work like this
class Chemical():
def __init__(self, name, blah1, blah2):
self.blah1 = blah1
self.blah2 = blah2
chemicals = {}
chemical1 = Chemical("chemical1", "Data", "Data")
chemicals[chemical1.name] = chemical1
chemicalName = input("Choose a chemical: ")
if chemicalName in chemicals:
print (chemical1.name + chemical1 + blah1 + chemical1.blah2)
else:
print ("\nThe chemical you chose hasn't been added yet\n")

"None" given when attempting to return the value of dict in Python

I'm trying to return the values of a dict when creating an instance of a class in Python, but I keep getting "None" returned instead.
I'm very new to Python, so I'm sure there is an easy answer to this one.
After running the below:
class TestTwo(object):
def __init__(self):
self.attributes = {
'age': "",
'name' : "",
'location': ""
}
def your_age(self):
self.attributes['age'] = raw_input("What is your age? > ")
self.your_name()
def your_name(self):
self.attributes['name'] = raw_input("What is your name? > ")
self.your_location()
def your_location(self):
self.attributes['location'] = raw_input("Where do you live? > ")
self.results()
def results(self):
print "You live in %s" % self.attributes['location']
print "Your number is %s" % self.attributes['age']
print "Your name is %s" % self.attributes['name']
d = self.attributes
return d
output = TestTwo().your_age()
print output
I end up with this:
MacBook-Pro-2:python johnrougeux$ python class_test.py
What is your age? > 30
What is your name? > John
Where do you live? > KY
You live in KY
Your number is 30
Your name is John
None
Instead of "None", I was expecting "{'age': '30', 'name': 'John', 'location': 'KY'}"
What am I missing?
Only results() returns something. You need to pass its return value along the call chain by returning it in the other functions if you want them to return something, too:
def your_age(self):
self.attributes['age'] = raw_input("What is your age? > ")
return self.your_name()
def your_name(self):
self.attributes['name'] = raw_input("What is your name? > ")
return self.your_location()
def your_location(self):
self.attributes['location'] = raw_input("Where do you live? > ")
return self.results()
Of course this kind of chaining is extremely ugly; but I'm sure you already know that. If not, rewrite your code like this:
in each of those functions, just set the value and do not call one of your other functions. Then add a function such as this:
def prompt_data(self):
self.your_age()
self.your_name()
self.your_location()
In the code using the class, do this:
t2 = TestTwo()
t2.prompt_data()
output = t2.results()
the function your_age() doesn't return any values, of course output is None

Categories