Python object and its elements - python

I am studying the codes written by other.
I think the owner wrote it using OOP.
when I print the results object sims
the output is something like below:
[<WorldCupSim.WorldCupSim object at 0x0000018DC471B908>, <WorldCupSim.WorldCupSim object at 0x0000018DC471B5C0>]
Number of objects in sims depends on number of iterations.
in this case, I run used two iterations.
I want to print the elements of the object sims.
It seems like I need to give more details about it.
Please advise me what are the other info to be provided.
I am confused with the codes.
Thanks
Zep

What you are seeing is a default representation of the list sims. If you do
print(sims[0])
then, if WorldCupSim.WorldCupSim has defined a string representation (which is what print() will show you) then you should see more useful stuff.

As per your code what I am seeing is sims is array of objects. You can use for loop to iterate your array. Below is the sample snippet
for o in sims:
print(o)
It will print all the objects inside sims variable

Related

Finding a set of values

I have a curve_hex generator, it uses two coordinates, I'm looking for only one coordinate. The script searches for one coordinate value easily, but if I specify several coordinates for the search, then it does not find anything.
wanted='58611774815559422402170859520215717661755632997646071327165159211728464937238'
curve_hex='(58611774815559422402170859520215717661755632997646071327165159211728464937238, 108722706890170119196943746054760504186165603293283329661416022207913727808252)'
if str(curve_hex).find(wanted)!=-1:
print (curve_hex[str(curve_hex).find(wanted):str(curve_hex).find(wanted)+len(wanted)])
else:
print ('not')
One value is found normally. But if I add several values, the script writes an error
wanted='58611774815559422402170859520215717661755632997646071327165159211728464937238', '108722706890170119196943746054760504186165603293283329661416022207913727808252'
curve_hex='(58611774815559422402170859520215717661755632997646071327165159211728464937238, 108722706890170119196943746054760504186165603293283329661416022207913727808252)'
if str(curve_hex).find(wanted)!=-1:
print (curve_hex[str(curve_hex).find(wanted):str(curve_hex).find(wanted)+len(wanted)])
else:
print ('not')
Tell me how to do it right. What am I doing wrong. I have just started learning python.
Very exciting that you are learning python, I would also suggest that you might want to spend some time in the stackoverflow section explaining how to ask a question because I am not 100% sure what you are trying to achieve with your code.
From my understanding, if you are happy with your if else conditions, and your only problem is that you can’t add multiple values to wanted. I would convert wanted to a list that includes all your wanted values, and loop over those.
Something like this:
wanted = ['586..{copy the whole value here}', '108...{copy the value here}']
curve_hex='(58611774815559422402170859520215717661755632997646071327165159211728464937238, 108722706890170119196943746054760504186165603293283329661416022207913727808252)'
for value in wanted:
if str(curve_hex).find(value)!=-1:
print (curve_hex[str(curve_hex).find(value):str(curve_hex).find(value)+len(value)])
else:
print ('not')
Edit: formatting and typos
You are misleadiong some basic concepts:
Difinitions like this result in diferent types
wanted='58611774815559422402170859520215717661755632997646071327165159211728464937238'
type(wanted) # string
wanted='58611774815559422402170859520215717661755632997646071327165159211728464937238', '108722706890170119196943746054760504186165603293283329661416022207913727808252'
type(wanted) # tuple
curve_hex='(58611774815559422402170859520215717661755632997646071327165159211728464937238, 108722706890170119196943746054760504186165603293283329661416022207913727808252)'
type(wanted) # string
So you should choose a type first. Tuple is the best case.
wanted=('58611774815559422402170859520215717661755632997646071327165159211728464937238', '108722706890170119196943746054760504186165603293283329661416022207913727808252')
curve_hex=('58611774815559422402170859520215717661755632997646071327165159211728464937238', '108722706890170119196943746054760504186165603293283329661416022207913727808252')
for i in wanted:
for x,j in enumerate(curve_hex):
if i in j:
print(f'{i} in {curve_hex} at position {x}')

Problem re-creating a forloop from a Kaggle competition

Recently for a course at University, our teacher asked us to re-create one of Kaggle's competitions. I chose to do this one.
I was able to follow the tutorial relatively well, until I reached the for loop they wrote to clean the text in the data frame. Here it is:
# Get the number of reviews based on the dataframe column size
num_reviews = train["review"].size
# Initialize an empty list to hold the clean reviews
clean_train_reviews = []
# Loop over each review; create an index i that goes from 0 to the length
# of the movie review list
for i in xrange( 0, num_reviews ):
# Call our function for each one, and add the result to the list of
# clean reviews
clean_train_reviews.append( review_to_words( train["review"][i] ) )
My problem is when they use 'xrange' in the loop, since that was never assigned to anything during the tutorial and is, therefore, returning an error when I try to run the code. I also checked the code provided by the author on github and they did the exact same thing. So, my question is: Is this simply a mistake on their end or am I missing something? If I am not missing anything, what should be in the place of'xrange'?
I've tried assigning the relevant dataframe column to a variable I can then use here, but I then get a TypeError, stating 'Series' object is not callable. My knowledge in Python is a bit elementary still, so I apologize if I am simply missing something obvious. Appreciate any help!
range and xrange are both python functions, the only thing, xrange is used only in python 2.
Both are implemented in different ways and have different characteristics associated with them. The points of comparison are:
Return Type:
Type 'list',
Type 'xrange'
Memory:
Big size,
Small size
Operation Usage:
As range() returns the list, all the operations that can be applied on the list can be used on it. On the other hand, as xrange() returns the xrange object, operations associated to list cannot be applied on them, hence a disadvantage.
Speed:
Because of the fact that xrange() evaluates only the generator object containing only the values that are required by lazy evaluation, therefore is faster in implementation than range().
More info:
source
You are probably having this issue because you are using python3.
Just use range function instead.

python dictionary for incremental values

I would like to perform the following:
Given a total number of observations (in this case the variable 'total_models'), I would like to parse this for parallel processing by a given number of python sessions ('sessions' variable and 'by' variable). I figure it best to perform this task using a dictionary.
The desired results should can be found in the 'obs_dict' object. For any given input to 'total_models', 'sessions' and 'by'. Can you assist in creating the desired output in a dictionary object? If possible, I would like to see the answer using some sort of list or dictionary comprehension.
total_models=1000000
sessions=4
by=int(total_models/sessions)
### Desired Output.
obs_dict={1:'0:250000',2:'250001:500000',3:'500001:750000',4:'750001:1000000'}
obs = {i+1: str(i*by+1)+':'+str((i+1)*by) for i in range(sessions)}
Edited, Kyle:
For the odd models it would seem wrapping a ceil the division will ensure we we do not go over the 'total_models'
total_models=1000326
sessions=5
by=math.ceil(total_models/sessions)
obs = {i+1: str(i*by+1)+':'+str(min((i+1)*by,total_models)) for i in range(sessions)}

Using list elements as variable names in Python

I'm guessing this will be a really simple problem but I have no solution yet.
I have a long code that does modelling and updates values of variables for optimisation. The code is initially written like this:
init_old(x,y):
return {(k):olddict[k][x][0]*prod[y] for k in realnames}
Q_house=init_old(“Q_house”,"P_house")
Q_car=init_old(“Q_car”,"P_car")
Q_holiday=init_old(“Q_holiday”,"P_holiday")
I already can simplify it a bit with a comprehension:
ListOfExpenses=["house","car","holiday"]
Q_house, Q_car, Q_holiday=[init_old(“Q_”+i,"P0_"+i) for i in ListOfExpenses]
I am trying to find an equivalent but more flexible way of writing that final line, so that I change the list of Expenses and the "Q_..." variables together easily:
ListOfExpenses=["house","car","holiday"]
ListOfCost=["Q_house","Q_car","Q_holiday"]
Elements_Of_ListOfCost=[init_old(“Q_”+i,"P0_"+i) for i in ListOfExpenses]
So when I look for Q_house, Q_car or Q_holiday later, it returns the same Q_house=init_old(“Q_house”,"P_house") calculated in the original code.
I don't want to use dictionaries for now as they would require major change to the rest of the code and calling dictionaries causes problems in some of the other functions. Thanks in advance for the help.

Python conciseness confuses me

I have been looking at Pandas: run length of NaN holes, and this code fragment from the comments in particular:
Series([len(list(g)) for k, g in groupby(a.isnull()) if k])
As a python newbie, I am very impressed by the conciseness but not sure how to read this. Is it short for something along the lines of
myList = []
for k, g in groupby(a.isnull()) :
if k:
myList.append(len(list(g)))
Series(myList)
In order to understand what is going on I was trying to play around with it but get an error:
list object is not callable
so not much luck there.
It would be lovely if someone could shed some light on this.
Thanks,
Anne
You've got the translation correct. However, the code you give cannot be run because a is a free variable.
My guess is that you are getting the error because you have assigned a list object to the name list. Don't do that, because list is a global name for the type of a list.
Also, in future please always provide a full stack trace, not just one part of it. Please also provide sufficient code that at least there are no free variables.
If that is all of your code, then you have only a few possibilities:
myList.append is really a list
len is really a list
list is really a list
isnull is really a list
groupby is really a list
Series is really a list
The error exists somewhere behind groupby.
I'm going to go ahead and strike out myList.append (because that is impossible unless you are using your own groupby function for some reason) and Series. Unless you are importing Series from somewhere strange, or you are re-assigning the variable, we know Series can't be a list. A similar argument can be made for a.isnull.
So that leaves us with two real possibilities. Either you have re-assigned something somewhere in your script to be a list where it shouldn't be, or the error is behind groupby.
I think you're using the wrong groupby itertools.groupby takes and array or list as an argument, groupby in pandas may evaluate the first argument as a function. I especially think this because isnull() returns an array-like object.

Categories