Getting AttributeError while calling RandomForest()

Getting AttributeError while calling RandomForest() - python

I have been trying to do hyperopt tuning using the following models but I keep getting this traceback. I have tried changing the parameters, added different code for the n_estimators but to no use. I am not able to solve it with any of the solutions that are available online.
# Defining Search Space
space = hp.choice('classifiers', [
{
'model': LogisticRegression(),
'params': {
'model__penalty': hp.choice('lr.penalty', ['l2']),
'model__C': hp.choice('lr.C', np.arange(0.005,1.0,0.01))
}
},
{
'model': BernoulliNB(),
'params': {}
},
{
'model': tree.DecisionTreeClassifier(),
'params': {
'model__max_depth' : hp.choice('tree.max_depth',
range(5, 30, 1)),
}
},
{
'model': xgb.XGBClassifier(),
'params': {
'model__max_depth' : hp.choice('xgb.max_depth',
range(5, 30, 1)),
'model__learning_rate': hp.loguniform ('learning_rate', 0.01, 0.5),
'model__gamma': hp.loguniform('xbg.gamma', 0.0, 2.0),
'model__random_state' : 42
}
},
# {
# 'model': GradientBoostingClassifier(),
# 'params': {
# 'model__n_estimators': hp.uniformint('n_estimators', 100, 500),
# 'model__max_depth': hp.uniformint('max_depth', 2, 20),
# 'model__random_state' : 42
# }
# },
{
'model': RandomForestClassifier(),
'params': {
'model__n_estimators' : hp.randint('rf.n_estimators_', [100, 200, 300, 400]),
'model__max_depth': hp.uniformint('rf.max_depth', 2, 20),
'model__min_samples_split':hp.uniformint('rf.min_samples_split', 2, 10),
'model__bootstrap': hp.choice('rf.bootstrap', [True, False]),
'model__max_features': hp.choice('rf.max_features', ['auto', 'sqrt']),
'model__random_state' : np.random.RandomState(42)
}
}
])
Traceback (most recent call last):
File "<input>", line 4, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll_utils.py", line 18, in wrapper
return f(label, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll_utils.py", line 72, in hp_choice
return scope.switch(ch, *options)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 188, in __call__
return self.symbol_table._new_apply(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 61, in _new_apply
pos_args = [as_apply(a) for a in args]
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 61, in <listcomp>
pos_args = [as_apply(a) for a in args]
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 211, in as_apply
named_args = [(k, as_apply(v)) for (k, v) in items]
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 211, in <listcomp>
named_args = [(k, as_apply(v)) for (k, v) in items]
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 217, in as_apply
rval = Literal(obj)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/hyperopt/pyll/base.py", line 534, in __init__
o_len = len(obj)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/sklearn/ensemble/_base.py", line 195, in __len__
return len(self.estimators_)
AttributeError: 'RandomForestClassifier' object has no attribute 'estimators_'
I have tried everything at this point and would appreciate any/all help. Thank you!

Related

how to add dictionary object name to json object

I have 3 python dictionaries as below:
gender = {'Female': 241, 'Male': 240}
marital_status = {'Divorced': 245, 'Engaged': 243, 'Married': 244, 'Partnered': 246, 'Single': 242}
family_type = {'Extended': 234, 'Joint': 235, 'Nuclear': 233, 'Single Parent': 236}
I add them to a list:
lst = [gender, marital_status, family_type]
And create a JSON object which I need to save as a JSON file using pd.to_json using:
jf = json.dumps(lst, indent = 4)
When we look at jf object:
print(jf)
[
{
"Female": 241,
"Male": 240
},
{
"Divorced": 245,
"Engaged": 243,
"Married": 244,
"Partnered": 246,
"Single": 242
},
{
"Extended": 234,
"Joint": 235,
"Nuclear": 233,
"Single Parent": 236
}
]
Is there a way to make the dictionary name as key and get output as below:
{
"gender": {
"Female": 241,
"Male": 240
},
"marital_status": {
"Divorced": 245,
"Engaged": 243,
"Married": 244,
"Partnered": 246,
"Single": 242
},
"family_type": {
"Extended": 234,
"Joint": 235,
"Nuclear": 233,
"Single Parent": 236
}
}

You'll have to do this manually by creating a dictionary and mapping the name to the sub_dictionary yourself.
my_data = {'gender': gender, 'marital_status':marital_status, 'family_type': family_type}
Edit: example of adding to an outfile using json.dump
with open('myfile.json','w') as wrtier:
json.dump(my_data, writer)

As per your requirement you can done it like this by replacing line lst
dict_req = {"gender":gender, "marital_status":marital_status, "family_type":family_type}

How to extract items inside JSON one by one with regex condition

I use Google Vision API on my project. The OCR result returns a JSON file that represents all the items the API recognized with coordinates. I want to add a feature that runs through the whole JOSN to find the item I want and then store the coordinate and the description into an array/list.
This is the returned JSON format:
{
"textAnnotations": [
{
"description": "a",
"boundingPoly": {
"vertices": [
{
"x": 235,
"y": 409
},
{
"x": 247,
"y": 408
},
{
"x": 250,
"y": 456
},
{
"x": 238,
"y": 457
}
]
}
},
{
"description": "b",
"boundingPoly": {
"vertices": [
{
"x": 235,
"y": 409
},
{
"x": 247,
"y": 408
},
{
"x": 250,
"y": 456
},
{
"x": 238,
"y": 457
}
]
}
},{c...},{d...},{e...}
],
"fullTextAnnotation": {
"pages": "not important",
"text": "a\nb\nc\nd\ne\n"
}
}
My aim is to find 2 items and calculate whether they are parallel. For example, I want to find out b or c or d or e is parallel with a, and I have already stored the coordinate of a into a list with this method:
def getJson():
try:
f = open('json_file.json', 'r', encoding="utf-8")
string = f.read()
origin_data = json.loads(string)
return origin_data
except Exception as e:
print(e)
print(traceback.format_exc())
def get_keywords_coordinates(origin_data):
__nodes = [__node for __node in origin_data['textAnnotations'] if __node['description'] == "a"]
__keyword_coords = []
for __lv in range(0, 4):
__tempx = __node['boundingPoly']['vertices'][__lv]['x']
__keyword_coords.append(__tempx)
__tempy = __node['boundingPoly']['vertices'][__lv]['y']
__keyword_coords.append(__tempy)
return __keyword_coords
which keyword_coords is the list that contains the coordinate, which looks like this:
keyword_coords[235, 409, 247, 408, 250, 456, 238, 457]
I will put it and another keyword coordinate into a function to do that calculation but I have no idea how to get the coordinate of b, c, d, and e one by one (abcde is just an example, the real situation will not be able to define the item name with hard code. I may let the program finds out the keywords with some regex)
How should I deal with this?

I don't know what exactly you want to do but it doesn't need regex but normal for-loop to work with items one by one.
First I would change get_keywords_coordinates to get all items and coordinates
def get_keywords_coordinates(data):
results = []
for item in data['textAnnotations']:
key = item["description"]
coords = []
for point in item["boundingPoly"]['vertices']:
coords.append(point['x'])
coords.append(point['y'])
results.append( (key, coords) )
return results
results = get_keywords_coordinates(data)
print('--- coords ---')
print(results)
Result:
--- coords ---
[
('a', [235, 409, 247, 408, 250, 456, 238, 457]),
('b', [335, 409, 347, 408, 350, 456, 338, 457]),
('c', [435, 409, 447, 408, 450, 456, 438, 457])
]
And I would get some selected itme (i.e. first item with a) and create list without this item
selected = results[0]
#rest = results[1:]
rest = results.copy() # more useful if I would selected item with different index
rest.remove(selected) # more useful if I would selected item with different index
print('--- items ---')
print('selected:', selected)
print('rest :', rest)
print('---')
Result:
--- items ---
selected: ('a', [235, 409, 247, 408, 250, 456, 238, 457])
rest : [('b', [335, 409, 347, 408, 350, 456, 338, 457]), ('c', [435, 409, 447, 408, 450, 456, 438, 457])]
And I could use for-loop to compare selected item with other items - one by one
for item in rest:
print('compare', selected[0], 'with', item[0])
print(selected[0], selected[1])
print(item[0], item[1])
Result:
compare a with b
a [235, 409, 247, 408, 250, 456, 238, 457]
b [335, 409, 347, 408, 350, 456, 338, 457]
compare a with c
a [235, 409, 247, 408, 250, 456, 238, 457]
c [435, 409, 447, 408, 450, 456, 438, 457]
Full example:
data = {
"textAnnotations": [
{
"description": "a",
"boundingPoly": {
"vertices": [
{
"x": 235,
"y": 409
},
{
"x": 247,
"y": 408
},
{
"x": 250,
"y": 456
},
{
"x": 238,
"y": 457
}
]
}
},
{
"description": "b",
"boundingPoly": {
"vertices": [
{
"x": 335,
"y": 409
},
{
"x": 347,
"y": 408
},
{
"x": 350,
"y": 456
},
{
"x": 338,
"y": 457
}
]
}
},
{
"description": "c",
"boundingPoly": {
"vertices": [
{
"x": 435,
"y": 409
},
{
"x": 447,
"y": 408
},
{
"x": 450,
"y": 456
},
{
"x": 438,
"y": 457
}
]
}
},
],
"fullTextAnnotation": {
"pages": "not important",
"text": "a\nb\nc\nd\ne\n"
}
}
def get_keywords_coordinates(data):
results = []
for item in data['textAnnotations']:
key = item["description"]
coords = []
for point in item["boundingPoly"]['vertices']:
coords.append(point['x'])
coords.append(point['y'])
results.append( (key, coords) )
return results
results = get_keywords_coordinates(data)
print('--- coords ---')
print(results)
selected = results[0]
#rest = results[1:]
rest = results.copy()
rest.remove(selected)
print('--- keywords ---')
print('selected:', selected)
print('rest :', rest)
print('---')
for item in rest:
print('compare', selected[0], 'with', item[0])
print(selected[0], selected[1])
print(item[0], item[1])

Iterate through nested JSON in Python

js = {
"status": "ok",
"meta": {
"count": 1
},
"data": {
"542250529": [
{
"all": {
"spotted": 438,
"battles_on_stunning_vehicles": 0,
"avg_damage_blocked": 39.4,
"capture_points": 40,
"explosion_hits": 0,
"piercings": 3519,
"xp": 376586,
"survived_battles": 136,
"dropped_capture_points": 382,
"damage_dealt": 783555,
"hits_percents": 74,
"draws": 2,
"battles": 290,
"damage_received": 330011,
"frags": 584,
"stun_number": 0,
"direct_hits_received": 1164,
"stun_assisted_damage": 0,
"hits": 4320,
"battle_avg_xp": 1299,
"wins": 202,
"losses": 86,
"piercings_received": 1004,
"no_damage_direct_hits_received": 103,
"shots": 5857,
"explosion_hits_received": 135,
"tanking_factor": 0.04
}
}
]
}
}
Let us name this json "js" as a variable, this variable will be in a for-loop.
To understand better what I'm doing here, I'm trying to collect data from a game.
This game has hundreds of different tanks, each tank has tank_id with which I can post tank_id to the game server and respond the performance data as "js".
for tank_id: json = requests.post(tank_id) etc...
and fetch all these values to my database as shown in the screenshot.
my python code for it:
def api_get():
for property in js['data']['542250529']['all']:
spotted = property['spotted']
battles_on_stunning_vehicles = property['battles_on_stunning_vehicles']
# etc
# ...
insert_to_db(spotted, battles_on_stunning_vehicles, etc....)
the exception is:
for property in js['data']['542250529']['all']:
TypeError: list indices must be integers or slices, not str
and when:
print(js['data']['542250529'])
i get the rest of the js as a string, and i can't iterate... can't be used a valid json string, also what's inside js['data']['542250529'] is a list containing only the item 'all'..., any help would be appreciated

You just missed [0] to get the first item in a list:
def api_get():
for property in js['data']['542250529'][0]['all']:
spotted = property['spotted']
# ...
Look carefully at the data structure in the source JSON.

There is a list containing the dictionary with a key of all. So you need to use js['data']['542250529'][0]['all'] not js['data']['542250529']['all']. Then you can use .items() to get the key-value pairs.
See below.
js = {
"status": "ok",
"meta": {
"count": 1
},
"data": {
"542250529": [
{
"all": {
"spotted": 438,
"battles_on_stunning_vehicles": 0,
"avg_damage_blocked": 39.4,
"capture_points": 40,
"explosion_hits": 0,
"piercings": 3519,
"xp": 376586,
"survived_battles": 136,
"dropped_capture_points": 382,
"damage_dealt": 783555,
"hits_percents": 74,
"draws": 2,
"battles": 290,
"damage_received": 330011,
"frags": 584,
"stun_number": 0,
"direct_hits_received": 1164,
"stun_assisted_damage": 0,
"hits": 4320,
"battle_avg_xp": 1299,
"wins": 202,
"losses": 86,
"piercings_received": 1004,
"no_damage_direct_hits_received": 103,
"shots": 5857,
"explosion_hits_received": 135,
"tanking_factor": 0.04
}
}
]
}
}
for key, val in js['data']['542250529'][0]['all'].items():
print("key:", key, " val:", val)
#Or this way
for key in js['data']['542250529'][0]['all']:
print("key:", key, " val:", js['data']['542250529'][0]['all'][key])

jsonb join not working properly in sqlalchemy

I have a query that joins on a jsonb type column in postgres that I want to convert to sqlalchemy in django using the aldjemy package
SELECT anon_1.key AS tag, count(anon_1.value ->> 'polarity') AS count_1, anon_1.value ->> 'polarity' AS anon_2
FROM feedback f
JOIN tagging t ON t.feedback_id = f.id
JOIN jsonb_each(t.json_content -> 'entityMap') AS anon_3 ON true
JOIN jsonb_each(((anon_3.value -> 'data') - 'selectionState') - 'segment') AS anon_1 ON true
where f.id = 2
GROUP BY anon_1.value ->> 'polarity', anon_1.key;
The json_content field stores data in the following format:
{
"entityMap":
{
"0":
{
"data":
{
"people":
{
"labelId": 5,
"polarity": "positive"
},
"segment": "a small segment",
"selectionState":
{
"focusKey": "9xrre",
"hasFocus": true,
"anchorKey": "9xrre",
"isBackward": false,
"focusOffset": 75,
"anchorOffset": 3
}
},
"type": "TAG",
"mutability": "IMMUTABLE"
},
"1":
{
"data":
{
"product":
{
"labelId": 6,
"polarity": "positive"
},
"segment": "another segment",
"selectionState":
{
"focusKey": "9xrre",
"hasFocus": true,
"anchorKey": "9xrre",
"isBackward": false,
"focusOffset": 138,
"anchorOffset": 79
}
},
"type": "TAG",
"mutability": "IMMUTABLE"
}
}
}
I wrote the following sqlalchemy code to achieve the query
first_alias = aliased(func.jsonb_each(Tagging.sa.json_content["entityMap"]))
print(first_alias)
second_alias = aliased(
func.jsonb_each(
first_alias.c.value.op("->")("data")
.op("-")("selectionState")
.op("-")("segment")
)
)
polarity = second_alias.c.value.op("->>")("polarity")
p_tag = second_alias.c.key
_count = (
Feedback.sa.query()
.join(
CampaignQuestion,
CampaignQuestion.sa.question_id == Feedback.sa.question_id,
isouter=True,
)
.join(Tagging)
.join(first_alias, true())
.join(second_alias, true())
.filter(CampaignQuestion.sa.campaign_id == campaign_id)
.with_entities(p_tag.label("p_tag"), func.count(polarity), polarity)
.group_by(polarity, p_tag)
.all()
)
print(_count)
but it is giving me a NotImplementedError: Operator 'getitem' is not supported on this expression error on accessing first_alias.c
the stack trace:
Traceback (most recent call last):
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/rest_framework/views.py", line 506, in dispatch
response = handler(request, *args, **kwargs)
File "/home/work/api/app/campaign/views.py", line 119, in results_p_tags
d = campaign_service.get_p_tag_count_for_campaign_results(id)
File "/home/work/api/app/campaign/services/campaign.py", line 177, in get_p_tag_count_for_campaign_results
return campaign_selectors.get_p_tag_counts_for_campaign(campaign_id)
File "/home/work/api/app/campaign/selectors.py", line 196, in get_p_tag_counts_for_campaign
polarity = second_alias.c.value.op("->>")("polarity")
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 1093, in __get__
obj.__dict__[self.__name__] = result = self.fget(obj)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/selectable.py", line 746, in columns
self._populate_column_collection()
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/selectable.py", line 1617, in _populate_column_collection
self.element._generate_fromclause_column_proxies(self)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/selectable.py", line 703, in _generate_fromclause_column_proxies
fromclause._columns._populate_separate_keys(
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/base.py", line 1216, in _populate_separate_keys
self._colset.update(c for k, c in self._collection)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/base.py", line 1216, in <genexpr>
self._colset.update(c for k, c in self._collection)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/operators.py", line 434, in __getitem__
return self.operate(getitem, index)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 831, in operate
return op(self.comparator, *other, **kwargs)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/operators.py", line 434, in __getitem__
return self.operate(getitem, index)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/type_api.py", line 75, in operate
return o[0](self.expr, op, *(other + o[1:]), **kwargs)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/default_comparator.py", line 173, in _getitem_impl
_unsupported_impl(expr, op, other, **kw)
File "/home/.cache/pypoetry/virtualenvs/api-FPSaTdE5-py3.8/lib/python3.8/site-packages/sqlalchemy/sql/default_comparator.py", line 177, in _unsupported_impl
raise NotImplementedError(
NotImplementedError: Operator 'getitem' is not supported on this expression
Any help would be greatly appreciated
PS: The sqlalchemy version I'm using for this is 1.4.6
I used the same sqlalchmy query expression before in a flask project using sqlalchemy version 1.3.22 and it was working correctly

Fixed the issue by using table_valued functions as mentioned in the docs,
and accessing the ColumnCollection of the function using indices instead of keys. Code is as follows:
first_alias = func.jsonb_each(Tagging.sa.json_content["entityMap"]).table_valued(
"key", "value"
)
second_alias = func.jsonb_each(
first_alias.c[1].op("->")("data").op("-")("selectionState").op("-")("segment")
).table_valued("key", "value")
polarity = second_alias.c[1].op("->>")("polarity")
p_tag = second_alias.c[0]

Partial updating of object in elastic search using python

So the puamapi/apiobjects_american/4901 object looks like this:
{
"_id": "4701",
"_index": "puamapi",
"_source": {
"CatRais": null,
"Classification": "Photographs",
"Constituents": [],
"CreditLine": "Gift of H. Kelley Rollings, Class of 1948, and Mrs. Rollings",
"CuratorApproved": 0,
"DateBegin": 1921,
"DateEnd": 1921,
"Dated": "1921",
"Department": "Photography",
"DimensionsLabel": "image: 19.3 x 24.6 cm (7 5/8 x 9 11/16 in.)\r\nsheet: 20.2 x 25.4 cm (7 15/16 x 10 in.)",
"Edition": null,
"Medium": "Gelatin silver print",
"ObjectID": 4701,
"ObjectNumber": "1995-341",
"ObjectStatus": "Accessioned Object",
"Restrictions": "Restricted",
"SortNumber": " 1995 341",
"SysTimeStamp": "AAAAAAAAC3k="
},
"_type": "apiobjects_american",
"_version": 4,
"found": true
}
I want to do a partial update on the object, where we add a constituent to the constituent array.
The record looks like this:
{'params': {'item': [{'ConstituentID': 5}]}, 'script': 'if (ctx._source[Constituents] == null) {ctx._source.Constituents = item } else { ctx._source.Constituents+= item }'}
And then I add with an elastic search instance in python:
es.update(index="puamapi", doc_type="apiobjects_american", id=4901, body=record)
But, I'm getting this error
Traceback (most recent call last):
File "json_to_elasticsearch.py", line 138, in <module>
load_xrefs(api_xrefs)
File "json_to_elasticsearch.py", line 118, in load_xrefs
load_xref(table, xref_map[table][0], xref_map[table][1], json.load(file)["RECORDS"])
File "json_to_elasticsearch.py", line 109, in load_xref
es.update(index=database, doc_type=table1, id=id1, body=record)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 69, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 460, in update
doc_type, id, '_update'), params=params, body=body)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 329, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 109, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 108, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
elasticsearch.exceptions.RequestError: TransportError(400, u'illegal_argument_exception', u'[Bastion][127.0.0.1:9300][indices:data/write/update[s]]')
Any insights would be appreciated. Thanks!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Getting AttributeError while calling RandomForest() - python

Related

how to add dictionary object name to json object

How to extract items inside JSON one by one with regex condition

Iterate through nested JSON in Python

jsonb join not working properly in sqlalchemy

Partial updating of object in elastic search using python

Categories

Resources