'collections.OrderedDict' object has no attribute

'collections.OrderedDict' object has no attribute - python

import pandas as pd
xl=pd.ExcelFile('/Users/denniz/Desktop/WORKINGPAPER/FDIPOLITICS/python.xlsx')
dfs = pd.read_excel(xl,sheet_name=None, dtype={'COUNTRY':str,'YEAR': int, 'govtcon':float, 'trans':float},na_values = "Missing")
dfs.head()
After running the code above i got the following:
collections.OrderedDict object has no attribute 'head'

sheet_name = None will not work and you can combine reading excel file lines like this.
import pandas as pd
import xlrd
dfs=pd.read_excel('/Users/denniz/Desktop/WORKINGPAPER/FDIPOLITICS/python.xlsx',sheet_name=0, dtype={'COUNTRY':str,'YEAR': int, 'govtcon':float, 'trans':float},na_values = "Missing")
dfs.head()

I have read the API reference of pandas.read_excel. pandas.read_excel method will return DataFrame or dict of DataFrames.
As you set sheet_name=None, you will get All sheets returned in the form of a dict of DataFrames. The key of this dict will be the sheet name.
So in your code snippet, dfs is a dict not a DataFrames. Obviously, dict has no head method. Your code should be like this dfs[sheet_name].head().

Related

ValueError: DataFrame constructor not properly called! when coverting dictionaries within list to pandas dataframe

I want to convert a list of dictionaries to a pandas dataframe, however, I got ValueError: DataFrame constructor not properly called!
Below is an example and how I got the data:
import requests
import pandas as pd
# Send an HTTP GET request to the URL
response = requests.get(url)
# Decode the JSON data into a dictionary
scrapped_data = response.text
Content of response.text is:
[{"id":123456,"date":"12-12-2022","value":37},{"id":123456,"date":"13-12-2022","value":38}]
I want to convert it to a dataframe format like the following:
id
date
value
123456
12-12-2022
37
123456
13-12-2022
38
I tried the following methods:
df = pd.DataFrame(scrapped_data)
df = pd.DataFrame_from_dict(scrapped_data)
df = pd.DataFrame(scrapped_data, orient='columns')
all got the same value errors.
I also tried:
df = pd.json_normalize(scrapped_data)
but got NotImplementedError
The type for scrapped_data is string format
Thanks for your help, let me know if you have any questions

One reason for receiving this error from pandas is providing str as data. I think your data come as str, If it is the case then Try this:
import json
import pandas as pd
orignal_data='[{"id":"123456","date":"12-12-2022","value":"37"}, {"id":"123456","date":"13-12-2022","value":"38"}]'
scraped_data = json.loads(orignal_data)
df = pd.DataFrame(data=scraped_data)
df

As you said, scrapped_data is a string then you need to convert it into a dictionary (with the method loads from the json library for example).
If scrapped_data = '[{"id":"123456","date":"12-12-2022","value":"37"}, {"id":"123456","date":"13-12-2022","value":"38"}]',
then you can just do df = pd.DataFrame(scrapped_data).

Use string literal instead of header name in Pandas csv file manipulation

Python 3.9.5/Pandas 1.1.3
I use the following code to create a nested dictionary object from a csv file with headers:
import pandas as pd
import json
import os
csv = "/Users/me/file.csv"
csv_file = pd.read_csv(csv, sep=",", header=0, index_col=False)
csv_file['org'] = csv_file[['location', 'type']].apply(lambda s: s.to_dict(), axis=1)
This creates a nested object called org from the data in the columns called location and type.
Now let's say the type column doesn't even exist in the csv file, and I want to pass a literal string as a type value instead of the values from a column in the csv file. So for example, I want to create a nested object called org using the values from the data column as before, but I want to just use the string foo for all values of a key called type. How to accomplish this?

You could just build it by hand:
csv_file['org'] = csv_file['location'].apply(lambda x: {'location': x,
'type': 'foo'})

use Chainmap. This will allow to use multiple columns (columns_to_use), and even override existing ones (if type is in these columns, it will be overridden):
from collections import ChainMap
# .. some code
csv_file['org'] = csv_file[columns_to_use].apply(
lambda s: ChainMap({'type': 'foo'}, s.to_dict()), axis=1)
BTW, without adding constant values it could be done by df.to_dict():
csv_file['org'] = csv_file[['location', 'type']].to_dict('records')

Compare JSON with Dictionary and put Key-Value in JSON in Python

I have one dictionary and one one json file. I want to check the data exist in dictionary and put the value pair in the json on the compared attribute.
#import pandas as pd
import numpy as np
import pandas as pd
df = pd.read_csv("Iris2.csv" , encoding='ISO-8859-1')
df.head()
dict_from_csv = pd.read_csv('Iris2.csv',encoding='ISO-8859-1', header=None, index_col=0, squeeze=True).to_dict()
print(dict_from_csv)
enter image description here
And then I read the JSON attribute
import pandas as pd
json = pd.read_json (r'C:/Users/IT City/Downloads/data.json')
print(json)
json = pd.read_json (r'C:/Users/IT City/Downloads/data.json')
df.venue_info = pd.DataFrame(json.venue_info.values.tolist())['venue_name']
print(df.venue_info)
[enter image description here][2]
Now I have dictionary contains the csv file "dict_from_csv" and json attribute "df.venue_info"
I firstly compared the json venue_name with Dictionary and got the required results. I have the "Lat" attribute finally. And now I want to add this new attribute to JSON file where the "Lat" would be match otherwise it should place empty attribute on that.
for x in df.venue_info:
if((x in dict_from_csv) == True):
#print(x)
#print(dict_from_csv[x])
Lat = x+":"+dict_from_csv[x]
print(Lat)
else:
print("Not found ")
enter image description here
Please help me in this regard
Thank you

Pandas itertuples are not named tuples as expected?

Using this page from the Pandas documentation, I wanted to read a CSV into a dataframe, and then turn that dataframe into a list of named tuples.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.itertuples.html?highlight=itertuples
I ran the code below...
import pandas as pd
def csv_to_tup_list(filename):
myfile = filename
df = pd.read_csv(myfile,sep=',')
df.columns = ["term", "code"]
tup_list = []
for row in df.itertuples(index=False, name="Synonym"):
tup_list.append(row)
return (tup_list)
test = csv_to_tup_list("test.csv")
type(test[0])
... and the type returned is pandas.core.frame.Synonym, not named tuple. Is this how it is supposed to work, or am I doing something wrong?
My CSV data is just two columns of data:
a,1
b,2
c,3
for example.

"Named tuple" is not a type. namedtuple is a type factory. pandas.core.frame.Synonym is the type it created for this call, using the name you picked:
for row in df.itertuples(index=False, name="Synonym"):
# ^^^^^^^^^^^^^^
This is expected behavior.

Pandas naming multiple sheets according to list of lists

First time I post here and I am rather a newbie.
Anyhow, I have been playing around with Pandas and Numpy to make some calculations from Excel.
Now I want to create an .xlsx file to which I can output my results and I want each sheet to be named after the name of the dataframe that is being outputted.
This is my code, I tried a couple of different solutions but I can't figure how to write it.
In the code you can see that save_excel just makes numbered sheets (and it works great) and save_excelB tries to do what I am describing it but I can't get it to work.
from generate import A,b,L,dr,dx
from pandas import DataFrame as df
from pandas import ExcelWriter as ew
A=df(A) #turning numpy arrays into dataframes
b=df(b)
L=df(L)
dr=df(dr)
dx=df(dx)
C=[A,b,L,dr,dx] #making a list of the dataframes to iterate through
def save_excel(filename, item):
w=ew(filename)
for n, i in enumerate(item):
i.to_excel(w, "sheet%s" % n, index=False, header=False)
w.save()
def save_excelB(filename, item):
w=ew(filename)
for name in item:
i=globals()[name]
i.to_excel(w, sheet_name=name, index=False, header=False)
w.save()
I run both in the same way I call the function and I add the file name and for item I insert the list C I have made.
So it would be:
save_excelB("file.xlsx", C)
and this is what I get
TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

You need to pass string literals of data frame names in your function and not actual data frame objects:
C = ['A', 'b', 'L', 'dr', 'dx']
def save_excelB(filename, item):
w=ew(filename)
for name in item:
i=globals()[name]
i.to_excel(w, sheet_name=name, index=False, header=False)
w.save()
save_excelB("file.xlsx", C)
You can even dynamically create C with all dataframes currently in global environment by checking items that are pandas data frame class type:
import pandas as pd
...
C = [i for i in globals() if type(globals()[i]) is pd.core.frame.DataFrame]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

'collections.OrderedDict' object has no attribute - python

Related

ValueError: DataFrame constructor not properly called! when coverting dictionaries within list to pandas dataframe

Use string literal instead of header name in Pandas csv file manipulation

Compare JSON with Dictionary and put Key-Value in JSON in Python

Pandas itertuples are not named tuples as expected?

Pandas naming multiple sheets according to list of lists

Categories

Resources