How to set a variable when using Pandas' read_csv - python

"test.csv" has columns "col_a", "col_b" and "col_c".
#import pandas import pandas as pd
df = pd.read_csv('./data/test.csv',header=0,dtype={'col_a':object,'col_b':object,'col_c':object})
This code can work well. But I would like to change the code using the variable "key_word" as follow, but it cannot work well.Why? How should I modify this code?
#import pandas import pandas as pd
key_word='col_a':object,'col_b':object,'col_c':object
df = pd.read_csv('./data/test.csv',header=0,dtype={key_word})

make key_word a dictionary by initializing it like this:
key_word={'col_a':object,'col_b':object,'col_c':object}
that should do the trick. right now it cannot possibly work since you produce a massive syntax error without curly brackets.

Related

Getting error TypeError: __init__() got an unexpected keyword argument 'sheet_name' when passing sheet_name argument to read_excel in pandas

Running this in a jupyter notebook with this. When I run it with just the file path it works fine but when I try to specify a sheet it gives me the error. What would be the right syntax to make this parameter (and I guess other parameters) work?
import pandas as pd
import datetime
import numpy as np
df = pd.DataFrame(pd.read_excel(the file path's name would be here), sheet_name='the sheets name'
)
df
Like Yuca said in the comments. You need to remove the pd.DataFrame
your code should be like this.
import pandas as pd
import datetime
import numpy as np
df = pd.read_excel('C:/path/to/file.xlsx', sheet_name='the sheets name')

Why is my csv file not printing after I import it?

import pandas as pd
tbl1 = pd.import_csv('sample_prices.csv')
tbl1.print()
and still not receiving anything? It does not even come up with an error.
The code might be written this way.
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
print(tbl1)
pd.import_csv doesn't exist. You probably meant to use pd.read_csv instead.
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
tbl1.print()
That said, I'm not sure why it wouldn't raise an error...
If you have a custom function called import_csv, you'll want to call it like this:
import pandas as pd
tbl1 = import_csv('sample_prices.csv')
tbl1.print()
...without the pd. prefix.
It should be writed like this
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
print(tbl1)

Problem when importing Excel File with Pandas

I'm new to python and was hoping someone could help me out.
I imported an excel file using pandas just to play around with. However when I try do any additional analysis or coding on the data it is only using the header row of the excel file.
Here's one of the codes I used:
import pandas as pd
df = pd.read_excel(r'C:\Users\at0789\Documents\Test File.xlsx')
data=list(df)
print(data)
Here's the output:
runfile('C:/Users/at0789/.spyder-py3/temp.py', wdir='C:/Users/at0789/.spyder-py3')
['Name', 'Number', 'Color', 'Date']
This is what my test file looks like:
you can pass only the string 'C:\Users\at0789\Documents\Test File.xlsx'
And you don't have to print the df, only call it, like that
import pandas as pd
df = pd.read_excel('C:\Users\at0789\Documents\Test File.xlsx')
df
import pandas as pd
df = pd.read_excel(r'C:\Users\at0789\Documents\Test File.xlsx')
df - data-frame
Data-frame have some many built-in function. With optimisation code with less line of code and high performance
One best feature is play example play with data as like sql query

Columns names issues using pandas.read_csv

I am pretty new to python.
I am trying to import the SMSSpam Collection Data using pandas read_csv module.
I
The import went went.
But as the file does not have header I tried to include columns names(variables names : "status" and "message" and ended up with empty file.
Here is my code:
import numpy as np
import pandas as pd
file_loc="C:\Users\User\Documents\JP\SMSCollection.txt"
df=pd.read_csv(file_loc,sep='\t')
The above code works well I got the I got the 5571 rows x 2 columns].
But when I add columns using the following line of code
df.columns=["status","message"]
I ended up with an empty df
Any help on this ?
Thanks
You could try to set the column names at read time:
df=pd.read_csv(file_loc,sep='\t',header=None,names=["status","message"])

read Json file using Pandas

I am trying to read a json file using pandas's read_json function and i am getting result but not what i want
My result have first row as a header (Titles) and i want to ignore first row in my result.
Below is my python code.
import json
import pandas as pd
result=pd.read_json('dummy_DB_clean.json')
print result
I tried pandas's json_normalize() function but did not get desired output.
If anyone of you , come across with this problem, please suggest me the solution.
Thanks,
Try this:
import json
import pandas as pd
df=pd.read_json('dummy_DB_clean.json')
df.drop(df.head(1).index, inplace=True)
print df

Categories