python not showing values in max() - python

The following code ran in PYCHARM editor
import pandas as pd
df=pd.read_csv('Train_UWu5bXk.csv')
df.min()
the output is not showing any value
C:\Users\krishna\PycharmProjects\pdproject\Scripts\python.exe
C:/Users/krishna/Python37-32/Scripts/phythonncode/load1pandas.py
Process finished with exit code 0
someone can help me.

the problem is solved with the following code in python
import pandas as pd
df=pd.read_csv('Train_UWu5bXk.csv')
print(df.min())

Related

Pandas shows up with a bunch of errers when running simple code

I am trying to write a program that chooses a recipie for me to cook at random so i found a csv file to get the recipies from but when i tried running some code this happened.
I'm new to pandas so sorry if this is stupid
`
import random
import pandas as pd
import numpy as np
pd.read_csv('recipes.csv')
pd.isnull()
`
https://github.com/cweber/cookbook/blob/master/recipes.csv (this is the link to the csv file)

VS Code Output result different from Terminal result

I just started learning python about a week ago and I am using VS code.
I am trying to run the following code:
import pandas as pd
df = pd.read_csv('ClassMarks.csv')
pd.set_option('display.max_rows', 85)
df['Final Marks'] = df['CPQ']+df['HW']+df['Tutorials']+df['Tests']
print(df)
in the Output terminal it gives:
File "/Users/User1/Desktop/test", line 1, in <module>
import pandas as pd
ImportError: No module named pandas
while the Terminal gives me the result table I expect, with no errors.
Does anyone knows what is happening or how to fix it?
Thanks in advance!
Edit:
The problem I had was, that my Output window was using python 2.7.16 as default. To change it to python 3.8.2, I found the answer in this video:
https://www.youtube.com/watch?v=06I63_p-2A4,
starting from min 43:00.
I hope this helps others!
The problem I had was, that my Output window was using python 2.7.16 as default. To change it to python 3.8.2, I found the answer in this video: https://www.youtube.com/watch?v=06I63_p-2A4, starting from min 43:00.
I hope this helps others!

Writing a UDF in Python using Pandas throwing error

We are trying to write UDFs of Hive in Python to clean the data. The UDF we tried was using Pandas and it is throwing the error.
When we try using another python code without the Pandas it is working fine. Kindly help to understand the problem. Providing Pandas code below:
We have already tried various ways of Pandas but unfortunately no luck. As the other Python code without Pandas is working fine,we are confused why is it failing?
import sys
import pandas as pd
import numpy as np
for line in sys.stdin:
df = line.split('\t')
df1 = pd.DataFrame(df)
df2=df1.T
df2[0] = np.where(df2[0].str.isalpha(), df2[0], np.nan)
df2[1] = np.where(df2[1].astype(str).str.isdigit(), df2[1], np.nan)
df2[2] = np.where(df2[2].astype(str).str.len() != 10, np.nan,
df2[2].astype(str))
#df2[3] = np.where(df2[3].astype(str).str.isdigit(), df2[3], np.nan)
df2 = df2.dropna()
print(df2)
I get this error:
FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. An error occurred when trying to close the Operator running your custom script.
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
I think you'll need to look at the detailed job logs for more information.
My first guess is that Pandas is not installed on a data node.
This answer looks appropriate for you if you intend to bundle dependencies with your job: https://stackoverflow.com/a/2869974/7379644

No lines of output shown after importing pandas

I installed pandas through pip, but when I import it, the code runs but no output is shown at all, right after the import statement.
Here's a sample of my code
import xlrd, xlwt
print("1")
import pandas as pd
print("2")
from math import trunc
1 is printed, but 2 isn't. After 1 is printed, the script just hangs for a few seconds and terminates. This occurs regardless of the code written below the import statement. I also seem to get the same error for the openpyxl module. Does anyone know a fix to this?

Read SAS file with pandas

I'm trying to use the pandas read_sas() function.
First, I create a SAS dataset by running this code in SAS:
libname tmp 'c:\temp';
data tmp.test;
do i=1 to 100;
x=rannor(0);
output;
end;
run;
Now, in IPython, I do this:
import numpy as np
import pandas as pd
%cd C:\temp
pd.read_sas('test.sas7bdat')
Pretty straightforward and seems like it should work. But I just get this error:
TypeError: read() takes at most 1 argument (2 given)
What am I missing here? I'm using pandas version 0.18.0.
According issue report linked below, this bug will be fixed in 18.1.
https://github.com/pydata/pandas/issues/12647

Categories