Get row values of CSV file having particular value using python - python

I have multiple
I want to get rows of Name.
I know how to get by index using dataframe but I want to get using row name as index might change.
like
(row=="Name" ) or (row== "name")
output be like :
Thanks in advance!

If you want the name column name = df['Name']

Related

Python select 5last value of an excel column

I wonder if it is possible to retrieve the last 5 values of an excel column instead of all the values in that same column.
Currently I am able to select all the data in the column with the following piece of code:
var= pd.read_excel("Path/MyFile.xlsx",'MS3',skiprows=15)
xDate = list(var['Date'])
Is there a way to retrieve the last 5 values in this column?
yes you can use tail like head
var.tail(5)
you can simply go for this:
xDate[-5:]

Return Name of first record df.iloc[0] using Python numpy

I am new to Python and am trying to create a function that returns the Name of a record in a dataset using numpy. I have sorted the data in descending order based on the column 'silver medals'to retrieve the record that I would like to return. I'm sure there is an easier way to get the top value but I'm new to this and am trying to learn one step at a time....
df.sort_values(by=['silver medals'], inplace =True, ascending=False)
when I use
df.iloc[0]
to return the record details i can see at the bottom of the information it says:
Detail.....
Name: Country Name, dtype: object
I can use the below to return the abbreviated country name
df['ID'].iloc[0]
however I am trying to return the full name.... I believe the column that has the full name in it is index 0 and does not have any header data... so I'm not sure how to reference the column
I have tried the following but none of them seems to work.... what am i doing incorrectly? Any help would be appreciated
df[0].iloc[0]
df[''].iloc[0]
df[' '].iloc[0]
Your index contains the country names.
The index cannot be accessed like a column because it's not a column.
You can put the index back into the columns by doing: df = df.reset_index()

How to feed new columns every time in a loop to a spark dataframe?

I have a task of reading each columns of Cassandra table into a dataframe to perform some operations. Here I want to feed the data like if 5 columns are there in a table I want:-
first column in the first iteration
first and second column in the second iteration to the same dataframe
and likewise.
I need a generic code. Has anyone tried similar to this? Please help me out with an example.
This will work:
df2 = pd.DataFrame()
for i in range(len(df.columns)):
df2 = df2.append(df.iloc[:,0:i+1],sort = True)
Since, the same column name is getting repeated, obviously df will not have same column name twice and hence it will keep on adding rows
You can extract the names from dataframe's schema and then access that particular column and use it the way you want to.
names = df.schema.names
columns = []
for name in names:
columns.append(name)
//df[columns] use it the way you want

how to change row index back to a column in pandas dataframe?

i have a dataframe, after grouping, it is like this now:
now i want to move row index(name) to be the first column, how to do that ?
i tried to do like this:
gr.reset_index(drop=True)
but the effect is like this:
name field now has count information,
Don't specify the drop parameter, as as it means, it will drop the index, and also probably better to rename the index, since you have a name column already:
gr.index.name = "company"
gr = gr.reset_index()

Python Pandas printing out values of each cells

I am trying to fetch values from an excel file using pandas dataframe and print out the values of each cells.
Im using read_excel() to populate the dataframe, and
I am looking for specific rows using the following line of code:
df.loc[df['Parcel_ID'] == parcel]
parcel being an arbitrary input from the user. And I simply use this to print out the cells:
row = df.loc[df['Parcel_ID'] == parcel]
print row['Category']
print row['Sub_Category']
.
.
(more values)
What I want is only the values from the cells, yet I get dtypes, names of the column, and other junks that I don't want to see. How would I only print out the value from each cells?
If you have several values in your row you could use following:
row['Category'].values.tolist()
row['Sub_Category'].values.tolist()
IIUC the following should work:
print row['Category'][0]
print row['Sub_Category'][0]
what is returned will be a Series in your case a Series with a single element which you index into to return a scalar value
How to find a value in a column (Column 1 name) by another value in another column (Column 2 name)
df = pd.read_excel('file.xlsm','SheetName') #Get a excel file sheet
Then you have two ways to get that:
First Way:
Value_Target = df.loc[df['Column1name']== 'Value_Key']['Column2name'].values
Second Way:
Value_Target = df['Column1name'][df['Column2name']=='Value_Key'].values[0]
Value_Key is the value in a column you have
Value_Target is the value you want to find using Value_Key

Categories