How to convert Pandas data frame into 2 D array [duplicate] - python

This question already has answers here:
Pandas convert dataframe to array of tuples
(10 answers)
Closed 3 years ago.
I have a Panda data frame
X =
id var1 var2
0 20000049588638 3 61.62
1 100798486386 3 61.62
2 100799238114 3 61.62
I want to convert this as a simple 2D array so that I can write this into Teradata database
Required Output
X =
[(20000049588638,3,61.62),
(100798486386,3,61.62),
(100799238114,3,61.62)]
I tried this:
X = X.values.tolist()
But, I am getting following output:
[[20000049588638, '3', '61.62'],
[100798486386, '3', '61.62'],
[100799238114, '3', '61.62']]
Which I am not able to write into the database.
Please check this.

As mentioned in this questions, you can use itertuples() and then enclose that in a list.
list(X.itertuples(index=False, name=None))

Related

How to select rows from pandas dataframe by looking a feature' data types when a feature contains more than one type of value [duplicate]

This question already has answers here:
Select row from a DataFrame based on the type of the object(i.e. str)
(3 answers)
Closed 3 months ago.
I have a dataframe with 3 features: id, name and point. I need to select rows that type of 'point' value is string.
id
name
point
0
x
5
1
y
6
2
z
ten
3
t
nine
4
q
two
How can I split the dataframe just looking by type of one feature' value?
I tried to modify select_dtypes method but I lost. Also I tried to divide dataset with using
df[df[point].dtype == str] or df[df[point].dtype is str]
but didn't work.
Technically, the answer would be:
out = df[df['point'].apply(lambda x: isinstance(x, str))]
But this would also select rows containing a string representation of a number ('5').
If you want to select "strings" as opposed to "numbers" whether those are real numbers or string representations, you could use:
m = pd.to_numeric(df['point'], errors='coerce')
out = df[df['point'].notna() & m]
The question is now, what if you have '1A' or 'AB123' as value?

Convert objects to numeric values [duplicate]

This question already has answers here:
Change Pandas String Column with commas into Float
(2 answers)
Closed 6 months ago.
I have a CSV file and it has a column full of numbers. These numbers can be formatted as 45.11 , 1,234.33, 122.33, 10,222.22 etc.
Right now they are showing up as objects in my data frame, and i need to convert them to numeric. I have tried:
df['Value'].astype(str).astype(float)
But am getting errors like this:
ValueError: could not convert string to float: '1,054.43'
Does anyone know how to solve this for the weirdly formatted numbers?
this should make the job
vals={'Value': ["45.11" , "1,234.33", "122.33", "10,222.22"]}
df = pd.DataFrame(vals)
df.Value = df.Value.apply(lambda x: x.replace(",", "")).astype(float)
print(df.Value)
output
0 45.11
1 1234.33
2 122.33
3 10222.22
Name: Value, dtype: float64

Need to plot Pairplot for a dataframe that has duplicate indices [duplicate]

This question already has answers here:
dataframe to long format
(2 answers)
Reshape wide to long in pandas
(2 answers)
Closed 9 months ago.
I have a dataframe 'df' (310, 7) and need to plot a pairplot for it. But I'm getting an error <ValueError: cannot reindex from a duplicate axis> when I do it in a regular way.
sns.pairplot(df,hue='Class')
ValueError: cannot reindex from a duplicate axis
The data is of this form:
[data]
P_incidence P_tilt L_angle S_slope P_radius S_Degree Class
0 38.505273 16.964297 35.112814 21.540976 127.632875 7.986683 Normal
1 54.920858 18.968430 51.601455 35.952428 125.846646 2.001642 Normal
2 44.362490 8.945435 46.902096 35.417055 129.220682 4.994195 Normal
3 48.318931 17.452121 48.000000 30.866809 128.980308 -0.910941 Normal
4 45.701789 10.659859 42.577846 35.041929 130.178314 -3.388910 Normal
I tried removing the duplicates using:
df.loc[df['L_angle'].duplicated(), 'L_angle'] = ''
But, this method converts the column to an object and I'm not able to negate it.
The expected output plot is as follows:
[expected]

create dataframe in pandas, why the creation leads to the clear of the zip of tuples? [duplicate]

This question already has answers here:
Why can't I iterate twice over the same iterator? How can I "reset" the iterator or reuse the data?
(5 answers)
Closed 1 year ago.
import pandas as pd
sentences=['aaa','bbb','ccc']
labels = [1,2,3]
infos = zip(sentences , labels)
df_synthesize = pd.DataFrame(infos, columns = ['content','label'])
print(df_synthesize)
print(list(infos))
I use the infos to initialize the dataframe, however, after the creation, the infos becomes null list.
print(list(infos))
[]
It is quite weird, why this happens?
pandas version : 1.1.5
Try this. panddas: 1.1.5
import pandas as pd
sentences=['aaa','bbb','ccc']
labels = [1,2,3]
infos = list(zip(sentences , labels))
// ^----------------------------------- clue
df_synthesize = pd.DataFrame(infos, columns = ['content','label'])
print(df_synthesize)
Output
content label
0 aaa 1
1 bbb 2
2 ccc 3

Changing pandas column values into another format [duplicate]

This question already has answers here:
How to convert string representation of list to a list
(19 answers)
Closed 3 years ago.
The labels column in my test['labels'] dataframe, looks like:
0 ['Edit Distance']
1 ['Island Perimeter']
2 ['Longest Substring with At Most K Distinct Ch...
3 ['Valid Parentheses']
4 ['Intersection of Two Arrays II']
5 ['N-Queens']
For each value in the column, which is a string representation of list ("['Edit Distance']"), I want to apply the function below to convert it into an actual list.
ast.literal_eval(VALUE HERE)
What is a straightforward way to do this?
Use:
import ast
test['labels'] = test['labels'].apply(ast.literal_eval)
print (test)
labels
0 [Edit Distance]
1 [Island Perimeter]
2 [Longest Substring with At Most K Distinct Ch]
3 [Valid Parentheses]
4 [Intersection of Two Arrays II]
5 [N-Queens]

Categories