Can someone tell me why the columns are arranged like that or tell how to fix it ?
Thanks
# import libraries
import numpy as np
import pandas as pd
from time import time
import mysql.connector
from IPython.display import display # Allows the use display() for dataframes
data = pd.read_csv("car_dataset.csv", delimiter = ";")
# Display result (example (5))
display(data.head(n=5))
I don't know what else to try.
If you look closely, your data is delimited by , not ;. Remove the delimiter parameter.
data = pd.read_csv("car_dataset.csv")
Related
import pandas as pd
tbl1 = pd.import_csv('sample_prices.csv')
tbl1.print()
and still not receiving anything? It does not even come up with an error.
The code might be written this way.
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
print(tbl1)
pd.import_csv doesn't exist. You probably meant to use pd.read_csv instead.
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
tbl1.print()
That said, I'm not sure why it wouldn't raise an error...
If you have a custom function called import_csv, you'll want to call it like this:
import pandas as pd
tbl1 = import_csv('sample_prices.csv')
tbl1.print()
...without the pd. prefix.
It should be writed like this
import pandas as pd
tbl1 = pd.read_csv('sample_prices.csv')
print(tbl1)
I've exported an Excel into a CSV where all the columns and entires look correct and normal. However, when I put it into a data frame and print the head, the structure becomes very messy and unreadable due to columns being unstructured.
As you can see in the image, the values are not neatly under user_id.
https://imgur.com/a/gbWaTwi
I'm using the following code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
then
df1 = pd.read_csv('../doc.csv', low_memory=False)
df1.head
Do this --- print the invocation of head. Just saying .head isn't enough.
print(df1.head())
"test.csv" has columns "col_a", "col_b" and "col_c".
#import pandas import pandas as pd
df = pd.read_csv('./data/test.csv',header=0,dtype={'col_a':object,'col_b':object,'col_c':object})
This code can work well. But I would like to change the code using the variable "key_word" as follow, but it cannot work well.Why? How should I modify this code?
#import pandas import pandas as pd
key_word='col_a':object,'col_b':object,'col_c':object
df = pd.read_csv('./data/test.csv',header=0,dtype={key_word})
make key_word a dictionary by initializing it like this:
key_word={'col_a':object,'col_b':object,'col_c':object}
that should do the trick. right now it cannot possibly work since you produce a massive syntax error without curly brackets.
I am trying to open a few excel folders inside a directory and then be able to do stuff with the data (like take the average of one row for three files).
My main goal right now is just to be able to display the information in each excel file. I used the following code to do so. But when I display it, it prints out the '0' element to the '29' element...then it skips 30-50 and and it prints out 51-80.
Here is a snip of my output on python:
import numpy as np
import scipy.io as sio
import scipy
import matplotlib.pyplot as plt
import os
import pandas as pd
from tkinter import filedialog
from tkinter import *
import matplotlib.image as image
import xlsxwriter
import openpyxl
import xlwt
import xlrd
#GUI
root=Tk()
root.withdraw() #closes tkinter window pop-up
path=filedialog.askdirectory(parent=root,title='Choose a file')
path=path+'/'
print('Folder Selected',path)
files=os.listdir(path)
length=len(files)
print('Files inside the folder',files)
Files=[]
for s in os.listdir(path):
Files.append(pd.read_excel(path+s))
print (Files)
I'm quite sure your data is being correctly read. The dots between rows 29 and 51 show that there is more data there. pandas elides these rows, so your console looks cleaner. If you want to see all the rows, you could use the solution from this answer:
with pd.option_context('display.max_rows', None, 'display.max_columns', 3):
print(Files)
Where None sets display limit on rows (no limit) and 3 sets display limit on columns. Here you can find more info on options.
This is actually the standard way to print the data, notice the ellipses between 29 and 51:
29 7.8000 [cont.]
...
51 12.19999 [cont.]
You can still operate on every row. To get the number of rows in a dataframe, you can call
len(df.index)
I am using Spyder as my python IDE.
I tried run this python code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
path = os.getcwd() + '\ex1data1.txt'
data = pd.read_csv(path, header = None, names = ['Population', 'Profit'])
data.head()
In theory it is supposed to show me a table of data since I am using
data.head()
at the end
but when I run it, it only shows :
I thought it was supposed to display a new window with the data table.
What is wrong?
You are calling data.head() in .py script. The script is supposed to only read the data, not print it. Try print(data.head())
You want to print your data.head() so do that...
print(data.head())