trying to import a excel csv (?!) file with panda - python

I am new to Python/Panda and I am trying to import the following file in Jupyter notebook via pd.read_
Initial file lines:
either pd.read_excel or pd.read_csv returned an error.
eliminating the first row allowed me to read the file but all csv data were not separated.

could you share the line of code you have used so far to import the data?
Maybe try this one here:
data = pd.read_csv(filename, delimiter=',')
It is always easier for people to help you if you share the relevant code accompanied by the error you are getting.

Related

Picking out a specific column in a table

My goal is to import a table of astrophysical data that I have saved to my computer (obtained from matching 2 other tables in TOPCAT, if you know it), and extract certain relevant columns. I hope to then do further manipulations on these columns. I am a complete beginner in python, so I apologise for basic errors. I've done my best to try and solve my problem on my own but I'm a bit lost.
This script I have written so far:
import pandas as pd
input_file = "location\\filename"
dataset = pd.read_csv(input_file,skiprows=12,usecols=[1])
The file that I'm trying to import is listed as having file type "File", in my drive. I've looked at this file in Notepad and it has a lot of descriptive bumf in the first few rows, so to try and get rid of this I've used "skiprows" as you can see. The data in the file is separated column-wise by lines--at least that's how it appears in Notepad.
The problem is when I try to extract the first column using "usecol" it instead returns what appears to be the first row in the command window, as well as a load of vertical bars between each value. I assume it is somehow not interpreting the table correctly? Not understanding what's a column and what's a row.
What I've tried: Modifying the file and saving it in a different filetype. This gives the following error:
FileNotFoundError: \[Errno 2\] No such file or directory: 'location\\filename'
Despite the fact that the new file is saved in exactly the same location.
I've tried using "pd.read_table" instead of csv, but this doesn't seem to change anything (nor does it give me an error).
When I've tried to extract multiple columns (ie "usecol=[1,2]") I get the following error:
ValueError: Usecols do not match columns, columns expected but not found: \[1, 2\]
My hope is that someone with experience can give some insight into what's likely going on to cause these problems.
Maybie you can try dataset.iloc[:,0] . With iloc you can extract the column or line you want by index(not only). [:,0] for all the lines of 1st column.
The file is incorrectly named.
I expect that you are reading a csv file or an xlsx or txt file. So the (windows) path would look similar to this:
import pandas as pd
input_file = "C:\\python\\tests\\test_csv.csv"
dataset = pd.read_csv(input_file,skiprows=12,usecols=[1])
The error message tell you this:
No such file or directory: 'location\\filename'

Everything in csv file converted to int64?

I'm trying to work with a csv file(link:https://github.com/mwaskom/seaborn-data/blob/master/tips.csv) on jupyter notebook. When I perform .dtypes to the file it only returns the result dtype('int64') and nothing else. How can I get other data types such as float64 and object for every columns?
Also I'm using the file from uploading the csv file to jupyter notebook. Used the code below to read it.
df = pd.read_csv('tips.csv')
It's weird because when I ran the exact same code yesterday it showed data types for each column. Does anyone know what the problem may be?
The local file on your laptop isn't what you think it is; check the output of
import csv
with open('tips.csv', newline='') as f:
reader = csv.reader(f)
for _ in range(2):
print(next(reader))
and confirm that your local copy of 'tips.csv' is missing data. Re-download the file from your linked source if needed.

Open an existing excel file and writing a new line in Python

I've searched through some old answers about my problem, but couldn't find an answer.
The problem: I want to open an existing Excel file, than write a new-line with a list and than save the current Excel-File.
My current code is:
import pandas
import pandas as pd
l_bsp = range(1,13)
df = pd.read_excel("Existing_file.xlsx")
df.loc[df.shape[0]+1] = l_bsp
print(df)
Right now it only changes my Dataframe without changing the Excel File. How to I add the list into the existing excel file?
Thank you.
Your code only read the Excel file without writing it back. You can use command such as df.to_excel(). For better understanding of the writing process, suggest you also take a look at the pandas user guide on Writing Excel files

Opening csv file in jupyter notebook

I tried to open a csv file in jupyter notebook, but it shows error message. And I didn't understand the error message. CSV file and jupyter notebook file is in the same directory. plz check the screenshot to see the error message
jupyter notebook code
csv file and jupyter notebook file is in same directory
As others have written it's a bit difficult to understand what exactly is your problem.
But why don't you try something like:
with open("file.csv", "r") as table:
for row in table:
print(row)
# do something
Or:
import pandas as pd
df = pd.read_csv("file.csv", sep=",")
# shows top 10 rows
df.head(10)
# do something
You can use the in-built csv package
import csv
with open('my_file.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
print(row)
This will print each row as an array of items representing each cell.
However, using Jupyter notebook you should use Pandas to nicely display the csv as a table.
import pandas as pd
df = pd.read_csv("test.csv")
# Displays top 5 rows
df.head(5)
# Displays whole table
df
Resources
The csv module implements classes to read and write tabular data in CSV format. It allows programmers to say, “write this data in the format preferred by Excel,” or “read data from this file which was generated by Excel,” without knowing the precise details of the CSV format used by Excel.
Read More CSV: https://docs.python.org/3/library/csv.html
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
Read More Pandas: https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html
Use pandas for csv reading.
import pandas as pd
df=pd.read_csv("AppleStore.csv")
You can used head/tail function to see the values. Use dtypes to see the types of all the values. You can check the documentation.

Writing value to given filed in csv file using pandas or csv module

Is there any way you can write value to specific place in given .csv file using pandas or csv module?
I have tried using csv_reader to read the file and find a line which fits my requirements though I couldn't figure out a way to switch value which is in the file to mine.
What I am trying to achieve here is that I have a spreadsheet of names and values. I am using JSON to update the values from the server and after that I want to update my spreadsheet also.
The latest solution which I came up with was to create separate sheet from which I will get updated data, but this one is not working, though there is no sequence in which the dict is written to the file.
def updateSheet(fileName, aValues):
with open(fileName+".csv") as workingSheet:
writer = csv.DictWriter(workingSheet,aValues.keys())
writer.writeheader()
writer.writerow(aValues)
I will appreciate any guidance and tips.
You can try this way to operate the specified csv file
import pandas as pd
a = ['one','two','three']
b = [1,2,3]
english_column = pd.Series(a, name='english')
number_column = pd.Series(b, name='number')
predictions = pd.concat([english_column, number_column], axis=1)
save = pd.DataFrame({'english':a,'number':b})
save.to_csv('b.csv',index=False,sep=',')

Categories