Read data of a CSV file to create a new CSV file - python

I have some data on a CSV file. As you can see in the code, I can read the file and print the info I need. The problem is when I try to create a new CSV file with some info of Original CSV file. I would like to save my analyzed info in a new CSV. I don't know how to use the original info to make a new file.
Data.csv
enter image description here
import csv
with open('Data.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
analyzed = (row[0],row[3],row[3]<0.25)
print(analyzed)

You probably want to use pandas when it comes to CSV files or table-like data:
import pandas as pd
df_data = pd.DataFrame.from_csv('Data.csv')
# Analyze
for index, row in df_data.iterrows():
pass
df_data.to_csv('new_Data.csv')
For reading you have several options like
pandas.DataFrame.from_csv
pandas.read_csv
pandas.read_table
and, as you see, use
pandas.DataFrame.to_csv
to save your transformed or newly created DataFrame.
For installation run
pip install pandas

Related

How do I write data to an existing file in Excel using Pandas?

I used the following code to read data in file_1 then write that to a new file_2.
import pandas as pd
inventory = pd.read_excel('file_1.xlsx', skiprows=3)
inventory.to_excel('file_2.xlsx')
file_2 is a newly created file each time. How do I write the data to specific tab in an existing file without clearing out other tabs that contain data?
ExcelWriter can be used to append to an existing Excel file using mode='a'. Specify the sheet name with the sheet_name parameter.
with pd.ExcelWriter('file_2.xlsx', mode='a') as writer:
inventory.to_excel(writer, sheet_name='Sheet_name_1')

Opening csv file in jupyter notebook

I tried to open a csv file in jupyter notebook, but it shows error message. And I didn't understand the error message. CSV file and jupyter notebook file is in the same directory. plz check the screenshot to see the error message
jupyter notebook code
csv file and jupyter notebook file is in same directory
As others have written it's a bit difficult to understand what exactly is your problem.
But why don't you try something like:
with open("file.csv", "r") as table:
for row in table:
print(row)
# do something
Or:
import pandas as pd
df = pd.read_csv("file.csv", sep=",")
# shows top 10 rows
df.head(10)
# do something
You can use the in-built csv package
import csv
with open('my_file.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
print(row)
This will print each row as an array of items representing each cell.
However, using Jupyter notebook you should use Pandas to nicely display the csv as a table.
import pandas as pd
df = pd.read_csv("test.csv")
# Displays top 5 rows
df.head(5)
# Displays whole table
df
Resources
The csv module implements classes to read and write tabular data in CSV format. It allows programmers to say, “write this data in the format preferred by Excel,” or “read data from this file which was generated by Excel,” without knowing the precise details of the CSV format used by Excel.
Read More CSV: https://docs.python.org/3/library/csv.html
pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
Read More Pandas: https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html
Use pandas for csv reading.
import pandas as pd
df=pd.read_csv("AppleStore.csv")
You can used head/tail function to see the values. Use dtypes to see the types of all the values. You can check the documentation.

how to import filtered rows before reading csv pandas

Hi I have to upload large number of csv files in pandas dataframe. Can I filter out data from these csv files before loading it so as I dont get any memory error.
I the existing set up it gives me memory error
I have a column Location which has 32 values but I only want 3-4 locations to be filtered before importing.
Is this possible?
You can use the csv library to read line by line and keep only the records you need:
import csv
with open('names.csv', newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
print(row['first_name'], row['last_name'])
After that you can save your filtered rows to csv files using writerow

Parsing and saving the rows of excel file using pandas

Have an excel file with a column with some text in each row of this column.
I'm using pandas pd.read_excel() to open this .xlsx file. Then I would like to do the following: I would like to save every row of this column as a distinct .txt file (that would have the text from this row inside this file). Is it possible to be done via pandas?
the basic idea would be to use an iterator to loop over the rows, opening a each file and writing the value in, something like:
import pandas as pd
df = pd.read_excel('test.xlsx')
for i, value in enumerate(df['column']):
with open(f'row-{i}.txt', 'w') as fd:
fd.write(value)

How to delete rows (NOT columns) in a csv file

I am trying to delete a particular row (NOT a column) in a csv file for a
class project. When I deleted columns I put:
r=row
r[22], r[21]
# and so on
So how do I specify that I want to delete rows? I am working with census data and want to get rid of that extra row of headers that are always in census tables.
Thank you for any help.
Convert your csv reader to a list and slice the appropriate indexes off:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
rows = list(reader)[1:] # no more header row
Use pandas, it's so easy to handle data and files with it. Use it to edit your data easily.
You can open your csv file and convert it to a pandas dataframe through.
df = pandas.read_csv('file.csv')
After that you can use this function.
df.drop(df.columns[[0]], axis=1)
In this example I'm deleting the row with index 0.
Pandas documentation

Categories