Loading a CVS file where the data is all in one column - python

I am importing a CSV file that contains data which is all in a single column (the TXT file has the data separated by ";"
Is there anyway to get the data to load into Anaconda (using Panda) so that it is in separate columns, or can it be manipulated afterwards into columns?
The data can be found at the following web-address (this is data about sunspots):
http://www.sidc.be/silso/INFO/snmtotcsv.php
From this website http://www.sidc.be/silso/datafiles
I have managed to do this so far:
Start code by loading the Panda command set
from pandas import *
#Initial setup commands
import warnings
warnings.simplefilter('ignore', FutureWarning)
import matplotlib
matplotlib.rcParams['axes.grid'] = True # show gridlines by default
%matplotlib inline
from scipy.stats import spearmanr
#load data from CSV file
startdata = read_csv('SN_m_tot_V2.0.csv',header=None)
startdata = startdata.reset_index()

I received an answer elsewhere; the lines of code that takes into account the lack of column headings AND the separator being s semi-colon is:
colnames=['Year','Month','Year (fraction)','Sunspot number','Std dev.','N obs.','Provisional']
ssdata=read_csv('SN_m_tot_V2.0.csv',sep=';',header=None,names=colnames)

Related

Plotting multiple graphs from multiple text files in python

I have multiple text files in a directory. The 1st line of each text file is the header line. Rest of the lines are like columns containing different datas. I have to plot 7th column vs 5th column data graphs for each text file. I also want to plot all the graphs using a loop and a single code. Can anyone pls help me to do this? Thank you in advance.
You can use pandas and matplotlib.pyplot
import matplotlib.pyplot as plt
import pandas as pd
# sep= accepts the separator of your data i.e. ' ' space ',' comma etc
table = pd.read_csv('your_file_name.txt', sep=' ')
table.plot(x=['header_of_5th_col',y=['header_of_7th_col'])
I suggest also to check pandas documentations about loading data and plot them
You can then loop the table.plot line of code to plot every graph you need
code for getting all files in a specified directory:
import os
files = os.listdir("path/to/directory")
print(files)
for reading the files I would suggest the library pandas (here) and for plotting matplotlib (here).
for a more detailed solution more information on what exact data is given and what output is expected is needed.
for example sharing the first few lines of one of the files and a basic image created in paint or similar containing what things should roughly look like.

Reading .ASC format file in python

I am dealing with a certain .asc file format which contains some data regarding weight and height. I just want to find BMI indexes of people with this data. I am not able to make sense of the dataframe formed after reading the data.
import pandas as pd
df = pd.read_table("data.asc")
I am not able to make sense of the result that I get. Please help me out
I recently had to work with a file with the extension "asc". My solution was the following:
I opened the file with a text editor and check the separator for the file, then transform it into a spreadsheet. In my case, I turned the document into a "csv" file.
After that I ran:
import pandas as pd
df = pd.read_csv('path to your file')

Problem with csv data imported on jupyter notebook

I'm new on this site so be indulgent if i make a mistake :)
I recently imported a csv file on my Jupyter notebook for a student work. I want use some of data of specific column of this file. The problem is that after import, the file appear as a table with 5286 lines (which represent dates and hours of measures) in a single column (that compiles all variables separated by ; that i want use for my work).
I don't know how to do to put this like a regular table.
I used this code to import my csv from my board :
import pandas as pd
data = pd.read_csv('/work/Weather_data/data 1998-2003.csv','error_bad_lines = false')
Output:
Desired output: the same data in multiple columns, separated on ;.
You can try this:
import pandas as pd
data = pd.read_csv('<location>', sep=';')

Python pandas create datafrane from csv embeded within a web txt file

I am trying to import CSV formatted data to Pandas dataframe. The CSV data is located within a .txt file the is located at a web URL. The issue is that I only want to import a part (or parts) of the .txt file that is formatted as CSV (see image below). Essentially I need to skip the first 9 rows and then import rows 10-16 as CSV.
My code
import csv
import pandas as pd
import io
url = "http://www.bom.gov.au/climate/averages/climatology/windroses/wr15/data/086282-3pmMonth.txt"
df = pd.read_csv(io.StringIO(url), skiprows = 9, sep =',', skipinitialspace = True)
df
I get a lengthy error msg that ultimately says "EmptyDataError: No columns to parse from file"
I have looked at similar examples Read .txt file with Python Pandas - strings and floats but this is different.
The code above attempts to read a CSV file from the URL itself rather than the text file fetched from that URL. To see what I mean take out the skiprows parameter and then show the data frame. You'll see this:
Empty DataFrame
Columns: [http://www.bom.gov.au/climate/averages/climatology/windroses/wr15/data/086282-3pmMonth.txt]
Index: []
Note that the columns are the URL itself.
Import requests (you may have to install it first) and then try this:
content = requests.get(url).content
df = pd.read_csv(io.StringIO(content.decode('utf-8')),skiprows=9)

python pandas read_excel return an AssertionError: importing a file with images

I can't use the read_excel method from pandas library in my Ipython note book.
After some test and cleaning in the Excel file, I understood their is a complete column of drawings (or images). When I deleted this column I stop the error message. Does somebody know how to configure read_excel option to collect only dataes? This is my code:
import pandas as pd
import os
# File selection
userfilepath = r'C:\Temp'
filename = "exportCS12.xlsx"
filenameCS12 = os.path.join(userfilepath, filename)
print(filenameCS12)
# workbook upload
df = pd.read_excel(filenameCS12, sheetname='Sheet1')
Pandas import was not working due to a none clean excel file. Problem sovlve with openpyxl, able to navigate in excel only in validated areas.

Categories