I am trying to load data from the web source and save it as a Excel file but not sure how to do it. What should I do?
import requests
import pandas as pd
import xmltodict
url = "https://www.kstan.ua/sitemap.xml"
res = requests.get(url)
raw = xmltodict.parse(res.text)
data = [[r["loc"], r["lastmod"]] for r in raw["urlset"]["url"]]
print("Number of sitemaps:", len(data))
df = pd.DataFrame(data, columns=["links", "lastmod"])
df.to_csv("output.csv", index=False)
OR
df.to_excel("output.xlsx")
You can write the dataframe to excel using the pandas ExcelWriter, such as this:
import pandas as pd
with pd.ExcelWriter('path_to_file.xlsx') as writer:
dataframe.to_excel(writer)
If you want to create multiple sheets in the same file
with pd.ExcelWriter('csv_s/results.xlsx') as writer:
same_res.to_excel(writer, sheet_name='same')
diff_res.to_excel(writer, sheet_name='sheet2')
Related
I want to convert my csv file to excel, but the first line of the csv get read as header
I first created a csv with the lists below then I used pandas to convert it to excel
import pandas as pd
id=["id",1,2,3,4,5]
name=["name","Salma","Ahmad","Manar","Mustapha","Zainab"]
age=["age",14,12,15,13,10]
#this is how i created the csv file
Csv='path/csvfile.csv'
open_csv=open(Csv, 'w')
outfile=cvs.writer(open_csv)
outfile.writerows([id]+[name]+[age])
open_csv.close()
#Excel file
Excel='path/Excelfile.xlsx'
Excel_open=open(Excel, 'w')
csv_file=pd.read_csv(Csv)
csv_file.to_excel(Excel)
This is what I get from this code
"Results"
I want the Id title to be in the same column as name and age
I would suggest this instead:
import pandas as pd
df = pd.DataFrame({
"id": [1,2,3,4,5],
"name":["Salma","Ahmad","Manar","Mustapha","Zainab"],
"age":[14,12,15,13,10]
})
excel_file = df.to_excel("excel_file.xlsx", index=False)
In this way you can create a dataframe more easily and understandable.
I want to know if is there a simple way to get a dataframe from a xlsm file, I tried just pandas with pd.Excelfile, but it doesn't read the data correctly
so... for now I have this:
import xlrd
import pandas as pd
cartera_improd = xlrd.open_workbook("CARTERA IMPRODUCTIVA - FORMATOV1.xlsm")
base_ici = cartera_improd.sheet_by_name("BASE ICI")
print (base_ici.row_values(1))
print (base_ici.nrows)
data_ici = list()
for i in range(base_ici.nrows):
data_ici.append(base_ici.row_values(i))
data_ici = pd.DataFrame(data_ici)
To read a xlsm file you just have to use :
import pandas as pd
df=pd.read_excel('CARTERA IMPRODUCTIVA - FORMATOV1.xlsm')
print(df.head())
How to convert the output I get from a pretty table to pandas dataframe and save it as an excel file.
My code which gets the pretty table output
from prettytable import PrettyTable
prtab = PrettyTable()
prtab.field_names = ['Item_1', 'Item_2']
for item in Items_2:
prtab.add_row([item, difflib.get_close_matches(item, Items_1)])
print(prtab)
I'm trying to convert this to a pandas dataframe however I get an error saying DataFrame constructor not properly called! My code to convert this is shown below
AA = pd.DataFrame(prtab, columns = ['Item_1', 'Item_2']).reset_index()
I found this method recently.
pretty_table.get_csv_string()
this will convert it to a csv string where you could write to a csv file.
I use it like this:
tbl_as_csv = pretty_table.get_csv_string().replace('\r','')
text_file = open("output_path.csv", "w")
n = text_file.write(tbl_as_csv)
text_file.close()
Load the data into a DataFrame first, then export to PrettyTable and Excel:
import io
import difflib
import pandas as pd
import prettytable as pt
data = []
for item in Items_2:
data.append([item, difflib.get_close_matches(item, Items_1)])
df = pd.DataFrame(data, columns=['Item_1', 'Item_2'])
# Export to prettytable
# https://stackoverflow.com/a/18528589/190597 (Ofer)
# Use io.StringIO with Python3, use io.BytesIO with Python2
output = io.StringIO()
df.to_csv(output)
output.seek(0)
print(pt.from_csv(output))
# Export to Excel file
filename = '/tmp/output.xlsx'
writer = pd.ExcelWriter(filename)
df.to_excel(writer,'Sheet1')
I have a csv file where i did some modifications in two columns. My question is the following: How can I print my csv file with the updated columns? My code is the following :
import pandas as pd
import csv
data = pd.read_csv("testdataset.csv")
data = data.join(pd.get_dummies(data["ship_from"]))
data = data.drop("ship_from", axis=1)
data['market_name'] = data['market_name'].map(lambda x: str(x)[39:-1])
data = data.join(pd.get_dummies(data["market_name"]))
data = data.drop("market_name", axis=1)
Thank you in advance!
You can write to a file with pandas.DataFrame.to_csv
data.to_csv('your_file.csv')
However, you can view it without writing with
print(data.to_csv())
I was trying to convert the below JSON file into a csv file.
JSON file
[{
"SubmitID":1, "Worksheet":3, "UserID":65,
"Q1":"395",
"Q2":"2178",
"Q3":"2699",
"Q4":"1494"},{
"SubmitID":2, "Worksheet":3, "UserID":65,
"Q4":"1394"},{
"SubmitID":3, "Worksheet":4, "UserID":65,
"Q1":"1629",
"Q2":"1950",
"Q3":"0117",
"Q4":"1816",
"Empty":" "}]
However, my Python code below gives the error message "TypeError: Expected String or Unicode". May I know how should I modify my program to make it work?
import json
import pandas as pd
f2 = open('temp.json')
useful_input = json.load(f2)
df=pd.read_json(useful_input)
print(df)
df.to_csv('results.csv')
You just need to pass the address string to pd.read_json():
df=pd.read_json("temp.json")
You have not to use json module:
Try:
import pandas as pd
df=pd.read_json("temp.json")
print(df)
df.to_csv('results.csv')
import pandas as pd
df = pd.read_json('data.json')
df.to_csv('data.csv', index=False, columns=['title', 'subtitle', 'date', 'description'])
import pandas as pd
df = pd.read_csv("data.csv")
df = df[df.columns[:4]]
df.dropna(how='all')
df.to_json('data.json', orient='records')