how to avoid headers repeating everytime I append in Pandas

how to avoid headers repeating everytime I append in Pandas - python

Im trying to append values from stored values taken from users. Is there a way to hard code the headers ("Player Name", "Age" and "Number") so evertime I run the code these headers dont appear again in the csv.
Edit: I want the headers to appear the "First time" it runs and not the ones after
data = {'Player Name': [playersNames[0]],
'Age': [Age[0]],
'Number': [number],
df = pd.DataFrame(data)
df.to_csv("Info.csv",mode='a',index=False, header=True)
print(df)
At the moment it does this:
Player Name,Age,Number
x,15,73
Player Name,Age,Number
y,25,70

Try:
import pathlib
csvfile = pathlib.Path('Info.csv')
df.to_csv(csvfile, mode='a', index=False, header=not csvfile.exists())

If you don't want the header to be appended as well, then set headers=False in your call to to_csv.
In this situation, if you are running the script multiple times but only want headers appended on the first execution, then you could check if the file exists.
import os
exists = os.path.exists('Info.csv')
Then, set headers=not exists when calling to_csv.
Docs here.

Related

pandas read_excel doesn't read all rows

I have a problem with "pandas read_excel", thats my code:
import pandas as pd
df = pd.read_excel('myExcelfile.xlsx', 'Table1', engine='openpyxl', header=1)
print(df.__len__())
If I run this code in Pycharm on Windows PC I got the right length of the dataframe, which is 28757
but if I run this code on my linux server I got only 26645 as output.
Any ideas whats the reason for that?
Thanks

Try this way:
import pandas as pd
data= pd.read_excel('Advertising.xlsx')
data.head()

I got the solution.
The problem was an empty first row in my .xlsx File.
My file is automatically created by another program, so I used openpyxl to delete the first row and make a new .xlsx File.
import openpyxl
path = 'myExcelFile.xlsx'
book = openpyxl.load_workbook(path)
sheet = book['Tabelle1']
#start at row 0, length 1 row:
sheet.delete_rows(0,1)
#save in new file:
book.save('myExcelFile_new.xlsx')
Attention, in this code sample I don`t check if the first row is empty!
So I delete the first line no matter if there is content in it or not.

Populating an Excel File Using an API to track Card Prices in Python

I'm a novice when it comes to Python and in order to learn it, I was working on a side project. My goal is to track card prices of my YGO cards using the yu-gi-oh prices API https://yugiohprices.docs.apiary.io/#
I am attempting to manually enter the print tag for each card and then have the API pull the data and populate the spreadsheet, such as the name of the card and its trait, in addition to the price data. So anytime I run the code, it is updated.
My idea was to use a for loop to get the API to search up each print tag and store the information in an empty dictionary and then post the results onto the excel file. I added an example of the spreadsheet.
Please let me know if I can clarify further. Any suggestions to the code that would help me achieve the goal for this project would be appreciated. Thanks in advance
import requests
import response as rsp
import urllib3
import urlopen
import json
import pandas as pd
df = pd.read_excel("api_ygo.xlsx")
print(df[:5]) # See the first 5 columns
response = requests.get('http://yugiohprices.com/api/price_for_print_tag/print_tag')
print(response.json())
data = []
for i in df:
print_tag = i[2]
request = requests.get('http://yugiohprices.com/api/price_for_print_tag/print_tag' + print_tag)
data.append(print_tag)
print(data)
def jprint(obj):
text = json.dumps(obj, sort_keys=True, indent=4)
print(text)
jprint(response.json())
Example Spreadsheet

Iterating over a pandas dataframe can be done using df.apply(). This has the added advantage that you can store the results directly in your dataframe.
First define a function that returns the desired result. Then apply the relevant column to that function while assigning the output to a new column:
import requests
import pandas as pd
import time
df = pd.DataFrame(['EP1-EN002', 'LED6-EN007', 'DRL2-EN041'], columns=['print_tag']) #just dummy data, in your case this is pd.read_excel
def get_tag(print_tag):
request = requests.get('http://yugiohprices.com/api/price_for_print_tag/' + print_tag) #this url works, the one in your code wasn't correct
time.sleep(1) #sleep for a second to prevent sending too many API calls per minute
return request.json()
df['result'] = df['print_tag'].apply(get_tag)
You can now export this column to a list of dictionaries with df['result'].tolist(). Or even better, you can flatten the results into a new dataframe with pd.json_normalize:
df2 = pd.json_normalize(df['result'])
df2.to_excel('output.xlsx') # save dataframe as new excel file

"No columns to parse from file" when reading in dictionary

I'm trying to take a dictionary object in python, write it out to a csv file, and then read it back in from that csv file.
But it's not working. When I try to read it back in, it gives me the following error:
EmptyDataError: No columns to parse from file
I don't understand this for two reasons. Firstly, if I used pandas very own to_csv method, it should
be giving me the correct format for a csv. Secondly, when I print out the header values (by doing this : print(df.columns.values) ) of the dataframe that I'm trying to save, it says I do in fact have headers ("one" and "two".) So if the object I was sending out had column names, I don't know why they wouldn't be found when I'm trying to read it back.
import pandas as pd
testing = {"one":1,"two":2 }
df = pd.DataFrame(testing, index=[0])
file = open('testing.csv','w')
df.to_csv(file)
new_df = pd.read_csv("testing.csv")
What am I doing wrong?
Thanks in advance for the help!

The default pandas.DataFrame.to_csv takes a path and not an text io. Just remove the file declaration and directly use the path, pass index = False to skip indexes.
import pandas as pd
testing = {"one":1,"two":2 }
df = pd.DataFrame(testing, index=[0])
df.to_csv('testing.csv', index = False)
new_df = pd.read_csv("testing.csv")

Python/Pandas: writing multiple Dataframes to Excel sheets using a "for-loop"

I cant imagine this question wasnt asked before but im not able to find the answere here:
I got a Excel-File as Dataframe and used Dataframe.groupby on it. Now I want to save every single group into ONE new Excel file using a DIFFERENT Sheet for every Group. All I was able to do is creating a lot new files with one group in every file. My new "solution" does nothing.
df = pd.read_excel(file)
neurons = df.groupby("Tags")
#writing Keys into a list
tags = neurons.groups.keys()
tags = list(tags)
for keyInTags in tags:
cells = group.get_group(keyInTags)
cells.to_excel("final.xlsx", sheet_name=keyInTags)
I get no errors but also not new file or writing to an existing file.

Actually, I believe this is a better solution. Replace your for loop with this code:
writer = pd.ExcelWriter('excel_file_name.xlsx')
for keyInTags in tags:
cells = group.get_group(keyInTags)
cells.to_excel(writer, sheet_name=keyInTags)
writer.save()
writer.close()

Here is a cleaner solution for anyone still looking:
import pandas as pd
df = pd.read_excel("input.xlsx")
with pd.ExcelWriter("output.xlsx") as writer:
for name, group in df.groupby("column_name"):
group.to_excel(writer, index=False, sheet_name=name[:31])

How to fetch input from the csv file in python

I have a csv (input.csv) file as shown below:
VM IP Naa_Dev Datastore
vm1 xx.xx.xx.x1 naa.ab1234 ds1
vm2 xx.xx.xx.x2 naa.ac1234 ds1
vm3 xx.xx.xx.x3 naa.ad1234 ds2
I want to use this csv file as an input file for my python script. Here in this file, first line i.e. (VM IP Naa_Dev Datastore) is the column heading and each value is separated by space.
So my question is how we can use this csv file for input values in python so if I need to search in python script that what is the value of vm1 IP then it should pickup xx.xx.xx.x1 or same way if I am looking for VM which has naa.ac1234 Naa_Dev should take vm2.
I am using Python version 2.7.8
Any help is much appreciated.
Thanks

Working with tabular data like this, the best way is using pandas.
Something like:
import pandas
dataframe = pandas.read_csv('csv_file.csv')
# finding IP by vm
print(dataframe[dataframe.VM == 'vm1'].IP)
# OUTPUT: xx.xx.xx.x1
# or find by Naa_Dev
print(dataframe[dataframe.Naa_Dev == 'xx.xx.xx.x2'].VM)
# OUTPUT: vm2

For importing csv into python you can use pandas, in your case the code would look like:
import pandas as pd
df = pd.read_csv('input.csv', sep=' ')
and for locating certain rows in created dataframe you can multiple options (that you can easily find in pandas or just by googling 'filter data python'), for example:
df['VM'].where(df['Naa_Dev'] == 'naa.ac1234')

Use the pandas module to read the file into a DataFrame. There is a lot of parameters for reading csv files with pandas.read_csv. The dataframe.to_string() function is extremely useful.
Solution:
# import module with alias 'pd'
import pandas as pd
# Open the CSV file, delimiter is set to white space, and then
# we specify the column names.
dframe = pd.read_csv("file.csv",
delimiter=" ",
names=["VM", "IP", "Naa_Dev", "Datastore"])
# print will output the table
print(dframe)
# to_string will allow you to align and adjust content
# e.g justify = left to align columns to the left.
print(dframe.to_string(justify="left"))

Pandas is probably the best answer but you can also:
import csv
your_list = []
with open('dummy.csv') as csvfile:
reader = csv.DictReader(csvfile, delimiter=' ')
for row in reader:
your_list += [row]
print(your_list)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to avoid headers repeating everytime I append in Pandas - python

Try: import pathlib csvfile = pathlib.Path('Info.csv') df.to_csv(csvfile, mode='a', index=False, header=not csvfile.exists())

Related

pandas read_excel doesn't read all rows

Populating an Excel File Using an API to track Card Prices in Python

"No columns to parse from file" when reading in dictionary

Python/Pandas: writing multiple Dataframes to Excel sheets using a "for-loop"

How to fetch input from the csv file in python

Categories

Resources