import pandas as pd
olympics = pd.read_csv('olympics.csv')
Edition NOC Medal
0 1896 AUT Silver
1 1896 FRA Gold
2 1896 GER Gold
3 1900 HUN Bronze
4 1900 GBR Gold
5 1900 DEN Bronze
6 1900 USA Gold
7 1900 FRA Bronze
8 1900 FRA Silver
9 1900 USA Gold
10 1900 FRA Silver
11 1900 GBR Gold
12 1900 SUI Silver
13 1900 ZZX Gold
14 1904 HUN Gold
15 1904 USA Bronze
16 1904 USA Gold
17 1904 USA Silver
18 1904 CAN Gold
19 1904 USA Silver
I can pivot the data frame to have some aggregate value
pivot = olympics.pivot_table(index='Edition', columns='NOC', values='Medal', aggfunc='count')
NOC AUT CAN DEN FRA GBR GER HUN SUI USA ZZX
Edition
1896 1.0 NaN NaN 1.0 NaN 1.0 NaN NaN NaN NaN
1900 NaN NaN 1.0 3.0 2.0 NaN 1.0 1.0 2.0 1.0
1904 NaN 1.0 NaN NaN NaN NaN 1.0 NaN 4.0 NaN
Rather than having the total number of medals in values= , I am interested to have a tuple (a triple) with (#Gold, #Silver, #Bronze), (0,0,0) for NaN
How do I do that succinctly and elegantly?
No need to use pivot_table, as pivot is perfectly fine with tuple for a value
value_counts to count all medals
create multi-index for all combinations of countries, dates, medals
reindex with fill_values=0
counts = df.groupby(['Edition', 'NOC']).Medal.value_counts()
mux = pd.MultiIndex.from_product(
[c.values for c in counts.index.levels], names=counts.index.names)
counts = counts.reindex(mux, fill_value=0).unstack('Medal')
counts = counts[['Bronze', 'Silver', 'Gold']]
pd.Series([tuple(l) for l in counts.values.tolist()], counts.index).unstack()
Related
Apologies if this question is elementary, but I'm a newbie to scraping and am trying to perform a simple scrape of NFL Future prices off of a website, but am not having any luck. My code is below. At this point, I'm just trying to get something/anything to return (ultimately will pull the text of the team names and futures prices), but this code returns "None" and "[]" (an empty list) for the find and find_all functions, respectively. I get the find/find_all parameters by inspecting the first line of the page (Baltimore Ravens) when I see that the team names are held in a span with the class of "style_label__2KJur".
I suspect this has something to do with how the html is loaded. When I print(nfl_futures), I don't see any of the html that I inspected for the first line which is presumably why I get no results. If this is true, how do I expose all of the html I need in order to scrape this data?
Appreciate the help.
import requests
from bs4 import BeautifulSoup
url = "https://www.pinnacle.com/en/football/nfl/matchups#futures"
r = requests.get(url).content
nfl_futures = BeautifulSoup(r, "lxml")
first_line = nfl_futures.find('span', class_="style_label__2KJur")
lines = nfl_futures.find_all('span', class_="style_label__2KJur")
print(first_line)
print(lines)
Output:
None
[]
Process finished with exit code 0
This site is hardly a simple scrape. The page is dynamic. You could use selenium to first render the page, then grab the html to parse with bs4. Or as stated, grab the dat from the api, but then you need to do a little data manipulation to join them. I always like going the api method as it's robust and more efficient.
import requests
import pandas as pd
url = 'https://www.pinnacle.com/config/app.json'
jsonData = requests.get(url).json()
x_api_key = jsonData['api']['haywire']['apiKey']
headers = {
'X-API-Key': x_api_key}
matchups_url = "https://guest.api.arcadia.pinnacle.com/0.1/leagues/889/matchups"
jsonData_matchups = requests.get(matchups_url, headers=headers).json()
df = pd.json_normalize(jsonData_matchups,
record_path = ['participants'],
meta = ['id','type',['special', 'category'],['special', 'description']],
meta_prefix = 'participants.',
errors='ignore')
df['id'] = df['id'].fillna(0).astype(int).astype(str)
df['participants.id'] = df['participants.id'].fillna(0).astype(int).astype(str)
df = df.rename(columns={'id':'participantId','participants.id':'matchupId'})
df_matchups = df[df['participants.type'] == 'matchup']
df_special = df[df['participants.type'] == 'special']
straight_url = 'https://guest.api.arcadia.pinnacle.com/0.1/leagues/889/markets/straight'
jsonData_straight = requests.get(straight_url, headers=headers).json()
df_straight = pd.json_normalize(jsonData_straight,
record_path = ['prices'],
meta = ['type', 'matchupId'],
errors='ignore')
df_straight['matchupId'] = df_straight['matchupId'].fillna(0).astype(int).astype(str)
df_straight['participantId'] = df_straight['participantId'].fillna(0).astype(int).astype(str)
df_filter = df_straight[df_straight['designation'].isin(['home','away','over','under'])]
df_filter = df_filter.pivot_table(index=['matchupId', 'participantId'],
columns='designation',
values=['points','price']).reset_index(drop=False)
df_filter.columns = ['.'.join(x) if x[-1] != '' else x[0] for x in df_filter.columns]
nfl_futures = pd.merge(df_special, df_straight, how='left', left_on=['matchupId', 'participantId'], right_on=['matchupId', 'participantId'])
nfl_matchups = pd.merge(df_matchups, df_filter, how='left', left_on=['matchupId', 'participantId'], right_on=['matchupId', 'participantId'])
Output:
Here's what the first 5 rows of 324 rows looks like for futures:
print(nfl_futures.head(10).to_string())
alignment participantId name order rotation matchupId participants.type participants.special.category participants.special.description points price designation type
0 neutral 1326753860 Over 0 3017.0 1326753859 special Regular Season Wins Dallas Cowboys Regular Season Wins? 9.5 108 NaN total
1 neutral 1326753861 Under 0 3018.0 1326753859 special Regular Season Wins Dallas Cowboys Regular Season Wins? 9.5 -129 NaN total
2 neutral 1336218775 Trevor Lawrence 0 5801.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 312 NaN moneyline
3 neutral 1336218776 Justin Fields 0 5802.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 461 NaN moneyline
4 neutral 1336218777 Zach Wilson 0 5803.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 790 NaN moneyline
5 neutral 1336218778 Trey Lance 0 5804.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 655 NaN moneyline
6 neutral 1336218779 Mac Jones 0 5805.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 807 NaN moneyline
7 neutral 1336218780 Kyle Pitts 0 5806.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1095 NaN moneyline
8 neutral 1336218781 Najee Harris 0 5807.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1015 NaN moneyline
9 neutral 1336218782 DeVonta Smith 0 5808.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1903 NaN moneyline
And here is week 1 matchup lines:
print(nfl_matchups.to_string())
alignment participantId name order rotation matchupId participants.type participants.special.category participants.special.description points.away points.home points.over points.under price.away price.home price.over price.under
0 home 0 Tampa Bay Buccaneers 1 NaN 1327265167 matchup NaN NaN 6.5 -6.5 51.5 51.5 107.0 -118.0 101.0 -112.0
1 away 0 Dallas Cowboys 0 NaN 1327265167 matchup NaN NaN 6.5 -6.5 51.5 51.5 107.0 -118.0 101.0 -112.0
2 home 0 Washington Football Team 1 NaN 1327265554 matchup NaN NaN 0.0 0.0 44.5 44.5 -115.0 104.0 -106.0 -106.0
3 away 0 Los Angeles Chargers 0 NaN 1327265554 matchup NaN NaN 0.0 0.0 44.5 44.5 -115.0 104.0 -106.0 -106.0
4 home 0 Detroit Lions 1 NaN 1327265774 matchup NaN NaN -7.5 7.5 46.0 46.0 101.0 -111.0 -106.0 -106.0
5 away 0 San Francisco 49ers 0 NaN 1327265774 matchup NaN NaN -7.5 7.5 46.0 46.0 101.0 -111.0 -106.0 -106.0
6 home 0 Las Vegas Raiders 1 NaN 1327266134 matchup NaN NaN -4.5 4.5 51.0 51.0 -110.0 -100.0 -106.0 -106.0
7 away 0 Baltimore Ravens 0 NaN 1327266134 matchup NaN NaN -4.5 4.5 51.0 51.0 -110.0 -100.0 -106.0 -106.0
8 home 0 Los Angeles Rams 1 NaN 1327266054 matchup NaN NaN 7.5 -7.5 45.0 45.0 -114.0 103.0 -106.0 -106.0
9 away 0 Chicago Bears 0 NaN 1327266054 matchup NaN NaN 7.5 -7.5 45.0 45.0 -114.0 103.0 -106.0 -106.0
10 home 0 Kansas City Chiefs 1 NaN 1327265828 matchup NaN NaN 6.0 -6.0 52.5 52.5 102.0 -112.0 -106.0 -106.0
11 away 0 Cleveland Browns 0 NaN 1327265828 matchup NaN NaN 6.0 -6.0 52.5 52.5 102.0 -112.0 -106.0 -106.0
12 home 0 Carolina Panthers 1 NaN 1327265337 matchup NaN NaN 4.0 -4.0 43.0 43.0 -105.0 -105.0 -106.0 -106.0
13 away 0 New York Jets 0 NaN 1327265337 matchup NaN NaN 4.0 -4.0 43.0 43.0 -105.0 -105.0 -106.0 -106.0
14 home 0 Cincinnati Bengals 1 NaN 1327265711 matchup NaN NaN -3.5 3.5 48.0 48.0 -105.0 -105.0 -106.0 -106.0
15 away 0 Minnesota Vikings 0 NaN 1327265711 matchup NaN NaN -3.5 3.5 48.0 48.0 -105.0 -105.0 -106.0 -106.0
16 home 0 New Orleans Saints 1 NaN 1327266000 matchup NaN NaN -2.5 2.5 50.0 50.0 -118.0 107.0 -106.0 -106.0
17 away 0 Green Bay Packers 0 NaN 1327266000 matchup NaN NaN -2.5 2.5 50.0 50.0 -118.0 107.0 -106.0 -106.0
18 home 0 Buffalo Bills 1 NaN 1327265283 matchup NaN NaN 7.0 -7.0 50.0 50.0 -116.0 105.0 -106.0 -106.0
19 away 0 Pittsburgh Steelers 0 NaN 1327265283 matchup NaN NaN 7.0 -7.0 50.0 50.0 -116.0 105.0 -106.0 -106.0
20 home 0 Tennessee Titans 1 NaN 1327265444 matchup NaN NaN 3.0 -3.0 51.0 51.0 -102.0 -108.0 -116.0 104.0
21 away 0 Arizona Cardinals 0 NaN 1327265444 matchup NaN NaN 3.0 -3.0 51.0 51.0 -102.0 -108.0 -116.0 104.0
22 home 0 New York Giants 1 NaN 1327265931 matchup NaN NaN -1.0 1.0 42.5 42.5 -110.0 100.0 -106.0 -106.0
23 away 0 Denver Broncos 0 NaN 1327265931 matchup NaN NaN -1.0 1.0 42.5 42.5 -110.0 100.0 -106.0 -106.0
24 home 0 Atlanta Falcons 1 NaN 1327265598 matchup NaN NaN 3.5 -3.5 48.0 48.0 -108.0 -102.0 -106.0 -105.0
25 away 0 Philadelphia Eagles 0 NaN 1327265598 matchup NaN NaN 3.5 -3.5 48.0 48.0 -108.0 -102.0 -106.0 -105.0
26 home 0 Indianapolis Colts 1 NaN 1327265657 matchup NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
27 away 0 Seattle Seahawks 0 NaN 1327265657 matchup NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28 home 0 New England Patriots 1 NaN 1327265876 matchup NaN NaN 2.5 -2.5 45.5 45.5 104.0 -115.0 103.0 -115.0
29 away 0 Miami Dolphins 0 NaN 1327265876 matchup NaN NaN 2.5 -2.5 45.5 45.5 104.0 -115.0 103.0 -115.0
try to use the html.parser instead of the lxml. Also, try to print your nfl_futures variable to check if you are getting an html page.
If that is the case then check inside the html code if the element(s) that your are looking for exist.
I was trying to crawl down nba player info from https://nba.com/players and click the button "Show Historic" on the webpage
nba_webpage_picture
part of the html code for the input button shows below:
<div aria-label="Show Historic Toggle" class="Toggle_switch__2e_90">
<input type="checkbox" class="Toggle_input__gIiFd" name="showHistoric">
<span class="Toggle_slider__hCMQQ Toggle_sliderActive__15Jrf Toggle_slidercerulean__1UnnV">
</span>
</div>
I simply use find_element_by_xpath to locate the input button and click
button_show_historic = driver.find_element_by_xpath("//input[#name='showHistoric']")
button_show_historic.click()
However it says:
Exception has occurred: ElementNotInteractableException
Message: element not interactable
(Session info: chrome=88.0.4324.192)
Could anyone help on solving this issue? Is this because the input is not visible?
Simply wait for the span element not the input element and click.
wait = WebDriverWait(driver, 30)
driver.get('https://www.nba.com/players')
wait.until(EC.element_to_be_clickable((By.XPATH,"//button[.='I Accept']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,"//input[#name='showHistoric']/preceding::span[1]"))).click()
Import
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Also to find an api just look under Developer tools ->Network->Headers
and Response to find if it gets populated.
Most probably problem is you don't have any wait code. You should wait until page is loaded. You can use simple python sleep function:
import time
time.sleep(3) #it will wait 3 seconds
##Do your action
Or You can use explicit wait. Check this page: selenium.dev
No need to use selenium when there's an api. Try this:
import requests
import pandas as pd
url = 'https://stats.nba.com/stats/playerindex'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36',
'Referer': 'http://stats.nba.com'}
payload = {
'College': '',
'Country': '',
'DraftPick': '',
'DraftRound': '',
'DraftYear': '',
'Height': '' ,
'Historical': '1',
'LeagueID': '00',
'Season': '2020-21',
'SeasonType': 'Regular Season',
'TeamID': '0',
'Weight': ''}
jsonData = requests.get(url, headers=headers, params=payload).json()
cols = jsonData['resultSets'][0]['headers']
data = jsonData['resultSets'][0]['rowSet']
df = pd.DataFrame(data, columns=cols)
Output: [4589 rows x 26 columns]
print(df.head(20).to_string())
PERSON_ID PLAYER_LAST_NAME PLAYER_FIRST_NAME PLAYER_SLUG TEAM_ID TEAM_SLUG IS_DEFUNCT TEAM_CITY TEAM_NAME TEAM_ABBREVIATION JERSEY_NUMBER POSITION HEIGHT WEIGHT COLLEGE COUNTRY DRAFT_YEAR DRAFT_ROUND DRAFT_NUMBER ROSTER_STATUS PTS REB AST STATS_TIMEFRAME FROM_YEAR TO_YEAR
0 76001 Abdelnaby Alaa alaa-abdelnaby 1.610613e+09 blazers 0 Portland Trail Blazers POR 30 F 6-10 240 Duke USA 1990.0 1.0 25.0 NaN 5.7 3.3 0.3 Career 1990 1994
1 76002 Abdul-Aziz Zaid zaid-abdul-aziz 1.610613e+09 rockets 0 Houston Rockets HOU 54 C 6-9 235 Iowa State USA 1968.0 1.0 5.0 NaN 9.0 8.0 1.2 Career 1968 1977
2 76003 Abdul-Jabbar Kareem kareem-abdul-jabbar 1.610613e+09 lakers 0 Los Angeles Lakers LAL 33 C 7-2 225 UCLA USA 1969.0 1.0 1.0 NaN 24.6 11.2 3.6 Career 1969 1988
3 51 Abdul-Rauf Mahmoud mahmoud-abdul-rauf 1.610613e+09 nuggets 0 Denver Nuggets DEN 1 G 6-1 162 Louisiana State USA 1990.0 1.0 3.0 NaN 14.6 1.9 3.5 Career 1990 2000
4 1505 Abdul-Wahad Tariq tariq-abdul-wahad 1.610613e+09 kings 0 Sacramento Kings SAC 9 F-G 6-6 235 San Jose State France 1997.0 1.0 11.0 NaN 7.8 3.3 1.1 Career 1997 2003
5 949 Abdur-Rahim Shareef shareef-abdur-rahim 1.610613e+09 grizzlies 0 Memphis Grizzlies MEM 3 F 6-9 245 California USA 1996.0 1.0 3.0 NaN 18.1 7.5 2.5 Career 1996 2007
6 76005 Abernethy Tom tom-abernethy 1.610613e+09 warriors 0 Golden State Warriors GSW 5 F 6-7 220 Indiana USA 1976.0 3.0 43.0 NaN 5.6 3.2 1.2 Career 1976 1980
7 76006 Able Forest forest-able 1.610613e+09 sixers 0 Philadelphia 76ers PHI 6 G 6-3 180 Western Kentucky USA 1956.0 NaN NaN NaN 0.0 1.0 1.0 Career 1956 1956
8 76007 Abramovic John john-abramovic 1.610610e+09 None 1 Pittsburgh Ironmen PIT None F 6-3 195 Salem USA NaN NaN NaN NaN 9.5 NaN 0.7 Career 1946 1947
9 203518 Abrines Alex alex-abrines 1.610613e+09 thunder 0 Oklahoma City Thunder OKC 8 G 6-6 190 FC Barcelona Spain 2013.0 2.0 32.0 NaN 5.3 1.4 0.5 Career 2016 2018
10 1630173 Achiuwa Precious precious-achiuwa 1.610613e+09 heat 0 Miami Heat MIA 5 F 6-8 225 Memphis Nigeria 2020.0 1.0 20.0 1.0 5.9 3.9 0.6 Season 2020 2020
11 101165 Acker Alex alex-acker 1.610613e+09 clippers 0 LA Clippers LAC 3 G 6-5 185 Pepperdine USA 2005.0 2.0 60.0 NaN 2.7 1.0 0.5 Career 2005 2008
12 76008 Ackerman Donald donald-ackerman 1.610613e+09 knicks 0 New York Knicks NYK G 6-0 183 Long Island-Brooklyn USA 1953.0 2.0 NaN NaN 1.5 0.5 0.8 Career 1953 1953
13 76009 Acres Mark mark-acres 1.610613e+09 magic 0 Orlando Magic ORL 42 C 6-11 220 Oral Roberts USA 1985.0 2.0 40.0 NaN 3.6 4.1 0.5 Career 1987 1992
14 76010 Acton Charles charles-acton 1.610613e+09 rockets 0 Houston Rockets HOU 24 F 6-6 210 Hillsdale USA NaN NaN NaN NaN 3.3 2.0 0.5 Career 1967 1967
15 203112 Acy Quincy quincy-acy 1.610613e+09 kings 0 Sacramento Kings SAC 13 F 6-7 240 Baylor USA 2012.0 2.0 37.0 NaN 4.9 3.5 0.6 Career 2012 2018
16 76011 Adams Alvan alvan-adams 1.610613e+09 suns 0 Phoenix Suns PHX 33 C 6-9 210 Oklahoma USA 1975.0 1.0 4.0 NaN 14.1 7.0 4.1 Career 1975 1987
17 76012 Adams Don don-adams 1.610613e+09 pistons 0 Detroit Pistons DET 10 F 6-7 210 Northwestern USA 1970.0 8.0 120.0 NaN 8.7 5.6 1.8 Career 1970 1976
18 200801 Adams Hassan hassan-adams 1.610613e+09 nets 0 Brooklyn Nets BKN 8 F 6-4 220 Arizona USA 2006.0 2.0 54.0 NaN 2.5 1.2 0.2 Career 2006 2008
19 1629121 Adams Jaylen jaylen-adams 1.610613e+09 bucks 0 Milwaukee Bucks MIL 6 G 6-0 225 St. Bonaventure USA NaN NaN NaN 1.0 0.3 0.4 0.3 Season 2018 2020
I am trying to reshape the following dataframe such that it is in panel data form by moving the "Year" column such that each year is an individual column.
Out[34]:
Award Year 0
State
Alabama 2003 89
Alabama 2004 92
Alabama 2005 108
Alabama 2006 81
Alabama 2007 71
... ...
Wyoming 2011 4
Wyoming 2012 2
Wyoming 2013 1
Wyoming 2014 4
Wyoming 2015 3
[648 rows x 2 columns]
I want the years to each be individual columns, this is an example,
Out[48]:
State 2003 2004 2005 2006
0 NewYork 10 10 10 10
1 Alabama 15 15 15 15
2 Washington 20 20 20 20
I have read up on stack/unstack but I don't think I want a multilevel index as a result. I have been looking through the documentation at to_frame etc. but I can't see what I am looking for.
If anyone can help that would be great!
Use set_index with append=True then select the column 0 and use unstack to reshape:
df = df.set_index('Award Year', append=True)['0'].unstack()
Result:
Award Year 2003 2004 2005 2006 2007 2011 2012 2013 2014 2015
State
Alabama 89.0 92.0 108.0 81.0 71.0 NaN NaN NaN NaN NaN
Wyoming NaN NaN NaN NaN NaN 4.0 2.0 1.0 4.0 3.0
Pivot Table can help.
df2 = pd.pivot_table(df,values='0', columns='AwardYear', index=['State'])
df2
Result:
AwardYear 2003 2004 2005 2006 2007 2011 2012 2013 2014 2015
State
Alabama 89.0 92.0 108.0 81.0 71.0 NaN NaN NaN NaN NaN
Wyoming NaN NaN NaN NaN NaN 4.0 2.0 1.0 4.0 3.0
I need to reshape a csv pivot table. A small extract looks like:
country location confirmedcases_10-02-2020 deaths_10-02-2020 confirmedcases_11-02-2020 deaths_11-02-2020
0 Australia New South Wales 4.0 0.0 4 0.0
1 Australia Victoria 4.0 0.0 4 0.0
2 Australia Queensland 5.0 0.0 5 0.0
3 Australia South Australia 2.0 0.0 2 0.0
4 Cambodia Sihanoukville 1.0 0.0 1 0.0
5 Canada Ontario 3.0 0.0 3 0.0
6 Canada British Columbia 4.0 0.0 4 0.0
7 China Hubei 31728.0 974.0 33366 1068.0
8 China Zhejiang 1177.0 0.0 1131 0.0
9 China Guangdong 1177.0 1.0 1219 1.0
10 China Henan 1105.0 7.0 1135 8.0
11 China Hunan 912.0 1.0 946 2.0
12 China Anhui 860.0 4.0 889 4.0
13 China Jiangxi 804.0 1.0 844 1.0
14 China Chongqing 486.0 2.0 505 3.0
15 China Sichuan 417.0 1.0 436 1.0
16 China Shandong 486.0 1.0 497 1.0
17 China Jiangsu 515.0 0.0 543 0.0
18 China Shanghai 302.0 1.0 311 1.0
19 China Beijing 342.0 3.0 352 3.0
is there any ready to use pandas tool to achieve it?
into something like:
country location date confirmedcases deaths
0 Australia New South Wales 2020-02-10 4.0 0.0
1 Australia Victoria 2020-02-10 4.0 0.0
2 Australia Queensland 2020-02-10 5.0 0.0
3 Australia South Australia 2020-02-10 2.0 0.0
4 Cambodia Sihanoukville 2020-02-10 1.0 0.0
5 Canada Ontario 2020-02-10 3.0 0.0
6 Canada British Columbia 2020-02-10 4.0 0.0
7 China Hubei 2020-02-10 31728.0 974.0
8 China Zhejiang 2020-02-10 1177.0 0.0
9 China Guangdong 2020-02-10 1177.0 1.0
10 China Henan 2020-02-10 1105.0 7.0
11 China Hunan 2020-02-10 912.0 1.0
12 China Anhui 2020-02-10 860.0 4.0
13 China Jiangxi 2020-02-10 804.0 1.0
14 China Chongqing 2020-02-10 486.0 2.0
15 China Sichuan 2020-02-10 417.0 1.0
16 China Shandong 2020-02-10 486.0 1.0
17 China Jiangsu 2020-02-10 515.0 0.0
18 China Shanghai 2020-02-10 302.0 1.0
19 China Beijing 2020-02-10 342.0 3.0
20 Australia New South Wales 2020-02-11 4.0 0.0
21 Australia Victoria 2020-02-11 4.0 0.0
22 Australia Queensland 2020-02-11 5.0 0.0
23 Australia South Australia 2020-02-11 2.0 0.0
24 Cambodia Sihanoukville 2020-02-11 1.0 0.0
25 Canada Ontario 2020-02-11 3.0 0.0
26 Canada British Columbia 2020-02-11 4.0 0.0
27 China Hubei 2020-02-11 33366.0 1068.0
28 China Zhejiang 2020-02-11 1131.0 0.0
29 China Guangdong 2020-02-11 1219.0 1.0
30 China Henan 2020-02-11 1135.0 8.0
31 China Hunan 2020-02-11 946.0 2.0
32 China Anhui 2020-02-11 889.0 4.0
33 China Jiangxi 2020-02-11 844.0 1.0
34 China Chongqing 2020-02-11 505.0 3.0
35 China Sichuan 2020-02-11 436.0 1.0
36 China Shandong 2020-02-11 497.0 1.0
37 China Jiangsu 2020-02-11 543.0 0.0
38 China Shanghai 2020-02-11 311.0 1.0
39 China Beijing 2020-02-11 352.0 3.0
Use pd.wide_to_long:
print (pd.wide_to_long(df,stubnames=["confirmedcases","deaths"],
i=["country","location"],j="date",sep="_",
suffix=r'\d{2}-\d{2}-\d{4}').reset_index())
country location date confirmedcases deaths
0 Australia New South Wales 10-02-2020 4.0 0.0
1 Australia New South Wales 11-02-2020 4.0 0.0
2 Australia Victoria 10-02-2020 4.0 0.0
3 Australia Victoria 11-02-2020 4.0 0.0
4 Australia Queensland 10-02-2020 5.0 0.0
5 Australia Queensland 11-02-2020 5.0 0.0
6 Australia South Australia 10-02-2020 2.0 0.0
7 Australia South Australia 11-02-2020 2.0 0.0
8 Cambodia Sihanoukville 10-02-2020 1.0 0.0
9 Cambodia Sihanoukville 11-02-2020 1.0 0.0
10 Canada Ontario 10-02-2020 3.0 0.0
11 Canada Ontario 11-02-2020 3.0 0.0
12 Canada British Columbia 10-02-2020 4.0 0.0
13 Canada British Columbia 11-02-2020 4.0 0.0
14 China Hubei 10-02-2020 31728.0 974.0
15 China Hubei 11-02-2020 33366.0 1068.0
16 China Zhejiang 10-02-2020 1177.0 0.0
17 China Zhejiang 11-02-2020 1131.0 0.0
18 China Guangdong 10-02-2020 1177.0 1.0
19 China Guangdong 11-02-2020 1219.0 1.0
20 China Henan 10-02-2020 1105.0 7.0
21 China Henan 11-02-2020 1135.0 8.0
22 China Hunan 10-02-2020 912.0 1.0
23 China Hunan 11-02-2020 946.0 2.0
24 China Anhui 10-02-2020 860.0 4.0
25 China Anhui 11-02-2020 889.0 4.0
26 China Jiangxi 10-02-2020 804.0 1.0
27 China Jiangxi 11-02-2020 844.0 1.0
28 China Chongqing 10-02-2020 486.0 2.0
29 China Chongqing 11-02-2020 505.0 3.0
30 China Sichuan 10-02-2020 417.0 1.0
31 China Sichuan 11-02-2020 436.0 1.0
32 China Shandong 10-02-2020 486.0 1.0
33 China Shandong 11-02-2020 497.0 1.0
34 China Jiangsu 10-02-2020 515.0 0.0
35 China Jiangsu 11-02-2020 543.0 0.0
36 China Shanghai 10-02-2020 302.0 1.0
37 China Shanghai 11-02-2020 311.0 1.0
38 China Beijing 10-02-2020 342.0 3.0
39 China Beijing 11-02-2020 352.0 3.0
Yes, and you can achieve it by reshaping the dataframe.
Firs you have to melt the columns to have them as values:
df = df.melt(['country', 'location'],
[ p for p in df.columns if p not in ['country', 'location'] ],
'key',
'value')
#> country location key value
#> 0 Australia New South Wales confirmedcases_10-02-2020 4
#> 1 Australia Victoria confirmedcases_10-02-2020 4
#> 2 Australia Queensland confirmedcases_10-02-2020 5
#> 3 Australia South Australia confirmedcases_10-02-2020 2
#> 4 Cambodia Sihanoukville confirmedcases_10-02-2020 1
#> .. ... ... ... ...
#> 75 China Sichuan deaths_11-02-2020 1
#> 76 China Shandong deaths_11-02-2020 1
#> 77 China Jiangsu deaths_11-02-2020 0
#> 78 China Shanghai deaths_11-02-2020 1
#> 79 China Beijing deaths_11-02-2020 3
After that you need to separate the values in the column key:
key_split_series = df.key.str.split("_", expand=True)
df["key"] = key_split_series[0]
df["date"] = key_split_series[1]
#> country location key value date
#> 0 Australia New South Wales confirmedcases 4 10-02-2020
#> 1 Australia Victoria confirmedcases 4 10-02-2020
#> 2 Australia Queensland confirmedcases 5 10-02-2020
#> 3 Australia South Australia confirmedcases 2 10-02-2020
#> 4 Cambodia Sihanoukville confirmedcases 1 10-02-2020
#> .. ... ... ... ... ...
#> 75 China Sichuan deaths 1 11-02-2020
#> 76 China Shandong deaths 1 11-02-2020
#> 77 China Jiangsu deaths 0 11-02-2020
#> 78 China Shanghai deaths 1 11-02-2020
#> 79 China Beijing deaths 3 11-02-2020
In the end, you just need to pivot the table to have confirmedcases and deaths back as columns:
df = df.set_index(["country", "location", "date", "key"])["value"].unstack().reset_index()
#> key country location date confirmedcases deaths
#> 0 Australia New South Wales 10-02-2020 4 0
#> 1 Australia New South Wales 11-02-2020 4 0
#> 2 Australia Queensland 10-02-2020 5 0
#> 3 Australia Queensland 11-02-2020 5 0
#> 4 Australia South Australia 10-02-2020 2 0
#> .. ... ... ... ... ...
#> 35 China Shanghai 11-02-2020 311 1
#> 36 China Sichuan 10-02-2020 417 1
#> 37 China Sichuan 11-02-2020 436 1
#> 38 China Zhejiang 10-02-2020 1177 0
#> 39 China Zhejiang 11-02-2020 1131 0
Use {dataframe}.reshape((-1,1)) if there is only one feature and {dataframe}.reshape((1,-1)) if there is only one sample
I have two big CSV files. I have converted them to Pandas dataframes. Both of them have columns of same names and in same order : event_name, category, category_id, description. I want to append one dataframe to another, and, finally want to write the resultant dataframe to a CSV. I wrote a code for that:
#appendind a new dataframe to the older dataframe
data = pd.read_csv("dataset.csv")
data1 = pd.read_csv("dataset_new.csv")
dfs = [data, data1]
pd.concat([df.squeeze() for df in dfs], ignore_index=True)
dfs = pd.DataFrame(columns=['event_name','category', 'category_id', 'description'])
dfs.to_csv('dataset_append.csv', encoding='utf-8', index=False)
I wanted to show you the output of print(dfs) but I couldn't because Stackoverflow is showing following error because the output is too long:
Body is limited to 30000 characters; you entered 32132.
Would you please tell me a code snippet which you use succesfully
to append Pandas dataframe?
Edit1:
print(dfs)
outout:
---------------------------------------------------------
[ Unnamed: 10 Unnamed: 100 Unnamed: 101 Unnamed: 102 Unnamed: 103 \
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN
8 NaN NaN NaN NaN NaN
9 NaN NaN NaN NaN NaN
10 NaN NaN NaN NaN NaN
11 NaN NaN NaN NaN NaN
12 NaN NaN NaN NaN NaN
13 NaN NaN NaN NaN NaN
14 NaN NaN NaN NaN NaN
15 NaN NaN NaN NaN NaN
16 NaN NaN NaN NaN NaN
17 NaN NaN NaN NaN NaN
18 NaN NaN NaN NaN NaN
19 NaN NaN NaN NaN NaN
20 NaN NaN NaN NaN NaN
21 NaN NaN NaN NaN NaN
22 NaN NaN NaN NaN NaN
23 NaN NaN NaN NaN NaN
24 NaN NaN NaN NaN NaN
25 NaN NaN NaN NaN NaN
26 NaN NaN NaN NaN NaN
27 NaN NaN NaN NaN NaN
28 NaN NaN NaN NaN NaN
29 NaN NaN NaN NaN NaN
... ... ... ... ... ...
1159 NaN NaN NaN NaN NaN
1160 NaN NaN NaN NaN NaN
1161 NaN NaN NaN NaN NaN
1162 NaN NaN NaN NaN NaN
Unnamed: 104 Unnamed: 105 Unnamed: 106 Unnamed: 107 Unnamed: 108 \
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN
... ... ... ... ... ...
1161 NaN NaN NaN NaN NaN
1162 NaN NaN NaN NaN NaN
... Unnamed: 94 \
0 ... NaN
1 ... NaN
2 ... NaN
3 ... NaN
4 ... NaN
5 ... NaN
6 ... NaN
7 ... NaN
8 ... NaN
9 ... NaN
10 ... NaN
11 ... NaN
12 ... NaN
13 ... NaN
14 ... NaN
15 ... NaN
16 ... NaN
17 ... NaN
18 ... NaN
19 ... NaN
20 ... NaN
21 ... NaN
22 ... NaN
23 ... NaN
24 ... NaN
25 ... NaN
26 ... NaN
27 ... NaN
28 ... NaN
29 ... NaN
... ... ...
1133 ... NaN
1134 ... NaN
1135 ... NaN
1136 ... NaN
1137 ... NaN
1138 ... NaN
1139 ... NaN
1140 ... NaN
1141 ... NaN
1142 ... NaN
1143 ... NaN
1144 ... NaN
1145 ... NaN
1146 ... NaN
1147 ... NaN
1148 ... NaN
1149 ... NaN
1150 ... NaN
1151 ... NaN
1152 ... NaN
1153 ... NaN
1154 ... NaN
1155 ... NaN
1156 ... NaN
1157 ... NaN
1158 ... NaN
1159 ... NaN
1160 ... NaN
1161 ... NaN
1162 ... NaN
Unnamed: 95 Unnamed: 96 Unnamed: 97 Unnamed: 98 Unnamed: 99 \
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN
... ... ... ... ... ...
1133 NaN NaN NaN NaN NaN
1134 NaN NaN NaN NaN NaN
1135 NaN NaN NaN NaN NaN
1136 NaN NaN NaN NaN NaN
category category_id \
0 Business 2
1 stage shows 33
2 Literature 15
3 Science & Technology 22
4 health 11
5 Science & Technology 22
6 Outdoor 19
7 stage shows 33
8 nightlife 30
9 fashion & lifestyle 6
10 Government & Activism 25
11 stage shows 33
12 Religion & Spirituality 21
13 Outdoor 19
14 management 17
15 Science & Technology 22
16 nightlife 30
17 Outdoor 19
18 FAMILy & kids 5
19 fashion & lifestyle 6
20 FAMILy & kids 5
21 games 10
22 hobbies 32
23 hobbies 32
24 Religion & Spirituality 21
25 health 11
26 fashion & lifestyle 6
27 career & education 31
28 health 11
29 arts 1
... ... ...
1133 Sports & Fitness 23
1134 Sports & Fitness 23
1135 Sports & Fitness 23
1136 Sports & Fitness 23
1137 Sports & Fitness 23
1138 Sports & Fitness 23
1139 Sports & Fitness 23
1140 Sports & Fitness 23
1141 Sports & Fitness 23
1142 Sports & Fitness 23
1143 Sports & Fitness 23
1144 Sports & Fitness 23
1145 Sports & Fitness 23
1146 Sports & Fitness 23
1147 Sports & Fitness 23
1148 Sports & Fitness 23
1149 Sports & Fitness 23
1150 Sports & Fitness 23
1151 Sports & Fitness 23
1152 Sports & Fitness 23
1153 Sports & Fitness 23
1154 Sports & Fitness 23
1155 Sports & Fitness 23
1156 Sports & Fitness 23
1157 Sports & Fitness 23
1158 Sports & Fitness 23
1159 Sports & Fitness 23
1160 Sports & Fitness 23
1161 Sports & Fitness 23
1162 Sports & Fitness 23
description \
0 Josh Talks in partnership with Facebook is all...
1 Unwind on the strums of Guitar & immerse your...
2 Book review for grade 3 and above learners. 3 ...
3 ..About Organizer:.This is the official page f...
4 Blood Donation is organized under the banner o...
5 A day "Etched with Innovation and Learning" to...
6 Our next destination for Fun with us is "Goa" ...
7 Enjoy the Soulful and Unplugged Performance of...
8 Get ready with your dance shoes on as our favo...
9 FESTIVE HUES -- a fashion and lifestyle exhibi...
10 On Aug. 8, Dr. Ambedkar presides over the Depr...
11 It's A Rapper Boys..And M Write A New Rap song...
12 The Spiritual Makeover..A weekend workshop tha...
13 Our next destination for Fun with us is "Goa" ...
14 Project Management is all about getting the th...
15 World Conference Next Generation Testing 2018 ...
16 ..About Organizer:.Whitefield is now #Sherlocked!
17 On occasion of 72th Independence Day , Udaan O...
18 *Smilofy Special Superstar*.A Talent hunt for ...
19 ITEEHA is coming back to Bengaluru, after a fa...
20 This is an exciting course for kids to teach t...
21 ..About Organizer:.PPG Lounge is a next genera...
22 Touch Feel Try & Buy the latest #car and #bike...
23 Sniper Media is organising an exclusive semina...
24 He has all sorts of powers and able solve any ...
25 registration fee 50/₹ we r providing free c...
26 World Biggest Pageant Miss & Mrs World Queen a...
27 ..About Organizer:.Canam Consultants - India's...
28 Innopharm is an effort to bring innovations in...
29 The first Central India Art and Design Expo - ...
... ...
1133 As the cricket fever grips the country again, ...
1134 An evening of fun, food, drinks and rooting fo...
1135 The time has come, who will take their place S...
1136 Do you want to prove that Age is not a barrier...
1137 We Invite All The Corporate Companies To Be A ...
1138 PlayTM happy to announce you that conducting o...
1139 A Mix of fun rules and cricketing skills. Afte...
1140 Shuttle Swap presents Singles, Doubles and Mix...
1141 Yonex Mavis 350 Shuttle will be used State/Nat...
1142 Light up the FIFA World Cup with Bud90 Match S...
1143 We are charmed to launch the SVSEVENTZ.COM 5-A...
1144 We corephysio FC invite you for our first foot...
1145 After completing the 2nd season of Bangalore S...
1146 As the cricket fever grips the country again, ...
1147 Introducing BOX Cricket Super 6 Corporate Cric...
1148 After the sucess of '1st Matt & Mudd T20 Leagu...
1149 Hi All, It is my pleasure to officially announ...
1150 Sign up: Get early updates, free movie voucher...
1151 About VIVO Pro Kabaddi 2018: A new season of t...
1152 The Hero Indian Super League (ISL) is India's ...
1153 Limited time offer: Free Paytm Movie Voucher w...
1154 The 5th edition of the Indian Super League is ...
1155 Calling all Jamshedpur FC fans! Here's your ch...
1156 Empower yourself and progress towards a health...
1157 Making people happy when they feel that its en...
1158 LOVE YOGA ?- but too busy with work during the...
1159 The coolest way to tour the city ! Absorb the ...
1160 Ready to be a part of India's Biggest Walkatho...
1161 The event will comprise of the following Open ...
1162 RUN FOR CANCER CHILDREN On world Cancer Day 3r...
event_name
0 Josh Talks Hyderabad 2018
1 Guitar Night With Ashmik Patil
2 Book Review - August 2018 - 2
3 Csaw'18
4 Blood donation camp
5 Rajasthan Youth Innovation and Technical Intel...
6 Goa – Fun All the Way!!! - Mom N Kids
7 The AnshUdhami Project LIVE at Tales & Spirits...
8 Friday Fiesta featuring Pearl
9 FESTIVE HUES
10 Nagpur
11 Yo Yo Deep SP The Rapper
12 The Spiritual Makeover
13 Goa Fun All the Way - Women Only group Tour
14 MS Project 2016 - A one day seminar
15 World Conference Next Generation Testing
16 Weekend Booster - Happy Hour
17 Ladies Only Camping : Freedom To Travel (Seaso...
18 Special superstar
19 Malaysian Batik Workshop
20 EQ Enhancement Course (5-10 years)
21 CS:GO Tournament 2018 - PPGL
22 Auto Mall at Mantri Square Bangalore
23 A Seminar by Ojas Rajani (Bollywood celebrity ...
24 rishikesh katti greatest Spirituality guru of ...
25 free BMD camp held on 26 jan 2018
26 Miss and Mrs Bhopal Madhya Pradesh India World...
27 USA, Canada & Singapore Application Days 2018
28 Innopharm 3
29 Kalasrishti Art and Design Expo
... ...
1133 Asia cup live screening at la casa Brewery+ ki...
1134 Asia Cup 2018 live screening at La Casa Brewer...
1135 FIFA FINAL AT KORAMANGALA TETTO - With #fifa#f...
1136 Womenasia Indoor Cricket Championship
1137 Switch Hit Corporate Cricket Tournament
1138 PlayTM Sports Arena Box Cricket league
1139 The Box Cricket League Edition II (16-17-18 No...
1140 Shuttle Swap Badminton Tournament - With Singl...
1141 SPARK BADMINTON LEAGUE - OCT 14th 2018
1142 Bud90 Match Screenings at Loft38
1143 5 A-Side Football Tournament
1144 5 vs 5 Football league - With Back 2 Track events
1145 Bangalore Sports Carnival Table Tennis Juniors...
1146 Asia cup live screening at la casa Brewery+ ki...
1147 Super 6 Corporate Cricket League
1148 Coolulu is organizing MATT & MUD T20 Cricket L...
1149 United Sportzs Pure Corporate Cricket season-10
1150 Sign up for updates on the VIVO Pro Kabaddi Se...
1151 VIVO Pro Kabaddi - UP Yoddha vs Patna Pirates ...
1152 HERO Indian Super League 2018-19: Kerala Blast...
1153 HERO ISL: FC Goa Memberships
1154 Hero Indian Super League 2018-19: Delhi Dynamo...
1155 HERO Indian Super League 2018-19: Jamshedpur F...
1156 Yoga Therapy Classes in Bangalore
1157 Saree Walkathon
1158 Weekend Yoga Teachers Training Program
1159 Bangalore Walks
1160 Oxfam Trailwalker Bengaluru
1161 TAD Pune 2018 (Triathlon Aquathlon Duathlon)
1162 RUN FOR CANCER CHILDREN
[1163 rows x 241 columns], event_name category \
0 Musical Camping at Dahanu Chiku farm outdoor
1 Adventure Camping at Wada outdoor
2 Kaas Plateau Tour outdoor
3 Pawna Lake Camping, kevre, Lonavala outdoor
4 Night Trek and Camping at Korigad Fort outdoor
5 PARAMOTORING outdoor
6 WATERFALL TREK & BEACH CAMPING (NAGALAPURAM: N... outdoor
7 Happiest Land On Earth - Bhutan outdoor
8 4 Days serial hiking in Sahyadris - Sep 29 to ... outdoor
9 Ride To Valparai outdoor
10 Dzongri Trek - Gateway to Kanchenjunga Mountain outdoor
11 Skandagiri Night Trek With Camping outdoor
12 Kalsubai Trek | Plan The Unplanned outdoor
13 Bike N Hike Skandagiri outdoor
14 Unplanned Stories - Episode 6 - Travel Tales outdoor
15 Feast on authentic flavors from Goa! outdoor
16 The Boot Camp outdoor
17 The HandleBards: Romeo and Juliet at Ranga Sha... outdoor
18 Workshop on Metagenomic Sequencing on the Grid... Science & Technology
19 Aerovision Science & Technology
20 Electric Vehicle Technology Workshop Science & Technology
21 BPM Strategy Summit Science & Technology
22 Summit of Interior Designers & Architecture Science & Technology
23 SMART ASIA India Expo& Summit Science & Technology
24 A Smart City Life Exhibition Science & Technology
25 OPEN SOURCE INDIA Science & Technology
26 SolarRoofs India Bangalore Science & Technology
27 International Conference on Innovative Researc... Science & Technology
28 International Conference on Business Managemen... Science & Technology
29 DevOn Summit Bangalore - Digital Transformations Science & Technology
.. ... ...
144 Asia cup live screening at la casa Brewery+ ki... Sports & Fitness
145 Asia Cup 2018 live screening at La Casa Brewer... Sports & Fitness
146 FIFA FINAL AT KORAMANGALA TETTO - With #fifa#f... Sports & Fitness
147 Womenasia Indoor Cricket Championship Sports & Fitness
148 Switch Hit Corporate Cricket Tournament Sports & Fitness
149 PlayTM Sports Arena Box Cricket league Sports & Fitness
150 The Box Cricket League Edition II (16-17-18 No... Sports & Fitness
151 Shuttle Swap Badminton Tournament - With Singl... Sports & Fitness
152 SPARK BADMINTON LEAGUE - OCT 14th 2018 Sports & Fitness
153 Bud90 Match Screenings at Loft38 Sports & Fitness
s
170 Bangalore Walks Sports & Fitness
171 Oxfam Trailwalker Bengaluru Sports & Fitness
172 TAD Pune 2018 (Triathlon Aquathlon Duathlon) Sports & Fitness
173 RUN FOR CANCER CHILDREN Sports & Fitness
category_id description \
0 19 Dear All Camping Lovers, Come take camping exp...
1 19 Our Adventure campsite at Wada is developed wi...
2 19 Type: Eco Tour Height: 3937 FT above MSL (Appr...
3 19 Our Pawna Lake Camping site is located near Ke...
4 19 Type: Hill Fort Height: 3050 Feet above MSL (A...
23 22 Making 'Smart Cities Mission' a Reality The SM...
24 22 A Smart City Life A Smart City Life Exhibition...
25 22 Asia's No. 1 Convention on Open Source Started...
26 22 The conference will offer an excellent platfor...
27 22 Provides a leading forum for the presentation ...
28 22 Provide opportunity for the global participant...
29 22 The biggest event about Digital Transformation...
.. ... ...
144 23 As the cricket fever grips the country again, ...
145 23 An evening of fun, food, drinks and rooting fo...
146 23 The time has come, who will take their place S...
147 23 Do you want to prove that Age is not a barrier...
148 23 We Invite All The Corporate Companies To Be A ...
149 23 PlayTM happy to announce you that conducting o...
150 23 A Mix of fun rules and cricketing skills. Afte...
151 23 Shuttle Swap presents Singles, Doubles and Mix...
152 23 Yonex Mavis 350 Shuttle will be used State/Nat...
153 23 Light up the FIFA World Cup with Bud90 Match S...
154 23 We are charmed to launch the SVSEVENTZ.COM 5-A...
155 23 We corephysio FC invite you for our first foot...
156 23 After completing the 2nd season of Bangalore S...
157 23 As the cricket fever grips the country again, ...
158 23 Introducing BOX Cricket Super 6 Corporate Cric...
159 23 After the sucess of '1st Matt & Mudd T20 Leagu...
160 23 Hi All, It is my pleasure to officially announ...
161 23 Sign up: Get early updates, free movie voucher...
162 23 About VIVO Pro Kabaddi 2018: A new season of t...
163 23 The Hero Indian Super League (ISL) is India's ...
164 23 Limited time offer: Free Paytm Movie Voucher w...
165 23 The 5th edition of the Indian Super League is ...
166 23 Calling all Jamshedpur FC fans! Here's your ch...
167 23 Empower yourself and progress towards a health...
168 23 Making people happy when they feel that its en...
169 23 LOVE YOGA ?- but too busy with work during the...
170 23 The coolest way to tour the city ! Absorb the ...
171 23 Ready to be a part of India's Biggest Walkatho...
172 23 The event will comprise of the following Open ...
173 23 RUN FOR CANCER CHILDREN On world Cancer Day 3r...
Unnamed: 4 Unnamed: 5
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
24 NaN NaN
25 NaN NaN
26 NaN NaN
27 NaN NaN
28 NaN NaN
29 NaN NaN
.. ... ...
144 NaN NaN
145 NaN NaN
146 NaN NaN
147 NaN NaN
148 NaN NaN
149 NaN NaN
[174 rows x 6 columns]]
Whats wrong with a simple:
pd.concat([df1, df2], ignore_index=True)).to_csv('File.csv', index=False)
this will work if they have the same columns.
A more verbose way to extract specific columns would be:
(pd.concat([df1[['event_name','category', 'category_id', 'description']],
df2[['event_name','category', 'category_id', 'description']]],
ignore_index=True))
.to_csv('File.csv', index=False))
Separate Notes:
you are initializing a DF with just columns and then outputting that to a CSV.
Why are you using .squeeze to convert it to 1-D dataset?