Related
This is my code :
import pandas as pd
import requests
response = requests.get(f'https://www.magicbricks.com/mbsrp/propertySearch.html?editSearch=Y&category=S&propertyType=10002,10003,10021,10022,10001,10017,10000&bedrooms=11700,11701,11702,11703,11704,11705,11706,11707,11708,11709,11710&city=4320&page=2&groupstart=30&offset=0&maxOffset=248&sortBy=premiumRecent&postedSince=-1&pType=10002,10003,10021,10022,10001,10017,10000&isNRI=N&multiLang=en')
df = pd.json_normalize(response.json()['resultList'], max_level=0)
df.to_csv('property_data.csv', mode='a')
for i in range(3, 102):
response = requests.get(f'https://www.magicbricks.com/mbsrp/propertySearch.html?editSearch=Y&category=S&propertyType=10002,10003,10021,10022,10001,10017,10000&bedrooms=11700,11701,11702,11703,11704,11705,11706,11707,11708,11709,11710&city=4320&page={i}&groupstart={30 * (i - 1)}&offset=0&maxOffset=248&sortBy=premiumRecent&postedSince=-1&pType=10002,10003,10021,10022,10001,10017,10000&isNRI=N&multiLang=en')
df = pd.json_normalize(response.json()['resultList'])
df.to_csv('property_data.csv', mode='a', header=False)
An issue I am encountering while appending to the property_data.csv file is that due to the non existence certain keys in JSON file, data from different columns is getting mixed up.
eg:
df = pd.read_csv("property_data.csv", on_bad_lines='skip')
df['floorNo'].unique()
produces the result
Output exceeds the size limit. Open the full output data in a text editor
array([nan, '9', '38', '12', '10', '17', '16', '48', '28', '3', '15',
'35', '5', '30', '21', '27', '45', '11', 'Lift', 'Air Conditioned',
'Private jaccuzi', 'Jogging and Strolling Track', 'North - East',
'South -West', 'East', 'West', 'South', 'North', 'South - East',
'ABAMNC', 'MCGM', 'Water Availability 24 Hours Available',
'Water Availability 6 Hours Available',
'Water Availability 12 Hours Available',
'Water Availability 1 Hour Available', '800', '355', '550', '507',
'250', '750', '1310', '1620', '1295', '650', '1012', '600', '2000',
'555', '775', '920', '1095', '882', '1500', '630', '1100', '699',
'900', '540', 'Intercom Facility', '71353', '68345', '77877',
'56356', '54016', '68703', '64595',
'ff260c41-dd5c-44a1-866f-c9cedca191e8graj6227',
'96de2716-6a3a-488f-a34c-f01563544696ghar1657',
'5d6b1f8e-73a4-4439-a956-c4c8367e8bd7raje3800',
'9f75a3cd-f749-42aa-bc77-39ef9d2ad8c7yoge1410',
'0955f6a3-f09e-417b-a2e1-8332b6d467f0vaid3337',
'b2ce8fad-2392-4b26-95d9-edaf4bf46cddkhuz5152',
'6e729b73-e20b-4b43-832d-30cef3094fc4chan8207',
'b09c339d-5667-4095-a756-b54356a44ddckash6262',
'50abb073-dfdd-4e24-b544-712b6d8bcc43bina7067',
'58d5ac1c-cdb3-4542-96c3-581b3a9fb8d1meha4330',
'35d87e1a-5dfd-4c8f-9192-0acbb8d36d60hari1100',
'53d875a6-1074-443e-b7d7-a1cb55f63d6anila1380',
'3cad0738-bdf0-4f1b-81f3-4bb890a16e77scsh5048', '+91-81XXXXXXX88',
...
'prathamesh apartments, Matunga East, Mumbai', 'Bhandup',
'40 X 100 sqft sqft', 'Freehold', 'Co-operative Society',
'Power Back Up', 'Near Sitladevi Temple, Kotak Mahindra Bank',
'Near Lodha World School', 'Newly Constructed Property',
'Near Mohan Baug Banquet hall', 'Smart Home'], dtype=object)
How do I make it so that the data is appended to the correct column on the csv file and if the file does not have a certain column, we create a new one and set values of other entries to null?
There are dozens of questions here on SO with a title very similar to this one - but most of them seem related to some iFrame, which prevents Selenium to access the intended tag, node, or whatever.
In my case, I'm trying to access this site. All I want is to read the data on a table - it is easy to identify, given there it is inside a div with a very particular ID. The table is also simple to read. Despite that, there is this dynamic-comp tag, which seems to be my stumbling block - I can access all elements outside of it, and no element inside at all - be it by ID, class, tag name, whatever.
How do I handle this? Is this some kind of special IFrame? I'd have tried the .switchTo approach, but the dynamic-comp elements have no ID or class, just the tag alone.
EDIT: I also tried adding wait = WebDriverWait(driver,20), just in case.
Didn't work. My goal is to iterate through the dates using the date selector, so I intend to read the table multiple times.
The table you need is the last one inside #oReportCell. To get it you can use (//td[#id='oReportCell']//table)[last()] xpath or #oReportCell table css selector and get the last one.
How to get table with requests and beautifulsoup, #PedroLobito's solution proposal. You can use pandas to collect and save data:
import requests
from bs4 import BeautifulSoup
params = (
('path', 'conteudo/txcred/Reports/TaxasCredito-Consolidadas-porTaxasAnuais-Historico.rdl'),
('parametros', ''),
('exibeparametros', 'true'),
)
response = requests.get('https://www.bcb.gov.br/api/relatorio/pt-br/contaspub', params=params)
page = BeautifulSoup(response.json()['conteudo'], 'lxml')
table = page.select('#oReportCell table')[-1]
for tr in table.find_all('tr'):
row_values = [td.text.strip() for td in tr.find_all('td')]
print(row_values)
Output:
['', '', '', '']
['', '', 'Taxas de juros']
['Posição', 'Instituição', '% a.m.', '% a.a.']
['1', 'SINOSSERRA S/A - SCFI', '0,47', '5,76']
['2', 'GRAZZIOTIN FINANCIADORA SA CFI', '0,81', '10,13']
['3', 'BCO CATERPILLAR S.A.', '0,91', '11,44']
['4', 'BCO DE LAGE LANDEN BRASIL S.A.', '0,91', '11,54']
['5', 'BCO VOLKSWAGEN S.A', '0,93', '11,76']
['6', 'BCO KOMATSU S.A.', '1,02', '12,92']
['7', 'BCO SANTANDER (BRASIL) S.A.', '1,13', '14,43']
['8', 'BCO VOLVO BRASIL S.A.', '1,16', '14,80']
['9', 'BCO DO ESTADO DO RS S.A.', '1,32', '17,07']
['10', 'BV FINANCEIRA S.A. CFI', '1,39', '18,05']
['11', 'FINANC ALFA S.A. CFI', '1,42', '18,43']
['12', 'AYMORÉ CFI S.A.', '1,44', '18,75']
['13', 'BCO RIBEIRAO PRETO S.A.', '1,46', '19,05']
['14', 'BCO BRADESCO S.A.', '1,47', '19,15']
['15', 'TODESCREDI S/A - CFI', '1,72', '22,75']
['16', 'CAIXA ECONOMICA FEDERAL', '2,46', '33,84']
['17', 'SIMPALA S.A. CFI', '2,50', '34,48']
['18', 'LEBES FINANCEIRA CFI SA', '3,12', '44,60']
['19', 'BCO RENDIMENTO S.A.', '3,15', '45,06']
['20', 'BECKER FINANCEIRA SA - CFI', '3,52', '51,47']
['21', 'BCO DO BRASIL S.A.', '3,61', '53,08']
['22', 'BCO CETELEM S.A.', '3,70', '54,65']
['23', 'LECCA CFI S.A.', '3,87', '57,65']
['24', 'HS FINANCEIRA', '3,98', '59,65']
['25', 'CREDIARE CFI S.A.', '4,17', '63,32']
['26', 'KREDILIG S.A. - CFI', '4,42', '68,06']
['27', 'CENTROCRED S.A. CFI', '4,60', '71,61']
['28', 'SENFF S.A. - CFI', '4,79', '75,31']
['29', 'ZEMA CFI S/A', '4,81', '75,68']
['30', 'VIA CERTA FINANCIADORA S.A. - CFI', '5,32', '86,31']
['31', 'OMNI BANCO S.A.', '5,35', '86,93']
['32', 'OMNI SA CFI', '5,42', '88,47']
['33', 'LUIZACRED S.A. SCFI', '5,55', '91,16']
['34', 'BCO HONDA S.A.', '5,67', '93,89']
['35', 'BCO LOSANGO S.A.', '5,71', '94,70']
['36', 'BANCO SEMEAR', '6,00', '101,13']
['37', 'NEGRESCO S.A. - CFI', '6,24', '106,69']
['38', 'GAZINCRED S.A. SCFI', '6,60', '115,24']
['39', 'PORTOCRED S.A. - CFI', '7,03', '125,93']
['40', 'AGORACRED S/A SCFI', '7,27', '132,10']
To locate and print the items from Posição, Instituição, % a.m. and % a.a. columns you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:
Using XPATH:
driver.get('https://www.bcb.gov.br/estatisticas/reporttxjuros?path=conteudo%2Ftxcred%2FReports%2FTaxasCredito-Consolidadas-porTaxasAnuais-Historico.rdl&nome=Hist%C3%B3rico%20Posterior%20a%2001%2F01%2F2012&exibeparametros=true')
print([my_elem.text for my_elem in WebDriverWait(driver, 60).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[text()='Instituição']//following::tr[#valign='top']//td/div")))])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Console Output:
['1', 'SINOSSERRA S/A - SCFI', ' 0,47', ' 5,76', '2', 'GRAZZIOTIN FINANCIADORA SA CFI', ' 0,81', ' 10,13', '3', 'BCO CATERPILLAR S.A.', ' 0,91', ' 11,44', '4', 'BCO DE LAGE LANDEN BRASIL S.A.', ' 0,91', ' 11,54', '5', 'BCO VOLKSWAGEN S.A', ' 0,93', ' 11,76', '6', 'BCO KOMATSU S.A.', ' 1,02', ' 12,92', '7', 'BCO SANTANDER (BRASIL) S.A.', ' 1,13', ' 14,43', '8', 'BCO VOLVO BRASIL S.A.', ' 1,16', ' 14,80', '9', 'BCO DO ESTADO DO RS S.A.', ' 1,32', ' 17,07', '10', 'BV FINANCEIRA S.A. CFI', ' 1,39', ' 18,05', '11', 'FINANC ALFA S.A. CFI', ' 1,42', ' 18,43', '12', 'AYMORÉ CFI S.A.', ' 1,44', ' 18,75', '13', 'BCO RIBEIRAO PRETO S.A.', ' 1,46', ' 19,05', '14', 'BCO BRADESCO S.A.', ' 1,47', ' 19,15', '15', 'TODESCREDI S/A - CFI', ' 1,72', ' 22,75', '16', 'CAIXA ECONOMICA FEDERAL', ' 2,46', ' 33,84', '17', 'SIMPALA S.A. CFI', ' 2,50', ' 34,48', '18', 'LEBES FINANCEIRA CFI SA', ' 3,12', ' 44,60', '19', 'BCO RENDIMENTO S.A.', ' 3,15', ' 45,06', '20', 'BECKER FINANCEIRA SA - CFI', ' 3,52', ' 51,47', '21', 'BCO DO BRASIL S.A.', ' 3,61', ' 53,08', '22', 'BCO CETELEM S.A.', ' 3,70', ' 54,65', '23', 'LECCA CFI S.A.', ' 3,87', ' 57,65', '24', 'HS FINANCEIRA', ' 3,98', ' 59,65', '25', 'CREDIARE CFI S.A.', ' 4,17', ' 63,32', '26', 'KREDILIG S.A. - CFI', ' 4,42', ' 68,06', '27', 'CENTROCRED S.A. CFI', ' 4,60', ' 71,61', '28', 'SENFF S.A. - CFI', ' 4,79', ' 75,31', '29', 'ZEMA CFI S/A', ' 4,81', ' 75,68', '30', 'VIA CERTA FINANCIADORA S.A. - CFI', ' 5,32', ' 86,31', '31', 'OMNI BANCO S.A.', ' 5,35', ' 86,93', '32', 'OMNI SA CFI', ' 5,42', ' 88,47', '33', 'LUIZACRED S.A. SCFI', ' 5,55', ' 91,16', '34', 'BCO HONDA S.A.', ' 5,67', ' 93,89', '35', 'BCO LOSANGO S.A.', ' 5,71', ' 94,70', '36', 'BANCO SEMEAR', ' 6,00', ' 101,13', '37', 'NEGRESCO S.A. - CFI', ' 6,24', ' 106,69', '38', 'GAZINCRED S.A. SCFI', ' 6,60', ' 115,24', '39', 'PORTOCRED S.A. - CFI', ' 7,03', ' 125,93', '40', 'AGORACRED S/A SCFI', ' 7,27', ' 132,10']
I am making a username scraper and I really can't understand why the HTML is 'disappearing' when I parse it. Let's take this site for example:
http://www.lolking.net/leaderboards#/eune/1
See how there is a tbody and a bunch of tables in it?
Well when I parse it and output it to the shell the tbody is empty
<div style="background: #333; box-shadow: 0 0 2px #000; padding: 10px;">
<table class="lktable" id="leaderboard_table" width="100%">
<thead>
<tr>
<th style="width: 80px;">
Rank
</th>
<th style="width: 80px;">
Change
</th>
<th style="width: 100px;">
Tier
</th>
<th>
Summoner
</th>
<th style="width: 150px;">
Top Champions
</th>
</tr>
</thead>
<tbody>
</tbody>
</table>
</div>
</div>
Why is this happening and how can I fix it?
This site needs JavaScript to work. JavaScript is used to populate the table by forming a web request, which probably points to a back-end API. This means that the "raw" HTML, without the effects of any JavaScript, has an empty table.
We can actually see this empty table in the background if we visit the site with JavaScript disabled:
BeautifulSoup doesn't cause this JavaScript to execute. Instead, have a look at some alternative libraries which do, such as the more advanced Selenium.
You can get all the data in json format, ll you need to do is parse a value from script tag inside the original page source and pass it to "http://www.lolking.net/leaderboards/some_value_here/eune/1.json":
from bs4 import BeautifulSoup
import requests
import re
patt = re.compile("\$\.get\('/leaderboards/(\w+)/")
js = "http://www.lolking.net/leaderboards/{}/eune/1.json"
soup = BeautifulSoup(requests.get("http://www.lolking.net/leaderboards#/eune/1").content)
script = soup.find("script", text=re.compile("\$\.get\('/leaderboards/"))
val = patt.search(script.text).group(1)
data = requests.get(js.format(val)).json()
data gives you json that contains all the player info like:
{'data': [{'division': '1',
'global_ranking': '12',
'league_points': '1217',
'lks': '2961',
'losses': '31',
'most_played_champions': [{'assists': '238',
'champion_id': '236',
'creep_score': '7227',
'deaths': '131',
'kills': '288',
'losses': '5',
'played': '39',
'wins': '34'},
{'assists': '209',
'champion_id': '429',
'creep_score': '5454',
'deaths': '111',
'kills': '204',
'losses': '3',
'played': '27',
'wins': '24'},
{'assists': '155',
'champion_id': '81',
'creep_score': '4800',
'deaths': '103',
'kills': '168',
'losses': '8',
'played': '26',
'wins': '18'}],
'name': 'Sadastyczny',
'previous_ranking': '2',
'profile_icon_id': 7,
'ranking': '1',
'region': 'eune',
'summoner_id': '42893043',
'tier': '6',
'tier_name': 'CHALLENGER',
'wins': '128'},
{'division': '1',
'global_ranking': '30',
'league_points': '1128',
'lks': '2956',
'losses': '180',
'most_played_champions': [{'assists': '928',
'champion_id': '24',
'creep_score': '37601',
'deaths': '1426',
'kills': '1874',
'losses': '64',
'played': '210',
'wins': '146'},
{'assists': '501',
'champion_id': '67',
'creep_score': '16836',
'deaths': '584',
'kills': '662',
'losses': '37',
'played': '90',
'wins': '53'},
{'assists': '124',
'champion_id': '157',
'creep_score': '5058',
'deaths': '205',
'kills': '141',
'losses': '14',
'played': '28',
'wins': '14'}],
'name': 'Richor',
'previous_ranking': '1',
'profile_icon_id': 577,
'ranking': '2',
'region': 'eune',
'summoner_id': '40385818',
'tier': '6',
'tier_name': 'CHALLENGER',
'wins': '254'},
{'division': '1',
'global_ranking': '49',
'league_points': '1051',
'lks': '2953',
'losses': '47',
'most_played_champions': [{'assists': '638',
'champion_id': '117',
'creep_score': '11927',
'deaths': '99',
'kills': '199',
'losses': '7',
'played': '66',
'wins': '59'},
{'assists': '345',
'champion_id': '48',
'creep_score': '8061',
'deaths': '99',
'kills': '192',
'losses': '11',
'played': '43',
'wins': '32'},
{'assists': '161',
'champion_id': '114',
'creep_score': '5584',
'deaths': '64',
'kills': '165',
'losses': '11',
'played': '31',
'wins': '20'}],
As you can see in Chrome Dev Tools, the site sends 2 XHR requests to get the data, and displays it by using JavaScript.
Since BeautifulSoup is an HTML parser. It will not execute JavaScript. You should use a tool like selenium, which emulates a real browser.
But in this case you might be better of using the API, they use to get the data. You can easily see from which urls they get the data by looking in the 'Network' tab. Reload the page, select XHR and you can use the info to create your own requests using something like Python Requests.
I want to create a list that produces the output of:
[001,002,003,004,005]
and keeps going until 300. Having the 0's in front of the digits is essential. I tried a method such as:
a = []
for i in range(0,3):
for j in range(0,10):
for k in range(0,10):
a.append(i j k)
However, for obvious reasons, the append function does not behave in the manner I would like.
Do people have any suggestions on how else I could do this?
You cannot produce a list with integers that are presented with padding, no. You can produce strings with leading zeros:
a = [format(i + 1, '03d') for i in range(300)]
The format() function is used to format integers to a field width of 3 characters with leading zeros to pad out the length, encoded as 03d.
Demo:
>>> [format(i + 1, '03d') for i in range(300)]
['001', '002', '003', '004', '005', '006', '007', '008', '009', '010', '011', '012', '013', '014', '015', '016', '017', '018', '019', '020', '021', '022', '023', '024', '025', '026', '027', '028', '029', '030', '031', '032', '033', '034', '035', '036', '037', '038', '039', '040', '041', '042', '043', '044', '045', '046', '047', '048', '049', '050', '051', '052', '053', '054', '055', '056', '057', '058', '059', '060', '061', '062', '063', '064', '065', '066', '067', '068', '069', '070', '071', '072', '073', '074', '075', '076', '077', '078', '079', '080', '081', '082', '083', '084', '085', '086', '087', '088', '089', '090', '091', '092', '093', '094', '095', '096', '097', '098', '099', '100', '101', '102', '103', '104', '105', '106', '107', '108', '109', '110', '111', '112', '113', '114', '115', '116', '117', '118', '119', '120', '121', '122', '123', '124', '125', '126', '127', '128', '129', '130', '131', '132', '133', '134', '135', '136', '137', '138', '139', '140', '141', '142', '143', '144', '145', '146', '147', '148', '149', '150', '151', '152', '153', '154', '155', '156', '157', '158', '159', '160', '161', '162', '163', '164', '165', '166', '167', '168', '169', '170', '171', '172', '173', '174', '175', '176', '177', '178', '179', '180', '181', '182', '183', '184', '185', '186', '187', '188', '189', '190', '191', '192', '193', '194', '195', '196', '197', '198', '199', '200', '201', '202', '203', '204', '205', '206', '207', '208', '209', '210', '211', '212', '213', '214', '215', '216', '217', '218', '219', '220', '221', '222', '223', '224', '225', '226', '227', '228', '229', '230', '231', '232', '233', '234', '235', '236', '237', '238', '239', '240', '241', '242', '243', '244', '245', '246', '247', '248', '249', '250', '251', '252', '253', '254', '255', '256', '257', '258', '259', '260', '261', '262', '263', '264', '265', '266', '267', '268', '269', '270', '271', '272', '273', '274', '275', '276', '277', '278', '279', '280', '281', '282', '283', '284', '285', '286', '287', '288', '289', '290', '291', '292', '293', '294', '295', '296', '297', '298', '299', '300']
You could subclass list and overload its __repr__ method to call str.zfill on each number:
class NumList(list):
def __repr__(self):
return '[' + ', '.join([str(x).zfill(3) for x in self]) + ']'
Demo:
>>> class NumList(list):
... def __repr__(self):
... return '[' + ', '.join([str(x).zfill(3) for x in self]) + ']'
...
>>> MyList([1, 2, 3, 4, 5])
[001, 002, 003, 004, 005]
>>>
To make the exact list you want, do NumList(range(300)).
Note however that this does not make integers with leading zeros (as #MartijnPieters said, that is impossible). The output is still a string. All this is doing is telling Python how to display those integers when they are outputed to the console.
I have a very long html file that looks exactly like this - html file . I want to be able to parse the file such that I get the information in the form on a tuple .
Example:
<tr>
<td>Cech</td>
<td>Chelsea</td>
<td>30</td>
<td>£6.4</td>
</tr>
The above information will look like ("Cech", "Chelsea", 30, 6.4). However if you look closely at the link i posted, the html example i posted comes under a <h2>Goalkeepers</h2> tag. i need this tag too. So basically the result tuple will look like ("Cech", "Chelsea", 30, 6.4, Goalkeepers) . Further down the file a bunch of players come under <h2> tags of Midfielders , Defenders and Forwards.
I tried using beautifulsoup and ntlk libraries and got lost. So now I have the following code:
import nltk
from urllib import urlopen
url = "http://fantasy.premierleague.com/player-list/"
html = urlopen(url).read()
raw = nltk.clean_html(html)
print raw
which just strips of the html file of all the tags and gives something like this:
Cech
Chelsea
30
£6.4
Although I can write a bad piece of code that reads every line and can assign it to a tuple. i cannot come up with any solution which can also incorporate the player position ( the string present in the <h2> tags). Any solution / suggestions will be greatly appreciated.
The reason I am inclined towards using tuples i so that I can use unpacking and plan on populating a MySQl table with the unpacked values.
from bs4 import BeautifulSoup
from pprint import pprint
soup = BeautifulSoup(html)
h2s = soup.select("h2") #get all h2 elements
tables = soup.select("table") #get all tables
first = True
title =""
players = []
for i,table in enumerate(tables):
if first:
#every h2 element has 2 tables. table size = 8, h2 size = 4
#so for every 2 tables 1 h2
title = h2s[int(i/2)].text
for tr in table.select("tr"):
player = (title,) #create a player
for td in tr.select("td"):
player = player + (td.text,) #add td info in the player
if len(player) > 1:
#If the tr contains a player and its not only ("Goalkeaper") add it
players.append(player)
first = not first
pprint(players)
output:
[('Goalkeepers', 'Cech', 'Chelsea', '30', '£6.4'),
('Goalkeepers', 'Hart', 'Man City', '28', '£6.4'),
('Goalkeepers', 'Krul', 'Newcastle', '21', '£5.0'),
('Goalkeepers', 'Ruddy', 'Norwich', '25', '£5.0'),
('Goalkeepers', 'Vorm', 'Swansea', '19', '£5.0'),
('Goalkeepers', 'Stekelenburg', 'Fulham', '6', '£4.9'),
('Goalkeepers', 'Pantilimon', 'Man City', '0', '£4.9'),
('Goalkeepers', 'Lindegaard', 'Man Utd', '0', '£4.9'),
('Goalkeepers', 'Butland', 'Stoke City', '0', '£4.9'),
('Goalkeepers', 'Foster', 'West Brom', '13', '£4.9'),
('Goalkeepers', 'Viviano', 'Arsenal', '0', '£4.8'),
('Goalkeepers', 'Schwarzer', 'Chelsea', '0', '£4.7'),
('Goalkeepers', 'Boruc', 'Southampton', '42', '£4.7'),
('Goalkeepers', 'Myhill', 'West Brom', '15', '£4.5'),
('Goalkeepers', 'Fabianski', 'Arsenal', '0', '£4.4'),
('Goalkeepers', 'Gomes', 'Tottenham', '0', '£4.4'),
('Goalkeepers', 'Friedel', 'Tottenham', '0', '£4.4'),
('Goalkeepers', 'Henderson', 'West Ham', '0', '£4.0'),
('Defenders', 'Baines', 'Everton', '43', '£7.7'),
('Defenders', 'Vertonghen', 'Tottenham', '34', '£7.0'),
('Defenders', 'Taylor', 'Cardiff City', '14', '£4.5'),
('Defenders', 'Zverotic', 'Fulham', '0', '£4.5'),
('Defenders', 'Davies', 'Hull City', '28', '£4.5'),
('Defenders', 'Flanagan', 'Liverpool', '0', '£4.5'),
('Defenders', 'Dawson', 'West Brom', '0', '£3.9'),
('Defenders', 'Potts', 'West Ham', '0', '£3.9'),
('Defenders', 'Spence', 'West Ham', '0', '£3.9'),
('Midfielders', 'Özil', 'Arsenal', '24', '£10.6'),
('Midfielders', 'Redmond', 'Norwich', '20', '£5.0'),
('Midfielders', 'Mavrias', 'Sunderland', '5', '£5.0'),
('Midfielders', 'Gera', 'West Brom', '0', '£5.0'),
('Midfielders', 'Essien', 'Chelsea', '0', '£4.9'),
('Midfielders', 'Brown', 'West Brom', '0', '£4.3'),
('Forwards', 'van Persie', 'Man Utd', '24', '£13.9'),
('Forwards', 'Cornelius', 'Cardiff City', '1', '£5.4'),
('Forwards', 'Elmander', 'Norwich', '7', '£5.4'),
('Forwards', 'Murray', 'Crystal Palace', '0', '£5.3'),
('Forwards', 'Vydra', 'West Brom', '2', '£5.3'),
('Forwards', 'Proschwitz', 'Hull City', '0', '£4.3')]