Get data with boundaries using regex - python

I would like to get the labels and data from this function using regex, I have tried using this:
pattern = re.compile(r'/blabels: ],/b')
print(pattern)
result = soup.find("script", text=pattern)
But I get None using boundaries
This is the soup:
<script>
Chart.defaults.LineWithLine = Chart.defaults.line;
new Chart(document.getElementById("chart-overall-mentions"), {
type: 'LineWithLine',
data: {
labels: [1637005508000,1637006108000,1637006708000,1637007308000,1637007908000,1637008508000,1637009108000,1637009708000,1637010308000,1637010908000,1637011508000,1637012108000,1637012708000,1637013308000,1637013908000,1637014508000,1637015108000,1637015708000,1637016308000,1637016908000,1637017508000,1637018108000,1637018708000,1637019308000,1637019908000,1637020508000,1637021108000,1637021708000,1637022308000,1637022908000,1637023508000,1637024108000,1637024708000,1637025308000,1637025908000,1637026508000,1637027108000,1637027708000,1637028308000,1637028908000,1637029508000,1637030108000,1637030708000,1637031308000,1637031908000,1637032508000,1637033108000,1637033708000,1637034308000,1637034908000,1637035508000,1637036108000,1637036708000,1637037308000,1637037908000,1637038508000,1637039108000,1637039708000,1637040308000,1637040908000,1637041508000,1637042108000,1637042708000,1637043308000,1637043908000,1637044508000,1637045108000,1637045708000,1637046308000,1637046908000,1637047508000,1637048108000,1637048708000,1637049308000,1637049908000,1637050508000,1637051108000,1637051708000,1637052308000,1637052908000,1637053508000,1637054108000,1637054708000,1637055308000,1637055908000,1637056508000,1637057108000,1637057708000,1637058308000,1637058908000,1637059508000,1637060108000,1637060708000,1637061308000,1637061908000,1637062508000,1637063108000,1637063708000,1637064308000,1637064908000,1637065508000,1637066108000,1637066708000,1637067308000,1637067908000,1637068508000,1637069108000,1637069708000,1637070308000,1637070908000,1637071508000,1637072108000,1637072708000,1637073308000,1637073908000,1637074508000,1637075108000,1637075708000,1637076308000,1637076908000,1637077508000,1637078108000,1637078708000,1637079308000,1637079908000,1637080508000,1637081108000,1637081708000,1637082308000,1637082908000,1637083508000,1637084108000,1637084708000,1637085308000,1637085908000,1637086508000,1637087108000,1637087708000,1637088308000,1637088908000,1637089508000,1637090108000,1637090708000,1637091308000],
datasets: [{
data: [13,10,20,26,21,23,24,21,24,35,25,31,42,24,24,20,23,22,17,23,30,11,16,20,9,10,22,10,19,16,15,16,17,19,10,20,24,14,19,15,13,9,13,17,20,16,15,21,18,25,15,14,16,15,16,14,14,21,10,9,5,9,9,13,14,9,9,18,15,11,11,6,12,14,19,17,16,11,20,14,21,13,15,12,14,10,20,16,25,17,17,11,23,11,13,11,19,10,17,19,10,20,22,19,19,27,28,18,20,22,18,16,17,18,14,17,19,18,20,11,13,20,15,15,18,14,13,14,14,11,19,14,14,11,11,15,26,12,15,15,11,4,3,6],
pointRadius: 0,
borderColor: "#666",
fill: true,
yAxisID:'yAxis1'
},
]
},
options: {
tooltips: {
mode: 'index',
bodyFontSize: 18,
intersect: false,
titleFontSize: 16,
},
.
.
.
</script>

Here is how you can do that:
Get the script tag - you can use a regex, too, if that is the only way to obtain that node
Then run a regex search against the node text/string to get your final output.
You can use
# Get the script node with text matching your pattern
item = soup.find("script", text=re.compile(r'\blabels:\s*\['))
import re
match = re.search(r'\blabels:\s*\[([^][]*)]', item.string)
if match:
labels = map(int, match.group(1).split(','))
Output:
>>> print(list(labels))
[1637005508000, 1637006108000, 1637006708000, 1637007308000, 1637007908000, 1637008508000, 1637009108000, 1637009708000, 1637010308000, 1637010908000, 1637011508000, 1637012108000, 1637012708000, 1637013308000, 1637013908000, 1637014508000, 1637015108000, 1637015708000, 1637016308000, 1637016908000, 1637017508000, 1637018108000, 1637018708000, 1637019308000, 1637019908000, 1637020508000, 1637021108000, 1637021708000, 1637022308000, 1637022908000, 1637023508000, 1637024108000, 1637024708000, 1637025308000, 1637025908000, 1637026508000, 1637027108000, 1637027708000, 1637028308000, 1637028908000, 1637029508000, 1637030108000, 1637030708000, 1637031308000, 1637031908000, 1637032508000, 1637033108000, 1637033708000, 1637034308000, 1637034908000, 1637035508000, 1637036108000, 1637036708000, 1637037308000, 1637037908000, 1637038508000, 1637039108000, 1637039708000, 1637040308000, 1637040908000, 1637041508000, 1637042108000, 1637042708000, 1637043308000, 1637043908000, 1637044508000, 1637045108000, 1637045708000, 1637046308000, 1637046908000, 1637047508000, 1637048108000, 1637048708000, 1637049308000, 1637049908000, 1637050508000, 1637051108000, 1637051708000, 1637052308000, 1637052908000, 1637053508000, 1637054108000, 1637054708000, 1637055308000, 1637055908000, 1637056508000, 1637057108000, 1637057708000, 1637058308000, 1637058908000, 1637059508000, 1637060108000, 1637060708000, 1637061308000, 1637061908000, 1637062508000, 1637063108000, 1637063708000, 1637064308000, 1637064908000, 1637065508000, 1637066108000, 1637066708000, 1637067308000, 1637067908000, 1637068508000, 1637069108000, 1637069708000, 1637070308000, 1637070908000, 1637071508000, 1637072108000, 1637072708000, 1637073308000, 1637073908000, 1637074508000, 1637075108000, 1637075708000, 1637076308000, 1637076908000, 1637077508000, 1637078108000, 1637078708000, 1637079308000, 1637079908000, 1637080508000, 1637081108000, 1637081708000, 1637082308000, 1637082908000, 1637083508000, 1637084108000, 1637084708000, 1637085308000, 1637085908000, 1637086508000, 1637087108000, 1637087708000, 1637088308000, 1637088908000, 1637089508000, 1637090108000, 1637090708000, 1637091308000]
Once the node is obtained the \blabels:\s*\[([^][]*)] regex searches for
\b - a word boundary
labels: - a fixed string
\s* - zero or more whitespaces
\[ - a [ char
([^][]*) - Group 1 (this is what you will need to split with a comma later): any zero or more chars other than ] and [
] - a ] char.

Related

Removing different string patterns from Pandas column

I have the following column which consists of email subject headers:
Subject
EXT || Transport enquiry
EXT || RE: EXTERNAL: RE: 0001 || Copy of enquiry
EXT || FW: Model - Jan
SV: [EXTERNAL] Calculations
What I want to achieve is:
Subject
Transport enquiry
0001 || Copy of enquiry
Model - Jan
Calculations
and for this I am using the below code which only takes into account the first regular expression that I am passing and ignoring the rest
def clean_subject_prelim(text):
text = re.sub(r'^EXT \|\| $' , '' , text)
text = re.sub(r'EXT \|\| RE: EXTERNAL: RE:', '' , text)
text = re.sub(r'EXT \|\| FW:', '' , text)
text = re.sub(r'^SV: \[EXTERNAL]$' , '' , text)
return text
df['subject_clean'] = df['Subject'].apply(lambda x: clean_subject_prelim(x))
Why this is not working, what am I missing here?
You can use
pattern = r"""(?mx) # MULTILINE mode on
^ # start of string
(?: # non-capturing group start
EXT\s*\|\|\s*(?:RE:\s*EXTERNAL:\s*RE:|FW:)? # EXT || or EXT || RE: EXTERNAL: RE: or EXT || FW:
| # or
SV:\s*\[EXTERNAL]# SV: [EXTERNAL]
) # non-capturing group end
\s* # zero or more whitespaces
"""
df['subject_clean'] = df['Subject'].str.replace(pattern', '', regex=True)
See the regex demo.
Since the re.X ((?x)) is used, you should escape literal spaces and # chars, or just use \s* or \s+ to match zero/one or more whitespaces.
Get rid of the $ sign in the first expression and switch some of regex expressions from place. Like this:
import pandas as pd
import re
def clean_subject_prelim(text):
text = re.sub(r'EXT \|\| RE: EXTERNAL: RE:', '' , text)
text = re.sub(r'EXT \|\| FW:', '' , text)
text = re.sub(r'^EXT \|\|' , '' , text)
text = re.sub(r'^SV: \[EXTERNAL]' , '' , text)
return text
data = {"Subject": [
"EXT || Transport enquiry",
"EXT || RE: EXTERNAL: RE: 0001 || Copy of enquiry",
"EXT || FW: Model - Jan",
"SV: [EXTERNAL] Calculations"]}
df = pd.DataFrame(data)
df['subject_clean'] = df['Subject'].apply(lambda x: clean_subject_prelim(x))

how to do complex pdf extraction with regex

I have a PDF file which contains Lottery Tickets winners, i want to extract all win tickets according to their prizes.
PDF file
i tried this:
import re
import pdfplumber
prize_re = re.compile(r"^\d[a-z]")
cons_prize_re = re.compile(r"^Cons")
ticket1_line_re = re.compile(r"^\d[)]")
ticket2_line_re = re.compile(r"^\d{4}")
ticket3_line_re = re.compile(r"[A-Z] \d{6}")
with pdfplumber.open("./test11.pdf") as pdf:
for i in range(len(pdf.pages)):
page_text = pdf.pages[i].extract_text()
for line in page_text.split("\n"):
if prize_re.match(line) or cons_prize_re.match(line) or ticket1_line_re.match(line) or ticket2_line_re.match(line) or ticket3_line_re.search(line):
print(line)
and i got this, i don't know how to assign each ticket to its prize, also Cons prizes tickets number seems a little bit strange i don't know why (AN 867952AO 867952AP shoud be => AN 867952 AO 867952 AP...):
1st Prize Rs :7000000/- 1) AU 867952 (MANANTHAVADY)
Cons Prize-Rs :8000/- AN 867952AO 867952AP 867952 AR 867952AS 867952
AT 867952 AV 867952 AW 867952AX 867952AY 867952
AZ 867952
2nd Prize Rs :500000/- 1) AZ 499603 (ADOOR)
3rd Prize Rs :100000/- 1) AN 215264 (KOTTAYAM)
2) AO 852774 (PATTAMBI)
3) AP 953655 (KOTTAYAM)
4) AR 638904 (PAYYANUR)
5) AS 496774 (VAIKKOM)
6) AT 878990 (WAYANADU)
7) AU 703702 (PUNALUR)
8) AV 418446 (WAYANADU)
9) AW 994685 (KOZHIKKODE)
10) AX 317550 (PATTAMBI)
11) AY 854780 (CHITTUR)
12) AZ 899905 (KARUNAGAPALLY
...
instead i want to get:
[
{
"1st Prize Rs :7000000",
"tickets": [
"AU 867952"
]
},
{
"Cons Prize-Rs :8000",
"tickets": [
"AN 867952",
"AO 867952",
"AP 867952",
"AR 867952",
...
]
},
...
]
how can i achieve this ?
You could first get all the full parts from all the pages in capture groups.
Then you can after process the 3rd capture group to get the separate "tickets" and in a loop create the wanted data structure.
For the first separate groups, you can use a pattern that matches the start of every prize section, and captures all values until the next prize section.
^(\w+ Prize[-\s]Rs\s*):(\d+)/-(?:\s*\d+\))?\s*(.*(?:\n(?!\w+ Prize\b).*)*)
Regex demo
For the after processing, you can use a pattern for the ticket formats, which matches either 2 uppercase chars, space and 6 digits, or 4 or more digits followed by a whitespace boundary.
(?:[A-Z]{2} \d{6}(?!\d)|(?<!\S)\d{4,}(?!\S))
Regex demo
Example code using the pdf file from the question:
import re
import pdfplumber
import json
pattern = r"^(\w+ Prize[-\s]Rs\s*):(\d+)/-(?:\s*\d+\))?\s*(.*(?:\n(?!\w+ Prize\b).*)*)"
with pdfplumber.open("./test11.pdf") as pdf:
all_text = ""
for page in pdf.pages:
all_text += '\n' + page.extract_text()
matches = re.finditer(pattern, all_text, re.MULTILINE)
coll = []
for matchNum, match in enumerate(matches):
dct = {}
dct[match.group(1)] = match.group(2)
dct["tickets"] = re.findall(r"(?:[A-Z]{2} \d{6}(?!\d)|(?<!\S)\d{4,}(?!\S))", match.group(3))
coll.append(dct)
print(json.dumps(coll, indent=4))
Output
[
{
"1st Prize Rs ": "120000000",
"tickets": [
"XG 218582"
]
},
{
"Cons Prize-Rs ": "500000",
"tickets": [
"XA 218582",
"XB 218582",
"XC 218582",
"XD 218582",
"XE 218582"
]
},
{
"2nd Prize Rs ": "5000000",
"tickets": [
"XA 788417",
"XB 161796",
"XC 319503",
"XD 713832",
"XE 667708",
"XG 137764"
]
},
....

Translate unicode characters in a string to ISO 8859-1 Characters

I have the following string:
'\x7B\x22h\x22\x3A\x5B\x7B\x22id\x22\x3A\x22242611\x22,\x22minute\x22\x3A\x222\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.9359999847412109\x22,\x22Y\x22\x3A\x220.534000015258789\x22,\x22xG\x22\x3A\x220.1072189137339592\x22,\x22player\x22\x3A\x22Ari\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222930\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22Head\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Wanderson\x22,\x22lastAction\x22\x3A\x22Chipped\x22\x7D,\x7B\x22id\x22\x3A\x22242612\x22,\x22minute\x22\x3A\x224\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.8059999847412109\x22,\x22Y\x22\x3A\x220.7069999694824218\x22,\x22xG\x22\x3A\x220.021672379225492477\x22,\x22player\x22\x3A\x22Cristian\x20Ram\x5Cu00edrez\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225477\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242613\x22,\x22minute\x22\x3A\x224\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.7780000305175782\x22,\x22Y\x22\x3A\x220.505\x22,\x22xG\x22\x3A\x220.023817993700504303\x22,\x22player\x22\x3A\x22Mauricio\x20Pereyra\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222922\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Viktor\x20Claesson\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242614\x22,\x22minute\x22\x3A\x2217\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.9330000305175781\x22,\x22Y\x22\x3A\x220.41\x22,\x22xG\x22\x3A\x220.01863950863480568\x22,\x22player\x22\x3A\x22Ari\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222930\x22,\x22situation\x22\x3A\x22FromCorner\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22Head\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Mauricio\x20Pereyra\x22,\x22lastAction\x22\x3A\x22Aerial\x22\x7D,\x7B\x22id\x22\x3A\x22242617\x22,\x22minute\x22\x3A\x2221\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.710999984741211\x22,\x22Y\x22\x3A\x220.534000015258789\x22,\x22xG\x22\x3A\x220.015956614166498184\x22,\x22player\x22\x3A\x22Ivan\x20Ignatyev\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x226025\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Ari\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242621\x22,\x22minute\x22\x3A\x2231\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.7959999847412109\x22,\x22Y\x22\x3A\x220.4640000152587891\x22,\x22xG\x22\x3A\x220.03898102045059204\x22,\x22player\x22\x3A\x22Viktor\x20Claesson\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225478\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242622\x22,\x22minute\x22\x3A\x2236\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.759000015258789\x22,\x22Y\x22\x3A\x220.3509999847412109\x22,\x22xG\x22\x3A\x220.05237437039613724\x22,\x22player\x22\x3A\x22Mauricio\x20Pereyra\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222922\x22,\x22situation\x22\x3A\x22DirectFreekick\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22Standard\x22\x7D,\x7B\x22id\x22\x3A\x22242624\x22,\x22minute\x22\x3A\x2242\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.919000015258789\x22,\x22Y\x22\x3A\x220.37\x22,\x22xG\x22\x3A\x220.10843519121408463\x22,\x22player\x22\x3A\x22Sergei\x20Petrov\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222920\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Viktor\x20Claesson\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242625\x22,\x22minute\x22\x3A\x2248\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.7719999694824219\x22,\x22Y\x22\x3A\x220.385\x22,\x22xG\x22\x3A\x220.023656079545617104\x22,\x22player\x22\x3A\x22Aleksandr\x20Martynovich\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222790\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Yuri\x20Gazinskiy\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242626\x22,\x22minute\x22\x3A\x2249\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.715999984741211\x22,\x22Y\x22\x3A\x220.4879999923706055\x22,\x22xG\x22\x3A\x220.013118931092321873\x22,\x22player\x22\x3A\x22Yuri\x20Gazinskiy\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222929\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Viktor\x20Claesson\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242627\x22,\x22minute\x22\x3A\x2254\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.909000015258789\x22,\x22Y\x22\x3A\x220.3529999923706055\x22,\x22xG\x22\x3A\x220.09400425851345062\x22,\x22player\x22\x3A\x22Magomed\x2DShapi\x20Suleymanov\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225926\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Viktor\x20Claesson\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242628\x22,\x22minute\x22\x3A\x2254\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.8859999847412109\x22,\x22Y\x22\x3A\x220.31799999237060544\x22,\x22xG\x22\x3A\x220.061035316437482834\x22,\x22player\x22\x3A\x22Magomed\x2DShapi\x20Suleymanov\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225926\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Yuri\x20Gazinskiy\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242629\x22,\x22minute\x22\x3A\x2255\x22,\x22result\x22\x3A\x22Goal\x22,\x22X\x22\x3A\x220.9269999694824219\x22,\x22Y\x22\x3A\x220.46\x22,\x22xG\x22\x3A\x220.523554801940918\x22,\x22player\x22\x3A\x22Magomed\x2DShapi\x20Suleymanov\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225926\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Viktor\x20Claesson\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242630\x22,\x22minute\x22\x3A\x2266\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.915\x22,\x22Y\x22\x3A\x220.5420000076293945\x22,\x22xG\x22\x3A\x220.3631550371646881\x22,\x22player\x22\x3A\x22Christian\x20Cueva\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x226799\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Cristian\x20Ram\x5Cu00edrez\x22,\x22lastAction\x22\x3A\x22Cross\x22\x7D,\x7B\x22id\x22\x3A\x22242631\x22,\x22minute\x22\x3A\x2271\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.685\x22,\x22Y\x22\x3A\x220.485\x22,\x22xG\x22\x3A\x220.03188558667898178\x22,\x22player\x22\x3A\x22Cristian\x20Ram\x5Cu00edrez\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225477\x22,\x22situation\x22\x3A\x22DirectFreekick\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22Standard\x22\x7D,\x7B\x22id\x22\x3A\x22242632\x22,\x22minute\x22\x3A\x2272\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.8909999847412109\x22,\x22Y\x22\x3A\x220.4809999847412109\x22,\x22xG\x22\x3A\x220.09532035887241364\x22,\x22player\x22\x3A\x22Sergei\x20Petrov\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x222920\x22,\x22situation\x22\x3A\x22FromCorner\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242634\x22,\x22minute\x22\x3A\x2275\x22,\x22result\x22\x3A\x22Goal\x22,\x22X\x22\x3A\x220.794000015258789\x22,\x22Y\x22\x3A\x220.47900001525878905\x22,\x22xG\x22\x3A\x220.05203503370285034\x22,\x22player\x22\x3A\x22Ivan\x20Ignatyev\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x226025\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242635\x22,\x22minute\x22\x3A\x2277\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.904000015258789\x22,\x22Y\x22\x3A\x220.5\x22,\x22xG\x22\x3A\x220.13627302646636963\x22,\x22player\x22\x3A\x22Viktor\x20Claesson\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225478\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Christian\x20Cueva\x22,\x22lastAction\x22\x3A\x22Chipped\x22\x7D,\x7B\x22id\x22\x3A\x22242637\x22,\x22minute\x22\x3A\x2281\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.8719999694824219\x22,\x22Y\x22\x3A\x220.6840000152587891\x22,\x22xG\x22\x3A\x220.07149235904216766\x22,\x22player\x22\x3A\x22Cristian\x20Ram\x5Cu00edrez\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225477\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Mauricio\x20Pereyra\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242638\x22,\x22minute\x22\x3A\x2282\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.895\x22,\x22Y\x22\x3A\x220.6329999923706054\x22,\x22xG\x22\x3A\x220.10822572559118271\x22,\x22player\x22\x3A\x22Ivan\x20Ignatyev\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x226025\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Magomed\x2DShapi\x20Suleymanov\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242639\x22,\x22minute\x22\x3A\x2286\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.7680000305175781\x22,\x22Y\x22\x3A\x220.6519999694824219\x22,\x22xG\x22\x3A\x220.019864005967974663\x22,\x22player\x22\x3A\x22Viktor\x20Claesson\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x225478\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242640\x22,\x22minute\x22\x3A\x2287\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.795\x22,\x22Y\x22\x3A\x220.7330000305175781\x22,\x22xG\x22\x3A\x220.01853240840137005\x22,\x22player\x22\x3A\x22Christian\x20Cueva\x22,\x22h_a\x22\x3A\x22h\x22,\x22player_id\x22\x3A\x226799\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Sergei\x20Petrov\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D\x5D,\x22a\x22\x3A\x5B\x7B\x22id\x22\x3A\x22242615\x22,\x22minute\x22\x3A\x2218\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.89\x22,\x22Y\x22\x3A\x220.4279999923706055\x22,\x22xG\x22\x3A\x220.3485192060470581\x22,\x22player\x22\x3A\x22Andrei\x20Panyukov\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x226138\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Nikolay\x20Dimitrov\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242616\x22,\x22minute\x22\x3A\x2218\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.8719999694824219\x22,\x22Y\x22\x3A\x220.4\x22,\x22xG\x22\x3A\x220.30197691917419434\x22,\x22player\x22\x3A\x22Nikolay\x20Dimitrov\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x225496\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22Rebound\x22\x7D,\x7B\x22id\x22\x3A\x22242618\x22,\x22minute\x22\x3A\x2222\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.86\x22,\x22Y\x22\x3A\x220.5\x22,\x22xG\x22\x3A\x220.06458419561386108\x22,\x22player\x22\x3A\x22Eric\x20Bicfalvi\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x225483\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Othman\x20El\x20Kabir\x22,\x22lastAction\x22\x3A\x22Cross\x22\x7D,\x7B\x22id\x22\x3A\x22242619\x22,\x22minute\x22\x3A\x2224\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.8030000305175782\x22,\x22Y\x22\x3A\x220.6179999923706054\x22,\x22xG\x22\x3A\x220.035318903625011444\x22,\x22player\x22\x3A\x22Eric\x20Bicfalvi\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x225483\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242620\x22,\x22minute\x22\x3A\x2229\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.81\x22,\x22Y\x22\x3A\x220.585\x22,\x22xG\x22\x3A\x220.04518153518438339\x22,\x22player\x22\x3A\x22Othman\x20El\x20Kabir\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x226587\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Andrei\x20Panyukov\x22,\x22lastAction\x22\x3A\x22HeadPass\x22\x7D,\x7B\x22id\x22\x3A\x22242623\x22,\x22minute\x22\x3A\x2241\x22,\x22result\x22\x3A\x22SavedShot\x22,\x22X\x22\x3A\x220.7580000305175781\x22,\x22Y\x22\x3A\x220.575\x22,\x22xG\x22\x3A\x220.022277826443314552\x22,\x22player\x22\x3A\x22Othman\x20El\x20Kabir\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x226587\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Nikolay\x20Dimitrov\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242633\x22,\x22minute\x22\x3A\x2274\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.750999984741211\x22,\x22Y\x22\x3A\x220.35700000762939454\x22,\x22xG\x22\x3A\x220.01709834299981594\x22,\x22player\x22\x3A\x22Nikolay\x20Dimitrov\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x225496\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22LeftFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Eric\x20Bicfalvi\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D,\x7B\x22id\x22\x3A\x22242636\x22,\x22minute\x22\x3A\x2279\x22,\x22result\x22\x3A\x22BlockedShot\x22,\x22X\x22\x3A\x220.8530000305175781\x22,\x22Y\x22\x3A\x220.3920000076293945\x22,\x22xG\x22\x3A\x220.042678408324718475\x22,\x22player\x22\x3A\x22Yuri\x20Bavin\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x223085\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3Anull,\x22lastAction\x22\x3A\x22None\x22\x7D,\x7B\x22id\x22\x3A\x22242641\x22,\x22minute\x22\x3A\x2292\x22,\x22result\x22\x3A\x22MissedShots\x22,\x22X\x22\x3A\x220.705999984741211\x22,\x22Y\x22\x3A\x220.534000015258789\x22,\x22xG\x22\x3A\x220.01331254467368126\x22,\x22player\x22\x3A\x22Eric\x20Bicfalvi\x22,\x22h_a\x22\x3A\x22a\x22,\x22player_id\x22\x3A\x225483\x22,\x22situation\x22\x3A\x22OpenPlay\x22,\x22season\x22\x3A\x222018\x22,\x22shotType\x22\x3A\x22RightFoot\x22,\x22match_id\x22\x3A\x229071\x22,\x22h_team\x22\x3A\x22FC\x20Krasnodar\x22,\x22a_team\x22\x3A\x22Ural\x22,\x22h_goals\x22\x3A\x222\x22,\x22a_goals\x22\x3A\x220\x22,\x22date\x22\x3A\x222018\x2D12\x2D02\x2011\x3A00\x3A00\x22,\x22player_assisted\x22\x3A\x22Andrey\x20Egorychev\x22,\x22lastAction\x22\x3A\x22Pass\x22\x7D\x5D\x7D'
I want to transform this string into the following JSON-object:
{"h":[{"id":"242611","minute":"2","result":"MissedShots","X":"0.9359999847412109","Y":"0.534000015258789","xG":"0.1072189137339592","player":"Ari","h_a":"h","player_id":"2930","situation":"OpenPlay","season":"2018","shotType":"Head","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":"Wanderson","lastAction":"Chipped"},{"id":"242612","minute":"4","result":"SavedShot","X":"0.8059999847412109","Y":"0.7069999694824218","xG":"0.021672379225492477","player":"Cristian Ramirez","h_a":"h","player_id":"5477","situation":"OpenPlay","season":"2018","shotType":"LeftFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":null,"lastAction":"None"},{"id":"242613","minute":"4","result":"SavedShot","X":"0.7780000305175782","Y":"0.505","xG":"0.023817993700504303","player":"Mauricio Pereyra","h_a":"h","player_id":"2922","situation":"OpenPlay","season":"2018","shotType":"RightFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":"Viktor Claesson","lastAction":"Pass"},{"id":"242614","minute":"17","result":"MissedShots","X":"0.9330000305175781","Y":"0.41","xG":"0.01863950863480568","player":"Ari","h_a":"h","player_id":"2930","situation":"FromCorner","season":"2018","shotType":"Head","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":"Mauricio Pereyra","lastAction":"Aerial"},{"id":"242617","minute":"21","result":"SavedShot","X":"0.710999984741211","Y":"0.534000015258789","xG":"0.015956614166498184","player":"Ivan Ignatyev","h_a":"h","player_id":"6025","situation":"OpenPlay","season":"2018","shotType":"RightFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":"Ari","lastAction":"Pass"},{"id":"242621","minute":"31","result":"MissedShots","X":"0.7959999847412109","Y":"0.4640000152587891","xG":"0.03898102045059204","player":"Viktor Claesson","h_a":"h","player_id":"5478","situation":"OpenPlay","season":"2018","shotType":"RightFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":null,"lastAction":"None"},{"id":"242622","minute":"36","result":"MissedShots","X":"0.759000015258789","Y":"0.3509999847412109","xG":"0.05237437039613724","player":"Mauricio Pereyra","h_a":"h","player_id":"2922","situation":"DirectFreekick","season":"2018","shotType":"LeftFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":null,"lastAction":"Standard"},{"id":"242624","minute":"42","result":"BlockedShot","X":"0.919000015258789","Y":"0.37","xG":"0.10843519121408463","player":"Sergei Petrov","h_a":"h","player_id":"2920","situation":"OpenPlay","season":"2018","shotType":"RightFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":"Viktor Claesson","lastAction":"Pass"},{"id":"242625","minute":"48","result":"MissedShots","X":"0.7719999694824219","Y":"0.385","xG":"0.023656079545617104","player":"Aleksandr Martynovich","h_a":"h","player_id":"2790","situation":"OpenPlay","season":"2018","shotType":"RightFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural","h_goals":"2","a_goals":"0","date":"2018-12-02 11:00:00","player_assisted":"Yuri Gazinskiy","lastAction":"Pass"},{"id":"242626","minute":"49","result":"MissedShots","X":"0.715999984741211","Y":"0.4879999923706055","xG":"0.013118931092321873","player":"Yuri Gazinskiy","h_a":"h","player_id":"2929","situation":"OpenPlay","season":"2018","shotType":"RightFoot","match_id":"9071","h_team":"FC Krasnodar","a_team":"Ural"}]}
With the help of a table I what each of the following unicode-characters represents:
\x7B = {
\x22 = "
\x3A = :
\x5B = [
\x7D = }
\x5D = ]
As a result, I have written the following script:
string = '\x7B\x22h\x22\x3A\....\\x22\x22\x7D\x5D\x7D'
cleaned_string = string.replace("\x7B","{")
cleaned_string = cleaned_string.replace('\x22', '"')
cleaned_string = cleaned_string.replace("\x3A", ":")
cleaned_string = cleaned_string.replace("\x5B", "[")
cleaned_string = cleaned_string.replace("\x7D", "}")
cleaned_string = cleaned_string.replace("\x5D", "]")
s = json.loads(cleaned_string)
However, The problem arises in the player- and player-assisted -field. Especially, for French or Spanish names there are multiple special characters possible. Currently, my script is only transforming the "standard" characters. In theory, there are of course a lot of different other characters that need to be transformed. How can I do this easily?
Thanks in advance,

Capture property names

I'm scanning a ".twig" (PHP template) file and trying to capture the property names of an object.
The twig file contains lines (strings) like these:
{{ product.id }}
{{ product.parentProductId }}
{{ product.countdown.startDate | date('Y/m/d H:i:s') }}
{{ product.countdown.endDate | date('Y/m/d H:i:s') }}
{{ product.countdown.expireDate | date('Y/m/d H:i:s') }}
{{ product.primaryImage.originalUrl }}
{{ product.image(1).originalUrl }}
{{ product.image(1).thumbUrl }}
{{ product.priceWithTax(preferences.default_currency) | money }}
The things I want to capture are:
.id
.parentProductId
.countdown
.startDate
.endDate
.expireDate
.primaryImage
.originalUrl
.image(1)
.originalUrl
.thumbUrl
.priceWithTax(preferences.default_currency)
Basically, I'm trying to figure out the properties of the product object. I have the following pattern, but it doesn't capture chained properties. For example,
"{{.+?product(\.[a-zA-Z]+(?:\(.+?\)){,1})++.+?}}" captures only .startDate, but it should capture both .countdown and .startDate seperately. Is this not possible, or am I missing something?
regex101
I could capture ("{{.+?product((?:\.[a-zA-Z]+(?:\(.+?\)){,1})+).+?}}") it as a whole (.countdown.startDate) and later check/split it, but this sounds troublesome.
If you want to handle it with a single regex, you might want to use the PyPi regex module:
import regex
s = """{{ product.id }}
{{ product.parentProductId }}
{{ product.countdown.startDate | date('Y/m/d H:i:s') }}
{{ product.primaryImage.originalUrl }}
{{ product.image(1).originalUrl }}
{{ product.priceWithTax(preferences.default_currency) | money }}"""
rx = r'{{[^{}]*product(\.[a-zA-Z]+(?:\([^()]+\))?)*[^{}]*}}'
l = [m.captures(1) for m in regex.finditer(rx, s)]
print([item for sublist in l for item in sublist])
# => ['.id', '.parentProductId', '.countdown', '.startDate', '.primaryImage', '.originalUrl', '.image(1)', '.originalUrl', '.priceWithTax(preferences.default_currency)']
See the Python demo
The {{[^{}]*product(\.[a-zA-Z]+(?:\([^()]+\))?)*[^{}]*}} regex will match
{{ - {{ substring
[^{}]* - 0+ chars other than { and }
product - the substring product
(\.[a-zA-Z]+(?:\([^()]+\))?)* - Capturing group 1: zero or more sequences of
\. - a dot
[a-zA-Z]+ - 1+ ASCII letters
(?:\([^()]+\))? - an optional sequence of (, 1+ chars other than ( and ) and then )
[^{}]* - 0+ chars other than { and }
}} - a }} substring.
If you are only limited to re, you will need to capture all the properties into 1 capturing group (wrap this (\.[a-zA-Z]+(?:\([^()]+\))?)* with (...)) and then run a regex based post-process to split by . not inside parentheses:
import re
rx = r'{{[^{}]*product((?:\.[a-zA-Z]+(?:\([^()]+\))?)*)[^{}]*}}'
l = re.findall(rx, s)
res = []
for m in l:
res.extend([".{}".format(n) for n in filter(None, re.split(r'\.(?![^()]*\))', m))])
print(res)
# => ['.id', '.parentProductId', '.countdown', '.startDate', '.primaryImage', '.originalUrl', '.image(1)', '.originalUrl', '.priceWithTax(preferences.default_currency)']
See this Python demo
try this one, captures all in your requirement
^{{ product(\..*?[(][^\d\/]+[)]).*?}}|^{{ product(\..*?)(\..*?)?(?= )
demo and explanation at regex 101
I've decided to stick to re (instead of regex, as suggested by Victor) and this is what I ended up with:
import re, json
file = open("test.twig", "r", encoding="utf-8")
content = file.read()
file.close()
patterns = {
"template" : r"{{[^{}]*product((?:\.[a-zA-Z]+(?:\([^()]+\))?)*)[^{}]*}}",
"prop" : r"^[^\.]+$", # .id
"subprop" : r"^[^\.()]+(\.[^\.]+)+$", # .countdown.startDate
"itemprop" : r"^[^\.]+\(\d+\)\.[^\.]+$", # .image(1).originalUrl
"method" : r"^[^\.]+\(.+\)$", # .priceWithTax(preferences.default_currency)
}
temp_re = re.compile(patterns["template"])
matches = temp_re.findall(content)
product = {}
for match in matches:
match = match[1:]
if re.match(patterns["prop"], match):
product[match] = match
elif re.match(patterns["subprop"], match):
match = match.split(".")
if match[0] not in product:
product[match[0]] = []
if match[1] not in product[match[0]]:
product[match[0]].append(match[1])
elif re.match(patterns["itemprop"], match):
match = match.split(".")
array = re.sub("\(\d+\)", "(i)", match[0])
if array not in product:
product[array] = []
if match[1] not in product[array]:
product[array].append(match[1])
elif re.match(patterns["method"], match):
product[match] = match
props = json.dumps(product, indent=4)
print(props)
Example output:
{
"id": "id",
"parentProductId": "parentProductId",
"countdown": [
"startDate",
"endDate",
"expireDate"
],
"primaryImage": [
"originalUrl"
],
"image(i)": [
"originalUrl",
"thumbUrl"
],
"priceWithTax(preferences.default_currency)": "priceWithTax(preferences.default_currency)"
}

Python regex sub confusion

There are four keywords: title, blog, tags, state
Excess keyword occurrences are being removed from their respective matches.
Example:
blog: blog state title tags and returns state title tags and instead of
blog state title tags and
The sub function should be matching .+ after it sees blog:, so I don't know why it treats blog as an exception to .+
Regex:
re.sub(r'((^|\n|\s|\b)(title|blog|tags|state)(\:\s).+(\n|$))', matcher, a)
Code:
def n15():
import re
a = """blog: blog: fooblog
state: private
title: this is atitle bun
and text"""
kwargs = {}
def matcher(string):
v = string.group(1).replace(string.group(2), '').replace(string.group(3), '').replace(string.group(4), '').replace(string.group(5), '')
if string.group(3) == 'title':
kwargs['title'] = v
elif string.group(3) == 'blog':
kwargs['blog_url'] = v
elif string.group(3) == 'tags':
kwargs['comma_separated_tags'] = v
elif string.group(3) == 'state':
kwargs['post_state'] = v
return ''
a = re.sub(r'((^|\n|\s|\b)(title|blog|tags|state)(\:\s).+(\n|$))', matcher, a)
a = a.replace('\n', '<br />')
a = a.replace('\r', '')
a = a.replace('"', r'\"')
a = '<p>' + a + '</p>'
kwargs['body'] = a
print kwargs
Output:
{'body': '<p>and text</p>', 'post_state': 'private', 'blog_url': 'foo', 'title': 'this is a bun'}
Edit:
Desired Output:
{'body': '<p>and text</p>', 'post_state': 'private', 'blog_url': 'fooblog', 'title': 'this is atitle bun'}
replace(string.group(3), '')
is replacing all occurrences of 'blog' with '' .
Rather than try to replace all the other parts of the matched string, which will be hard to get right, I suggest capture the string you actually want in the original match.
r'((^|\n|\s|\b)(title|blog|tags|state)(\:\s)(.+)(\n|$))'
which has () around the .+ to capture that part of the string, then
v = match.group(5)
at the start of matcher.

Categories