I have a long string that looks like below. This is a small part of it, but the pattern repeat.
The raw data that I get when I read the API with Requests:
...
}
#VER B 160 20201020 "Test Bolag AB (4117)" 20201223
{
#TRANS 3001 {6 "1000050"} -180000 "" "" 0
#TRANS 2611 {6 "1000050"} -45000 "" "" 0
#TRANS 1510 {6 "1000050"} 225000 "" "" 0
}
#VER A 2 20200212 "Test Bolag AB1" 20201223
{
#TRANS 1930 {} -7549 "" "" 0
#TRANS 2641 {} 1209.75 "" "" 0
#TRANS 7990 {} 6339.25 "" "" 0
}
...
The code I've written now:
lst = r.text.split('}\r\n')
for i in range(len(lst)):
tmpstr1 = str(lst[i])
tmpstr2 = tmpstr1.replace(" ",";")
tmpstr3 = tmpstr2.replace("\\r","")
tmpstr4 = tmpstr3.replace("\\n","")
tmpstr5 = re.sub(r'[^A-Za-z0-9;,#-.]', '', tmpstr4)
tmpstr6 = tmpstr5.replace("#VER;","")
tmplst1 = tmpstr6.split('#TRANS;')
tmpstr7 = str(tmplst1)
tmpstr8 = str(tmplst1[0])
tmpstr9 = tmpstr7.replace(";0",tmpstr8)
tmpstr10 = re.sub(r'[^A-Za-z0-9;,-]', '', tmpstr9)
tmpstr11 = tmpstr10.strip()
tmplst2 = tmpstr11.split(',')
tmplst2.pop(0)
lst[i] = str(tmplst2)
print(lst[200])
This is what I get now:
['3001;6;1000050;-180000;;B;160;20201020;Kundfaktura;Test;Bolag;AB;4117;20201223', '2611;6;1000050;-45000;;B;160;20201020;Kundfaktura;Test;Bolag;AB;4117;20201223', '1510;6;1000050;225000;;B;160;20201020;Kundfaktura;Test;Bolag;AB;4117;20201223']
This is what I want to get:
3001;6;1000050;-180000;;B;160;20201020;Kundfaktura;Test Bolag AB;4117;20201223
2611;6;1000050;-45000;;B;160;20201020;Kundfaktura;Test Bolag AB;4117;20201223
1510;6;1000050;225000;;B;160;20201020;Kundfaktura;Test Bolag AB;4117;20201223
Thanks in advance!
Just save them in arrays of an array.
So you would get every entry in an array, which will make it more dynamically for future use.
For this define an array and append after every for loop.
Related
How to rank the data frame based on the row value. i.e I have a row that contains text data want to provide the rank based on the similarity?
Expected output
i have tried with the levistian distance but not sure how can i do for the whole table
def bow(x=None):
x = x.lower()
words = x.split(' ')
words.sort()
x = ' '.join(words)
exclude = set('{}{}'.format(string.punctuation, string.digits))
x = ''.join(ch for ch in x if ch not in exclude)
x = '{} '.format(x.strip())
return x
#intents = load_intents(export=True)
df['bow'] = df['name'].apply(lambda x: bow(x))
df.sort_values(by='bow',ascending=True,inplace=True)
last_bow = ''
recs = []
for idx,row in df.iterrows():
record = {
'name': row['name'],
'bow': row['bow'],
'lev_distance': ed.eval(last_bow,row['bow'])
}
recs.append(record)
last_bow = row['bow']
intents = pd.DataFrame(recs,columns=['name','bow','lev_distance'])
l = intents[intents['lev_distance'] <= lev_distance_range]
r = []
for x in l.index.values:
r.append(x - 1)
r.append(x)
r = list(set(r))
l = intents.iloc[r,:]
Using textdistance, you could try this:
import pandas as pd
import textdistance
df = pd.DataFrame(
{
"text": [
"Rahul dsa",
"Rasul dsad",
"Raul ascs",
"shrez",
"Indya",
"Indi",
"shez",
"india",
"kloa",
"klsnsd",
],
}
)
df = (
df
.assign(
match=df["text"].map(
lambda x: [
i
for i, text in enumerate(df["text"])
if textdistance.jaro_winkler(x, text) >= 0.9
]
)
)
.sort_values(by="match")
.drop(columns="match")
)
print(df)
# Output
text
0 Rahul dsa
1 Rasul dsad
2 Raul ascs
3 shrez
6 shez
4 Indya
5 Indi
7 india
8 kloa
9 klsnsd
I want to automatially edit .txt files with code. Everything containing victory_poins shall be removed and entered in another form after the "history={" statement. But in the end, it adds an additional history={. Why?
Code:
def überschreiben(filename,vp, capital):
data_out=open(filename,"r")
data_in=open(filename+"_output.txt","w")
vpsegment=False
for line in data_out:
if "\thistory" in line:
data_in.write(line+'\n\t\tvictory_points = { '+str(capital)+' '+str(vp)+' }\n')
if "\t\tvictory_points" in line:
vppivot=line
vpsegment=True
if vpsegment==True:
if "}" in line:
data_in.write("")
vpsegment=False
else:
data_in.write("")
else:
data_in.write(line)
data_in.close()
data_out.close()
Input:
state={
id=1
name="STATE_1" # Corsica
manpower = 322900
state_category = town
history={
owner = FRA
victory_points = { 3838 1 }
buildings = {
infrastructure = 4
industrial_complex = 1
air_base = 1
3838 = {
naval_base = 3
}
}
add_core_of = FRA
}
provinces={
3838 9851 11804
}
}
Output:
[...]
state_category = town
history={
victory_points = { 00001 8 }
history={
owner = FRA
buildings = {
infrastructure = 4
industrial_complex = 1
air_base = 1
3838 = {
naval_base = 3
}
}
add_core_of = FRA
}
provinces={
3838 9851 11804
}
}
Where does the second history={ come from?
Let's look at what happens when you read the line history{ :
if "\thistory" in line:
data_in.write(line+'\n\t\tvictory_points = { '+str(capital)+' '+str(vp)+' }\n')
The line contains "\thistory" so it writes the lines (it writes the first "history{") and other things
if "\t\tvictory_points" in line:
vppivot=line
vpsegment=True
Nothing happens because the line does not contain "\t\tvictory_points"
if vpsegment==True:
if "}" in line:
data_in.write("")
vpsegment=False
else:
data_in.write("")
else:
data_in.write(line)
vpsegment==False so it goes to the else statement and write the line which is "\thistory{"
I have a JSON array shown below.
[
"3D3iAR9M4HDETajfD79gs9BM8qhMSq5izX",
"35xfg4UnpEJeHDo55HNwJbr1V3G1ddCuVA"
]
I would like to add a value in the form of the string (self.tx_amount_5) so I get a JSON OBJECT something like this:
{
"3D3iAR9M4HDETajfD79gs9BM8qhMSq5izX" : 100000
"35xfg4UnpEJeHDo55HNwJbr1V3G1ddCuVA" : 100000
}
The part of code that has generated the first JSON array is:
r = requests.get('http://api.blockcypher.com/v1/btc/main/addrs/A/balance')
balance = r.json()['balance']
with open("Entries#x1.csv") as f,open("winningnumbers.csv") as nums:
nums = set(imap(str.rstrip, nums))
r = csv.reader(f)
results = defaultdict(list)
for row in r:
results[sum(n in nums for n in islice(row, 1, None))].append(row[0])
self.number_matched_0 = results[0]
self.number_matched_1 = results[1]
self.number_matched_2 = results[2]
self.number_matched_3 = results[3]
self.number_matched_4 = results[4]
self.number_matched_5 = results[5]
self.number_matched_5_json = json.dumps(self.number_matched_5, sort_keys = True, indent = 4)
print(self.number_matched_5_json)
if len(self.number_matched_3) == 0:
print('Nobody matched 3 numbers')
else:
self.tx_amount_3 = int((balance*0.001)/ len(self.number_matched_3))
if len(self.number_matched_4) == 0:
print('Nobody matched 4 numbers')
else:
self.tx_amount_4 = int((balance*0.1)/ len(self.number_matched_4))
if len(self.number_matched_5) == 0:
print('Nobody matched 3 numbers')
else:
self.tx_amount_5 = int((balance*0.4)/ len(self.number_matched_5))
If I understand correctly, you can create the dictionary like this:
import json
s="""[
"3D3iAR9M4HDETajfD79gs9BM8qhMSq5izX",
"35xfg4UnpEJeHDo55HNwJbr1V3G1ddCuVA"
]"""
d = {el: self.tx_amount_5 for el in json.loads(s)}
print(d)
which produces
{'3D3iAR9M4HDETajfD79gs9BM8qhMSq5izX': 100000,
'35xfg4UnpEJeHDo55HNwJbr1V3G1ddCuVA': 100000}
I have a json file with objects and a text file with several groups (Each group have 5 numbers and I have them in a list this way: the first number of each group are in list 1, the second number of each group, are in list 2, etc). I basically have to match each object of the json with each group I created. The problem is that Im getting as result the last element from the Json. The groups from the text file are created in the correct way.
This is my code:
import json
NUM_LIST = 5
index = 0
def report(a, b, c, d, e, index):
json_file = 'json_global.json'
json_data = open(json_file)
data = json.load(json_data)
i = 0
index = 0
item = 0
cmd = " "
ind = 0
for node in data:
for i in range(0, 5):
item = data[i]['item']
cmd = data[i]['command']
index+= 1
print item, cmd, a, b, c, d, e
f = open("Output.txt", "r")
lines = [line.rstrip() for line in f if line != "\n"]
NUM_LISTS = 5
groups = [[] for i in range(NUM_LISTS)]
listIndex = 0
for line in lines:
if "Transactions/Sec for Group" not in line:
groups[listIndex].append(float(line))
listIndex += 1
if listIndex == NUM_LISTS:
listIndex = 0
value0 = groups[0]
value1 = groups[1]
value2 = groups[2]
value3 = groups[3]
value4 = groups[4]
for i in range(0, 5):
a = value0[i]
b = value1[i]
c = value2[i]
d = value3[i]
e = value4[i]
i += 1
report(a, b, c, d, e, index)
The Json file looks like:
[
{
"item": 1,
"command": "AA"
},
{
"item": 2,
"command": "BB",
},
{
"item": 3,
"command": "CC",
},
{
"item": 4,
"command": "DD",
},
{
"item": 5,
"command": "EE",
}
]
The text file looks like this:
Transactions/Sec for Group = AA\CODE1\KK
1011.5032
2444.8864
2646.6893
2740.8531
2683.8178
Transactions/Sec for Group = BB\CODE1\KK
993.2360
2652.8784
3020.2740
2956.5260
3015.5910
Transactions/Sec for Group = CC\CODE1\KK
1179.5766
3271.5700
4588.2059
4174.6358
4452.6785
Transactions/Sec for Group = DD\CODE1\KK
1112.2567
3147.1466
4014.8404
3913.3806
3939.0626
Transactions/Sec for Group = EE\CODE1\KK
1205.8499
3364.8987
4401.1702
4747.4354
4765.7614
The logic in the body of the program works fine. The groups appears ok, but instead of having the list from 1 to 5 from the Json file, is appearing everything with the number 5 command EE. Instead should appear: Item 1, 2, 3, 4, 5, with their commands
My list 1 will have the numbers: 1011.5032, 993.2360, 1179.5766, 1112.2567, 1205.8499.
My list 2 will have the numbers: 2444.8864, 2652.8784, 3271.5700, 3147.1466,
The python version I'm using is 2.6
Based on your explanation it's hard to tell what you're trying to do -- do you mean the nested loop below? The inner loop executes 5 times, but in every iteration it overwrites the previous values for item and cmd.
for node in data:
for i in range(0, 5):
item = data[i]['item']
cmd = data[i]['command']
index+= 1
Try printing the values each time the inner loop executes:
for node in data:
for i in range(0, 5):
item = data[i]['item']
cmd = data[i]['command']
print item, cmd
index+= 1
I think this code is your problem:
for node in data:
for i in range(0, 5):
item = data[i]['item']
cmd = data[i]['command']
Item will always be "5" and command will always be "EE" after this executes. Perhaps your indents are off for the code beneath it, and that code is supposed to be within the loop?
I have reply like this:
(Result){
rows[] =
(ResultRow){
param1 = "value1"
values[] =
(ResultValue){
paramx1 = "valuex1"
paramx2 = "valuex2"
paramx3 = "valuex3"
paramx4 = "valuex4"
paramx5 = "valuex5"
},
},
rownum = 1
}
when I want print value param1 I do this:
for row in reply.rows:
print row.param1
but I don't know how to print value paramx1 from tuple values[]