Hello I'm trying to get this program to print out the list data for the corridor entered in the class call at the bottom. But it only prints out the very last row in the list. This program takes in a .csv file and turns into a list. Not by any means a very experienced python programmer.
class csv_get(object): # class to being in the .csv file to the program
import os
os.chdir('C:\Users\U2970\Documents\ArcGIS')
gpsTrack = open('roadlog_intersection_export_02_18_2014_2.csv', 'rb')
# Figure out position of lat and long in the header
headerLine = gpsTrack.readline()
valueList = headerLine.split(",")
class data_set(object): # place columns from .csv file into a python dictionary
dict = {'DESC' : csv_get.valueList.index("TDD_DESC"),
'ROUTE_NAME' : csv_get.valueList.index("ROUTE_NAME"),
'CORRIDOR': csv_get.valueList.index("CORRIDOR"),
'ROADBED': csv_get.valueList.index("DC_RBD"),
'BEG_RP': csv_get.valueList.index("BEG_RP"),
'END_RP': csv_get.valueList.index("END_RP"),
'DESIGNATION': csv_get.valueList.index("NRLG_SYS_DESC")}
class columns_set(object): # append the dict into a list
new_list = []
for line in csv_get.gpsTrack.readlines():
segmentedLine = line.split(",")
new_list.append([segmentedLine[data_set.dict['DESC']],\
'{:>7}'.format(segmentedLine[data_set.dict['ROUTE_NAME']]),\
'{:>7}'.format(segmentedLine[data_set.dict['CORRIDOR']]),\
'{:>7}'.format(segmentedLine[data_set.dict['ROADBED']]),\
'{:>7}'.format(segmentedLine[data_set.dict['BEG_RP']]),\
'{:>7}'.format(segmentedLine[data_set.dict['END_RP']]),\
'{:>7}'.format(segmentedLine[data_set.dict['DESIGNATION']])])
class data:
def __init__(self,corridor):
for col in columns_set.new_list: # for each column in the list new_list
self.desc = col[0]
self.route = col[1] # assigns column names to column numbers
self.corridor = col[2]
self.roadbed = col[3]
self.beg_rp = col[4]
self.end_rp = col[5]
self.designation = col[6]
def displayData(self): # print data for corridor number entered
print self.desc,\
self.route,\
self.corridor,\
self.roadbed,\
self.beg_rp,\
self.end_rp,\
self.designation
set1 = data('C000021') # corridor number to be sent into data class
# should print all the corridor data but only prints very last record
set1.displayData()
You're only storing data from the current row, and overwriting it with each row. A line like self.desc = col[0] says "overwrite self.desc so it refers to the value of col[0].
I hate to say it, but all of this code is flawed at a fundamental level. Your classes, except for data, are really functions. And even data is defective because it pulls in hardwired elements from outside itself.
You really should use Python's included CSV module to parse a CSV file into lists of lists. It can even give you a list of dictionaries and handle the header line.
Related
I'm trying to retrieve the index of a row within a dataframe using the loc method and a comparison of data from another dataframe within a for loop. Maybe I'm going about this wrong, I dunno. Here's a bit of information to help give the problem some context...
The following function imports some inventory data into a pandas dataframe from an xlsx file; this seemingly works just fine:
def import_inventory():
import warnings
try:
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
return pandas.read_excel(config_data["inventory_file"],header=1)
except Exception as E:
writelog.error(E)
sys.exit(E)
The following function imports some data from a combination of CSV files, creating a singular dataframe to work from during comparison; this seemingly works just fine:
def get_report_results():
output_dir = f"{config_data['output_path']}/reports"
report_ids = []
......
...execute and download the report csv files
......
reports_content = []
for path,current_directory,files in os.walk(output_dir):
for file in files:
file_path = os.path.join(path,file)
clean_csv_data(file_path) # This function simply cleans up the CSV content (removes blank rows, removes unnecessary footer data); updates same file that was sent in upon successful completion
current_file_content = pandas.read_csv(file_path,index_col=None,header=7)
reports_content.append(current_file_content)
reports_content = pandas.concat(reports_content,axis=0,ignore_index=True)
return reports_content
The problems exist here, at the following function that is supposed to search the reports content for the existence of an ID value then grab that row's index so I can use it in the future to modify some columns, add some columns.
def search_reports(inventory_df,reports_df):
for index,row in inventory_df.iterrows():
reports_index = reports_df.loc[reports_df["Inventory ID"] == row["Inv ID"]].index[0]
print(reports_df.iloc[reports_index]["Lookup ID"])
Here's the error I receive upon comparison
Length of values (1) does not match length of index (4729)
I can't quite figure out why this is happening. If I pull everything out of functions the work seems to happen the way it should. Any ideas?
There's a bit more work happening to the dataframe that comes from import_inventory, but didn't want to clutter the question. It's nothing major - one function adds a few columns that splits out a comma-separated value in the inventory into its own columns, another adds a column based on the contents of another column.
Edit:
As requested, the full stack trace is below. I've also included the other functions that operate on the original inventory_df object between its retreival (import_inventory) and its final comparison (search_reports).
This function again operates on the inventory_df function, only this time it retrieves a single column from each row (if it has data) and breaks the semicolon-separated list of key-value pair tags apart for further inspection. If it finds one, it creates the necessary column for it and populates that row with the found value.
def sort_tags(inventory_df):
cluster_key = "Cluster:"
nodetype_key = "NodeType:"
project_key = "project:"
tags = inventory_df["Tags List"]
for index,tag in inventory_df.items():
if not pandas.isna(tag):
tag_keysvalues = tag.split(";")
if any(cluster_key in string for string in tag_keysvalues):
pair = [x for x in tag_keysvalues if x.startswith(cluster_key)]
key_value_split = pair[0].split(":")
inventory_df.loc[index, "Cluster Name"] = key_value_split[1]
if any(nodetype_key in string for string in tag_keysvalues):
pair = [x for x in tag_keysvalues if x.startswith(nodetype_key)]
key_value_split = pair[0].split(":")
inventory_df.loc[index, "Node Type"] = key_value_split[1]
if any(project_key in string for string in tag_keysvalues):
pair = [x for x in tag_keysvalues if x.startswith(project_key)]
key_value_split = pair[0].split(":")
inventory_df.loc[index, "Project Name"] = key_value_split[1]
return inventory_df
This function compares the new inventory DF with a CSV import-to-DF of the old inventory. It creates new columns based on old inventory data if it finds a match. I know this is ugly code, but I'm hoping to replace it when I can find a solution to my current problem.
def compare_inventories(old_inventory_df,inventory_df):
aws_rowcount = len(inventory_df)
now = parser.parse(datetime.utcnow().isoformat()).replace(tzinfo=timezone.utc).astimezone(tz=None)
for a_index,a_row in inventory_df.iterrows():
if a_row["Comments"] != "none":
for o_index,o_row in old_inventory_df.iterrows():
last_checkin = parser.parse(str(o_row["last_checkin"])).replace(tzinfo=timezone.utc).astimezone(tz=None)
if (a_row["Comments"] == o_row["asset_name"]) and ((now - timedelta(days=30)) <= last_checkin):
inventory_df.loc[a_index,["Found in OldInv","OldInv Address","OldInv Asset ID","Inv ID"]] = ["true",o_row["address"],o_row["asset_id"],o_row["host_id"]]
return inventory_df
Here's the stack trace for the error:
Traceback (most recent call last):
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\main.py", line 52, in main
reports_index = reports_df.loc[reports_df["Inventory ID"] == row["Inv ID"]].index
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\ops\common.py", line 70, in new_method
return method(self, other)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\arraylike.py", line 40, in __eq__
return self._cmp_method(other, operator.eq)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\series.py", line 5625, in _cmp_method
return self._construct_result(res_values, name=res_name)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\series.py", line 3017, in _construct_result
out = self._constructor(result, index=self.index)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\series.py", line 442, in __init__
com.require_length_match(data, index)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\common.py", line 557, in require_length_match
raise ValueError(
ValueError: Length of values (1) does not match length of index (7150)
reports_index = reports_df.loc[report_data["Inventory ID"] == row["Inv ID"].index[0]
missing ] at end
The title may sound confusing...but this is what I need to do:
I have a list (which will be variable in length, with different values depending on various scenarios), e.g: list1 = ['backup', 'downloadMedia', 'createAlbum']. From this list, I need to create one of the following for each of these items: (and obviously the name will update depending on the item in the list)
I need to create a new list called: testcases_backup = []
I need to create a new list called: results_backup = []
I need to create a new list called: screenshot_paths_backup = []
And lastly, I need to open a new worksheet, which requires: worksheet1 = workbook.add_worksheet('Results'). Of note in this case, I will need to iterate 1,2,3,... for the worksheet name for each of the items in the list. So for the first iteration for 'backup', it will be worksheet1. and 2 for downloadMedia, etc.
I have tried using dictionaries, but at this point I am not making any real progress.
My attempt: (I have very limited exp with dictionaries)
master_test_list = ['backup', 'downloadMedia', 'createAlbum']
master_test_dict = {}
def addTest(test, worksheet, testcases_list, results_list, screenshots_path_list):
master_test_dict[test] = worksheet
master_test_dict[test] = testcases_list
master_test_dict[test] = results_list
master_test_dict[test] = screenshots_path_list
for test in master_test_list:
addTest(test, "worksheet"+str(master_test_list.index(test)+1), "testcases_list_"+test, "results_list_"+test, "screenshots_path_list_"+test)
print(results_list_backup)
I thought this might work...but I just get strings inside the lists, and so I cannot define them as lists:
worksheets = []
for i in range(len(master_test_list)):
worksheets.append(str(i+1))
worksheets = ["worksheet%s" % x for x in worksheets]
testcases = ["testcases_list_%s" % x for x in master_test_list]
results = ["results_%s" % x for x in master_test_list]
screenshot_paths = ["screenshot_paths_%s" % x for x in master_test_list]
for w in worksheets:
w = workbook.add_worksheet('Results')
for t in testcases:
t = []
for r in results:
r = []
for s in screenshot_paths:
s = []
Adding a second answer since the code is significantly different, addressing the specified request for how to create n copies of lists:
def GenerateElements():
# Insert your code which generates your list here
myGeneratedList = ['backup', 'downloadMedia', 'createAlbum']
return myGeneratedList
def InitDict(ListOfElements):
# Dont make a new free floating list for each element of list1. Generate and store the lists you want in a dictionary
return dict([[x, []] for x in ListOfElements])
def RunTest():
for myContent in list1:
# Do whatever you like to generate the data u need
myTestCaseList = ['a', 'b']
myResultsList = [1, 2]
myScreenshot_Paths_List = ['sc1', 'sc2']
# 1 Store your created list for test case of item 'myContent' from list1 in a dictionary
testcases[myContent].append(myTestCaseList)
# 2 Same but your results list
results[myContent].append(myResultsList)
# 3 Same but your screenshot_paths list
screenshot_paths[myContent].append(myScreenshot_Paths_List)
# 4 Make an excel sheet named after the item from list1
# run_vba_macro("C:\\Users\\xx-_-\\Documents\\Coding Products\\Python (Local)\\Programs\\Python X Excel\\AddSheets.xlsm","SheetAdder", "AddASheet", myContent)
list1 = GenerateElements()
testcases, results, screenshot_paths = InitDict(
list1), InitDict(list1), InitDict(list1)
NumTests = 5 # Number of tests you want
for x in range(NumTests):
RunTest()
What's going on here is just defining some initialization functions and then exercising them in a couple of lines.
My understanding is that you are running a series of tests, where you want a list of the inputs and outputs to be a running tally kind of thing. As such, this code uses a dictionary to store a list of lists. The dictionary key is how you identify which log you're looking at: test cases log vs results log vs screenshot_paths log.
As per my understanding of your requirements, each dictionary element is a list of lists where the 1st list is just the output of the first test. The second list is the first with the outcome of the second test/result appended to it. This goes on, so the structure looks like:
testcases= [ [testcase1] , [testcase1,testcase2] , [testcase1,testcase2,testcase3] ]
etc.
If this isn't exactly what you want you can probably modify it to suit your needs.
You explanation leaves some things to be imagined, but I think I've got what you need. There are two files: The .py python file and an excel file which is the spreadsheet serving as a foundation for adding sheets. You can find the ones I made on my github:
https://github.com/DavidD003/LearningPython
here is the excel code. Sharing first because its shorter. If you don't want to download mine then make a sheet called 'AddSheets.xlsm' with a module called 'SheetAdder' and within that module put the following code:
Public Sub AddASheet(nm)
Application.DisplayAlerts = False 'Reset on workbook open event, since we need it to be False here right up to the point of saving and closing
Dim NewSheet As Worksheet
Set NewSheet = ThisWorkbook.Sheets.Add
NewSheet.Name = nm
End Sub
Make sure to add this to the 'ThisWorkbook' code in the 'MicroSoft Excel Objects' folder of the VBA project:
Private Sub Workbook_Open()
Application.DisplayAlerts = True
End Sub
The python script is as follows:
See [this question][1] for an example of how to type format the filepath as a string for function argument. I removed mine here.
import win32com.client as wincl
import os
# Following modified from https://stackoverflow.com/questions/58188684/calling-vba-macro-from-python-with-unknown-number-of-arguments
def run_vba_macro(str_path, str_modulename, str_macroname, shtNm):
if os.path.exists(str_path):
xl = wincl.DispatchEx("Excel.Application")
wb = xl.Workbooks.Open(str_path, ReadOnly=0)
xl.Visible = True
xl.Application.Run(os.path.basename(str_path)+"!" +
str_modulename+'.'+str_macroname, shtNm)
wb.Save()
wb.Close()
xl.Application.Quit()
del xl
# Insert your code which generates your list here
list1 = ['backup', 'downloadMedia', 'createAlbum']
# Dont make a new free floating list for each element of list1. Generate and store the lists you want in a dictionary
testcases = dict([[x, []] for x in list1])
results = dict([[x, []] for x in list1])
screenshot_paths = dict([[x, []] for x in list1])
for myContent in list1:
myTestCaseList = [] # Do whatever you like to generate the data u need
myResultsList = []
myScreenshot_Paths_List = []
# 1 Store your created list for test case of item 'myContent' from list1 in a dictionary
testcases[myContent].append(myTestCaseList)
# 2 Same but your results list
results[myContent].append(myResultsList)
# 3 Same but your screenshot_paths list
screenshot_paths[myContent].append(myScreenshot_Paths_List)
# 4 Make an excel sheet named after the item from list1
run_vba_macro("C:\\Users\\xx-_-\\Documents\\Coding Products\\Python (Local)\\Programs\\Python X Excel\\AddSheets.xlsm",
"SheetAdder", "AddASheet", myContent)```
I started working on this before you updated your question with a code sample, so bear in mind I haven't looked at your code at all lol. Just ran with this.
Here is a summary of what all of the above does:
-Creates an excel sheet with a sheet for every element in 'list1', with the sheet named after that element
-Generates 3 dictionaries, one for test cases, one for results, and one for screenshot paths, where each dictionary has a list for each element from 'list1', with that list as the value for the key being the element in 'list1'
[1]: https://stackoverflow.com/questions/58188684/calling-vba-macro-from-python-with-unknown-number-of-arguments
I'm working on a program and want to write my result into a comma separated file, like a CSV.
new_throughput =[]
self.t._interval = 2
self.f = open("output.%s.csv"%postfix, "w")
self.f.write("time, Byte_Count, Throughput \n")
cur_throughput = stat.byte_count
t_put.append(cur_throughput)
b_count = (cur_throughput/131072.0) #calculating bits
b_count_list.append(b_count)
L = [y-x for x,y in zip(b_count_list, b_count_list[1:])] #subtracting current value - previous, saves value into list
for i in L:
new_throughput.append(i/self.t._interval)
self.f.write("%s,%s,%s,%s \n"%(self.experiment, b_count, b_count_list,new_throughput)) #write to file
when running this code i get this in my CSV file
picture here.
It somehow prints out the previous value every time.
What I want is new row for each new line:
time , byte_count, throughput
20181117013759,0.0,0.0
20181117013759,14.3157348633,7.157867431640625
0181117013759,53.5484619141,, 19.616363525390625
I don't have a working minimal example, but your last line should refer to the last member of each list, not the whole list. Something like this:
self.f.write("%s,%s,%s,%s \n"%(self.experiment, b_count, b_count_list[-1],new_throughput[-1])) #write to file
Edit: ...although if you want this simple solution to work, then you should initialize the lists with one initial value, e.g. [0], otherwise you'd get a "list index out of range error" at the first iteration according to your output.
Description of the problem:
I have an external *.xls file that I have converted to a *.csv file containing block of data such as:
"Legend number one";;;;Number of items;6
X;-358.6806792;-358.6716338;;;
Y;0.8767189;0.8966855;Avg;;50.1206378
Z;-0.7694626;-0.7520983;Std;;-0.0010354
D;8.0153902;8;Err;;1.010385
;;;;;
There is many many blocks.
Each block may contain some additional lines data;
"Legend number six";;;;Number of items;19
X;-358.6806792;-358.6716338;;;
Y;0.8767189;0.8966855;Avg;;50.1206378
Z;-0.7654644;-0.75283;Std;;-0.0010354
D;8.0153902;8;Err;;1.010385
A;0;1;Value;;0
B;1;0;;;
;;;;;
The structure is such that a new empty line separate each blocs, which is the ';;;;;;' line in my samples.
The first line after this begins with a unique identifier of the block.
It appears that each line contains 6 elements such as key1;elem1;elem2;key2;elem3;elem4 which would be nice to represent as two 3-elements vector key1;elem1;elem2 and key2;elem3;elem4 on two separate lines. Example for the second sample:
"Legend number six";;
;;Number of items;19
X;-358.6806792;-358.6716338;
;;
Y;0.8767189;0.8966855;
Avg;;50.1206378
Z;-0.7654644;-0.75283;
Std;;-0.0010354
D;8.0153902;8;
Err;;1.010385
A;0;1;
Value;;0
B;1;0;
;;
;;;;;
Some are empty but I do not want to discard them for the moment.
But I would like to end up a DataFrame containing columnwise elements for each block of data.
The cleanest "pre solution" I have so far:
With this Python code I ended up in a more organized "List of dictionaries":
import os, sys, re, glob
import pandas as pd
csvFile = os.path.join(workingDir,'file.csv')
h = 0 # Number of lines to skip in head
s = 2 # number of values per key
s += 1
str1 = 'Number of items'
# Reading file in a global list and storing each line in a sublist:
A = [line.split(';') for line in open(csvFile).read().split('\n')]
# This code splits each 6-elements sublist in one new sublist
# containing two-elements; each element with 3 values:
B = [(';'.join(el[:s])+'\n'+';'.join(el[s:])).split('\n') for el in A]
# Init empty structures:
names = [] # to store block unique identifier (the name in the legend)
L = [] # future list of dictionnaries
for el in (B):
for idx,elj in enumerate(el):
vi = elj.split(';')[1:]
# Here we grep the name only when the 2nd element of
# the first line contains the string "Number of items",
# which is constant all over the file:
if len(vi)>1 and vi[0]==str1:
name = el[idx-1].split(';')[0]
names.append(name)
#print(name)
# We loop again over B to append in a new list one dictionary
# per vector of 3 elements because each vector of 3 elements
# is structured like ; key;elem1;elem2
for el in (B):
for elj in (el):
k = elj.split(';')[0]
v = elj.split(';')[1:]
# Little tweak because the key2;elem3;elem4 of the
# first line (the one containing the name) have the
# key in the second place like "elem3;key2;elem4" :
if len(v)>1 and v[0]==str1:
kp = v[0]
v = [v[1],k]
k = kp
if k!='':
dct = {k:v}
L.append(dct)
I am unsuccessful to extract the name as a global identifier and all values of the blocs as variable so far. I can't play with some modulo based technique because of the variable number of informations in each separate block of data even if all block contain at least some common keys.
I also tried a while condition within a for loop all over each dictionary but it's a mess now.
zip could potentially be a nice option but I don't really know how to use it properly.
Target DataFrame:
What I'd like to end up should ideally look something similar to a DataFrame containing;
index 'Number of items' 'X' '' 'Y' 'Avg' 'Z' 'Std' ...
"Legend number one" 6 ...
"Legend number six" 19 ...
"Legend number 11" 6 ...
"Legend number 15" 18 ...
The columns names are the keys and the table is containing the values for each block of data on a separate line.
If there is a numbered index and a new column with "Legend name"; it's OK as well.
CSV sample to play with:
"Legend number one";;;;Number of items;6
X;8.6806792;8.6716338;;;
Y;0.1557;0.1556;Avg;;50.1206378
Z;-0.7859;-0.7860;Std;;-0.0010354
D;8.0153902;8;Err;;1.010385
;;;;;
"Legend number six";;;;Number of items;19
X;56.6806792;56.6716338;;;
Y;0.1324;0.1322;Avg;;50.1206378
Z;-0.7654644;-0.75283;Std;;-0.0010354
D;8.0153902;8;Err;;1.010385
A;0;1;Value;;0
B;1;0;;;
;;;;;
"Legend number 11";;;;Number of items;6
X;358.6806792;358.6716338;;;
Y;0.1324;0.1322;Avg;;50.1206378
Z;-0.7777;-0.7778;Std;;-0.0010354
D;8.0153902;8;Err;;1.010385
;;;;;
"Legend number 15";;;;Number of items;18
X;58.6806792;58.6716338;;;
Y;0.1324;0.1322;Avg;;50.1206378
Z;0.5555;0.5554;Std;;-0.0010354
D;8.0153902;8;Err;;1.010385
A;0;1;Value;;0
B;1;0;;;
C;0;0;k;1;0
;;;;;
I'm using Ubuntu and Python 3.6 but the script must work on a Windows computer as well.
Appending this to the previous code should work pretty well:
for elem in L:
for key,val in elem.items():
if key in names:
name = key
Dict2 = {}
else:
Dict2[key] = val
Dict1[name] = Dict2
df1 = pd.DataFrame.from_dict(Dict1, orient='index')
df2 = pd.DataFrame(index=df1.index)
for col in df1.columns:
colS = df1[col].apply(pd.Series)
colS = colS.rename(columns = lambda x : col+'_'+ str(x))
df2 = pd.concat([df2[:], colS[:]], axis=1)
df2.to_csv('output.csv', sep=',', index=True, header=True)
There is probably many other ways to go...
This link was helpful:
https://chrisalbon.com/python/data_wrangling/pandas_expand_cells_containing_lists/
Started fiddling with Python for the first time a week or so ago and have been trying to create a script that will replace instances of a string in a file with a new string. The actual reading and creation of a new file with intended strings seems to be successful, but error checking at the end of the file displays output suggesting that there is an error. I checked a few other threads but couldn't find a solution or alternative that fit what I was looking for or was at a level I was comfortable working with.
Apologies for messy/odd code structure, I am very new to the language. Initial four variables are example values.
editElement = "Testvalue"
newElement = "Testvalue2"
readFile = "/Users/Euan/Desktop/Testfile.csv"
writeFile = "/Users/Euan/Desktop/ModifiedFile.csv"
editelementCount1 = 0
newelementCount1 = 0
editelementCount2 = 0
newelementCount2 = 0
#Reading from file
print("Reading file...")
file1 = open(readFile,'r')
fileHolder = file1.readlines()
file1.close()
#Creating modified data
fileHolder_replaced = [row.replace(editElement, newElement) for row in fileHolder]
#Writing to file
file2 = open(writeFile,'w')
file2.writelines(fileHolder_replaced)
file2.close()
print("Modified file generated!")
#Error checking
for row in fileHolder:
if editElement in row:
editelementCount1 +=1
for row in fileHolder:
if newElement in row:
newelementCount1 +=1
for row in fileHolder_replaced:
if editElement in row:
editelementCount2 +=1
for row in fileHolder_replaced:
if newElement in row:
newelementCount2 +=1
print(editelementCount1 + newelementCount1)
print(editelementCount2 +newelementCount2)
Expected output would be the last two instances of 'print' displaying the same value, however...
The first instance of print returns the value of A + B as expected.
The second line only returns the value of B (from fileHolder), and from what I can see, A has indeed been converted to B (In fileHolder_replaced).
Edit:
For example,
if the first two counts show A and B to be 2029 and 1619 respectively (fileHolder), the last two counts show A as 0 and B as 2029 (fileHolder_replace). Obviously this is missing the original value of B.
So in am more exdented version as in the comment.
If you look for "TestValue" in the modified file, it will find the string, even if you assume it is "TestValue2". Thats because the originalvalue is a substring of the modified value. Therefore it should find twice the number of occurences. Or more precise the number of lines in which the string occurs.
If you query
if newElement in row
It will have a look if the string newElement is contained in the string row