Replace item in string formatted as csv line

Replace item in string formatted as csv line - python

Goal is to replace the second field of csv_line with new_item in an elegant way. This question is different from the topics listed by Rawing because here we are working with a different data structure, though we can use other topics to get inspired.
# Please assume that csv_line has not been imported from a file.
csv_line = 'unknown_item1,unknown_old_item2,unknown_item3'
new_item = 'unknown_new_item2'
goal = 'unknown_item1,unknown_new_item2,unknown_item3'
# Works but error prone. Non-replaced items could be inadvertently swapped.
# In addition, not convenient if string has many fields.
item1, item2, item3 = csv_line.split(',')
result = ','.join([item1, new_item, item3])
print(result) # unknown_item1,unknown_new_item2,unknown_item3
# Less error prone but ugly.
result_list = []
new_item_idx = 1
for i, item in enumerate(csv_line.split(',')):
result_list += [item] if i != new_item_idx else [new_item]
result = ','.join(result_list)
print(result) # unknown_item1,unknown_new_item2,unknown_item3
# Ideal (not-error prone) but not working.
csv_line.split(',')[1] = new_item
print(csv_line) # unknown_item1,unknown_old_item2,unknown_item3

The second item could be replaced using Python's CSV library by making use of io.StringIO() objects. This behave like files but can be read as a string:
import csv
import io
csv_line = 'unknown_item1,unknown_old_item2,unknown_item3'
new_item = 'unknown_new_item2'
row = next(csv.reader(io.StringIO(csv_line)))
row[1] = new_item
output = io.StringIO()
csv.writer(output).writerow(row)
goal = output.getvalue()
print(goal)
This would display goal as:
unknown_item1,unknown_new_item2,unknown_item3

l = csv_line.split(',')
l[1] = new_item
csv_line = ','.join(l)

In the line csv_line.split(',')[1] = new_item, you do not alter the csv_line variable at all. You need to assign the new list created with .split() to a variable before you can change the elements within it:
new_csv = csv_line.split(',')
new_csv[1] = new_item
print(','.join(new_csv))

This seems the most pythonic:
csv_line = 'unknown_item1,old_item2,unknown_item3'
old_index = 1
new_item = 'new_item2'
goal = 'unknown_item1,new_item2,unknown_item3'
items = csv_line.split(',')
items[old_index] = new_item
print(','.join(items))
print(goal)
Output:
unknown_item1,new_item2,unknown_item3
unknown_item1,new_item2,unknown_item3

Related

How can I make the for work in this function?

could you help me with this problem? damned for! :p
def exchange(x):
r = requests.get(URL1 + x + URL2)
js = r.json()
df = pd.DataFrame.from_dict(js, orient="index").transpose()
return df
if capture data with next code, after individual append() i have expected answer:
c = exchange("tiendacrypto")
d = exchange("belo")
c.append(d)
but, i don't find the error in the for:
a = []
for i in exchanges:
print(exchange(i))
a = exchange(i)
a.append(a)

The issue here is the reassignment of the a value on line 2 in the for loop.
You need to use a different variable name.
𝚊 = []
𝚏𝚘𝚛 𝚒 𝚒𝚗 𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎𝚜:
𝚙𝚛𝚒𝚗𝚝(𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎(𝚒))
x = 𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎(𝚒)
𝚊.𝚊𝚙𝚙𝚎𝚗𝚍(x)
Notice how we dont now change a in each loop.

You're using a twice.
𝚊 = []
𝚏𝚘𝚛 𝚒 𝚒𝚗 𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎𝚜:
𝚙𝚛𝚒𝚗𝚝(𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎(𝚒))
𝚊 = 𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎(𝚒) # Here is overwritten!
𝚊.𝚊𝚙𝚙𝚎𝚗𝚍(𝚊)#
results = []
𝚏𝚘𝚛 𝚒 𝚒𝚗 𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎𝚜:
df = 𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎(𝚒)
𝚙𝚛𝚒𝚗𝚝(𝚎𝚡𝚌𝚑𝚊𝚗𝚐𝚎(df))
results.𝚊𝚙𝚙𝚎𝚗𝚍(df)

Python list data filtering

I have a list that holds names of files, some of which are almost identical except for their timestamp string section. The list is in the format of [name-subname-timestamp] for example:
myList = ['name1-001-20211202811.txt', 'name1-001-202112021010.txt', 'name1-002-202112021010.txt', 'name2-002-202112020811.txt']
What I need is a list that holds for every name and subname, the most recent file derived by the timestamp. I have started by creating a list that holds every [name-subname]:
name_subname_list = []
for row in myList:
name_subname_list.append((row.rpartition('-')[0]))
name_subname_list = set(name_subname_list) # {'name1-001', 'name2-002', 'name1-002'}
Not sure if it is the right approach, moreover I am not sure how to continue. Any ideas?

This code is what you asked for:
For each name-subname, you will have the corresponding newest file:
from datetime import datetime as dt
dic = {}
for i in myList:
sp = i.split('-')
name_subname = sp[0]+'-'+sp[1]
mytime = sp[2].split('.')[0]
if name_subname not in dic:
dic[name_subname] = mytime
else:
if dt.strptime(mytime, "%Y%m%d%H%M") > dt.strptime(dic[name_subname], "%Y%m%d%H%M"):
dic[name_subname] = mytime
result = []
for name_subname in dic:
result.append(name_subname+'-'+dic[name_subname]+'.txt')
which out puts resutl to be like:
['name1-001-202112021010.txt',
'name1-002-202112021010.txt',
'name2-002-202112020811.txt']

Try this:
myList = ['name1-001-20211202811.txt', 'name1-001-202112021010.txt', 'name1-002-202112021010.txt', 'name2-002-202112020811.txt']
dic = {}
for name in myList:
parts = name.split('-')
dic.setdefault(parts[0] + '-' + parts[1], []).append(parts[2])
unique_list = []
for key,value in dic.items():
unique_list.append(key + '-' + max(value))

CSV Python Outputting: Outputting non-matching field once rather than once for every item in list

I've been trying to figure this out for about a year now and I'm really burnt out on it so please excuse me if this explanation is a bit rough.
I cannot include job data, but it would be accurate to imagine 2 csv files both with the first column populated with values (Serial numbers/phone numbers/names, doesn't matter - just values). Between both csv files, some values would match while other values would only be contained in one or the other (Timmy is in both files and is a match, Robert is only in file 1 and does not match any name in file 2).
I can successfully output a csv value ONCE that exists in the both csv files (I.e. both files contain "Value78", output file will contain "Value78" only once).
When I try to tack on an else statement to my if condition, to handle non-matching items, the program will output 1 entry for every item it does not match with (makes 100% sense, matches happen once but every other comparison result besides the match is a non-match).
I cannot envision a structure or method to hold the fields that don't match back so that they can be output once and not overrun my terminal or output file.
My goal is to output two csv files, matches and non-matches, with the non-matches having only one entry per value.
Anyways, onto the code:
import csv
MYUNITS = 'MyUnits.csv'
VENDORUNITS = 'VendorUnits.csv'
MATCHES = 'Matches.csv'
NONMATCHES = 'NonMatches.csv'
with open(MYUNITS,mode='r') as MFile,
open(VENDORUNITS,mode='r') as VFile,
open(MATCHES,mode='w') as OFile,
open(NONMATCHES,mode'w') as NFile:
MyReader = csv.reader(MFile,delimiter=',',quotechar='"')
MyList = list(MyReader)
VendorReader = csv.reader(VFile,delimiter=',',quotechar='"')
VList = list(VendorReader)
for x in range(len(MyList)):
for y in range(len(VList)):
if str(MyList[x][0]) == str(VList[y][0]):
OFile.write(MyList[x][0] + '\n')
else:
pass
The "else: pass" is where the logic of filtering out non-matches is escaping me. Outputting from this else statement will write the non-matching value (len(VList) - 1) times for an iteration that DOES produce 1 match, the entire len(VList) for an iteration with no match. I've tried using a counter and only outputting if the counter equals the len(VList), (incrementing in the else statement, writing output under the scope of the second for loop), but received the same output as if I tried outputting non-matches.

Below is one way you might go about deduplicating and then writing to a file:
import csv
MYUNITS = 'MyUnits.csv'
VENDORUNITS = 'VendorUnits.csv'
MATCHES = 'Matches.csv'
NONMATCHES = 'NonMatches.csv'
list_of_non_matches = []
with open(MYUNITS,mode='r') as MFile,
open(VENDORUNITS,mode='r') as VFile,
open(MATCHES,mode='w') as OFile,
open(NONMATCHES,mode'w') as NFile:
MyReader = csv.reader(MFile,delimiter=',',quotechar='"')
MyList = list(MyReader)
VendorReader = csv.reader(VFile,delimiter=',',quotechar='"')
VList = list(VendorReader)
for x in range(len(MyList)):
for y in range(len(VList)):
if str(MyList[x][0]) == str(VList[y][0]):
OFile.write(MyList[x][0] + '\n')
else:
list_of_non_matches.append(MyList[x][0])
# Remove duplicates from the non matches
new_list = []
[new_list.append(x) for x in list_of_non_matches if x not in new_list]
# Write the new list to a file
for i in new_list:
NFile.write(i + '\n')

Does this work?
import csv
MYUNITS = 'MyUnits.csv'
VENDORUNITS = 'VendorUnits.csv'
MATCHES = 'Matches.csv'
NONMATCHES = 'NonMatches.csv'
with open(MYUNITS,'r') as MFile,
(VENDORUNITS,'r') as VFile,
(MATCHES,'w') as OFile,
(NONMATCHES,mode,'w') as NFile:
MyReader = csv.reader(MFile,delimiter=',',quotechar='"')
MyList = list(MyReader)
MyVals = [x for x in MyList]
MyVals = [x[0] for x in MyVals]
VendorReader = csv.reader(VFile,delimiter=',',quotechar='"')
VList = list(VendorReader)
vVals = [x for x in VList]
vVals = [x[0] for x in vVals]
for val in MyVals:
if val in vVals:
OFile.write(Val + '\n')
else:
NFile.write(Val + '\n')
#for x in range(len(MyList)):
# for y in range(len(VList)):
# if str(MyList[x][0]) == str(VList[y][0]):
# OFile.write(MyList[x][0] + '\n')
# else:
# pass

Sorry, I had some issues with my PC. I was able to solve my own question the night I posted. The solution I used is so simple I'm kicking myself for not figuring it out way sooner:
import csv
MYUNITS = 'MyUnits.csv'
VENDORUNITS = 'VendorUnits.csv'
MATCHES = 'Matches.csv'
NONMATCHES = 'NonMatches.csv'
with open(MYUNITS,mode='r') as MFile,
open(VENDORUNITS,mode='r') as VFile,
open(MATCHES,mode='w') as OFile,
open(NONMATCHES,mode'w') as NFile:
MyReader = csv.reader(MFile,delimiter=',',quotechar='"')
MyList = list(MyReader)
VendorReader = csv.reader(VFile,delimiter=',',quotechar='"')
VList = list(VendorReader)
for x in range(len(MyList)):
tmpStr = ''
for y in range(len(VList)):
if str(MyList[x][0]) == str(VList[y][0]):
tmpStr = '' #Sets to blank so comparison fails, works because break
OFile.write(MyList[x][0] + '\n')
break
else:
tmp = str(MyList[x][0])
if tmp != '':
NFile.write(tmp + '\n')

Python - change value of list item

My code is as follows with comments. it runs fine until it comes to changing the value of a list item i.e. data[x][y] = something.
Tdata = cursor.fetchall() #Get data from MYSQL database
data = list(Tdata) #Convert into list...not sure if absolutely required
APIData = APIDataList()
MPLlat = 0.0
MPLLon = 0.0
RadiusOI = 15
for i in (range(0,len(data))):
MPLCount = 0
MPLlat = data[i][2]
MPLLon = data[i][3]
MPLCount = CountofbikesnearMPL(MPLlat, MPLLon, RadiusOI)
if MPLCount>0:
data[i][4] = MPLCount #ERROR: here is where the error is kicking in.
#get error "tuple' object does not support
#item assignment"
I really cant figure out why this is happening and have tried googling but with no success. Any help will be deeply appreciated.
Thanks in advance.
C

cursor.fetchall() returns a list of tuples.
That means that data[i] will be a tuple, which is by definition immutable. If you want to modify data[i], you will need to turn your tuples into lists
data = [list(row) for row in Tdata]
or replace the entire row via tuple concatenation
data[i] = data[i][:4] + (MPLCount,) + data[i][5:]

Maybe it is cleaner to write it using enumerate:
for i, elem in enumerate(data):
# MPLCount = 0 - I suppose it is unnecessary since you overwrite the value below
MPLlat = elem[2]
MPLLon = elem[3]
MPLCount = CountofbikesnearMPL(MPLlat, MPLLon, RadiusOI)
if MPLCount > 0:
data[i] = elem[:4] + (MPLCount,) + elem[5:]
Is there a reason you name variables and functions ThisWay? If not, it would be nice if you follow PEP8 and name them this_way. Anyway be consistent and don't mix two styles together.

An idiomatic Python version of Ruby code with a while loop that tests a statement?

The following code Ruby code will iterate the source string and produce a list of the cumulative words delimited by a '.' character, other than those after the last '.'.
For example, give a source string of 'Company.Dept.Group.Team' the result will be ...
["Company.Dept.Group", "Company.Dept", "Company"]
Given that a while loop in Python (I believe) will test only an expression and not a statement as shown below, how would one best write this in idiomatic Python?
#ruby
source = 'Company.Dept.Group.Team'
results = []
temp = source.clone
while (i = temp.rindex('.')) # test statement not supported in Python?
temp = temp[0...i]
results << temp
end
p results # >> ["Company.Dept.Group", "Company.Dept", "Company"]

The Python idiom is something like this:
while True:
i = temp.rindex('.')
if not i:
break
...

>>> source = 'Company.Dept.Group.Team'
>>> last = []
>>> [last.append(s) or '.'.join(last) for s in source.split('.')[:-1]]
['Company', 'Company.Dept', 'Company.Dept.Group']

To accomplish this in general, I'd probably do:
source = 'Company.Dept.Group.Team'
split_source = source.split('.')
results = ['.'.join(split_source[0:x]) for x in xrange(len(split_source) - 1, 0, -1)]
print results
A literal translation would be more like:
source = 'Company.Dept.Group.Team'
temp = source
results = []
while True:
i = temp.rfind('.')
if i < 0:
break
temp = temp[0:i]
results.append(temp)
print results
Or, if you prefer:
source = 'Company.Dept.Group.Team'
temp = source
results = []
try:
while True:
temp = temp[0:temp.rindex('.')]
results.append(temp)
except ValueError:
pass
print results
Or:
source = 'Company.Dept.Group.Team'
temp = source
results = []
i = temp.rfind('.')
while i > 0:
temp = temp[0:i]
results.append(temp)
i = temp.rfind('.')
print results
As you point out, the fact that you cannot treat assignment as an expression makes these cases a bit inelegant. I think the former cases(s) - i.e. "while True" - are more common than the last one.
For more background, this post looks pretty good: http://effbot.org/pyfaq/why-can-t-i-use-an-assignment-in-an-expression.htm

If you get used to Python you see list comprehensions and iterators/generators everywhere!
Python could be
source = 'Company.Dept.Group.Team'
# generate substrings
temp = source.split(".")
results = [".".join(temp[:i+1]) for i,s in enumerate(temp)]
# pop the team (alternatively slice the team out above)
results.pop()
# reverse results
result.reverse()
print result # should yield ["Company.Dept.Group", "Company.Dept", "Company"]
but most probably there are more idiomatic solutions ...

I would do
>>> import re
>>> source = 'Company.Dept.Group.Team'
>>> results = [source[:m.start()] for m in re.finditer(r"\.", source)]
>>> results
['Company', 'Company.Dept', 'Company.Dept.Group']
(use reversed(results) if you want the order to be reversed).
A more or less literal translation of your code into Python would be
source = 'Company.Dept.Group.Team'
results = []
temp = source
while True:
try:
i = temp.rindex('.')
temp = temp[:i]
results.append(temp)
except ValueError:
break
print(results)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replace item in string formatted as csv line - python

l = csv_line.split(',') l[1] = new_item csv_line = ','.join(l)

In the line csv_line.split(',')[1] = new_item, you do not alter the csv_line variable at all. You need to assign the new list created with .split() to a variable before you can change the elements within it: new_csv = csv_line.split(',') new_csv[1] = new_item print(','.join(new_csv))

Related

How can I make the for work in this function?

Python list data filtering

CSV Python Outputting: Outputting non-matching field once rather than once for every item in list

Python - change value of list item

An idiomatic Python version of Ruby code with a while loop that tests a statement?

Categories

Resources