How to read comma separated string in one cell using Python - python

I have a project wherein you need to read data from an excel file. I use openpyxl to read the said file. I tried reading the data as string first before converting it to an integer; however, error is occurring because of, I think, numbers in one cell separated by comma. I am trying to do a nested list but I still new in Python.
My code looks like this:
# storing S
S_follow = []
for row in range(2, max_row+1):
if (sheet.cell(row,3).value is not None):
S_follow.append(sheet.cell(row, 3).value);
# to convert the list from string to int, nested list
for i in range(0, len(S_follow)):
S_follow[i] = int(S_follow[i])
print(S_follow)
The data I a trying to read is:
['2,3', 4, '5,6', 8, 7, 9, 8, 9, 3, 11, 0]
hoping for your help

When you're about to convert the values to integers in the loop on the second-last line of your script, you can check if each value is an integer or string and if it is a string, just split it, convert the split values to integers and push them to a temporary list called say, strVal and then append that temp list to a new list called, say S_follow_int. But if the value is not a string, then just append them to S_follow_int without doing anything.
data= ['2,3', 4, '5,6', 8, 7, 9, 8, 9, 3, 11, 0]
S_follow = []
S_follow_int = []
for row in range(0, len(data)):
if (sheet.cell(row,3).value is not None):
S_follow.append(sheet.cell(row, 3).value);
# to convert the list from string to int, nested list
for i in range(0, len(S_follow)):
#if the current value is a string, split it, convert the values to integers, put them on a temp list called strVal and then append it to S_follow_int
if type(S_follow[i]) is str:
x = S_follow[i].split(',')
strVal = []
for y in x:
strVal.append(int(y))
S_follow_int.append(strVal)
#else if it is already an integer, just append it to S_follow_int without doing anything
else:
S_follow_int.append(S_follow[i])
print(S_follow_int)
However, I would recommend that you check the datatype(str/int) of each value in the initial loop that you used to retrieved data from the excel file itself rather than pushing all values to S_follow and then convert the type afterwards like this:
#simplified representation of the logic you can use for your script
data = ['2,3', 4, '5,6', 8, 7, 9, 8, 9, 3, 11, 0]
x = []
for dat in data:
if dat is not None:
if type(dat) is str:
y = dat.split(',')
strVal = []
for z in y:
strVal.append(int(z))
x.append(strVal)
else:
x.append(dat)
print(x)

S_follow = ['2,3', 4, '5,6', 8, 7, 9, 8, 9, 3, 11, 0]
for i in range(0, len(S_follow)):
try:
s = S_follow[i].split(',')
del S_follow[i]
for j in range(len(s)):
s[j] = int(s[j])
S_follow.insert(i,s)
except AttributeError as e:
S_follow[i] = int(S_follow[i])
print(S_follow)

Related

How to extract data from a text file and add it to a list?

Python noob here. I have this text file that has data arranged in particular way, shown below.
x = 2,4,5,8,9,10,12,45
y = 4,2,7,2,8,9,12,15
I want to extract the x values and y values from this and put them into their respective arrays for plotting graphs. I looked into some sources but could not find a particular solution as they all used the "readlines()" method that returns as a list with 2 strings. I can convert the strings to integers but the problem that I face is how do I only extract the numbers and not the rest?
I did write some code;
#lists for storing values of x and y
x_values = []
y_values = []
#opening the file and reading the lines
file = open('data.txt', 'r')
lines = file.readlines()
#splitting the first element of the list into parts
x = lines[0].split()
#This is a temporary variable to remove the "," from the string
temp_x = x[2].replace(",","")
#adding the values to the list and converting them to integer.
for i in temp_x:
x_value.append(int(i))
This gets the job done but the method I think is too crude. Is there a better way to do this?
You can use read().splitlines() and removeprefix():
with open('data.txt') as file:
lines = file.read().splitlines()
x_values = [int(x) for x in lines[0].removeprefix('x = ').split(',')]
y_values = [int(y) for y in lines[1].removeprefix('y = ').split(',')]
print(x_values)
print(y_values)
# output:
# [2, 4, 5, 8, 9, 10, 12, 45]
# [4, 2, 7, 2, 8, 9, 12, 15]
Since your new to python, here's a tip! : never open a file without closing it, it is common practice to use with to prevent that, as for your solution, you can do this :
with open('data.txt', 'r') as file:
# extract the lines
lines = file.readlines()
# extract the x and y values
x_values = [
int(el) for el in lines[0].replace('x = ', '').split(',') if el.isnumeric()
]
y_values = [
int(el) for el in lines[1].replace('y = ', '').split(',') if el.isnumeric()
]
# the final output
print(x_values, y_values)
output:
[2, 4, 5, 8, 9, 10, 12] [4, 2, 7, 2, 8, 9, 12, 15]
Used dictionary to store the data.
# read data from file
with open('data.txt', 'r') as fd:
lines = fd.readlines()
# store in a (x,y)-dictionary
out = {}
for label, coord in zip(('x', 'y'), lines):
# casting strings to integers
out[label] = list(map(int, coord.split(',')[1:]))
# display data
#
print(out)
#{'x': [4, 5, 8, 9, 10, 12, 45], 'y': [2, 7, 2, 8, 9, 12, 15]}
print(out['y'])
#[2, 7, 2, 8, 9, 12, 15]
In case desired output as list just substitute the main part with
out = []
for coord in lines:
# casting strings to integers
out.append(list(map(int, coord.split(',')[1:])))
X, Y = out

Python: Convert list of numbers to according letters

I know the answer is going to be obvious once I see it, but I can't find how to convert my output list of numbers back to letters after I've manipulated the list.
I am putting in data here:
import string
print [ord(char) - 96 for char in raw_input('Write Text: ').lower()]
and I want to be able to reverse this so after I manipulate the list, I can return it back to letters.
example: input gasoline / output [7, 1, 19, 15, 12, 9, 14, 5]
manipulate the output with append or other
then be able to return it back to letters.
Everything I search is only to convert letterst to numbers and nothing to convert that list of numbers back to letters.
Thank you!
It can be done by using chr() built-in function :
my_list = [7, 1, 19, 15, 12, 9, 14, 5]
out = ""
for char in my_list:
out += chr( 96 + char )
print(out) # Prints gasoline
If you want the final output as a list of characters use the first one otherwise the last one.
l = [7, 1, 19, 15, 12, 9, 14, 5] # l is your list of integers
listOfChar = list(map(chr,[x+96 for x in l]))
aWord = "".join(list(map(chr,[x+96 for x in l])))#your word here is "gasoline"

using += to populate a list through while loop gives me an error

I have a very basic understanding that += and .append are quite similar in terms of appending new element to a list. However, I find them perform differently when I try to populate a list with random integer values through while loop. append works well, however, running my program with += will give me an error :
TypeError: 'int' object is not iterable
Here is my code:
1.use +=
import random
random_list = []
list_length = 20
# Write code here and use a while loop to populate this list of random integers.
i = 0
while i < 20:
random_list += random.randint(0,10)
i = i + 1
print random_list
**TypeError: 'int' object is not iterable**
2.use .append
import random
random_list = []
list_length = 20
# Write code here and use a while loop to populate this list of random integers.
i = 0
while i < 20:
random_list.append(random.randint(0,10))
i = i + 1
print random_list
**[4, 7, 0, 6, 3, 0, 1, 8, 5, 10, 9, 3, 4, 6, 1, 1, 4, 0, 10, 8]**
Does anyone know why would this happen?
This happens because += is for appending a list to the end of another list, not for appending an item.
It is the short version of doing:
items = items + new_value
If new_value isn't a list this will fail because you can't use + to add a item to a list.
items = items + 5 # Error: can only add two list together
The solution is to make the value into a one-item long list:
items += [value]
Or to use .append - the preferred way to add single items to a list.
Yes, it's tricky. just add a , at end of random.randint(0, 10)
import random
random_list = []
list_length = 20
# Write code here and use a while loop to populate this list of random integers.
i = 0
while i < 20:
random_list += random.randint(0, 10),
i += 1
print random_list
It will print:
[4, 7, 7, 10, 0, 5, 10, 2, 6, 2, 6, 0, 2, 7, 5, 8, 9, 8, 0, 2]
You can find more explanation about trailing ,

Counting like elements in a list and appending list

I am trying to create a list in Python with values pulled from an active excel sheet. I want it to pull the step # value from the excel file and append it to the list while also including which number of that element it is. For example, 1_1 the first time it pulls 1, 1_2 the second time, 1_3 the third, etc. My code is as follows...
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
CellNum = xl.ActiveSheet.UsedRange.Rows.Count
Steps = []
for i in range(2,CellNum + 1): #Create load and step arrays in abaqus after importing from excel
if str(int(xl.Cells(i,1).value))+('_1' or '_2' or '_3' or '_4' or '_5' or '_6') in Steps:
StepCount = 1
for x in Steps:
if x == str(int(xl.Cells(i,1).value))+('_1' or '_2' or '_3' or '_4' or '_5' or '_6'):
StepCount+=1
Steps.append(str(int(xl.Cells(i,1).value))+'_'+str(StepCount))
else:
Steps.append(str(int(xl.Cells(i,1).value))+'_1')
I understand that without the excel file, the program will not run for any of you, but I was just wondering if it is some simple error that I am missing. When I run this, the StepCount does not go higher than 2 so I receive a bunch of 1_2, 2_2, 3_2, etc elements. I've posted my resulting list below.
>>> Steps
['1_1', '2_1', '3_1', '4_1', '5_1', '6_1', '7_1', '8_1', '9_1', '10_1', '11_1', '12_1',
'13_1', '14_1', '1_2', '14_2', '13_2', '12_2', '11_2', '10_2', '2_2', '3_2', '9_2',
'8_2', '7_2', '6_2', '5_2', '4_2', '3_2', '2_2', '1_2', '2_2', '3_2', '4_2', '5_2',
'6_2', '7_2', '8_2', '9_2', '10_2', '11_2', '12_2', '13_2', '14_2', '1_2', '2_2']
EDIT #1: So, if the ('_1' or '_2' or '_3' or '_4' or '_5' or '_6') will ALWAYS only use _1, is it this line of code that is messing with my counter?
if x == str(int(xl.Cells(i,1).value))+('_1' or '_2' or '_3' or '_4' or '_5' or '_6'):
Since it is only using _1, it will only count 1_1 and not check 1_2, 1_3, 1_4, etc
EDIT #2: Now I am using the following code. My input list is also below.
from collections import defaultdict
StepsList = []
Steps = []
tracker = defaultdict(int)
for i in range(2,CellNum + 1):
StepsList.append(int(xl.Cells(i,1).value))
>>> StepsList
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1, 14, 13, 12, 11, 10, 2, 3, 9, 8,
7, 6, 5, 4, 3, 2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 1, 2]
for cell in StepsList:
Steps.append('{}_{}'.format(cell, tracker[cell]+1)) # This is +1 because the tracker starts at 0
tracker[cell]+=1
I get the following error: ValueError: zero length field name in format from the for cell in StepsList: iteration block
EDIT #3: Got it working. For some reason it didn't like
Steps.append('{}_{}'.format(cell, tracker[cell]+1))
So I just changed it to
for cell in StepsList:
tracker[cell]+=1
Steps.append(str(cell)+'_'+str(tracker[cell]))
Thanks for all of your help!
This line:
if str(int(xl.Cells(i,1).value))+('_1' or '_2' or '_3' or '_4' or '_5' or '_6') in Steps:
does not do what you think it does. ('_1' or '_2' or '_3' or '_4' or '_5' or '_6') will always return '_1'. It does not iterate over that series of or values looking for a match.
Without seeing expected input vs. expected output, it's hard to point you in the correct direction to actually get what you want out of your code, but likely you'll want to leverage itertools.product or one of the other combinatoric methods from itertools.
Update
Based on your comments, I think that this is a way of solving your problem. Assuming an input list of the following:
in_list = [1, 1, 1, 2, 3, 3, 4]
You can do the following:
from collections import defaultdict
tracker = defaultdict(int) # defaultdict is just a regular dict with a default value at new keys (in this case 0)
steps = []
for cell in in_list:
steps.append('{}_{}'.format(cell, tracker[cell]+1)) # This is +1 because the tracker starts at 0
tracker[cell]+=1
Result:
>>> steps
['1_1', '1_2', '1_3', '2_1', '3_1', '3_2', '4_1']
There are likely more efficient ways to do this using combinations of itertools, but this way is certainly the most straight-forward

Searching for key string within target string in Python recursively

The following is for Python 3.2.3.
I would like to write a function that takes two arguments, a key string and a target string. These function is to recursively determine (it must be recursive) the positions of the key string in the target string.
Currently, my code is as follows.
def posSubStringMatchRecursive(target,key):
import string
index=str.rfind(target, key)
if index !=-1:
print (index)
target=target[:(index+len(key)-1)]
posSubStringMatchRecursive(target,key)
The issue with this is that there is no way to store all the locations of the key string in the target string in a list as the numbers indicating the location will just be printed out.
So, my question is, is there any way to change the code such that the positions of the key string in the target string can be stored in a list?
Example Output
countSubStringMatchRecursive ('aatcgdaaaggraaa', 'aa')
13
12
7
6
0
Edit
The following code seems to work without the issue in Ashwini's code. Thanks, Lev.
def posSubStringMatchRecursive(target,key):
import string
index=str.rfind(target, key)
if index ==-1:
return []
else:
target=target[:(index+len(key)-1)]
return ([index] + posSubStringMatchRecursive(target,key))
def posSubStringMatchRecursive(target,key,res):
import string
index=str.rfind(target, key)
if index !=-1:
target=target[:(index+len(key)-1)]
res.append(index) #append the index to the list res,
return posSubStringMatchRecursive(target,key,res) #Use return here when calling recursively else your program will return None, and also pass res to the function
else:
return res
print(posSubStringMatchRecursive('aatcgdaaaggraaa', 'aa',[]))#pass a empty list to the function
print(posSubStringMatchRecursive('aatcgdaaaggraaa', 'a',[]))
output:`
[13, 12, 7, 6, 0]`
[14, 13, 12, 8, 7, 6, 1, 0]
Since it suspiciously resembles a homework question, here's an example of a recursive function that returns a list:
In [1]: def range_rec(limit):
if limit == 0:
return []
else:
return ([limit-1] + range_rec(limit-1)[::-1])[::-1]
...:
In [2]: range_rec(10)
Out[2]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Categories