I am trying t convert a number as follows:
input -> 123456
output -> ['1','2','3','4','5','6']
The following loop does work:
number = 123456
defg = []
abc = str(number)
for i in range(len(abc)):
defg.append(abc[i])
However, when i try this in the form of a one line for loop, the output is ['None','None','None','None']
My one line loop is as follows:
number = 123456
defg = []
abc = str(number)
defg= [[].append(abc[i]) for i in range(len(abc))]
Any idea what I am doing wrong?
The answers above are concise, but I personally prefer vanilla list comprehension over map, as powerful as it is. I add this answer simply for the sake of adding diversity to the mix, showing that the same can be achieved without using map.
str_list = [abc[i] for i in range(len(abc))]
We can strive for a more Pythonic solution by avoiding direct indexing and using a simpler loop.
better_str_list = [char for char in abc]
This list comprehension is basically just a translation of the for loop you already have.
Try this:
x = 527876324
print(list(str(x))
Output
['5', '2', '7', '8', '7', '6', '3', '2', '4']
here the solution
list(map(lambda x: str(x), str(number)))
Out[13]: ['1', '2', '3', '4', '5', '6']
or you can do this
list(map(str, str(number)))
Out[13]: ['1', '2', '3', '4', '5', '6']
or this
list(str(number))
['1', '2', '3', '4', '5', '6']
duplicate you can see more here
Splitting integer in Python?
Solution
As your expection the result is
number = 123456
defg = []
abc = str(number)
[defg.append(abc[i]) for i in range(len(abc))]
print(defg)
When you run the loop the values are append to defg but it return none. you assign the None value to defg so it will show the None output
Recomanded way
You can use list metnod to convert the number to list
number = 865393410
result = list(str(number))
print(result)
If you want int in list try this
number = 865393410
result = []
for i in list(str(number)):
result.append(int(i))
print(result)
With single line loop
number = 865393410
result = []
[result.append(int(i)) for i in list(str(number))]
print(result)
The last one is recommended for you
I am trying do precess a list of files
file_list = ['.DS_Store', '9', '7', '6', '8', '01', '4', '3', '2', '5']
the goal is to find the files whose name has only one character.
I tried this code
r = re.compile('[0-9]')
result_list = list(filter(r.match, file_list))
result_list
and got
['9', '7', '6', '8', '01', '4', '3', '2', '5']
where '01' should not be included.
I made a workaround
tmp = []
for i in file_list:
if len(i)==1:
tmp.append(i)
tmp
and I got
['9', '7', '6', '8', '4', '3', '2', '5']
this is exactly what I want. Although the method is ugly.
how can I use regex in Python to finish the task?
r = re.compile('^[0-9]$')
The ^ matches the beginning of a line and $ matches the end.
And if you really want it to match any character, not just numbers, it should be
r = re.compile('^.$')
The . in the regex is a single-character wildcard.
Match a string if it's simply any single character appearing at the beginning of the string (^.) right before the end of the string ($):
^.$
Regex101
Your Python then becomes:
r = re.compile('^.$')
result_list = list(filter(r.match, file_list))
Your code is equivalent to
[ i for i in file_list if len(i)==1]
And this method adapts to every case in which file's name has only one character.
this is my code:
positions = []
for i in lines[2]:
if i not in positions:
positions.append(i)
print (positions)
print (lines[1])
print (lines[2])
the output is:
['1', '2', '3', '4', '5']
['is', 'the', 'time', 'this', 'ends']
['1', '2', '3', '4', '1', '5']
I would want my output of the variable "positions" to be; ['2','3','4','1','5']
so instead of removing the second duplicate from the variable "lines[2]" it should remove the first duplicate.
You can reverse your list, create the positions and then reverse it back as mentioned by #tobias_k in the comment:
lst = ['1', '2', '3', '4', '1', '5']
positions = []
for i in reversed(lst):
if i not in positions:
positions.append(i)
list(reversed(positions))
# ['2', '3', '4', '1', '5']
You'll need to first detect what values are duplicated before you can build positions. Use an itertools.Counter() object to test if a value has been seen more than once:
from itertools import Counter
counts = Counter(lines[2])
positions = []
for i in lines[2]:
counts[i] -= 1
if counts[i] == 0:
# only add if this is the 'last' value
positions.append(i)
This'll work for any number of repetitions of values; only the last value to appear is ever used.
You could also reverse the list, and track what you have already seen with a set, which is faster than testing against the list:
positions = []
seen = set()
for i in reversed(lines[2]):
if i not in seen:
# only add if this is the first time we see the value
positions.append(i)
seen.add(i)
positions = positions[::-1] # reverse the output list
Both approaches require two iterations; the first to create the counts mapping, the second to reverse the output list. Which is faster will depend on the size of lines[2] and the number of duplicates in it, and wether or not you are using Python 3 (where Counter performance was significantly improved).
you can use a dictionary to save the last position of the element and then build a new list with that information
>>> data=['1', '2', '3', '4', '1', '5']
>>> temp={ e:i for i,e in enumerate(data) }
>>> sorted(temp, key=lambda x:temp[x])
['2', '3', '4', '1', '5']
>>>
I am splitting a string in python and my goal is to split by commas except these between quotations marks. I am using
fields = line.strip().split(",")
but some strings are like the following one:
10,20,"Installations, machines",3,5
How can I use regular expressions for accomplishing this?
Although I agree that regular expressions may not be the best tool for the job, I found the problem quite interesting on its own.
import re
split_on_commas = re.compile(r'[^,]*".*"[^,]*|[^,]+|(?<=,)|^(?=,)').findall
This regexp consists in four alternative parts in this order:
any number of non-commas, followed by a substring enclosed between double quotes, followed by any number of non-commas;
at least one non-comma;
an empty substring following a comma;
an empty substring at the start of the string, and followed by a comma.
Some tests:
assert split_on_commas('10,20,"aaa, bbb",3,5') == ['10', '20', '"aaa, bbb"', '3', '5']
assert split_on_commas('10,,20,"aaa, bbb",3,5') == ['10', '', '20', '"aaa, bbb"', '3', '5']
assert split_on_commas('10,,,20,"aaa, bbb",3,5') == ['10', '', '', '20', '"aaa, bbb"', '3', '5']
assert split_on_commas(',10,20,"aaa, bbb",3,5') == ['', '10', '20', '"aaa, bbb"', '3', '5']
assert split_on_commas('10,20,"aaa, bbb",3,5,') == ['10', '20', '"aaa, bbb"', '3', '5', '']
assert split_on_commas('10,20,"aaa, bbb" ccc,3,5') == ['10', '20', '"aaa, bbb" ccc', '3', '5']
assert split_on_commas('10,20,ccc "aaa, bbb",3,5') == ['10', '20', 'ccc "aaa, bbb"', '3', '5']
assert split_on_commas('10,20,"aaa, bbb" "ccc",3,5,') == ['10', '20', '"aaa, bbb" "ccc"', '3', '5', '']
assert split_on_commas('10,20,"aaa, bbb" "ccc, ddd",3,5,') == ['10', '20', '"aaa, bbb" "ccc, ddd"', '3', '5', '']
assert split_on_commas('10,20,"aaa, "bbb",3,5') == ['10', '20', '"aaa, "bbb"', '3', '5']
assert split_on_commas('10,20,"",3,5') == ['10', '20', '""', '3', '5']
assert split_on_commas('10,20,",",3,5') == ['10', '20', '","', '3', '5']
assert split_on_commas(',,,') == ['', '', '', '']
assert split_on_commas('') == []
assert split_on_commas(',') == ['', '']
assert split_on_commas('","') == ['","']
assert split_on_commas('",') == ['"', '']
assert split_on_commas(',"') == ['', '"']
assert split_on_commas('"') == ['"']
Update: comparison with the csv module solution
Similar questions have been asked many times on SO, and each time the best / accepted answer was "Just use the csv module". Perhaps it's useful to point out some differences between the recommended solution and my re proposition. But first, devise a csv function with the same interface as split (not idiomatic, but consistent with the original requirement):
import csv
split_on_commas = lambda s: csv.reader([s]).next()
The first thing to be aware of is that csv.reader does more than a smart split. The external delimiters are suppressed:
assert split_on_commas('10,20,"aaa, bbb",3,5') == ['10', '20', 'aaa, bbb', '3', '5']
Which can lead to some strange behaviours:
assert split_on_commas('10,20,"aaa, bbb" ccc,3,5') == ['10', '20', 'aaa, bbb ccc', '3', '5']
assert split_on_commas('10,20,aaa", bbb ccc",3,5') == ['10', '20', 'aaa"', ' bbb ccc"', '3', '5']
I am sure this is not a problem with a generated CSV, since the offending double quotes would be escaped.
More shocking is the fact that this module still does not support Unicode:
split_on_commas(u'10,20,"Juan, Chô",3,5')
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-83-a0ef82b5fc26> in <module>()
----> 1 split_on_commas(u'10,20,"Juan, Chô",3,5')
<ipython-input-81-18a2b4070348> in <lambda>(s)
1 if __name__ == "__main__":
2 import csv
----> 3 split_on_commas = lambda s: csv.reader([s]).next()
4
5 assert split_on_commas('10,20,"aaa, bbb",3,5') == ['10', '20', 'aaa, bbb', '3', '5']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf4' in position 15: ordinal not in range(128)
But there is of course a third difference: my solution has not be thoroughly tested, and is not guaranteed to work in the cases I didn't think of... Now, since this approach seems to have several real use cases (e.g., non-TSV files, non-ASCII input), I would be glad if some regex guru, far from dismissing it as dangerous, could help to find out its limitations and improve it.
This is how I'd do it:
import re
data = "my string \"string is nice\" other string "
print re.findall(r'(\w+|".*?")', data)
The output will be:
['my', 'string', '"string is nice"', 'other', 'string']
I don't think there's anything to explain here as the regex speaks for itself. Anyway, if you have any doubts I recommend regex101
\w+ - match any word character [a-zA-Z0-9_]
" - matches the characters " literally
.*? - matches any character (except newline)
If you also want to get rid of the square brackets, do this:
import re
string = "my string \"string is nice\" other string "
parsed_string = re.findall(r'(\w+|".*?")', string)
print(", ".join(parsed_string))
The output will be:
my, string, "string is nice", other, string
As jonrsharpe and Alan Moore mentioned, the Python's built-in CSV module would be a much better solution.
As per their own example:
import csv
with open('some.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
print row
Regular expressions will not work well here.
You can split by comma and then recombine...
Or use the csv module as suggested in the comments...
line = '10,20,"Installations, machines",3,5'
fields = line.strip().split(",")
result = []
tmpfield = ''
for checkfield in fields:
tmpfield = checkfield if tmpfield=='' else tmpfield +','+ checkfield
if tmpfield.strip().startswith('"'):
if tmpfield.strip().endswith('"'):
result.append(tmpfield)
tmpfield = ''
else:
result.append(tmpfield)
tmpfield = ''
if tmpfield<>'':
result.append(tmpfield)
print(result)
i would like to slice a set within a list, but every time i do so, i get an empty list in return.
what i try to accomplish (maybe there is an easier way):
i got a list of sets
each set has 5 items
i would like to compare a new set against the list (if the set already exists in the list)
the first and the last item in the set is irrelevant for the comparison, so only the positions 2-4 are valid for the search of already existing sets
here is my code:
result_set = ['1', '2', '3', '4', '5']
result_matrix = []
result_matrix.append(result_set)
slicing the set is no problem:
print result_set[1:4]
['2', '3', '4']
print result_matrix[:][1:4]
[]
i would expect:
[['2', '3', '4']]
I think this is what you want to do:
>>> target_set = ['2', '3', '4']
>>> any([l for l in result_matrix if target_set == l[1:-1]])
True
>>> target_set = ['1', '2', '3']
>>> any([l for l in result_matrix if target_set == l[1:-1]])
False
Generalising and making that a function:
def is_set_in_matrix(target_set, matrix):
return any(True for l in matrix if list(target_set) == l[1:-1])
>>> result_matrix = [['1', '2', '3', '4', '5']]
>>> is_set_in_matrix(['1', '2', '3'], result_matrix)
False
>>> is_set_in_matrix(['2', '3', '4'], result_matrix)
True
# a quirk - it also works with strings...`
>>> s = '234'
>>> is_set_in_matrix(s, result_matrix)
True
Note that I have used l[1:-1] to ignore the first and last elements of the "set" in the comparison. This is more flexible should you ever need sets of different lengths.
>>> result_set = ['1', '2', '3', '4', '5']
>>> print result_set[1:4]
['2', '3', '4']
>>> result_matrix.append(result_set[1:4])
>>> result_matrix
[['2', '3', '4']]
Using result_matrix[:] returns the whole matrix as it is. You need to treat the result you want as a part of the array.
>>> result_matrix.append(result_set)
>>> result_matrix[:]
[['1', '2', '3', '4']]
>>> result_matrix[:][0]
['1', '2', '3', '4']
>>> result_matrix[0][1:4]
['2', '3', '4']
Also, as pointed out by falsetru:
>>> result_matrix.extend(result_set)
>>> result_matrix
['1', '2', '3', '4']
>>> result_matrix[1:4]
['2', '3', '4']