Naming lists dynamically in Python - python

As the title says I'm trying to name lists dynamically in Python. The purpose of the code is to create lists around consecutive letters. Here is my code:
consecutive_duplicates=["a","a","a","a","b","c","c","a","a","d","e","e","e","e","X"]
count=0
name=0
for i in consecutive_duplicates:
if consecutive_duplicates[count]==consecutive_duplicates[count+1] or
consecutive_duplicates[count]==consecutive_duplicates[count-1]:
consecutive_duplicates[name].append(i)
count=count+1
else:
consecutive_duplicates[name+1].append(i)
name=name+1
I'm at a loss of how to name the lists. Obviously this doesn't work as it is. What would make it work?
I'm also having trouble defining the dynamic lists. How should I do that?

You can use itertools.groupby:
>>> from itertools import groupby
>>> lis = ["a","a","a","a","b","c","c","a","a","d","e","e","e","e","X"]
>>> [list(g) for k,g in groupby(lis)]
[['a', 'a', 'a', 'a'], ['b'], ['c', 'c'], ['a', 'a'], ['d'], ['e', 'e', 'e', 'e'], ['X']]
And instead of creating dynamic variables it's better to use a dict:
>>> dic = { 'lis'+str(i): list(g) for i,(k,g) in enumerate(groupby(lis), 1)}
>>> dic['lis1']
['a', 'a', 'a', 'a']
>>> dic['lis2']
['b']

Related

replace duplicate values in a list with 'x'?

I am trying to understand the process of creating a function that can replace duplicate strings in a list of strings. for example, I want to convert this list
mylist = ['a', 'b', 'b', 'a', 'c', 'a']
to this
mylist = ['a', 'b', 'x', 'x', 'c', 'x']
initially, I know I need create my function and iterate through the list
def replace(foo):
newlist= []
for i in foo:
if foo[i] == foo[i+1]:
foo[i].replace('x')
return foo
However, I know there are two problems with this. the first is that I get an error stating
list indices must be integers or slices, not str
so I believe I should instead be operating on the range of this list, but I'm not sure how to implement it. The other being that this would only help me if the duplicate letter comes directly after my iteration (i).
Unfortunately, that's as far as my understanding of the problem reaches. If anyone can provide some clarification on this procedure for me, I would be very grateful.
Go through the list, and keep track of what you've seen in a set. Replace things you've seen before in the list with 'x':
mylist = ['a', 'b', 'b', 'a', 'c', 'a']
seen = set()
for i, e in enumerate(mylist):
if e in seen:
mylist[i] = 'x'
else:
seen.add(e)
print(mylist)
# ['a', 'b', 'x', 'x', 'c', 'x']
Simple Solution.
my_list = ['a', 'b', 'b', 'a', 'c', 'a']
new_list = []
for i in range(len(my_list)):
if my_list[i] in new_list:
new_list.append('x')
else:
new_list.append(my_list[i])
print(my_list)
print(new_list)
# output
#['a', 'b', 'b', 'a', 'c', 'a']
#['a', 'b', 'x', 'x', 'c', 'x']
The other solutions use indexing, which isn't necessarily required.
Really simply, you could check if the value is in the new list, else you can append x. If you wanted to use a function:
old = ['a', 'b', 'b', 'a', 'c']
def replace_dupes_with_x(l):
tmp = list()
for char in l:
if char in tmp:
tmp.append('x')
else:
tmp.append(char)
return tmp
new = replace_dupes_with_x(old)
You can use the following solution:
from collections import defaultdict
mylist = ['a', 'b', 'b', 'a', 'c', 'a']
ret, appear = [], defaultdict(int)
for c in mylist:
appear[c] += 1
ret.append(c if appear[c] == 1 else 'x')
Which will give you:
['a', 'b', 'x', 'x', 'c', 'x']

Removing duplicates (not by using set)

My data look like this:
let = ['a', 'b', 'a', 'c', 'a']
How do I remove the duplicates? I want my output to be something like this:
['b', 'c']
When I use the set function, I get:
set(['a', 'c', 'b'])
This is not what I want.
One option would be (as derived from Ritesh Kumar's answer here)
let = ['a', 'b', 'a', 'c', 'a']
onlySingles = [x for x in let if let.count(x) < 2]
which gives
>>> onlySingles
['b', 'c']
Try this,
>>> let
['a', 'b', 'a', 'c', 'a']
>>> dict.fromkeys(let).keys()
['a', 'c', 'b']
>>>
Sort the input, then removing duplicates becomes trivial:
data = ['a', 'b', 'a', 'c', 'a']
def uniq(data):
last = None
result = []
for item in data:
if item != last:
result.append(item)
last = item
return result
print uniq(sorted(data))
# prints ['a', 'b', 'c']
This is basically the shell's cat data | sort | uniq idiom.
The cost is O(N * log N), same as with a tree-based set.
Instead of sorting, or linearly scanning and re-counting the main list for its occurrences each time.
Count the number of occurrences and then filter on items that appear once...
>>> from collections import Counter
>>> let = ['a', 'b', 'a', 'c', 'a']
>>> [k for k, v in Counter(let).items() if v == 1]
['c', 'b']
You have to look at the sequence at least once regardless - although it makes sense to limit the amount of times you do so.
If you really want to avoid any type or set or otherwise hashed container (because you perhaps can't use them?), then yes, you can sort it, then use:
>>> from itertools import groupby, islice
>>> [k for k,v in groupby(sorted(let)) if len(list(islice(v, 2))) == 1]
['b', 'c']

Python: how to seperate a list to several list based on empty string?

I'm working a on a list like this, a = ['a','b','','','c','d'], the real list is including thousands of data entries. Is there a fancy way to make the list a as [['a','b'],['c','d]] because the data is really huge?
You can use itertools.groupby for this. You basically group by consecutive empty strings, or consecutive non-empty strings. Then keep all groups that were grouped by True from the lambda in a list comprehension.
>>> from itertools import groupby
>>> [list(i[1]) for i in groupby(a, lambda i: i != '') if i[0]]
[['a', 'b'], ['c', 'd']]
For another example
>>> b = ['a','b','','','c','d', '', 'e', 'f', 'g', '', '', 'h']
>>> [list(i[1]) for i in groupby(b, lambda i: i != '') if i[0]]
[['a', 'b'], ['c', 'd'], ['e', 'f', 'g'], ['h']]

python .count for multidimensional arrays (list of lists)

How would I count the number of occurrences of some value in a multidimensional array made with nested lists? as in, when looking for 'foobar' in the following list:
list = [['foobar', 'a', 'b'], ['x', 'c'], ['y', 'd', 'e', 'foobar'], ['z', 'f']]
it should return 2.
(yes I am aware that I could write a loop that just searches through all of it, but I dislike that solution as it is rather time-consuming, (to write and during runtime))
.count maybe?
>>> list = [['foobar', 'a', 'b'], ['x', 'c'], ['y', 'd', 'e', 'foobar'], ['z', 'f']]
>>> sum(x.count('foobar') for x in list)
2
First join the lists together using itertools, then just count each occurrence using the Collections module:
import itertools
from collections import Counter
some_list = [['foobar', 'a', 'b'], ['x', 'c'], ['y', 'd', 'e', 'foobar'], ['z', 'f']]
totals = Counter(i for i in list(itertools.chain.from_iterable(some_list)))
print(totals["foobar"])
>> from collections import Counter
>> counted = Counter([item for sublist in my_list for item in sublist])
>> counted.get('foobar', 'not found!')
>> 2
#or if not found in your counter
>> 'not found!'
This uses flattening of sublists and then using the collections module and Counter
to produce the counts of words.

Python: filtering lists by indices

In Python I have a list of elements aList and a list of indices myIndices. Is there any way I can retrieve all at once those items in aList having as indices the values in myIndices?
Example:
>>> aList = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> myIndices = [0, 3, 4]
>>> aList.A_FUNCTION(myIndices)
['a', 'd', 'e']
I don't know any method to do it. But you could use a list comprehension:
>>> [aList[i] for i in myIndices]
Definitely use a list comprehension but here is a function that does it (there are no methods of list that do this). This is however bad use of itemgetter but just for the sake of knowledge I have posted this.
>>> from operator import itemgetter
>>> a_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
>>> my_indices = [0, 3, 4]
>>> itemgetter(*my_indices)(a_list)
('a', 'd', 'e')
Indexing by lists can be done in numpy. Convert your base list to a numpy array and then apply another list as an index:
>>> from numpy import array
>>> array(aList)[myIndices]
array(['a', 'd', 'e'],
dtype='|S1')
If you need, convert back to a list at the end:
>>> from numpy import array
>>> a = array(aList)[myIndices]
>>> list(a)
['a', 'd', 'e']
In some cases this solution can be more convenient than list comprehension.
You could use map
map(aList.__getitem__, myIndices)
or operator.itemgetter
f = operator.itemgetter(*aList)
f(myIndices)
If you do not require a list with simultaneous access to all elements, but just wish to use all the items in the sub-list iteratively (or pass them to something that will), its more efficient to use a generator expression rather than list comprehension:
(aList[i] for i in myIndices)
Alternatively, you could go with functional approach using map and a lambda function.
>>> list(map(lambda i: aList[i], myIndices))
['a', 'd', 'e']
I wasn't happy with these solutions, so I created a Flexlist class that simply extends the list class, and allows for flexible indexing by integer, slice or index-list:
class Flexlist(list):
def __getitem__(self, keys):
if isinstance(keys, (int, slice)): return list.__getitem__(self, keys)
return [self[k] for k in keys]
Then, for your example, you could use it with:
aList = Flexlist(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
myIndices = [0, 3, 4]
vals = aList[myIndices]
print(vals) # ['a', 'd', 'e']

Categories