Parse a list into a list of lists in python - python

I am trying to figure out how to parse a list into a list of lists.
tileElements = browser.find_element(By.CLASS_NAME, 'tile-container')
tileHTML = (str(tileElements.get_attribute('innerHTML')))
tileNUMS = re.findall('\d+',tileHTML)
NumTiles = int(len(tileNUMS)/4)
#parse out list, each 4 list items are one tile
print(str(tileNUMS))
print(str(NumTiles))
TileList = [[i+j for i in range(len(tileNUMS))]for j in range (NumTiles)]
print(str(TileList))
The first part of this code works find and gives me a list of Tile Numbers:
['2', '3', '1', '2', '2', '4', '4', '2']
However, what I need is a list of lists made out of this and that is where I am getting stuck.
The list of lists should be 4 elements long and look like this:
[['2', '3', '1', '2'] , ['2', '4', '4', '2']]
It should be able to do this for as many tiles as there are in the game (up to 19 I believe). It would be really nice if when the middle numbers are repeated that the two outside numbers are replaced with the latest value from the source list.

You can use a list comprehension to get slices from the list like so.
elements = ['2', '3', '1', '2', '2', '4', '4', '2']
size = 4
result = [elements[i:i+size] for i in range(0, len(elements), size)]
(By the way, there's no need to cast things into str to print them, and tileHTML is probably already a string, too.)

Related

How to sort a list of integers that are stored as string in python [duplicate]

This question already has answers here:
How to sort python list of strings of numbers
(4 answers)
Closed 2 years ago.
I tried to sort a list of string that are actually integers but i do not get the right sort value. How do i sort it in a way that it is sorted according to the integer value of string ?
a = ['10', '1', '3', '2', '5', '4']
print(sorted(a))
Output:
['1', '10', '2', '3', '4', '5']
Output wanted:
['1', '2', '3', '4', '5', '10']
We have to use the lambda as a key and make each string to int before the sorted function happens.
sorted(a,key=lambda i: int(i))
Output :
['1', '2', '3', '4', '5', '10']
More shorter way -> sorted(a,key=int). Thanks to #Mark for commenting.
So one of the ways to approach this problem is converting the list to a list integers by iterating through each element and converting them to integers and later sort it and again converting it to a list of strings again.
You could convert the strings to integers, sort them, and then convert back to strings. Example using list comprehensions:
sorted_a = [str(x) for x in sorted(int(y) for y in a)]
More verbose version:
int_a = [int(x) for x in a] # Convert all elements of a to ints
sorted_int_a = sorted(int_a) # Sort the int list
sorted_str_a = [str(x) for x in sorted_int_a] # Convert all elements of int list to str
print(sorted_str_a)
Note: #tedd's solution to this problem is the preferred solution to this problem, I would definitely recommend that over this.
Whenever you have a list of elements and you want to sort using some property of the elements, use key argument (see the docs).
Here's what it looks like:
>>> a = ['10', '1', '3', '2', '5', '4']
>>> print(sorted(a))
['1', '10', '2', '3', '4', '5']
>>> print(sorted(a, key=lambda el: int(el)))
['1', '2', '3', '4', '5', '10']

Use index of first and second repeated index in list

There are lots of similar posts out there, but I could not find something that directly matched, or resulted in a solution to, the issue I am dealing with.
I want to use the second instance of a repeated index contained in a list as the index of another list. When the function is executed, I want all numbers from the start of the list up to the first '\*' to print after Code1, all numbers between the first '\*' and the second '\*' to print after Code2, and then all numbers following the second '\*' until the end of the list to print after Code3. Example data for digit would be "['1', '2', '3', '4', '5', '\*', '6', '\*', '7', '8', '9', '10', '1']".
In other words, I want the code below to print , assuming those digits exist, User Code: 12345, Pass Code: 6, Pin Code: 789101, all in one line.
print_string += 'User Code: {} '.format(''.join(str(dig) for dig in digit[:digit.index('*')])) + \
'Pass Code: {} '.format(''.join(str(dig) for dig in digit[digit.index('*'):digit.index('*')])) + \
'Pin Code: {} '.format(''.join(str(dig) for dig in digit[digit.index('*'):]))
print(print_string)
Essentially, I would like to call the first asterisk as the right index for User Code, the first asterisk as the left index and the second asterisk as the right index for Pass Code, and the second asterisk as the left index for Pin Code.
I just cannot figure out how make it look for sequential asterisks. If there is a simpler way to execute this, please let me know!
Given,
L = ['1', '2', '3', '4', '5', '\*', '6', '\*', '7', '8', '9', '10', '1']
Then
str.join('', L)
will form a string
'12345\\*6\\*789101'
which you can split into the three parts
parts = str.join('', L).split('\*')
and then pull out what you need
user_code = parts[0]
pass_code = parts[1]
pin = parts[2]
If you have actually got all the digits in a list like shape ina string,
"['1', '2', '3', '4', '5', '\*', '6', '\*', '7', '8', '9', '10', '1']"
it might be worth just having them as a list, then you can use the join/split method above.

how to remove the first occurence of an integer in a list

this is my code:
positions = []
for i in lines[2]:
if i not in positions:
positions.append(i)
print (positions)
print (lines[1])
print (lines[2])
the output is:
['1', '2', '3', '4', '5']
['is', 'the', 'time', 'this', 'ends']
['1', '2', '3', '4', '1', '5']
I would want my output of the variable "positions" to be; ['2','3','4','1','5']
so instead of removing the second duplicate from the variable "lines[2]" it should remove the first duplicate.
You can reverse your list, create the positions and then reverse it back as mentioned by #tobias_k in the comment:
lst = ['1', '2', '3', '4', '1', '5']
positions = []
for i in reversed(lst):
if i not in positions:
positions.append(i)
list(reversed(positions))
# ['2', '3', '4', '1', '5']
You'll need to first detect what values are duplicated before you can build positions. Use an itertools.Counter() object to test if a value has been seen more than once:
from itertools import Counter
counts = Counter(lines[2])
positions = []
for i in lines[2]:
counts[i] -= 1
if counts[i] == 0:
# only add if this is the 'last' value
positions.append(i)
This'll work for any number of repetitions of values; only the last value to appear is ever used.
You could also reverse the list, and track what you have already seen with a set, which is faster than testing against the list:
positions = []
seen = set()
for i in reversed(lines[2]):
if i not in seen:
# only add if this is the first time we see the value
positions.append(i)
seen.add(i)
positions = positions[::-1] # reverse the output list
Both approaches require two iterations; the first to create the counts mapping, the second to reverse the output list. Which is faster will depend on the size of lines[2] and the number of duplicates in it, and wether or not you are using Python 3 (where Counter performance was significantly improved).
you can use a dictionary to save the last position of the element and then build a new list with that information
>>> data=['1', '2', '3', '4', '1', '5']
>>> temp={ e:i for i,e in enumerate(data) }
>>> sorted(temp, key=lambda x:temp[x])
['2', '3', '4', '1', '5']
>>>

how to combine list of tuples based on common value into a list

I have a long list of tuples, could look like this for example:
[('5','9'), ('10','11'), ('1','2'), ('1','3'), ('1','4'), ('2','7'), ('3','8'), ('2','1'), ('3','1'), ('3','4'), ('5','6'), ('5','10'), ('10','12'), ('11','13'), ('13','14')]
I need to combine them into lists if they share anything in common. So the output in the example would be:
['11', '10', '13', '12', '14', '5', '6', '9']
['1', '3', '2', '4', '7', '8']
To clarify: the input is a list of tuples. I need to combine all tuples with shared element into one list. So if I will have: [('1','2'), ('1','3'), ('1','4'), ('4','5')], all the elements should all be put into one list ['1', '2', '3', '4', '5'], because they are linked through the tuples.
I tried to come up with something going through dictionaries byt failed miserably. I am sure there some "easier" solution.
thank you
Martin
Looks like you are doing union-find operations. https://en.wikipedia.org/wiki/Disjoint-set_data_structure
Start with creating singleton disjoint sets, each having a number occurring in your list. Next, for each tuple, union the sets corresponding to the numbers in the tuple.
The above solution will have running time nearly linear in the number of tuples. See A set union find algorithm for possible union-find implementations.
We can frame this problem as that of finding connected components in an undirected graph. All distinct numbers that appear in the list can be treated as the nodes (vertices) of the graph, and the pairs as its edges.
Here is a simple algorithm with inline comments:
l = [('5','9'), ('10','11'), ('1','2'), ('1','3'), ('1','4'), ('2','7'), ('3','8'), ('2','1'), ('3','1'), ('3','4'), ('5','6'), ('5','10'), ('10','12'), ('11','13'), ('13','14')]
# get all unique elements ("nodes") of `l'
nodes = set().union(*map(set, l))
# start with each node in its own connected component
comp = {node:{node} for node in nodes}
# now repeatedly merge pairs of components connected by edges in `l'
while True:
merged = False
new_l = [] # will drop edges that have already been used in a merge
for n1, n2 in l:
if comp[n1] is not comp[n2]:
# the two connected components are not the same, so merge them
new_comp = comp[n1] | comp[n2]
for n in new_comp:
comp[n] = new_comp
merged = True
else:
# keep the pair for the next iteration
new_l.append((n1, n2))
if not merged:
# all done
break
l = new_l
# now print all distinct connected components
for c in set(map(frozenset, comp.values())):
print list(c)
This prints out:
['1', '3', '2', '4', '7', '8']
['11', '10', '13', '12', '14', '5', '6', '9']

Randomly extract x items from a list using python

Starting with two lists such as:
lstOne = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
lstTwo = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
I want to have the user input how many items they want to extract, as a percentage of the overall list length, and the same indices from each list to be randomly extracted. For example say I wanted 50% the output would be
newLstOne = ['8', '1', '3', '7', '5']
newLstTwo = ['8', '1', '3', '7', '5']
I have achieved this using the following code:
from random import randrange
lstOne = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
lstTwo = [ '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
LengthOfList = len(lstOne)
print LengthOfList
PercentageToUse = input("What Percentage Of Reads Do you want to extract? ")
RangeOfListIndices = []
HowManyIndicesToMake = (float(PercentageToUse)/100)*float(LengthOfList)
print HowManyIndicesToMake
for x in lstOne:
if len(RangeOfListIndices)==int(HowManyIndicesToMake):
break
else:
random_index = randrange(0,LengthOfList)
RangeOfListIndices.append(random_index)
print RangeOfListIndices
newlstOne = []
newlstTwo = []
for x in RangeOfListIndices:
newlstOne.append(lstOne[int(x)])
for x in RangeOfListIndices:
newlstTwo.append(lstTwo[int(x)])
print newlstOne
print newlstTwo
But I was wondering if there was a more efficient way of doing this, in my actual use case this is subsampling from 145,000 items. Furthermore, is randrange sufficiently free of bias at this scale?
Thank you
Q. I want to have the user input how many items they want to extract, as a percentage of the overall list length, and the same indices from each list to be randomly extracted.
A. The most straight-forward approach directly matches your specification:
percentage = float(raw_input('What percentage? '))
k = len(data) * percentage // 100
indicies = random.sample(xrange(len(data)), k)
new_list1 = [list1[i] for i in indicies]
new_list2 = [list2[i] for i in indicies]
Q. in my actual use case this is subsampling from 145,000 items. Furthermore, is randrange sufficiently free of bias at this scale?
A. In Python 2 and Python 3, the random.randrange() function completely eliminates bias (it uses the internal _randbelow() method that makes multiple random choices until a bias-free result is found).
In Python 2, the random.sample() function is slightly biased but only in the round-off in the last of 53 bits. In Python 3, the random.sample() function uses the internal _randbelow() method and is bias-free.
Just zip your two lists together, use random.sample to do your sampling, then zip again to transpose back into two lists.
import random
_zips = random.sample(zip(lstOne,lstTwo), 5)
new_list_1, new_list_2 = zip(*_zips)
demo:
list_1 = range(1,11)
list_2 = list('abcdefghij')
_zips = random.sample(zip(list_1, list_2), 5)
new_list_1, new_list_2 = zip(*_zips)
new_list_1
Out[33]: (3, 1, 9, 8, 10)
new_list_2
Out[34]: ('c', 'a', 'i', 'h', 'j')
The way you are doing it looks mostly okay to me.
If you want to avoid sampling the same object several times, you could proceed as follows:
a = len(lstOne)
choose_from = range(a) #<--- creates a list of ints of size len(lstOne)
random.shuffle(choose_from)
for i in choose_from[:a]: # selects the desired number of items from both original list
newlstOne.append(lstOne[i]) # at the same random locations & appends to two newlists in
newlstTwo.append(lstTwo[i]) # sequence

Categories