how to extract values from python sublists - python

data_sets = [
['O'],
['X'],
# These data sets put Sheet A in all possible locations and orientations
# Data sets 2 - 9
['O', ['Sheet A', 'Location 1', 'Upright']],
['O', ['Sheet A', 'Location 2', 'Upright']],
['O', ['Sheet A', 'Location 3', 'Upright']],
['O', ['Sheet A', 'Location 4', 'Upright']],
['O', ['Sheet A', 'Location 1', 'Upside down']],
['O', ['Sheet A', 'Location 2', 'Upside down']],
['O', ['Sheet A', 'Location 3', 'Upside down']],
['O', ['Sheet A', 'Location 4', 'Upside down']]
]
for each in data_sets:
if 'Sheet A' in each:
print('1')
when i run this, it doesn't print anything because i dont think its going through all the sublists. how can i get this to work?

You can use itertools.chain.from_iterable
import itertools
for each in data_sets:
if "Sheet A" in itertools.chain.from_iterable(eeach):
print("1")
1
1
1
1
1
1
1
1
Here you have a live example

in is not recursive. It tries to find the item in the list itself. If the item is a list, in won't go down in the list to look for the string.
In your case, you could
check if the list has at least 2 items
perform in on the second item
like this:
for each in data_sets:
if len(each)>1 and 'Sheet A' in each[1]:
print('1')
of course if the structure is more complex/not fixed, you have to use a recursive approach which tests item type, like this: Python nested list recursion search

def listChecker(list_elems):
for list_elem in list_elems:
if "Sheet A" in list_elem:
print "1"
if any(isinstance(elem, list) for elem in list_elem):
listChecker(list_elem)
listChecker(data_sets)
you can also use this function. It will be helpful to print 1 in all cases of nested lists. Just pass your list object to this function.

you can also check it by count.
for each in data_sets:
if len(each)>1 and each[1].count("Sheet A"):
print('1')
len(each)>1 checks the number of list item.
each[1] is the second sublist of your given list. and .count("Sheet A") returns occurrence number of Sheet A.

Related

Testing if strip in column was successful with polars

I have developed a function to strip a dataset using polars. Now I want to check with a test if the strip was successful. For this I want to use the following logic. But this code is in python. How can I solve this using polars?
def test_strip():
df = pd.DataFrame({
'ID': [1, 1, 1, 1, 1],
'Entity': ['Entity 1 ', 'Entity 2', 'Entity 3', 'Entity 4', 'Entity 5'],
'Table': ['Table 1', ' Table 2', 'Table 3', 'Table 4', None],
'Local': ['Local 1', 'Local 2 ', None, 'Local 4', 'Local 5'],
'Global': ['Global 1', ' Global 2', 'Global 3', None, ' Global 5'],
'mandatory': ['M', 'M', 'M', 'CM ', 'M']
})
job = first_job(
config=test_config,
copying_list=copying,
)
result = job.run(df)
df_clean, *_ = result
for column in df_clean.columns:
for value in df_clean[column]:
if isinstance(value, str) and (value.startswith(" ") or value.endswith(" ")):
raise AssertionError(f"Strip failed for column '{column}'")
This should do it...
def test_strip(df):
bad_rows=df.filter(
pl.any([pl.col(x).str.contains("(^ )|( $)") for x in df.columns])
)
if bad_rows.shape[0]==0:
return("all good")
else:
str_cols=', '.join(bad_rows.melt().filter(pl.col('value').str.contains("(^ )|( $)")).get_column('variable').unique().to_list())
raise AssertionError(f"Strip failed for column(s): {str_cols}")
The meat and potatoes is the bad_rows assignment. It combines a list comprehension that uses a regex with the beginning of string anchor and the end of string anchor. That is wrapped in pl.any so that any column can trigger it. If the shape is 0 that means everything worked and it returns a message stating as much. Otherwise it'll raise the error and tell you which columns were bad.

How can I return the entire dictionary?

This is my method. I am having trouble with returning the entire dictionary
def get_col(amount):
letter = 0
value = []
values = {}
for i in range(amount):
letter = get_column_letter(i + 1)
[value.append(row.value) for row in ws[letter]]
values = dict(zip(letter, [value]))
value = []
return values
I want it to output it like this:
{'A': ['ID', 'value is 1', 'value is 2', 'value is 3', 'value is 4', 'value is 5', 'value is 6']}
{'B': ['Name', 'value is 1', 'value is 2', 'value is 3', 'value is 4', 'value is 5', 'value is 6']}
{'C': ['Math', 'value is 1', 'value is 2', 'value is 3', 'value is 4', 'value is 5', 'value is 6']}
But when the return is onside the 'for' it only returns
{'A': ['ID', 'value is 1', 'value is 2', 'value is 3', 'value is 4', 'value is 5', 'value is 6']}
and when the return is outside the 'for' loop, it returns
{'C': ['Math', 'value is 1', 'value is 2', 'value is 3', 'value is 4', 'value is 5', 'value is 6']}
Any help would be appreciated. Thank you!
I am assuming you want all of the data in one dictionary:
values = dict(zip(letter, [value]))
Currently this part of your code overites the dictionary everytime. It is why you get the "A" dict with returning before the for loop finishes, and why after the loop finishes when return the dict is only the "C" dict as the "A" and "B" were overwriten.
Put the return outside the for loop afterwards, and instead of
values = dict(zip(letter, [value]))
use
values[letter] = value
as this will append more keys/values to the dict.
ps. This is my first post, I hope it helps and is understandable.
edit: If you are wanting a list of three dictionaries like your desired output shows do this:
def get_col(amount):
letter = 0
value = []
values = []
for i in range(amount):
letter = get_column_letter(i + 1)
[value.append(row.value) for row in ws[letter]]
values.append(dict(zip(letter, [value])))
value = []
return values
Your desired output is not a single dictionary. It's a list of dictionaries.
In the for loop, at each iteration you are creating a new dictionary. When you return, you either return the first one you create or the last one if you put the return inside or outside respectevely.
You need to return a list of the created dictionaries
def get_col(amount):
letter = 0
value = []
values = {}
values_list = []
for i in range(amount):
letter = get_column_letter(i + 1)
[value.append(row.value) for row in ws[letter]]
values = dict(zip(letter, [value]))
value = []
values_list.append(values)
return values_list

Adding values from one list to another when they share value

I'm trying to add values from List2 if the type is the same in List1. All the data is strings within lists. This isn't the exact data I'm using, just a representation. This is my first programme so please excuse any misunderstandings.
List1 = [['Type A =', 'Value 1', 'Value 2', 'Value 3'], ['Type B =', 'Value 4', 'Value 5']]
List2 = [['Type Z =', 'Value 6', 'Value 7', 'Value 8'], ['Type A =', 'Value 9', 'Value 10', 'Value 11'], ['Type A =', 'Value 12', 'Value 13']]
Desired result:
new_list =[['Type A =', 'Value 1', 'Value 2', 'Value 3', 'Value 9', 'Value 10', 'Value 11', 'Value 12', 'Value 13'], ['Type B =', 'Value 4', 'Value 5']]
Current attempt:
newlist = []
for values in List1:
for valuestoadd in List2:
if values[0] == valuestoadd[0]:
newlist = [List1 + [valuestoadd[1:]]]
else:
print("Types don't match")
return newlist
This works for me if there weren't two Type A's in List2 as this causes my code to create two instances of List1. If I was able to add the values at a specific index of the list then that would be great but I can work around that.
It's probably easier to use a dictionary for this:
def merge(d1, d2):
return {k: v + d2[k] if k in d2 else v for k, v in d1.items()}
d1 = {'A': [1, 2, 3], 'B': [4, 5, 6]}
d2 = {'A': [7, 8, 9], 'C': [0]}
print(merge(d1, d2))
If you must use a list, it's fairly easy to temporarily convert to a dictionary and back to a list:
from collections import defaultdict
def list_to_dict(xss):
d = defaultdict(list)
for xs in xss:
d[xs[0]].extend(xs[1:])
return d
def dict_to_list(d):
return [[k, *v] for k, v in d.items()]
Rather than using List1 + [valuestoadd[1:]], you should be using newlist[0].append(valuestoadd[1:]) so that it doesn't ever create a new list and only appends to the old one. The [0] is necessary so that it appends to the first sublist rather than the whole list.
newlist = List1 #you're doing this already - might as well initialize the new list with this code
for values in List1:
for valuestoadd in List2:
if values[0] == valuestoadd[0]:
newlist[0].append(valuestoadd[1:]) #adds the values on to the end of the first list
else:
print("Types don't match")
Output:
[['Type A =', 'Value 1', 'Value 2', 'Value 3', ['Value 9', 'Value 10', 'Value 11'], ['Value 12', 'Value 13']], ['Type B =', 'Value 4', 'Value 5']]
This does, sadly, input the values as a list - if you want to split them into individual values, you would need to iterate through the lists you're adding on, and append individual values to newlist[0].
This could be achieved with another for loop, like so:
if values[0] == valuestoadd[0]:
for subvalues in valuestoadd[1:]: #splits the list into subvalues
newlist[0].append(subvalues) #appends those subvalues
Output:
[['Type A =', 'Value 1', 'Value 2', 'Value 3', 'Value 9', 'Value 10', 'Value 11', 'Value 12', 'Value 13'], ['Type B =', 'Value 4', 'Value 5']]
I agree with the other answers that it would be better to use a dictionary right away. But if you want, for some reason, stick to the data structure you have, you could transform it into a dictionary and back:
type_dict = {}
for tlist in List1+List2:
curr_type = tlist[0]
type_dict[curr_type] = tlist[1:] if not curr_type in type_dict else type_dict[curr_type]+tlist[1:]
new_list = [[k] + type_dict[k] for k in type_dict]
In the creation of new_list, you can take the keys from a subset of type_dict only if you do not want to include all of them.

Get elements of a sublist in python based on indexes of a different list

I have two lists of lists.
I want to get the elements from second list of lists, based on a value from the first list of lists.
I if I have simple lists, everything go smooth, but once I have list of list, I'm missing something at the end.
Here is the code working for two lists (N = names, and V = values):
N = ['name 1', 'name 2','name 3','name 4','name 5','name 6','name 7','name 8','name 9','name 10']
V = ['val 1', 'val 2','val 3','val 4','val 5','val 6','val 7','val 8','val 9','val 10']
bool_ls = []
NN = N
for i in NN:
if i == 'name 5':
i = 'y'
else:
i = 'n'
bool_ls.append(i)
# GOOD INDEXES = GI
GI = [i for i, x in enumerate(bool_ls) if x == 'y']
# SELECT THE GOOD VALUES = "GV" FROM V
GV = [V[index] for index in GI]
if I define a function, works well applied to the two lists:
def GV(N,V,name):
bool_ls = []
NN = N
for i in NN:
if i == name:
i = 'y'
else:
i = 'n'
bool_ls.append(i)
GI = [i for i, x in enumerate(bool_ls) if x == 'y']
GV = [V[index] for index in GI]
return GV
Once I try "list of list", I cannot get the similar results. My code looks like below so far:
NN = [['name 1', 'name 2','name 3'], ['name 1', 'name 2','name 3'], ['name 1', 'name 2','name 3'], ['name 1', 'name 2','name 3'], ['name 1', 'name 2','name 3'], ['name 1', 'name 2','name 3']]
VV = [['val 1', 'val 2', 'val 3'], ['val 1', 'val 2', 'val 3'], ['val 1', 'val 2', 'val 3'], ['val 1', 'val 2', 'val 3'], ['val 1', 'val 2', 'val 3']]
def GV(NN,VV,name):
bool_ls = []
NNN = NN
for j in NNN:
for i in j:
if i == name:
i = 'y'
else:
i = 'n'
bool_ls.append(i)
# here is where I'm lost
Help greatly appreciated! Thank you.
You can generate pair-wise combinations from both list using zip and then filter in a list comprehension.
For the flat lists:
def GV(N, V, name):
return [j for i, j in zip(N, V) if i==name]
For the nested lists, you'll add an extra nesting:
def GV(NN,VV,name):
return [j for tup in zip(NN, VV) for i, j in zip(*tup) if i==name]
In case you want a list of lists, you can move the nesting into new lists inside the parent comprehension.
There's an easier way to do what your function is doing, but, to answer your question, you just need two loops (one for each level of lists): the first list iterates over the list of lists, the second iterates over the inner lists and does the somewhat odd y or n thing to chose a value.

Append every other line from a text file? (Python)

What I'm trying to do is append every other line in my text file into a list, and then the other lines into a serperate list? E.g.
Text File 'example'
Item 1
Item 2
Item 3
Item 4
Item 5
So I want 'Item 1', 'Item 3' and 'Item 5' in a list called exampleOne and the other items in a list called exampleTwo?
I've tried for ages to try and work this out by myself by slicing and then appending in different ways, but I just can't seem to get it, if anyone could help it would be greatly appreciated!
from itertools import izip_longest as zip2
with open("some_file.txt") as f:
linesA,linesB = zip2(*zip(f,f))
is one way you could do something like this
this basically is just abusing the fact that filehandles are iterators
What about
with open('example') as f:
lists = [[], []]
i = 0
for line in f:
lists[i].append(line.strip())
i ^= 1
print(lists[0]) # ['Item 1', 'Item 3', 'Item 5']
print(lists[1]) # ['Item 2', 'Item 4']
Or simpler, with enumerate:
with open('example') as f:
lists = [[], []]
for i,line in enumerate(f):
lists[i%2].append(line.strip())
print(lists[0]) # ['Item 1', 'Item 3', 'Item 5']
print(lists[1]) # ['Item 2', 'Item 4']
EDIT
print(lists[0][0]) # 'Item 1'
print(lists[0][1]) # 'Item 3'
print(lists[0][2]) # 'Item 5'
print(lists[1][0]) # 'Item 2'
print(lists[1][1]) # 'Item 4'

Categories