Python 2d Element Sort using recursion - python

How can i sort a 2d python array alphabetically based on one of its elements.
[['8617622', 'Inner Seas', '', '56.71657', '-10.45898', 'H'
, 'SEA', '', '', '', '', '', '', '0', '', '-9999', '', '2013-10-01']
,['8617622', 'Blue seas', '', '56.71657', '-10.45898', 'H', 'SEA', ''
, '', '', '', '', '', '0', '', '-9999', '', '2013-10-01']]
If you notice, the second element of the array is called blue seas, while the first is called inner seas. can someone please help me make a function that sorts the arrays based on the 2nd element of each of the arrays within the arrays alphabetically?

You just sort the list providing the key as the second element in the array to the sort as shown below:
l = [['8617622', 'Inner Seas', '', '56.71657', '-10.45898', 'H', 'SEA', '', '', '', '', '', '', '0', '', '-9999', '', '2013-10-01'],['8617622', 'Blue seas', '', '56.71657', '-10.45898', 'H', 'SEA', '', '', '', '', '', '', '0', '', '-9999', '', '2013-10-01']]
l.sort(key=lambda x:x[1])
print l

Related

Get a element from list, from pair of 3

Hello I have a list in which elemnets are in pair of 3 list given below,
labels = ['', '', '5000','', '2', '','', '', '1000','mm-dd-yy', '', '','', '', '15','dd/mm/yy', '', '', '', '3', '','', '', '200','', '2', '','mm-dd-yy', '', '','', '', '','', '', '']
in above list elements are coming in pair of 3 i.e. ('', '', '5000') one pair, ('', '2', '') second pair, ('mm-dd-yy', '', '') third pair and so on.
now i want to check ever 3 pairs in list and get the element which is not blank.
('', '', '5000') gives '5000'
('', '2', '') gives '2'
('mm-dd-yy', '', '') gives 'mm-dd-yy'
and if all three are blank it should return blank i.e.
('', '', '') gives '' like last 2 pair in list
so from the above list my output should be:
required_list = ['5000','2','1000','mm-dd-yy','15','dd/mm/yy','3','200','2','mm-dd-yy','','']
as it is fixed you have to create 3 pairs each time you can do with for loop by specifying step in range(start,end,step)
labels = ['', '', '5000','', '2', '','', '', '1000','mm-dd-yy', '', '','', '', '15','dd/mm/yy', '', '', '', '3', '','', '', '200','', '2', '','mm-dd-yy', '', '','', '', '','', '', '']
res1=[]
for i in range(0,len(labels),3):
res1.append(labels[i]+labels[i+1]+labels[i+2])
print(res1)
#List Comprehension
res2=[labels[i]+labels[i+1]+labels[i+2] for i in range(0,len(labels),3)]
print(res2)
Output:
['5000', '2', '1000', 'mm-dd-yy', '15', 'dd/mm/yy', '3', '200', '2', 'mm-dd-yy', '', '']
I think this should give you the required result. Not ideal performance but gets the job done and should be pretty easy to follow
labels = ['', '', '5000','', '2', '','', '', '1000','mm-dd-yy', '', '','', '', '15','dd/mm/yy', '', '', '', '3', '','', '', '200','', '2', '','mm-dd-yy', '', '','', '', '','', '', '']
def chunks(ls):
chunks = []
start = 0
end = len(ls)
step = 3
for i in range(start, end, step):
chunks.append(ls[i:i+step])
return chunks
output = []
for chunk in chunks(labels):
nonEmptyItems = [s for s in chunk if len(s) > 0]
if len(nonEmptyItems) > 0:
output.append(nonEmptyItems[0])
else:
output.append('')
print(output)
All the previous answers laboriously create a new list of triplets, then iterate on that list of triplets.
There is no need to create this intermediate list.
def gen_nonempty_in_triplets(labels):
return [max(labels[i:i+3], key=len) for i in range(0, len(labels), 3)]
labels = ['', '', '5000','', '2', '','', '', '1000','mm-dd-yy', '', '','', '', '15','dd/mm/yy', '', '', '', '3', '','', '', '200','', '2', '','mm-dd-yy', '', '','', '', '','', '', '']
print(gen_nonempty_in_triplets(labels))
# ['5000', '2', '1000', 'mm-dd-yy', '15', 'dd/mm/yy', '3', '200', '2', 'mm-dd-yy', '', '']
Interestingly, there are many different ways to implement "get the element which is not blank".
I chose to use max(..., key=len) to select the longest string.
Almost every answer you received uses a different method!
Here are a few different methods that were suggested. They are equivalent when at most one element of the triplet is nonempty, but they behave differently if the triplet contains two or more nonempty elements.
# selects the longest string
max(labels[i:i+3], key=len)
# selects the first nonempty string
next((i for i in labels[i:i+3] if i), '')
# concatenates all three strings
labels[i]+labels[i+1]+labels[i+2]
Iterate over a 3 sliced list and then get the first non-null element with next.
labels = ['', '', '5000','', '2', '','', '', '1000','mm-dd-yy', '', '','', '', '15','dd/mm/yy', '', '', '', '3', '','', '', '200','', '2', '','mm-dd-yy', '', '','', '', '','', '', '']
length = len(labels)
list_by_3 = [labels[i:i+3] for i in range(0, length, 3)]
required_list = []
for triplet in list_by_3:
required_list.append(
next(i for i in triplet if i, "")
)
>>> required_list
['5000', '2', '1000', 'mm-dd-yy', '15', 'dd/mm/yy', '3', '200', '2', 'mm-dd-yy', '', '']

Updating a variable to the value of a string only returns the first character of the string

I'm trying to hard code the major ticks for a plot by creating an array which I will then attach to the x-axis of the graph. However, I can't get the array to come out correctly. I created an empty list xticks which I want to update every 5th value the correct value from major_ticks but the updated values are only the first characters of the values in major_ticks
{
length_x = 21
import numpy as np
xticks=np.full(length_x,'',dtype=str)
#print(xticks) returns ['' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '' '']
major_ticks=np.linspace(-10,10,5,dtype=int)
#print(major_ticks) returns [-10 -5 0 5 10]
i=0
for j in range(len(xticks)):
if j%5==0:
xticks[j]=str(major_ticks[i])
i+=1
print(xticks) #returns ['-' '' '' '' '' '-' '' '' '' '' '0' '' '' '' '' '5' '' '' '' '' '1']
}
please help me understand why this is happening, I've been banging my head against the wall for 3 hours now.
This happens because np.full doesn't generate an array of strings in the first place but an array of chars:
np.full(length_x,'',dtype=str).dtype
dtype('<U1')
Typically I wouldn't recommend to use numpy for string operations. Replacing xticks=np.full(length_x,'',dtype=str) with xticks = [''] * length_x will give you what you want.
I think there's something funky going on with your np.full declaration. Switching to using python lists will make it easier:
major_ticks=np.linspace(-10,10,5,dtype=int)
xticks = []
i=0
for j in range(length_x):
if j%5==0:
tick = str(major_ticks[i])
i += 1
else:
tick = ''
xticks.append(tick)
print(xticks)
In [129]: major_ticks=np.linspace(-10,10,5,dtype=int)
In [130]: major_ticks.shape
Out[130]: (5,)
In [133]: major_ticks
Out[133]: array([-10, -5, 0, 5, 10])
In [134]: major_ticks.astype(str)
Out[134]: array(['-10', '-5', '0', '5', '10'], dtype='<U21')
Making strings from major_ticks. 21 is bigger than needed, but who's counting?
In [135]: xticks=np.full(21,'',dtype='U21')
In [136]: xticks
Out[136]:
array(['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
'', '', '', ''], dtype='<U21')
In [138]: i=0
...: for j in range(len(xticks)):
...: if j%5==0:
...: xticks[j] = str(major_ticks[i])
...: i+=1
...:
...:
In [139]: xticks
Out[139]:
array(['-10', '', '', '', '', '-5', '', '', '', '', '0', '', '', '', '',
'5', '', '', '', '', '10'], dtype='<U21')
But we can fill the string array directly:
In [140]: xticks=np.full(21,'',dtype='U21')
In [141]: xticks[0::5] = major_ticks
In [142]: xticks
Out[142]:
array(['-10', '', '', '', '', '-5', '', '', '', '', '0', '', '', '', '',
'5', '', '', '', '', '10'], dtype='<U21')
The integers are converted to the string dtype as they are added to xticks.
Seems like dtype=str defaults to length limit of 1. Set dtype='U10' or 'S10' instead or something.

How to remove values from a list of list in Python

I have a list as
[['Name', 'Place', 'Batch'], ['11', '', 'BBI', '', '2015'], ['12', '', 'CCU', '', '', '', '', '2016'], ['13', '', '', 'BOM', '', '', '', '', '2017']]
I want to remove all the '' from the list.
The code I tried is :
>>> for line in list:
... if line == '':
... list.remove(line)
...
print(list)
The output that it shows is:
[['Name', 'Place', 'Batch'], ['11', '', 'BBI', '', '2015'], ['12', '', 'CCU', '', '', '', '', '2016'], ['13', '', '', 'BOM', '', '', '', '', '2017']]
Can someone suggest what's wrong with this ?
This is what you want:
a = [['Name', 'Place', 'Batch'], ['11', '', 'BBI', '', '2015'], ['12', '', 'CCU', '', '', '', '', '2016'], ['13', '', '', 'BOM', '', '', '', '', '2017']]
b = [[x for x in y if x != ''] for y in a] # kudos #Moses
print(b) # prints: [['Name', 'Place', 'Batch'], ['11', 'BBI', '2015'], ['12', 'CCU', '2016'], ['13', 'BOM', '2017']]
Your solution does not work because line becomes the entire sublist in every step and not the elements of the sublist.
Now it seems that you are trying to modify the original list in-place (without creating any additional variables). This is a noble cause but looping through a list that you are modifying as you are looping is not advisable and will probably lead to your code raising an error (IndexError or otherwise).
Lastly, do not use the name list since Python is using it internally and you are overwriting its intended usage.
You're running your test on the sublists not on the items they contain. And you'll need a nested for to do what you want. However removing items from a list with list.remove while iterating usually leads to unpredictable results.
You can however, use a list comprehension to filter out the empty strings:
r = [[i for i in x if i != ''] for x in lst]
# ^^ filter condition

"why can't I use the Python "in" operator to tell if a word is present in a CSV file?"

I'm spliting a string into words and for each word I'm going through each row of a CSV file to check if this word exists in any row or not. If it exists I want to print this row
import csv
tempfile = open("word.csv",'r')
listtext = csv.reader(tempfile)
sti = "this is not a very good string"
sss = sti.split(" ")
print sss
for word in sss:
print word
for x in listtext:
if str(word) in x:
print x
This is the output I'm getting
['this', 'is', 'not', 'a', 'very', 'good', 'string']
this
is
not
a
very
good
string
This is the output I want
['proud', 'flesh', '', '', '', '', '', '', '', '', '', '', '', 'n', '-0.38925']
['proud', 'of', '', '', '', '', '', '', '', '', '', '', '', 'a', '-0.03118']
['proud', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '0.5']
['tarquin', 'the', 'proud', '', '', '', '', '', '', '', '', '', '', 'n', '-0.07997']
I'm getting this output with this code
for x in list-text:
if "proud" in x:
print x
A CSV reader is a generator in Python: It yields lines whenever asked for its next item, until all those lines have been consumed. Once they've been consumed, it's done and has no more content: You can't run that same generator through a loop a second time and get more results from it.
However, by iterating first over your list of words to match, and then against the contents of the CSV file, you're trying to do exactly that -- with the effect that the CSV file will be read only once, matched against the first word in your string ("this"), but not any other.
If you want to create an object you can iterate through more than once, you'll want something like:
listtext = list(csv.reader(tempfile))
Or you can invert your loop: iterating first over lines of the input file, and then against lines you want to match against it.
Your problem as answered by Charles is that once you've spun through the csv with for x in listtext:, the csv row reader has been exhausted and there is nothing left to process the next time through the outer loop.
Any time you find yourself spinning through a list multiple times, ask yourself if there is some way to index it. You could use a dict that uses a cell as key and a list of rows where that cell appears as value. collections.defaultdict does some of the bookkeeping for you by creating items in the dict on first access. Your code could be
import csv
import collections
# index csv rows by column
csv_index = collections.defaultdict(list)
with open("word.csv", "rb") as tempfile:
for row in csv.reader(tempfile):
for col in row:
csv_index[col].append(row)
# sample to check
sti = "this is not a very good string"
sss = sti.split(" ")
print sss
# lookup words from sample and print
for word in sss:
print word
for row in csv_index[word]:
print(row)
This outputs
['this', 'is', 'not', 'a', 'very', 'good', 'string']
this
is
not
a
['proud', 'of', '', '', '', '', '', '', '', '', '', '', '', 'a', '-0.03118']
['proud', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '0.5']
very
good
string

Filter function doesn't work in python 2.7

For some reason I can't get the filter function to work.
I'm trying to remove empty strings from a list. After reading Remove empty strings from a list of strings, I'm trying to utilize the filter function.
import csv
import itertools
importfile = raw_input("Enter Filename(without extension): ")
importfile += '.csv'
test=[]
#imports plant names, effector names, plant numbers and leaf numbers from csv file
with open(importfile) as csvfile:
lijst = csv.reader(csvfile, delimiter=';', quotechar='|')
for row in itertools.islice(lijst, 0, 4):
test.append([row])
test1 = list(filter(None, test[3]))
print test1
This however returns:
[['leafs', '3', '', '', '', '', '', '', '', '', '', '']]
What am I doing wrong?
You have a list in a list, so filter(None, ...) is applied on the non-empty list, the empty strings are not affected. You can use, say a nested list comprehension to reach into the inner list and filter out falsy object:
lst = [['leafs', '3', '', '', '', '', '', '', '', '', '', '']]
test1 = [[x for x in i if x] for i in lst]
# [['leafs', '3']]
You filter a list of lists, where the inner item is a non-empty list.
>>> print filter(None, [['leafs', '3', '', '', '', '', '', '', '', '', '', '']])
[['leafs', '3', '', '', '', '', '', '', '', '', '', '']]
If you filter the inner list, the one that contains strings, everything works as expected:
>>> print filter(None, ['leafs', '3', '', '', '', '', '', '', '', '', '', ''])
['leafs', '3']
I was indeed filtering lists of lists, the problem in my code was:
for row in itertools.islice(lijst, 0, 4):
test.append[row]
This should be:
for row in itertools.islice(lijst, 0, 4):
test.append(row)

Categories