Loop to Match Parts of List - python

My code:
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
count = 0
i = 1
while count < 1000:
if mylist[i] == mylist[i+12] and mylist [i+3] == mylist [i+14]:
print mylist[i]
count = count+1
i = i+12
My intention is to look at elt 1, elt 2. If elt 1 == elt 13 AND elt 2==elt 14 I want to print elt 1. Then, I want to look at elt 13 and elt 14. If elt 2 matches elt 13+12 AND elt 14 matches elt 14+12 I want to print it. ETC...
There are certainly parts of my list that fit this criteria, but the program returns no output.

One problem is your indices. Be advised that lists begin with an index of 0.
I'm surprised nobody's answered this yet:
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
count = 0
i = 0
while count < 1000:
#print mylist[i]
#print mylist[i+12]
#print mylist[i+13]
#print mylist[i+14]
#...use prints to help you debug
if mylist[i] == mylist[i+12] and mylist [i+1] == mylist [i+13]:
print mylist[i]
count = count+1
i = i+12
This is probably what you want.

To iterate over multiple lists (technically, iterables) in "lockstep", you can use zip. In this case, you want to iterate over four versions of mylist, offset by 0, 12, 2 and 13.
zippedLists = zip(mylist, mylist[12:], mylist[2:], mylist[13:])
Next, you want the 0th, 12th, 24th, etc elements. This is done with slice:
slicedList = zippedLists[::12]
Then you can iterate over that:
for elt1, elt13, elt2, elt14 in slicedList:
if elt1 == elt13 and elt2 == elt14:
print elt1
Putting it together with the file operations, we get
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
zippedLists = zip(mylist, mylist[12:], mylist[2:], mylist[13:])
slicedList = zippedLists[::12]
for elt1, elt13, elt2, elt14 in slicedList:
if elt1 == elt13 and elt2 == elt14:
print elt1
Code like this is generally considered more "pythonic" than your current version, as using list indexes are generally discouraged when you are iterating over the list.
Note that if you've got a huge number of elements in your list the above code creates (and destroys at some point) five extra lists. Therefore, you may get better memory performance if you use the equivalent functions in itertools, which uses lazy iterators to prevent copying lists needlessly:
from itertools import islice, izip
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
zippedLists = itertools.izip(mylist, islice(mylist, 12), islice(mylist, 2), islice(mylist, 13))
slicedList = itertools.islice(zippedLists, 0, None, 12)
for elt1, elt13, elt2, elt14 in slicedList:
if elt1 == elt13 and elt2 == elt14:
print elt1
There's probably a way in itertools to avoid slurping the entire file into mylist, but I'm not sure I remember what it is - I think that is the use case for itertools.tee.

Related

How to take only 3 duplicates from a list with 6 or more duplicates and put them in another list

I made some code that solve my problem, but it's to slow. Any others solutions?
Input
['13906577679124','13906577679124','13906577679124','13906577679124','13906577679124','13906577679124','13906577679124','142404643442629780','142404643442629780','142404643442629780','142404643442629780','142404643442629780','142404643442629780']
bs = []
ls = []
for item in append_to_table2:
counted = Counter(ls)
if counted[item] <= 2:
ls.append(item)
else:
bs.append(item)
print(ls)
Output
['13906577679124', '13906577679124', '13906577679124', '142404643442629780', '142404643442629780', '142404643442629780']
You can count just once and then compare:
d = {}
ls = []
cnt = 0 # to count the length of ls
for item in append_to_table2:
d[item] = d.get(item,0)+1
for item in d:
if d[item]>1:
ls.append(item)
cnt+=1
if cnt == 3:
break
You can use the list comprehension on the old list with a condition on the counted items... like so...
bs=[i for i in append_to_table2 if counted[i] > 2]
print(bs)
May be I might have misunderstood the question but according to me what I'm trying to is any element whose count is less then 3 is appended that many times to ls list and then any element whose count is greater then 3 I'm storing 3 occurences of that list to ls and the rest to bs list
from collections import Counter
from itertools import repeat
org=['13906577679124','13906577679124','13906577679124','13906577679124','13906577679124','13906577679124','13906577679124','142404643442629780','142404643442629780','142404643442629780','142404643442629780','142404643442629780','142404643442629780']
append_to_table2=list(set(org))
bs = []
ls = []
for item in append_to_table2:
ele_count = org.count(item)
if ele_count<= 2:
ls.extend(repeat(item, ele_count))
else:
left=ele_count-3
ls.extend(repeat(item, 3))
bs.extend(repeat(item, left))
print(ls)
Output:
['13906577679124', '13906577679124', '13906577679124', '142404643442629780', '142404643442629780', '142404643442629780']
I'm also new so any suggestions to improve the code will be very much appreciable
This is your code just condensed and optimized further.
bs = []
ls = []
[ls.append(item) if Counter(ls)[item] <= 2 else bs.append(item) for item in append_to_table2]
print(ls)

Compare each element of a list in existing order with elements of a second list in order as long as items in lists are equal

Compare each element of a list in existing order with elements of a second list in existing order as long as items in lists are equal. Stop if they're not equal and give me as result the index and name of the last match.
I thought it's straightforward with a while loop but it seems like this has to be approached with a for-loop.
My List one I want to compare:
nk_script_file_path
['P:', 'Projects', '2019_projects', '1910_My_Project', '01_Production_IN', '01_OFX', '01_Comp', '00_Nuke', 'relink_test_v001.nk']
My second list I want to compare it to:
node_filepath
['P:', 'Projects', '2019_projects', '1910_My_Project', '02_Production_OUT', '01_OFX', '01_Comp', '00_Nuke', '040_ALY', '040_ALY_040_HROTERRORBLADE', '040_ALY_040_HROTERRORBLADE_prev_Gamma22_apcs_mov', '040_ALY_040_HROTERRORBLADE_prev_v14_Gamma22_apcs.mov']
What I've tried
nk_script_file_path = r"P:/Projects/2019_projects/1910_My_Project/01_Production_IN/01_OFX/01_Comp/00_SO/relink_test_v001.nk".split("/")
node_filepath = r"P:/Projects/2019_projects/1910_My_Project/02_Production_OUT/01_OFX/01_Comp/00_S=/040_ALY/040_ALY_040_HROTERRORBLADE/040_ALY_040_HROTERRORBLADE_prev_Gamma22_apcs_mov/040_ALY_040_HROTERRORBLADE_prev_v14_Gamma22_apcs.mov".split("/")
# Compare file paths
path_object = 0
while nk_script_file_path in node_filepath:
path_object += 1
print path_object
print node_filepath[path_object]
Result I'm looking for:
"3"
or
"1910_My_Project"
You can use zip() with enumerate() to find first index where's difference. In this example if no difference is found, value of i is equal to -1:
lst1 = ['P:', 'Projects', '2019_projects', '1910_My_Project', '01_Production_IN', '01_OFX', '01_Comp', '00_Nuke', 'relink_test_v001.nk']
lst2 = ['P:', 'Projects', '2019_projects', '1910_My_Project', '02_Production_OUT', '01_OFX', '01_Comp', '00_Nuke', '040_ALY', '040_ALY_040_HROTERRORBLADE', '040_ALY_040_HROTERRORBLADE_prev_Gamma22_apcs_mov', '040_ALY_040_HROTERRORBLADE_prev_v14_Gamma22_apcs.mov']
for i, (a, b) in enumerate(zip(lst1, lst2)):
if a != b:
break
else:
i = -1
print('First difference is at index:', i)
Prints:
First difference is at index: 4
nk_script_file_path= r"P:/Projects/2019_projects/1910_My_Project/01_Production_IN/01_OFX/01_Comp/00_SO/relink_test_v001.nk".split("/")
node_filepath = r"P:/Projects/2019_projects/1910_My_Project/02_Production_OUT/01_OFX/01_Comp/00_S=/040_ALY/040_ALY_040_HROTERRORBLADE/040_ALY_040_HROTERRORBLADE_prev_Gamma22_apcs_mov/040_ALY_040_HROTERRORBLADE_prev_v14_Gamma22_apcs.mov".split("/")
j = 0
for i in nk_script_file_path:
if i != node_filepath[j] :
j = j-1
break
else:
j += 1
print(nk_script_file_path[j])
print(j)

removing numbers which are close to each other in a list

I have a list like
mylist = [75,75,76,77,78,79,154,155,154,156,260,262,263,550,551,551,552]
i need to remove numbers are close to each other by maxumim four number like:
num-4 <= x <= num +4
the list i need at the end should be like :
list = [75,154,260,550]
or
list = [76,156,263,551]
doesn't really matter which number to stay in the list , only one of those which are close.
i tried this which gave me :
for i in range(len(l)):
for j in range(len(l)):
if i==j or i==j+1 or i==j+2 or i == j+3:
pp= l.pop(j)
print(pp)
print(l)
IndexError: pop index out of range
and this one which doesn't work the way i need:
for q in li:
for w in li:
print(q,'////',w)
if q == w or q ==w+1 or q==w+2 or q==w+3:
rem = li.remove(w)
thanks
The below uses groupby to identify runs from the iterable that start with a value start and contain values that differ from start by no more than 4. We then collect all of those start values into a list.
from itertools import groupby
def runs(difference=4):
start = None
def inner(n):
nonlocal start
if start is None:
start = n
elif abs(start-n) > difference:
start = n
return start
return inner
print([next(g) for k, g in groupby(mylist, runs())])
# [75, 154, 260, 550]
This assumes that the input data is already sorted. If it's not, you'll have to sort it: groupby(sorted(mylist), runs()).
You can accomplish this using a set or list, you don't need a dict.
usedValues = set()
newList = []
for v in myList:
if v not in usedValues:
newList.append(v)
for lv in range(v - 4, v + 5):
usedValues.add(lv)
print(newList)
This method stores all values within 4 of every value you've seen so far. When you look at a new value from myList, you only need to check if you've seen something in it's ballpark before by checking usedValues.

Funny behaviour of my recursive function

t = 8
string = "1 2 3 4 3 3 2 1"
string.replace(" ","")
string2 = [x for x in string]
print string2
for n in range(t-1):
string2.remove(' ')
print string2
def remover(ca):
newca = []
print len(ca)
if len(ca) == 1:
return ca
else:
for i in ca:
newca.append(int(i) - int(min(ca)))
for x in newca:
if x == 0:
newca.remove(0)
print newca
return remover(newca)
print (remover(string2))
It's supposed to be a program that takes in a list of numbers, and for every number in the list it subtracts from it, the min(list). It works fine for the first few iterations but not towards the end. I've added print statements here and there to help out.
EDIT:
t = 8
string = "1 2 3 4 3 3 2 1"
string = string.replace(" ","")
string2 = [x for x in string]
print len(string2)
def remover(ca):
newca = []
if len(ca) == 1: return()
else:
for i in ca:
newca.append(int(i) - int(min(ca)))
while 0 in newca:
newca.remove(0)
print len(newca)
return remover(newca)
print (remover(string2))
for x in newca:
if x == 0:
newca.remove(0)
Iterating over a list and removing things from it at the same time can lead to strange and unexpected behvaior. Try using a while loop instead.
while 0 in newca:
newca.remove(0)
Or a list comprehension:
newca = [item for item in newca if item != 0]
Or create yet another temporary list:
newnewca = []
for x in newca:
if x != 0:
newnewca.append(x)
print newnewca
return remover(newnewca)
(Not a real answer, JFYI:)
Your program can be waaay shorter if you decompose it into proper parts.
def aboveMin(items):
min_value = min(items) # only calculate it once
return differenceWith(min_value, items)
def differenceWith(min_value, items):
result = []
for value in items:
result.append(value - min_value)
return result
The above pattern can, as usual, be replaced with a comprehension:
def differenceWith(min_value, items):
return [value - min_value for value in items]
Try it:
>>> print aboveMin([1, 2, 3, 4, 5])
[0, 1, 2, 3, 4]
Note how no item is ever removed, and that data are generally not mutated at all. This approach helps reason about programs a lot; try it.
So IF I've understood the description of what you expect,
I believe the script below would result in something closer to your goal.
Logic:
split will return an array composed of each "number" provided to raw_input, while even if you used the output of replace, you'd end up with a very long number (you took out the spaces that separated each number from one another), and your actual split of string splits it in single digits number, which does not match your described intent
you should test that each input provided is an integer
as you already do a print in your function, no need for it to return anything
avoid adding zeros to your new array, just test first
string = raw_input()
array = string.split()
intarray = []
for x in array:
try:
intarray.append(int(x))
except:
pass
def remover(arrayofint):
newarray = []
minimum = min(arrayofint)
for i in array:
if i > minimum:
newarray.append(i - minimum)
if len(newarray) > 0:
print newarray
remover(newarray)
remover(intarray)

Python 3.0+ Calculating Mode

I have written a program to calculate the most often occurring number. This works great unless you have 2 most occurring numbers in a list such as 7,7,7,9,9,9. For that I wrote in:
if len(modeList) > 1 and modeList[0] != modeList[1]:
break
but then I encounter other problems like a set of number with 7,9,9,9,9. What do I do. Below is my code that will calculate one Mode.
list1 = [7,7,7,9,9,9,9]
numList=[]
modeList=[]
finalList =[]
for i in range(len(list1)):
for k in range(len(list1)):
if list1[i] == list1[k]:
numList.append(list1[i])
numList.append("EOF")
w = 0
for w in range(len(numList)):
if numList[w] == numList[w + 1]:
modeList.append(numList[w])
if numList[w + 1] == "EOF":
break
w = 0
lenMode = len(modeList)
print(lenMode)
while lenMode > 1:
for w in range(lenMode):
print(w)
if w != lenMode - 1:
if modeList[w] == modeList[w + 1]:
finalList.append(modeList[w])
print(w)
lenFinal = len(finalList)
modeList = []
for i in range(lenFinal):
modeList.append(finalList[i])
finalList = []
lenMode = len(modeList)
and then
print(modeList)
We have not learned counters but I would be open to it if someone could explain!
I would just use collections.Counter for this:
>>> from collections import Counter
>>> c = Counter([7,9,9,9,9])
>>> max(c.items(), key=lambda x:x[1])[0]
9
This is really rather simple. All it does is count how many times each value appears in the list, and then selects the element with the highest count.
I would use statistics.mode() for this. If there is more than one mode, it will raise an exception. If you need to handle multiple modes (it's not clear to me whether that's the case), you probably want to use a collections.Counter object as suggested by NPE.

Categories