This is my current code but I am not entirely sure how to append the list with the results of the search. If anyone could provide any help it would be appreciated.
import sys
import re
with open('text.log') as f:
z=[]
count = 0
match = re.compile(r'^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(.?)$')
for l in f:
if match.search(l):
z = l.strip().split("\t")[5:8]
z.pop(1)
print(z[1]) # Now print them
print('\n')
count += 1
if count == 20:
print("\n\n\n\n\n-----NEW GROUPING OF 20 RESULTS-----\n\n\n\n\n\n")
count = 0
else:
print('wrong')
sys.exit()
A few thoughts:
1) Whenever you have questions about regular expressions, you can use tools like the Python Regex Tool Site to confirm your REs are doing what you think they're doing.
2) In your comment, you said that you don't want every element in ip to be printed. The any() function will return True if any of the elements in the iterable are True, hence the function name.
if any(match.search(s) for s in ip):
# this if statement will be true if ANY of the elements of ip match the
# regex, and all of the statements under it will be executed
print(ip) # now you're printing the whole list, so even the ones that
# didn't match will be printed
3) If you want the tryme() function to return just matching elements of the ip list, try this:
def tryme(ip):
match = re.compile("^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$")
return [element for element in ip if match.search(element)]
Don't forget to modify the main body of your code to catch the returned list and print it out.
4) You have a useless assignment:
with open('text.log') as f:
listofip=[]
count = 0
for l in f:
listofip = l.strip().split("\t")[5:8] # Grab the elements you want...
The second assignment to listofip overwrites the the empty list you created, so you should either get rid of the listofip = [] or use a different name when you splice up your input line. Based on your original question title, I think something like this may be more appropriate for you:
import sys
import re
import operator
# using my definition of tryme() from section 3 of my answer
with open('text.log') as f:
list_of_ips = []
count = 0
retriever = operator.itemgetter(5, 7, 8)
for l in f:
list_of_ips.append(tryme(retriever(l.strip().split("\t"))))
count += len(list_of_ips) # add however many ips matched to current count
if count >= 20:
print("\n\n\n\n\n-----NEW GROUPING OF RESULTS-----\n\n\n\n\n\n")
count = 0 # reset count
list_of_ips = [] # empty the list of ips
This will iterate over the file, grab the elements that match your RE, append them to a list, and print them out once there are over 20 in the list. Note that it may print groups larger than 20. I have also added operator.itemgetter() to simplify your slicing and popping.
Related
I am writing a function to grow a tree:
def collect_append(collect,split):
collect.append(split)
return collect
def tree(string,passwords,collect): #collect is a list and passwords is also a list
matching_list = []
match = 0
if len(string)==0:
print(collect)
return 0
for j in passwords:
for i in range(min(len(j),len(string))):
if string[i]!=j[i]:
break
else :
matching_list.append(j)
match = match + 1
if match == 0:
return 1
else:
for split in matching_list:
x =tree(string.strip(split),passwords,collect_append(collect,split))
return x
My question is, for each split in matching_list(say two), I want to add different strings to the existing list at that point (i.e. I want two versions of list).
In this case the collect_append function I use is modifying the list in first iteration of the for loop and using the same for further iterations. What I want is to just modify the collect list just for the parameter and without permanently changing it. Is there a way to do this?
I see two serious errors in your code. First, this else clause is never executed:
for j in passwords:
for i in range(...):
if ...:
break
else:
...
Since the break is in the inner for loop, the outer for loop is never exited via a break so the else is never taken. Second, this doesn't do what you want:
string.strip(split)
You're trying to remove split from the beginning of string but you're removing all the letters in split from both ends of string, bruising it badly. Here's one way to do it correctly:
string[len(split):]
I'm going to go out on a limb, and rewrite your code to do what I think you want it to do:
def tree(string, passwords, collect):
length = len(string)
if length == 0:
return False
matching_list = []
for j in passwords:
i = min(len(j), length)
if string[:i] == j[:i]:
matching_list.append(j)
if not matching_list:
return False
result = False
for split in matching_list:
local_collection = list([split])
if split == string or tree(string[len(split):], passwords, local_collection):
collect.append(local_collection)
result = True
return result
collection = []
print(tree('dogcatcher', ['cat', 'catch', 'cher', 'dog', 'dogcat', 'dogcatcher', 'er'], collection))
print(collection)
OUTPUT
% python3 test.py
True
[['dog', ['cat', ['cher']], ['catch', ['er']]], ['dogcat', ['cher']], ['dogcatcher']]
%
Giving you a tree of all the ways to assemble string from the words in passwords.
I am using this code: https://pastebin.com/mQkpxdeV
wordlist[overticker] = thesentence[0:spaces]
in this function:
def mediumparser(inpdat3):
spaceswitch = 0
overticker = 0
thesentence = "this sentence is to count the spaces"
wordlist = []
while spaceswitch == 0:
spaces = thesentence.find(' ')
wordlist[overticker] = thesentence[0:spaces] # this is where we save the words at
thesentence = thesentence[spaces:len(thesentence)] # this is where we change the sentence for the
# next run-through
print('here2')
print(wordlist)
I can't figure out why it just keeps saying list index out of range.
The program seems to work but it gives an error, what am I doing wrong? I have looked through this book by Mark Lutz for an answer and I can't find one.
The "list index out of range" problem is never with list splicing, as shown in this simple test:
>>> l = []
>>> l[0:1200]
[]
>>> l[-400:1200]
[]
so the problem is with your left hand assignment wordlist[overticker] which uses a list access, not slicing, and which is subject to "list index out of range".
Just those 4 lines of your code are enough to find the issue
wordlist = []
while spaceswitch == 0:
spaces = thesentence.find(' ')
wordlist[overticker] = ...
wordlist is just empty. You have to extend/append the list (or use a dictionary if you want to dynamically create items according to a key)
Instead of doing wordlist[overticker] with wordlist being a empty list, you will need to use append instead, since indexing an empty list wouldn't make sense.
wordlist.append(thesentence[0:spaces])
Alternatively, you can pre-initiate the list with 20 empty strings.
wordlist = [""]*20
wordlist[overticker] = thesentence[0:spaces]
P.S.
wordlist[overticker] is called indexing, wordlist[1:10] is called slicing.
I want to get the no. of same string in a list
Example
list = ['jack','jeen','jeen']
number_of_jeen = getnumber('jeen',list)
print(number_of_jeen)
Output
2
I have tried this so far
def getnumber(string_var,list):
if any(string_var in s for s in list):
print("This user exits !")
There's a built-in method count that does this.
number_of_jeen = list.count('jeen')
Use Counter from collections:
list = ['jack','jeen','jeen']
count = Counter(list)
print(count['jeen'])
Try just using a simple counter function. The idea is to loop through the list you have. In each iteration you are checking whether or not the name you are looking at is equal to 'jeen' in this case. If it is increment your counter, otherwise just move onto the next name. Below is an example:
listNames = ['jack','jeen','jeen']
occurrences = 0
for i in listNames:
if(i == 'jeen'):
occurrences += 1
print(occurrences)
I have been working on a sort of encryption tool in python. This bit of code is for the decryption feature.
The point is to take the given numbers and insert them into a list from where they will be divided by the given keys.
My idea for code is below but I keep getting the out of list index range whenever I try it out. Any suggestions? Keep in mind I'm a beginner:
need = []
detr = raw_input('What would you like decrypted?')
count = 0
for d in detr:
if (d == '.' or d == '!') or (d.isalpha() or d== " "):
count +=1
else:
need[count].append(d)
The problem is you are attempting to overwrite list values that don't exist.
list.append(item) adds item to the end of list. list[index] = item inserts item into list at position index.
list = [0,0,0]
list.append(0) # = [0,0,0,0]
list[0] = 1 # [1,0,0,0]
list[99] = 1 # ERROR: out of list index range
You should get rid of the count variable entirely. You could append None in the case of d==' ' etc. or just ignore them.
The way I understood your description you want to extract the numbers in a string and append them to a list using a for-loop to iterate over each character.
I think it would be easier doing it with regular expressions (something like r'([\d]+)').
But the way joconner said: "get rid of the count variable":
need = []
detr = input('What would you like decrypted?\n')
i = iter(detr) # get an iterator
# iterate over the input-string
for d in i:
numberstr = ""
try:
# as long as there are digits
while d.isdigit():
# append them to a cache-string
numberstr+= d
d = next(i)
except StopIteration:
# occurs when there are no more characters in detr
pass
if numberstr != "":
# convert the cache-string to an int
# and append the int to the need-array
need.append( int(numberstr) )
# print the need-array to see what is inside
print(need)
I have been struggling with managing some data. I have data that I have turned into a list of lists each basic sublist has a structure like the following
<1x>begins
<2x>value-1
<3x>value-2
<4x>value-3
some indeterminate number of other values
<1y>next observation begins
<2y>value-1
<3y>value-2
<4y>value-3
some indeterminate number of other values
this continues for an indeterminate number of times in each sublist
EDIT I need to get all the occurrences of <2,<3 & <4 separated out and grouped together I am creating a new list of lists [[<2x>value-1,<3x>value-2, <4x>value-3], [<2y>value-1, <3y>value-2, <4y>value-3]]
EDIT all of the lines that follow <4x> and <4y> (and for that matter <4anyalpha> have the same type of coding and I don't know a-priori how high the numbers can go-just think of these as sgml tags that are not closed I used numbers because my fingers were hurting from all the coding I have been doing today.
The solution I have come up with finally is not very pretty
listINeed=[]
for sublist in biglist:
for line in sublist:
if '<2' in line:
var2=line
if '<3' in line:
var3=line
if '<4' in line:
var4=line
templist=[]
templist.append(var2)
templist.append(var3)
templist.append(var4)
listIneed.append(templist)
templist=[]
var4=var2=var3=''
I have looked at ways to try to clean this up but have not been successful. This works fine I just saw this as another opportunity to learn more about python because I would think that this should be processable by a one line function.
itertools.groupby() can get you by.
itertools.groupby(biglist, operator.itemgetter(2))
If you want to pick out the second, third, and fourth elements of each sublist, this should work:
listINeed = [sublist[1:4] for sublist in biglist]
You're off to a good start by noticing that your original solution may work but lacks elegance.
You should parse the string in a loop, creating a new variable for each line.
Here's some sample code:
import re
s = """<1x>begins
<2x>value-1
<3x>value-2
<4x>value-3
some indeterminate number of other values
<1y>next observation begins
<2y>value-1
<3y>value-2
<4y>value-3"""
firstMatch = re.compile('^\<1x')
numMatch = re.compile('^\<(\d+)')
listIneed = []
templist = None
for line in s.split():
if firstMatch.match(line):
if templist is not None:
listIneed.append(templist)
templist = [line]
elif numMatch.match(line):
#print 'The matching number is %s' % numMatch.match(line).groups(1)
templist.append(line)
if templist is not None: listIneed.append(templist)
print listIneed
If I've understood your question correctly:
import re
def getlines(ori):
matches = re.finditer(r'(<([1-4])[a-zA-Z]>.*)', ori)
mainlist = []
sublist = []
for sr in matches:
if int(sr.groups()[1]) == 1:
if sublist != []:
mainlist.append(sublist)
sublist = []
else:
sublist.append(sr.groups()[0])
else:
mainlist.append(sublist)
return mainlist
...would do the job for you, if you felt like using regular expressions.
The version below would break all of the data down into sublists (not just the first four in each grouping) which might be more useful depending what else you need to do to the data. Use David's listINeed = [sublist[1:4] for sublist in biglist] to get the first four results from each list for the specific task above.
import re
def getlines(ori):
matches = re.finditer(r'(<(\d*)[a-zA-Z]>.*)', ori)
mainlist = []
sublist = []
for sr in matches:
if int(sr.groups()[1]) == 1:
print "1 found!"
if sublist != []:
mainlist.append(sublist)
sublist = []
else:
sublist.append(sr.groups()[0])
else:
mainlist.append(sublist)
return mainlist