I am creating custom code for fetching the IP and some other needed details from text file.
Consider the textfile is having following content: tenant_id and IP
cbdf25542c194a069464f69efff4859a 45.45.45.45
cbdf25542c194a069464f69efff4859a 1.6.7.3
cbdf25542c194a069464f69efff4859a 1.7.6.2
1235b3a73ad24b9c86cf301525310b24 2.3.7.5
1235b3a73ad24b9c86cf301525310b24 6.5.2.1
Now I have already created the code for fetching the IP and tenant separately.
Code is as follows:
files = open("/root/flattext", "r")
# create an empty list
ips = []
tenants = []
# read through the files
for text in files.readlines():
# strip off the \n
text = text.rstrip()
# IP and Tenant Fetch
regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$', text)
regex1 = re.findall(r'[0-9A-Za-z]{32}', text)
if regex is not None and regex not in ips:
ips.append(regex)
if regex1 is not None and regex1 not in tenants:
tenants.append(regex1)
ip_valuess = [''.join(ip) for ip in ips if ip]
tenant_ids = [''.join(tenant) for tenant in tenants if tenant]
# cleanup and close files
files.close()
So It will be giving result which consists of IP and Tenant_id as separate list.
Here what I need is fetching the IP which is coming under particular tenant ID.
Consider 1235b3a73ad24b9c86cf301525310b24 as a tenant_id,
So it should be giving result as
2.3.7.5 , 6.5.2.1.
Some one please have a look and give me a better way to sort it out.
Why use regex just split works fine just use defaultdict-
from collections import defaultdict
data = defaultdict(list)
with open(r"D:\ip.txt",'rb') as fl:
for i in fl.readlines():
i=i.strip()
data[i.split(" ")[0]].append(i.split(" ")[1])
print data.items()
Output-
[('1235b3a73ad24b9c86cf301525310b24', ['2.3.7.5', '6.5.2.1']), ('cbdf25542c194a069464f69efff4859a', ['45.45.45.45', '1.6.7.3', '1.7.6.2'])]
If your file is not structured and no space to split with then try regex-
import re
from collections import defaultdict
data = defaultdict(list)
pattern_ip = r'([\d]{1,3}(?=\.|$))'
pattern_tenat = r'^[a-z0-9]{32}'
with open(r"D:\ip.txt",'rb') as fl:
for i in fl.readlines():
i=i.strip()
ip = '.'.join(re.findall(pattern_ip,i))
tent = ''.join(re.findall(pattern_tenat,i))
data[tent].append(ip)
print data.items()
Output-
[('1235b3a73ad24b9c86cf301525310b24', ['2.3.7.5', '6.5.2.1']), ('cbdf25542c194a069464f69efff4859a', ['45.45.45.45', '1.6.7.3', '1.7.6.2'])]
See regex LIVE DEMOTENANT and DEMOIP
Use split and defaultdict:
from collections import defaultdict
results = defaultdict(list)
with open('flattext', 'r') as f:
for row in f.read().strip().split('\n'):
if row.strip() != "":
tenant_id, ip = row.split()
results[tenant_id].append(ip)
print results.get('1235b3a73ad24b9c86cf301525310b24', None)
print results.items()
Output:
['2.3.7.5', '6.5.2.1']
And results content:
[
('1235b3a73ad24b9c86cf301525310b24', ['2.3.7.5', '6.5.2.1']),
('cbdf25542c194a069464f69efff4859a', ['45.45.45.45', '1.6.7.3', '1.7.6.2'])
]
Related
Got two excel files, one with a little ammount of id's and other with tonns of id's + ip address:
How do i compare them line by line, and then print out a concatenation of id(first file) and id+ip(second file) cells that matched with id's?
Where could i go further from here?
import re
router_id = r'[0-9]{5}' # regex for finding 5-num-symboled id like '65432'
ip_multi = r'[0-9]+(?:\.[0-9]+){3}' # for finding ip address
def parsing_func(filename, method):
with open(filename, 'r') as file:
lines = str(file.readlines())
exp_data = re.findall(method, lines)
return exp_data
table_id = set(parsing_func('file1.txt', router_id))
table_addr = set(parsing_func('file2.txt', router_id))
table_mk = set(parsing_func('file2.txt', ip_multi))
content = '\n'.join(table_id)
print(content)
Background Information
I have a program that I'm using for pinging a service and printing the results back to a window. I'm currently trying to add to this program, by adding a kind of 'settings' file that users can edit to change the a) host that is pinged and b) timeout
What I've tried so far
file = open("file.txt", "r")
print (file.read())
settings = file.read()
# looking for the value of 'host'
pattern = 'host = "(.*)'
variable = re.findall(pattern, settings)[0]
print(test)
As for what is contained within the file.txt file:
host = "youtube.com"
pingTimeout = "1"
However, my attempts have been unsuccessful as this comes up with the following
error:
IndexError: list index out of range
And so, my question is:
Can anyone point me in the right direction to do this? To recap, I am asking how I can take an input from file (in this case host = "youtube.com" and save that as a variable 'host' within the python file).
First, as Patrick Haugh pointed out, you can't call read() twice on the same file object. Second, using regex to parse a simple key = value format is a bit overkill.
host, pingTimeout = None,None # Maybe intialize these to a default value
with open("settings.txt", "r") as f:
for line in f:
key,value = line.strip().split(" = ")
if key == 'host':
host = value
if key == 'pingTimeout':
pingTimeout = int(value)
print host, pingTimeout
Note that the expected input format would have no quotes for the example code above.
host = youtube.com
pingTimeout = 1
I tried this, it may help :
import re
filename = "<your text file with hostname>"
with open(filename) as f:
lines = f.read().splitlines()
for str in lines:
if re.search('host', str):
host, val = str.split('=')
val = val.replace("\"", "")
break
host = val
print host
f.close()
I've developed a program that stores a list of ids, so:
But for the desired purposes, the data should take the sequential form, so that the first pair of ids is something like: "889926212541448192" becomes 1 and "889919950248448000" becomes 2. That is, the file to be should be something like:
Where the first id connects with 2,3 and 6, and the id 4 only with 5, forming a network.
I have no experience in this area, but I can not find a way to do this reading.
I tried to do some programs, but they only read row and not column id to id. This data is saved following the following program
import json
arq = open('ids.csv','w')
arq.write('Source'+','+'Target')
arq.write("\n")
lista_rede = [] #list to store all ids
with open('dados_twitter.json', 'r') as f:
for line in f:
lista = []
tweet = json.loads(line) # to write as a Python dictionary
lista = list(tweet.keys()) #write list of keys
try:
if 'retweeted_status' in lista:
id_rt = json.dumps(tweet['retweeted_status']['id_str'])
id_status = json.dumps(tweet['id_str'])
lista_rede.append(tweet['id_str'])
lista_rede.append(tweet['retweeted_status']['id_str'])
arq.write( id_status +','+ id_rt )
arq.write("\n")
if tweet['quoted_status'] in lista :
id_rt = json.dumps(tweet['quoted_status']['id_str'])
id_status = json.dumps(tweet['id_str'])
lista_rede.append(tweet['id_str'])
lista_rede.append(tweet['quoted_status']['id_str'])
arq.write( id_status +','+ id_rt )
arq.write("\n")
except:
continue
arq.close()
As a result I have a file with ids data in pairs of interactions.
How can I then rearrange these data in reading, or even how to write them ?? In Python or another language?
The following snippet would do the job-
import re
header = ''
id_dict = {}
# read the ids
with open('ids.csv') as fr:
header = fr.readline()
for line in fr:
ids = [int(s) for s in re.findall(r'\d+', line)]
try:
id_dict[int(ids[0])].append(int(ids[1]))
except:
id_dict[int(ids[0])] = [int(ids[1])]
# sort the ids
for key in id_dict:
id_dict[key].sort()
# save the sorted ids in a new file
with open('ids_sorted.txt', 'w') as fw:
# fw.write(header)
for key in sorted(id_dict):
for value in id_dict[key]:
fw.write("{0} {1}\n".format(key, value))
I have a snort log file named "logs" and want to extract IP addresses from it and store them to another file named "blacklist". it can extract unique IP Addresses but if I run the program again, it adds the previous IPs as well. I want the program to first check whether IP is already in blacklist file? if so, just ignore it otherwise add unique IPs from logs file to blacklist. code:
#!/usr/bin/python
import re
mylist1 = []
mylist2 = []
mylist3 = []
mylist4 = []
logfile = open('/var/log/snort/logs', 'r')
blklist = open('blacklist', 'ab+')
for line in open ('blacklist', 'r').readlines():
mylist4.append(line)
for l in logfile.readlines():
l = l.rstrip()
ip = re.findall(r'[0-9]+(?:\.[0-9]+){3}',l)
if ip is not None and ip not in mylist1:
mylist1.append(ip)
for ip in mylist1:
addr = ",".join(ip)
if ',' in addr:
a = addr.split(',')
for ip in a:
addr = "".join(ip)
if addr is not '':
mylist2.append(addr)
else:
mylist3.append(addr)
for x in blklist:
mylist2.append(x.strip())
for x in mylist2:
if x not in mylist3 and x not in mylist4:
blklist.write(x+'\n')
mylist3.append(x)
Logs file is:
12/16-10:34:27.070967 [**] [1:10000001:1] snort alert [1:0000001] [**][classification ID: 0] [Priority ID: 0] {ICMP} 192.168.40.19 -> 192.168.50.29
12/16-10:34:27.070967 [**] [1:10000001:1] snort alert [1:0000001] [**][classification ID: 0] [Priority ID: 0] {ICMP} 192.168.50.29 -> 192.168.30.20
Output of blacklist file after first program run:
192.168.30.20
192.168.50.29
192.168.40.19
Output of blacklist file after second program run:
192.168.30.20
192.168.50.29
192.168.40.19
192.168.30.20
192.168.50.29
192.168.40.19
any help please?
You can read everything in from your blacklist file and log into lists. Join those list and then ouput a set back to the blacklist file (sets are unique values) since the read empties the file your will have a unique list of all new and old IPs. If the order matters (doubt it does) then a set will cause issues. Let me know and I can revamp the below.
if __name__ == '__main__':
import re
blacklist = list(open("blacklist", 'r').read().split('\n'))
logfile = list(open("/var/log/snort/logs", 'r').read().split('\n'))
newentry = []
for entry in logfile:
ips = re.findall( r'[0-9]+(?:\.[0-9]+){3}', entry)
for ip in ips:
newentry.append(ip)
newblacklist = blacklist + newentry
with open("blacklist", 'w+') as f:
f.write('\n' .join(set(newblacklist)))
f.close()
You could utilize the Python container type set which stores only unique elements. The procedure below should work for you:
create a 'current' blacklist set
read the blacklist file IP's into the current set
create a 'delta' blacklist set
for each IP address in the log file
if not already in current blacklist
add the IP into the delta set
append (by writing) the delta set into the black list file
I'd like to add only servers currently doesn't exist at file
My current code :
f = open(filename,'a')
for server in cmo.getServers() :
print >>f, server.getListenAddress()
Thanks in advance
try this:
data = set( [i.strip() for i in open( filename, 'r' ).readlines()] )
for server in cmo.getServers() :
data.add( server.getListenAddress() )
open( filename, 'w' ).write('\n'.join(data))
Build a list of servers already present in the file:
present = [l.strip() for l in open(filename)]
(assuming the file format is just one server per line, no other symbols).
Then check if an address is in the list:
for server in cmo.getServers():
address = server.getListenAddress()
if address not in present:
print >>f, address
This assumes that the addresses you get from getServers() will not repeat.
If that's also possible, then build a set of them first:
new = set(server.getListenAddress() for server in cmo.getServers())
for address in new:
if address not in present:
print >>f, address