I have a file that I read lines from and manipulate the strings.
Here is an example of a few of the lines (the file is Intel's HEX format if you're interested):
:10DE50003EDE179280DB0338D2C32202023CD2D3CB
:10DE600022021792A0DB0338E2C32202023CE2D373
:10DE7000220292533EDEB0906400C4FF022082432F
:10DE80003EDE3741324190C3B8240013FDDBFF056D
:10DE900057494D453420D0D88CDEFDDB8FDEFF03A3
A buddy suggested I create an array with the first 4:7 bytes as the index, EG, DE50, then use the remaining 16 bytes as the data (00 after DE50 is not used, and last byte is not used). He said I could use HEX and add let's say, 10 to the DE50 to get DE5A and therefore locate the byte associated with that index. Problem is, I can't figure out a way to do that. Is it even possible? This would allow me to then address any byte I want by knowing the HEX index which would be really powerful.
Thank you!
There is an Intel Hex package in pypi perhaps you should look at that first
Here are some examples copied from the docs.
Once created, an IntelHex object can be loaded with data. This is only
necessary if “source” was unspecified in the constructor. You can also
load data several times (but if addresses in those files overlap you
get exception AddressOverlapError). This error is only raised when
reading from hex files. When reading from other formats, without
explicitly calling merge, the data will be overwritten. E.g.:
>>> from intelhex import IntelHex
>>> ih = IntelHex() # create empty object
>>> ih.loadhex('foo.hex') # load from hex
>>> ih.loadfile('bar.hex',format='hex') # also load from hex
>>> ih.fromfile('bar.hex',format='hex') # also load from hex
NOTE: using IntelHex.fromfile is recommended way.
All of the above examples will read from HEX files. IntelHex also
supports reading straight binary files. For example:
>>> from intelhex import IntelHex
>>> ih = IntelHex() # create empty object
>>> ih.loadbin('foo.bin') # load from bin
>>> ih.fromfile('bar.bin',format='bin') # also load from bin
>>> ih.loadbin('baz.bin',offset=0x1000) # load binary data and place them
>>> # starting with specified offset
Finally, data can be loaded from an appropriate Python dictionary.
This will permit you to store the data in an IntelHex object to a
builtin dictionary and restore the object at a later time. For
example:
>>> from intelhex import IntelHex
>>> ih = IntelHex('foo.hex') # create empty object
>>> pydict = ih.todict() # dump contents to pydict
...do something with the dictionary...
>>> newIH = IntelHex(pydict) # recreate object with dict
>>> another = IntelHex() # make a blank instance
>>> another.fromdict(pydict) # now another is the same as newIH
You're on the right track here, but you can't have an "array" indexed by hex characters. Arrays, and lists, are always indexed by integers, starting with 0.
If you know the initial offset (which you do, from the first line), you can make an index very easily. For example, everything from 'DE50' to 'DE5F' should be line #0, right? So, convert that DE50 to an integer, divide by 16 (truncating fractions), and subtract 0xDE50. Like this:
with open('hexfile.txt') as f:
lines = list(f)
offset = int(lines[0][4:7], 16) // 16
def get_line(hex_index):
index = int(hex_index, 16) // 16
return lines[index - offset]
Alternatively, you could use a dict keyed off the hex indices, instead of a list, and then do what your friend suggested:
with open('hexfile.txt') as f:
lines = {line[4:7]: line for line in f}
def get_line(hex_index):
base_hex_index = hex_index[:3] + '0'
return lines[base_hex_index]
However, this seems to be just adding extra complexity to your data structure for no benefit. If you've got sequential lines, just treat them sequentially. And if you've got numbers as hex strings, just convert them to numbers to treat them as indices.
Related
How can I replace multiple bytes in a bytearray? For example:
b"\x00\x01\x02\x03\x04\x05"
I want to replace \x02\x03 with \xFF\xFF and \x04\x05 with \xEE\xEE. How can I do this all at once?
The replace method can also be used on byte object in python
a = b"\x00\x01\x02\x03\x04\x05"
b = a.replace(b"\x02\x03", b"\xFF\xFF").replace(b"\x04\x05", b"\xEE\xEE")
so I am capturing packets with Pydivert. I can print out the full packet payload by using
print(packet.tcp.payload)
OR
print(packet.payload)
output was
b'\x03\x00\x34\xe2\xd1' //continued like this
same output in both cases. I printed out the type by using
print(type(packet.payload))
This showed the type to be
<class 'byte'>
I would like to take say the first 10 byte positions from the output and type it out and also save it into a variable so when I'm modifying the payload, I exclude the initial bytes and then modify the remaining parts. So I can somehow attach the separated out bytes to my newly created bytes to create a final byte stream like for example:
TotalByteStream = (initial bytes which I separated out) + b'\x03\x00\x34\xe2\xd1\x78\x23\x45\x79' //continued like this as needed
//And then do
packet.payload = TotalByteStream
Is this possible?
I'm not sure I understand your question, but you can manipulate bytes in a manner similar to strings.
If you have your original payload:
>>> payload_1 = b'\x03\x00\xf4\xe2\xd1'
>>> type(payload_1)
<class 'bytes'>
>>> payload_1
b'\x03\x00\xf4\xe2\xd1'
You can slice of the first few bytes
>>> part = payload_1[:2]
>>> part
b'\x03\x00'
And later create a new payload where you prepend the part variable
>>> payload_2 = part + b'\xf5\xe5\xd5'
>>> payload_2
b'\x03\x00\xf5\xe5\xd5'
>>> payload_1
b'\x03\x00\xf4\xe2\xd1'
So you get a new payload with the same starting bytes. Does this answer your question? Or did I misunderstand your issue?
I am currently in the process of using python to transmit a python dictionary from one raspberry pi to another over a 433Mhz link, using virtual wire (vw.py) to send data.
The issue with vw.py is that data being sent is in string format.
I am successfully receiving the data on PI_no2, and now I am trying to reformat the data so it can be placed back in a dictionary.
I have created a small snippet to test with, and created a temporary string in the same format it is received as from vw.py
So far I have successfully split the string at the colon, and I am now trying to get rid of the double quotes, without much success.
my_status = {}
#temp is in the format the data is recieved
temp = "'mycode':['1','2','firstname','Lastname']"
key,value = temp.split(':')
print key
print value
key = key.replace("'",'')
value = value.replace("'",'')
my_status.update({key:value})
print my_status
Gives the result
'mycode'
['1','2','firstname','Lastname']
{'mycode': '[1,2,firstname,Lastname]'}
I require the value to be in the format
['1','2','firstname','Lastname']
but the strip gets rid of all the single speech marks.
You can use ast.literal_eval
import ast
temp = "'mycode':['1','2','firstname','Lastname']"
key,value = map(ast.literal_eval, temp.split(':'))
status = {key: value}
Will output
{'mycode': ['1', '2', 'firstname', 'Lastname']}
This shouldn't be hard to solve. What you need to do is strip away the [ ] in your list string, then split by ,. Once you've done this, iterate over the elements are add them to a list. Your code should look like this:
string = "[1,2,firstname,lastname]"
string = string.strip("[")
string = string.strip("]")
values = string.split(",")
final_list = []
for val in values:
final_list.append(val)
print final_list
This will return:
> ['1','2','firstname','lastname']
Then take this list and insert it into your dictionary:
d = {}
d['mycode'] = final_list
The advantage of this method is that you can handle each value independently. If you need to convert 1 and 2 to int then you'll be able to do that while leaving the other two as str.
Alternatively to cricket_007's suggestion of using a syntax tree parser - you're format is very similar to the standard yaml format. This is a pretty lightweight and intutive framework so I'll suggest it
a = "'mycode':['1','2','firstname','Lastname']"
print yaml.load(a.replace(":",": "))
# prints the dictionary {'mycode': ['1', '2', 'firstname', 'Lastname']}
The only thing that's different between your format and yaml is the colon needs a space
It also will distinguish between primitive data types for you, if that's important. Drop the quotes around 1 and 2 and it determines that they're numerical.
Tadhg McDonald-Jensen suggested pickling in the comments. This will allow you to store more complicated objects, though you may lose the human-readable format you've been experimenting with
This is similar to, Python creating dynamic global variable from list, but I'm still confused.
I get lots of flo data in a semi proprietary format. I've already used Python to strip the data to my needs and save the data into a json file called badactor.json and are saved in the following format:
[saddr as a integer, daddr as a integer, port, date as Julian, time as decimal number]
An arbitrary example [1053464536, 1232644361, 2222, 2014260, 15009]
I want to go through my weekly/monthly flo logs and save everything by Julian date. To start I want to go through the logs and create a list that is named according to the Julian date it happened, i.e, 2014260 and then save it to the same name 2014260.json. I have the following, but it is giving me an error:
#!/usr/bin/python
import sys
import json
import time
from datetime import datetime
import calendar
#these are varibles I've had to use throughout, kinda a boiler plate for now
x=0
templist2 = []
templist3 = []
templist4 = []
templist5 = []
bad = {}
#this is my list of "bad actors", list is in the following format
#[saddr as a integer, daddr as a integer, port, date as Julian, time as decimal number]
#or an arbitrary example [1053464536, 1232644361, 2222, 2014260, 15009]
badactor = 'badactor.json'
with open(badactor, 'r') as f1:
badact = json.load(f1)
f1.close()
for i in badact:
print i[3] #troubleshooting to verify my value is being read in
tmp = str(i[3])
print tmp#again just troubleshooting
tl=[i[0],i[4],i[1],i[2]]
bad[tmp]=bad[tmp]+tl
print bad[tmp]
Trying to create the variable is giving me the following error:
Traceback (most recent call last):
File "savetofiles.py", line 39, in <module>
bad[tmp]=bad[tmp]+tl
KeyError: '2014260'
By the time your code is executed, there is no key "2014260" in the "bad" dict.
Your problem is here:
bad[tmp]=bad[tmp]+tl
You're saying "add t1 to something that doesn't exist."
Instead, you seem to want to do:
bad[tmp]=tl
I suggest you initialize bad to be an empty collections.defaultdict instead of just regular built-in dict. i.e.
import collections
...
bad = collections.defaultdict(list)
That way, initial empty list values will be created for you automatically the first time a date key is encountered and the error you're getting from the bad[tmp]=bad[tmp]+tl statement will go away since it will effectively become bad[tmp]=list()+tl — where the list() call just creates and returns an empty list — the first time a particular date is encountered.
It's also not clear whether you really need the tmp = str(i[3]) conversion because values of any non-mutable type are valid dictionary (or defaultdict) keys, not just strings — assuming i[3] isn't a string already. Regardless, subsequent code would be more readable if you named the result something else, like julian_date = i[3] (or julian_date = str(i[3]) if the conversion really is required).
I would like to get the size of a populated dictionary in python. I tried this:
>>> dict = {'1':1}
>>> import sys
>>> print dict
{'1': 1}
>>> sys.getsizeof(dict)
140
but this apparently wouldn't do it. The return value I'd expect is 2 (Bytes). I'll have a dictionary with contents like:
{'L\xa3\x93': '\x15\x015\x02\x00\x00\x00\x01\x02\x02\x04\x1f\x01=\x00\x9d\x00^\x00e\x04\x00\x0b', '\\\xe7\xe6': '\x15\x01=\x02\x00\x00\x00\x01\x02\x02\x04\x1f\x01B\x00\xa1\x00_\x00c\x04\x02\x17', '\\\xe8"': '\x15\xff\x1d\x02\x00\x00\x00\x01\x02\x02\x04\x1f\x01:\x00\x98\x00Z\x00_\x04\x02\x0b', '\\\xe6#': '\x15\x014\x02\x00\x00\x00\x01\x02\x02\x04\x1f\x01#\x00\x9c\x00\\\x00b\x04\x00\x0b'}
and I want to know how many Bytes of data I need to send. my index is 6 Bytes but how long is the content? I know here it's 46Bytes per index, so I'd like to know that I need to transmit 4*(6+46) Bytes.... How do I do this best?
Thanks,
Ron
So, does only this give me the real length when I need to transmit the content Byte by Byte?
#non_mem_macs is my dictionary
for idx in non_mem_macs:
non_mem_macs_len += len(hexlify(idx))
non_mem_macs_len += len(hexlify(non_mem_macs[idx]))