Unique session id in python - python

How do I generate a unique session id in Python?

UPDATE: 2016-12-21
A lot has happened in a the last ~5yrs. /dev/urandom has been updated and is now considered a high-entropy source of randomness on modern Linux kernels and distributions. In the last 6mo we've seen entropy starvation on a Linux 3.19 kernel using Ubuntu, so I don't think this issue is "resolved", but it's sufficiently difficult to end up with low-entropy randomness when asking for any amount of randomness from the OS.
I hate to say this, but none of the other solutions posted here are correct with regards to being a "secure session ID."
# pip install M2Crypto
import base64, M2Crypto
def generate_session_id(num_bytes = 16):
return base64.b64encode(M2Crypto.m2.rand_bytes(num_bytes))
Neither uuid() or os.urandom() are good choices for generating session IDs. Both may generate random results, but random does not mean it is secure due to poor entropy. See "How to Crack a Linear Congruential Generator" by Haldir or NIST's resources on Random Number Generation. If you still want to use a UUID, then use a UUID that was generated with a good initial random number:
import uuid, M2Crypto
uuid.UUID(bytes = M2Crypto.m2.rand_bytes(num_bytes)))
# UUID('5e85edc4-7078-d214-e773-f8caae16fe6c')
or:
# pip install pyOpenSSL
import uuid, OpenSSL
uuid.UUID(bytes = OpenSSL.rand.bytes(16))
# UUID('c9bf635f-b0cc-d278-a2c5-01eaae654461')
M2Crypto is best OpenSSL API in Python atm as pyOpenSSL appears to be maintained only to support legacy applications.

You can use the uuid library like so:
import uuid
my_id = uuid.uuid1() # or uuid.uuid4()

Python 3.6 makes most other answers here a bit out of date. Versions including 3.6 and beyond include the secrets module, which is designed for precisely this purpose.
If you need to generate a cryptographically secure string for any purpose on the web, refer to that module.
https://docs.python.org/3/library/secrets.html
Example:
import secrets
def make_token():
"""
Creates a cryptographically-secure, URL-safe string
"""
return secrets.token_urlsafe(16)
In use:
>>> make_token()
'B31YOaQpb8Hxnxv1DXG6nA'

import os, base64
def generate_session():
return base64.b64encode(os.urandom(16))

It can be as simple as creating a random number. Of course, you'd have to store your session IDs in a database or something and check each one you generate to make sure it's not a duplicate, but odds are it never will be if the numbers are large enough.

Related

Is it "cryptographically secure" to seed MicroPython pseudo random number generator with an input from os.urandom?

I am trying to generate true random numbers that would be considered cryptographically secure in MicroPython (a variant of Python that is used for microcontrollers). MicroPython does not currently support Python's secrets library.
I understand that I can use os.urandom to generate cryptographically secure random numbers, but would like to bring in the conveniences of setting minimums, maximums, ranges, choices, etc... that are available in Python's (and MicroPython's) random library.
In order to do this, I am contemplating "seeding" the pseudo random number generator with a sufficiently large input from os.urandom (please see example code below). This code considers some of the concepts described here: https://stackoverflow.com/a/72908523/17870197
What are the security implications of this approach? Would numbers output by this code be considered cryptographically secure?
import os
import random
count = 4
def generate_true_random_int(min_int, max_int):
seed_bytes = os.urandom(32)
seed_int = int.from_bytes(seed_bytes, "big")
random.seed(seed_int)
return random.randint(min_int, max_int)
for x in range(count):
min_int = 1
max_int = 9999
true_random_int = generate_true_random_int(min_int, max_int)
print(true_random_int)

How to see the "Bundle" output of Hypothesis Python library? (Stateful testing)

When using the hypothesis library and performing stateful testing, how can I see or output the Bundle "services" the library is trying on my code?
Example
import hypothesis.strategies as st
from hypothesis.strategies import integers
from hypothesis.stateful import Bundle, RuleBasedStateMachine, rule, precondition
class test_servicediscovery(RuleBasedStateMachine):
services = Bundle('services')
#rule(target=services, s=st.integers(min_value=0, max_value=2))
def add_service(self, s):
return s
The question is: how do I print / see the Bundle "services" variable, generated by the library?
In the example you've given, the services bundle isn't being tried on your code - you're adding things to it, but never using them as inputs to another rule.
If you are, running Hypothesis in verbose mode will show all inputs as they happen; or even in normal mode failing examples will print all the values used.

Nail down system requirements: MIN_CPU_COUNT, MIN_RAM

I develop a server component with Python.
I want to nail down the system requirements:
MIN_CPU_COUNT
MIN_RAM
...
Is there a way (maybe in setup.py) to define something like this?
My software needs at least N CPUs and M RAM?
Why? Because we had trouble in the past because operators moved the server component to a less capable server and we could not ensure the service-level agreement.
I implemented it this way.
Feedback welcome
from django.conf import settings
import psutil
import humanfriendly
from djangotools.utils import testutils
class Check(testutils.Check):
#testutils.skip_if_not_prod
def test_min_cpu_count(self):
min_cpu_count=getattr(settings, 'MIN_CPU_COUNT', None)
self.assertIsNotNone(min_cpu_count, 'settings.MIN_CPU_COUNT not set. Please supply a value.')
self.assertLessEqual(min_cpu_count, psutil.cpu_count())
#testutils.skip_if_not_prod
def test_min_physical_memory(self):
min_physical_memory_orig=getattr(settings, 'MIN_PHYSICAL_MEMORY', None)
self.assertIsNotNone(min_physical_memory_orig, "settings.MIN_PHYSICAL_MEMORY not set. Please supply a value. Example: MIN_PHYSICAL_MEMORY='4G'")
min_physical_memory_bytes=humanfriendly.parse_size(min_physical_memory_orig)
self.longMessage=False
self.assertLessEqual(min_physical_memory_bytes, psutil.virtual_memory().total, 'settings.MIN_PHYSICAL_MEMORY=%r is not satisfied. Total virtual memory of current hardware: %r' % (
min_physical_memory_orig, humanfriendly.format_size(psutil.virtual_memory().total)))

Fixed identifier for a machine (uuid.getnode)

I'm trying to find something I can use as a unique string/number for my script that is fixed in a machine and easily obtainable(cross-platform). I presume a machine would have a network card. I don't need it to be really unique, but the necessary is it should be fixed in a long run and as rare as possible.
I know MAC can be changed and I'd probably make a warning about it in my script, however I don't expect anyone to change MAC each morning.
What I came up with is uuid.getnode(), but in the docs there is:
If all attempts to obtain the hardware address fail, we choose a random 48-bit number
Does it mean that for each function call I get another random number, therefore it's not possible to use it if MAC is unobtainable?
...on a machine with multiple network interfaces the MAC address of any one of them may be returned.
Does this sentence mean getnode() gets a random(or first) MAC from all available? What if it'd get MAC A in first run and MAC B next time? There'd be no problem if I'd get a fixed list(sort, concatenate, tadaaa!)
I'm asking because I have no way how to test it myself.
I managed to test the first part on my android device and on each new python run it created random number, so it's not usable at all for this purpose.
The second problem kind of drowned itself, because if in the docs it mentioned that it may return any one of them, then it's not something you could rely on (+I couldn't find a machine I could test it on). A nice package netifaces came to rescue, which does a similar thing
netifaces.interfaces() # returns e.g. ['lo', 'eth0', 'tun2']
netifaces.ifaddresses('eth0')[netifaces.AF_LINK]
# returns [{'addr': '08:00:27:50:f2:51', 'broadcast': 'ff:ff:ff:ff:ff:ff'}]
However I rather gave up using MACs, I got something rather more stable.
Now to the identifiers:
1) Windows:
Executing this one and getting output may be good enough:
wmic csproduct get UUID
or the one I used and is available in registry (HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography):
import _winreg
registry = _winreg.HKEY_LOCAL_MACHINE
address = 'SOFTWARE\\Microsoft\\Cryptography'
keyargs = _winreg.KEY_READ | _winreg.KEY_WOW64_64KEY
key = _winreg.OpenKey(registry, address, 0, keyargs)
value = _winreg.QueryValueEx(key, 'MachineGuid')
_winreg.CloseKey(key)
unique = value[0]
2) Linux:
/sys/class/dmi/id/board_serial
or
/sys/class/dmi/id/product_uuid
or if not root:
cat /var/lib/dbus/machine-id
3) Android:
If you are working with python and don't want to mess with Java stuff, then this should work pretty good:
import subprocess
cmd = ['getprop', 'ril.serialnumber']
self.unique = subprocess.check_output(cmd)[:-1]
but if you like Java, then go for this answer although even ANDROID_ID's uniqueness is rather debatable if it's allowed to change, therefore a serial number is most likely a safer bet.
Note that like it's already mentioned in the linked answer, even ril.serialnumber can be null/empty or non-existing (missing key). Same thing happens even with the official Android API where it's clearly stated this:
A hardware serial number, if available.
Mac/iPhone:
I couldn't find any solution as I don't have access to any of these, but if there is a variable that holds the machine id value, then you should be able to get there with simple subprocess.check_output()
I wouldn't recommend using a MAC address for a unique machine identifier, since it can change depending on the network being used. Rather, I'd recommend using the machine's native GUID, assigned by the operating system during install. I wrote a small, cross-platform PyPI package that queries a machine's native GUID called py-machineid.
Essentially, it looks like this, but with some Windows-specific WMI registry queries for more a accurate ID. The package also has support for hashing the ID, to anonymize it.
import subprocess
import sys
def run(cmd):
try:
return subprocess.run(cmd, shell=True, capture_output=True, check=True, encoding="utf-8") \
.stdout \
.strip()
except:
return None
def guid():
if sys.platform == 'darwin':
return run(
"ioreg -d2 -c IOPlatformExpertDevice | awk -F\\\" '/IOPlatformUUID/{print $(NF-1)}'",
)
if sys.platform == 'win32' or sys.platform == 'cygwin' or sys.platform == 'msys':
return run('wmic csproduct get uuid').split('\n')[2] \
.strip()
if sys.platform.startswith('linux'):
return run('cat /var/lib/dbus/machine-id') or \
run('cat /etc/machine-id')
if sys.platform.startswith('openbsd') or sys.platform.startswith('freebsd'):
return run('cat /etc/hostid') or \
run('kenv -q smbios.system.uuid')
For Mac/iphone you can try below command:
import subprocess
subprocess.check_output("ioreg -rd1 -c IOPlatformExpertDevice | grep -E '(UUID)'", shell=True).split('"')[-2] # for me i got it on list value -2 if having any trouble try getting it with any alternative list element.
uuid.getnode will return the same value for every call wthinin a single run of your app. If it has to defer to the random algorithm, then you will get a different value when you start a new instance of your app.
The implementation for getNode shows why. This is sort of what the routine looks like in python 3.7 (comments mine, code simplified for clarity)
_node = None
def getnode():
global _node
if _node is not None:
# Return cached value
return _node
# calculate node using platform specific logic like unix functions, ifconfig, etc
_node = _get_node_from_platform()
if not _node:
# couldn't get node id from the system. Just make something up
_node = _get_random_node()
return _node

Determining the Bazaar version number from Python without calling bzr

I have a django (Python) project that needs to know what version its code is on in Bazaar for deployment purposes. This is a web application, so I don't want to do this because it fires off a new subprocess and that's not going to scale.
import subprocess
subprocess.Popen(["bzr", "revno"], stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
Is there a way to parse Bazaar repositories to calculate the version number? Bazaar itself is written in Python and contains this code for calculating the revno, which makes me think it isn't exactly trivial.
rh = self.revision_history()
revno = len(rh)
Edit: Final fix
from bzrlib.branch import BzrBranch
branch = BzrBranch.open_containing('.')[0]
revno = len(branch.revision_history())
Edit: Final fix but for real this time
from bzrlib.branch import BzrBranch
branch = BzrBranch.open_containing('.')[0]
revno = branch.last_revision_info()[0]
You can use Bazaar's bzrlib API to get information about any given Bazaar repository.
>>> from bzrlib.branch import BzrBranch
>>> branch = BzrBranch.open('.')
>>> branch.last_revision_info()
More examples are available here.
Do it once and cache the result (in a DB/file, if need be)? I doubt the version is going to change that much.

Categories