Convert random characters in a string to uppercase - python

I'm try to append text strings randomly so that instead of just having an output like
>>>david
I will end up having something like
>>>DaViD
>>>dAviD
the code i have right now is this
import random
import string
print "Name Year"
text_file = open("names.txt", "r")
for line in text_file:
print line.strip()+"".join([random.choice(string.digits) for x in range(1, random.randint(1,9))])
and it outports this
>>>JOHN01361
I want that string to be somthing like
>>>jOhN01361
>>>john01361
>>>JOHN01361
>>>JoHn01361

Well, your specification is actually to randomly uppercase characters, and if you were so inclined, you could achieve that with the following list comprehension:
import random
s = "..."
s = "".join( random.choice([k.upper(), k ]) for k in s )
but there may be nicer ways ...

you probably want to do something like:
import random
lol = "lol apples"
def randomupper(c):
if random.random() > 0.5:
return c.upper()
return c.lower()
lol =''.join(map(randomupper, lol))
EDIT:
As pointed out by Shawn Chin in the comments, this can be simplified to:
lol = "".join((c.upper(), c)[random() > 0.5] for c in lol)
Very cool and, but slower than using map.
EDIT 2:
running some timer tests, it seems that
"".join( random.choice([k.upper(), k ]) for k in s )
is over 5 times slower than the map method, can anyone confirm this?
Times are:
no map: 5.922078471303955
map: 4.248832001003303
random.choice: 25.282491881882898

The following might be slightly more efficient than Nook's solution, also it doesn't rely on the text being lower-case to start with:
import random
txt = 'JOHN01361'
''.join(random.choice((x,y)) for x,y in zip(txt.upper(),txt.lower()))

Timing different implementations just for fun:
#!/usr/bin/env python
import random
def f1(s):
return ''.join(random.choice([x.upper(), x]) for x in s)
def f2(s):
return ''.join((x.upper(), x)[random.randint(0, 1)] for x in s)
def f3(s):
def randupper(c):
return random.random() > 0.5 and c.upper() or c
return ''.join(map(randupper, s))
def f4(s):
return ''.join(random.random() > 0.5 and x.upper() or x for x in s)
if __name__ == '__main__':
import timeit
timethis = ['f1', 'f2', 'f3', 'f4']
s = 'habia una vez... truz'
for f in timethis:
print '%s: %s' % (f,
timeit.repeat('%s(s)' % f, 'from __main__ import %s, s' % f,
repeat=5, number=1000))
This are my times:
f1: [0.12144303321838379, 0.13189697265625, 0.13808107376098633, 0.11335396766662598, 0.11961007118225098]
f2: [0.22459602355957031, 0.23735499382019043, 0.19971895217895508, 0.2097780704498291, 0.22068285942077637]
f3: [0.044358015060424805, 0.051508903503417969, 0.045358896255493164, 0.047426939010620117, 0.042778968811035156]
f4: [0.04383397102355957, 0.039394140243530273, 0.039273977279663086, 0.045912027359008789, 0.039510011672973633]

Related

Need help regarding random string generation python

What I want is to generate a string in this specific format: l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l With each l and d being a different string or number.
The issue is when I try to generate, the whole thing is the same value/string. But I want it different.
Here is an example:
What I am getting:
lll9999l9llll9999l9llll9999l9llll9999l9l
What I need:
bfb7491w3anfr4530x2zzbg9891u2rbep8421m9s
def id_gen():
l = random.choice(string.ascii_lowercase)
d = random.choice(string.digits)
id = l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l+l+l+l+d+d+d+d+l+d+l
print(id)
The result:
lll9999l9llll9999l9llll9999l9llll9999l9l
I need this to generate something different :)
This seems to work for me:
def gen_id() :
pattern = 'lllddddldllllddddldllllddddldllllddddldl'
digits = [random.choice(string.digits) for i in range(len(pattern))]
letters = [random.choice(string.ascii_lowercase) for i in range(len(pattern))]
return ''.join( digits[i] if pattern[i] == 'd' else letters[i] for i in range(len(pattern)) )
testing:
>>> gen_id()
'lnx1066k0hnrd5409d1nhgo1254t6rzyw5165f8v'
>>> gen_id()
'sbc7119f4ythd8845i1afay1900f4wjcv0659b4e'
>>> gen_id()
'yan6228r0nebj5097y7jnwh7065s7osra0391j5f'
>>>
seems different enough... please, don't forget to import string, random =)
To not consume the random generator, IMHO this is the best solution:
def gen_id(pattern) :
l = len(pattern)
d = pattern.count('d')
digits = random.choices(string.digits, d)
letters = random.choices(string.ascii_lowercase, l-d)
return ''.join( digits.pop() if pattern[i] == 'd' else letters.pop() for i in range(l) )
You can use this to get a random combination of letters and digits in the desired order:
def letter():
return random.choice(string.ascii_lowercase)
def digit():
return random.choice(string.digits)
def id_gen():
return letter() + digit() + letter() + letter() # ldll

Finding most sequences of specified length

I'm trying to write python code that will take a string and a length, and search through the string to tell me which sub-string of that particular length occurs the most, prioritizing the first if there's a tie.
For example, "cadabra abra" 2 should return ab
I tried:
import sys
def main():
inputstring = str(sys.argv[1])
length = int(sys.argv[2])
Analyze(inputstring, length)
def Analyze(inputstring, length):
count = 0;
runningcount = -1;
sequence = ""
substring = ""
for i in range(0, len(inputstring)):
substring = inputstring[i:i+length]
for j in range(i+length,len(inputstring)):
#print(runningcount)
if inputstring[j:j+2] == substring:
print("runcount++")
runningcount += 1
print(runningcount)
if runningcount > count:
count = runningcount
sequence = substring
print(sequence)
main()
But can't seem to get it to work. I know I'm at least doing something wrong with the counts, but I'm not sure what. This is my first program in Python too, but I think my problem is probably more with the algorithm than the syntax.
Try to use built-in method, they will make your life easier, this way:
>>> s = "cadabra abra"
>>> x = 2
>>> l = [s[i:i+x] for i in range(len(s)-x+1)]
>>> l
['ca', 'ad', 'da', 'ab', 'br', 'ra', 'a ', ' a', 'ab', 'br', 'ra']
>>> max(l, key=lambda m:s.count(m))
'ab'
EDIT:
Much simpler syntax as per Stefan Pochmann comment:
>>> max(l, key=s.count)
import sys
from collections import OrderedDict
def main():
inputstring = sys.argv[1]
length = int(sys.argv[2])
analyze(inputstring, length)
def analyze(inputstring, length):
d = OrderedDict()
for i in range(0, len(inputstring) - length + 1):
substring = inputstring[i:i+length]
if substring in d:
d[substring] += 1
else:
d[substring] = 1
maxlength = max(d.values())
for k,v in d.items():
if v == maxlength:
print(k)
break
main()
Pretty good stab at a solution for a first Python program. As you learn the language, spend some time reading the excellent documentation. It is full of examples and tips.
For example, the standard library includes a Counter class for counting things (obviously) and an OrderedDict class which remebers the ording in which keys are entered. But the documentation includes an example that combines the two to make an OrderedCounter, which can be used to solve you problem like this:
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
pass
def analyze(s, n):
substrings = (s[i:i+n] for i in range(len(s)-n+1))
counts = OrderedCounter(substrings)
return max(counts.keys(), key=counts.__getitem__)
analyze("cadabra abra", 2)

Printing alphabets advanced by n in Python

how can i write a python program to intake some alphabets in and print out (alphabets+n) in the output. Example
my_string = 'abc'
expected_output = 'cde' # n=2
One way I've thought is by using str.maketrans, and mapping the original input to (alphabets + n). Is there any other way?
PS: xyz should translate to abc
I've tried to write my own code as well for this, (apart from the infinitely better answers mentioned):
number = 2
prim = """abc! fgdf """
final = prim.lower()
for x in final:
if(x =="y"):
print("a", end="")
elif(x=="z"):
print("b", end="")
else:
conv = ord(x)
x = conv+number
print(chr(x),end="")
Any comments on how to not convert special chars? thanks
If you don't care about wrapping around, you can just do:
def shiftString(string, number):
return "".join(map(lambda x: chr(ord(x)+number),string))
If you do want to wrap around (think Caesar chiffre), you'll need to specify a start and an end of where the alphabet begins and ends:
def shiftString(string, number, start=97, num_of_symbols=26):
return "".join(map(lambda x: chr(((ord(x)+number-start) %
num_of_symbols)+start) if start <= ord(x) <= start+num_of_symbols
else x,string))
That would, e.g., convert abcxyz, when given a shift of 2, into cdezab.
If you actually want to use it for "encryption", make sure to exclude non-alphabetic characters (like spaces etc.) from it.
edit: Shameless plug of my Vignère tool in Python
edit2: Now only converts in its range.
How about something like
>>> my_string = "abc"
>>> n = 2
>>> "".join([ chr(ord(i) + n) for i in my_string])
'cde'
Note As mentioned in comments the question is bit vague about what to do when the edge cases are encoundered like xyz
Edit To take care of edge cases, you can write something like
>>> from string import ascii_lowercase
>>> lower = ascii_lowercase
>>> input = "xyz"
>>> "".join([ lower[(lower.index(i)+2)%26] for i in input ])
'zab'
>>> input = "abc"
>>> "".join([ lower[(lower.index(i)+2)%26] for i in input ])
'cde'
I've made the following change to the code:
number = 2
prim = """Special() ops() chars!!"""
final = prim.lower()
for x in final:
if(x =="y"):
print("a", end="")
elif(x=="z"):
print("b", end="")
elif (ord(x) in range(97, 124)):
conv = ord(x)
x = conv+number
print(chr(x),end="")
else:
print(x, end="")
**Output**: urgekcn() qru() ejctu!!
test_data = (('abz', 2), ('abc', 3), ('aek', 26), ('abcd', 25))
# translate every character
def shiftstr(s, k):
if not (isinstance(s, str) and isinstance(k, int) and k >=0):
return s
a = ord('a')
return ''.join([chr(a+((ord(c)-a+k)%26)) for c in s])
for s, k in test_data:
print(shiftstr(s, k))
print('----')
# translate at most 26 characters, rest look up dictionary at O(1)
def shiftstr(s, k):
if not (isinstance(s, str) and isinstance(k, int) and k >=0):
return s
a = ord('a')
d = {}
l = []
for c in s:
v = d.get(c)
if v is None:
v = chr(a+((ord(c)-a+k)%26))
d[c] = v
l.append(v)
return ''.join(l)
for s, k in test_data:
print(shiftstr(s, k))
Testing shiftstr_test.py (above code):
$ python3 shiftstr_test.py
cdb
def
aek
zabc
----
cdb
def
aek
zabc
It covers wrapping.

How to compress by removing duplicates in python?

I have strings with blocks of the same character in, eg '1254,,,,,,,,,,,,,,,,982'. What I'm aiming to do is replace that with something along the lines of '1254(,16)982' so that the original string can be reconstructed. If anyone could point me in the right direction that would be greatly appreciated
You're looking for run-length encoding: here is a Python implementation based loosely on this one.
import itertools
def runlength_enc(s):
'''Return a run-length encoded version of the string'''
enc = ((x, sum(1 for _ in gp)) for x, gp in itertools.groupby(s))
removed_1s = [((c, n) if n > 1 else c) for c, n in enc]
joined = [["".join(g)] if n == 1 else list(g)
for n, g in itertools.groupby(removed_1s, key=len)]
return list(itertools.chain(*joined))
def runlength_decode(enc):
return "".join((c[0] * c[1] if len(c) == 2 else c) for c in enc)
For your example:
print runlength_enc("1254,,,,,,,,,,,,,,,,982")
# ['1254', (',', 16), '982']
print runlength_decode(runlength_enc("1254,,,,,,,,,,,,,,,,982"))
# 1254,,,,,,,,,,,,,,,,982
(Note that this will be efficient only if there are very long runs in your string).
If you don't care about the exact compressed form you may want to look at zlib.compress and zlib.decompress. zlibis a standard Python library that can compress a single string and will probably get better compression than a self implemented compression algorithm.
using regular expressions:
s = '1254,,,,,,,,,,,,,,,,982'
import re
c = re.sub(r'(.)\1+', lambda m: '(%s%d)' % (m.group(1), len(m.group(0))), s)
print c # 1254(,16)982
using itertools
import itertools
c = ''
for chr, g in itertools.groupby(s):
k = len(list(g))
c += chr if k == 1 else '(%s%d)' % (chr, k)
print c # 1254(,16)982

removing non-numeric characters from a string

strings = ["1 asdf 2", "25etrth", "2234342 awefiasd"] #and so on
Which is the easiest way to get [1, 25, 2234342]?
How can this be done without a regex module or expression like (^[0-9]+)?
One could write a helper function to extract the prefix:
def numeric_prefix(s):
n = 0
for c in s:
if not c.isdigit():
return n
else:
n = n * 10 + int(c)
return n
Example usage:
>>> strings = ["1asdf", "25etrth", "2234342 awefiasd"]
>>> [numeric_prefix(s) for s in strings]
[1, 25, 2234342]
Note that this will produce correct output (zero) when the input string does not have a numeric prefix (as in the case of empty string).
Working from Mikel's solution, one could write a more concise definition of numeric_prefix:
import itertools
def numeric_prefix(s):
n = ''.join(itertools.takewhile(lambda c: c.isdigit(), s))
return int(n) if n else 0
new = []
for item in strings:
new.append(int(''.join(i for i in item if i.isdigit())))
print new
[1, 25, 2234342]
Basic usage of regular expressions:
import re
strings = ["1asdf", "25etrth", "2234342 awefiasd"]
regex = re.compile('^(\d*)')
for s in strings:
mo = regex.match(s)
print s, '->', mo.group(0)
1asdf -> 1
25etrth -> 25
2234342 awefiasd -> 2234342
Building on sahhhm's answer, you can fix the "1 asdf 1" problem by using takewhile.
from itertools import takewhile
def isdigit(char):
return char.isdigit()
numbers = []
for string in strings:
result = takewhile(isdigit, string)
resultstr = ''.join(result)
if resultstr:
number = int(resultstr)
if number:
numbers.append(number)
So you only want the leading digits? And you want to avoid regexes? Probably there's something shorter but this is the obvious solution.
nlist = []
for s in strings:
if not s or s[0].isalpha(): continue
for i, c in enumerate(s):
if not c.isdigit():
nlist.append(int(s[:i]))
break
else:
nlist.append(int(s))

Categories