I am trying to create a loop where I can generate string using loop. What I am trying to achieve is that I want to create a small collection of strings starting from 1 character to up to 5 characters.
So, starting from sting 1, I want to go to 55555 but this is number so it seems easy if I just add them, but when it comes to alpha numeric, it gets tricky.
Here is explanation,
I have collection of alpha-numeric chars as string s = "123ABC" and what I want to do is that I want to create all possible 1 character string out of it, so I will have 1,2,3,A,B,C and after that I want to add one more digit in length of string so I can get 11, 12, 13 and so on until I get all possible combination out of it up to CA, CB, CC and I want to get it up to CCCCCC. I am confused in loop because I can get it to generate a temp sting but looping inside to rotate characters is tricky,
this is what I have done so far,
i = 0
strr = "123ABC"
while i < len(strr):
t = strr[0] * (i+1)
for q in range(0, len(t)):
# Here I need help to rotate more
pass
i += 1
Can anyone explain me or point me to resource where I can find solution for it?
You may want to use itertools.permutations function:
import itertools
chars = '123ABC'
for i in xrange(1, len(chars)+1):
print list(itertools.permutations(chars, i))
EDIT:
To get a list of strings, try this:
import itertools
chars = '123ABC'
strings = []
for i in xrange(1, len(chars)+1):
strings.extend(''.join(x) for x in itertools.permutations(chars, i))
This is a nested loop. Different depths of recursion produce all possible combinations.
strr = "123ABC"
def prod(items, level):
if level == 0:
yield []
else:
for first in items:
for rest in prod(items, level-1):
yield [first] + rest
for ln in range(1, len(strr)+1):
print("length:", ln)
for s in prod(strr, ln):
print(''.join(s))
It is also called cartesian product and there is a corresponding function in itertools.
Related
How do I take the first character from each string in a list, join them together, then the second character from each string, join them together, and so on - and eventually create one combined string?
eg. if I have strings like these:
homanif
eiesdnt
ltiwege
lsworar
I want the end result to be helloitsmeiwaswonderingafter
I put together a very hackneyed version of this which does the job but produces an extra line of gibberish. Considering this is prone to index going out of range, I don't think this is a good approach:
final_c = ['homanif', 'eiesdnt', 'ltiwege', 'lsworar']
final_message = ""
current_char = 0
for i in range(len(final_c[1])):
for c in final_c:
final_message += c[current_char]
current_char += 1
final_message += final_c[0][:-1]
print(final_message)
gives me helloitsmeiwaswonderingafterhomani when it should simply stop at helloitsmeiwaswonderingafter.
How do I improve this?
Problems related to iterating in some convoluted order can often be solved elegantly with itertools.
Using zip
You can use zip and itertools.chain together.
from itertools import chain
final_c = ['homanif', 'eiesdnt', 'ltiwege', 'lsworar']
final_message = ''.join(chain.from_iterable(zip(*final_c))) # 'helloitsmeiwaswonderingafter'
In the event you needed the strings in final_c to be of different lengths, you could tweak your code a bit by using itertools.zip_longest.
final_message = ''.join(filter(None, chain.from_iterable(zip_longest(*final_c))))
Using cycle
The fun part with itertools is that it offers plenty of clever short solutions for iterating over objects. Here is another using itertools.cycle.
from itertools import cycle
final_c = ['homanif', 'eiesdnt', 'ltiwege', 'lsworara']
final_message = ''.join(next(w) for w in cycle(iter(w) for w in final_c))
You can use a nested comprehension:
x = ["homanif",
"eiesdnt",
"ltiwege",
"lsworar"]
y = "".join(x[i][j]
for j in range(len(x[0]))
for i in range(len(x)))
or use nested joins and zip
y = "".join("".join(y) for y in zip(*x))
Here is a code that works for me :
final_c = ["homanif", "eiesdnt", "ltiwege", "lsworar"]
final_message = ""
current_char = 0
for i in range(len(final_c[1])):
for c in final_c:
final_message += c[current_char]
current_char += 1
# final_message += final_c[0][:-1]
print(final_message)
I hope it helps
I don't understand what you are expecting with the line
final_message += final_c[0][:-1]
The code works just fine without that. Either remove that line or go with something like list comprehensions :
final_message = "".join(final_c[i][j] for j in range(len(final_c[0])) for i in range(len(final_c)))
This gives the expected output:
helloitsmeiwaswonderingafter
looks like you can have a matrix of form nxm where n is the number of words and m is the number of character in a word (the following code will work if all your words have the same length)
import numpy as np
n = len(final_c) # number of words in your list
m = len(final_c[0]) # number of character in a word
array = np_array(''.join([list(w) for w in ''.join(final_c)])
# reshape the array
matrix = array.reshape(n,1,m )
''.join(matrix.transpose().flatten())
I'm trying to get how many any character repeats in a word. The repetitions must be sequential.
For example, the method with input "loooooveee" should return 6 (4 times 'o', 2 times 'e').
I'm trying to implement string level functions and I can do it this way but, is there an easy way to do this? Regex, or some other sort of things?
Original question: order of repetition does not matter
You can subtract the number of unique letters by the number of total letters. set applied to a string will return a unique collection of letters.
x = "loooooveee"
res = len(x) - len(set(x)) # 6
Or you can use collections.Counter, subtract 1 from each value, then sum:
from collections import Counter
c = Counter("loooooveee")
res = sum(i-1 for i in c.values()) # 6
New question: repetitions must be sequential
You can use itertools.groupby to group sequential identical characters:
from itertools import groupby
g = groupby("aooooaooaoo")
res = sum(sum(1 for _ in j) - 1 for i, j in g) # 5
To avoid the nested sum calls, you can use itertools.islice:
from itertools import groupby, islice
g = groupby("aooooaooaoo")
res = sum(1 for _, j in g for _ in islice(j, 1, None)) # 5
You could use a regular expression if you want:
import re
rx = re.compile(r'(\w)\1+')
repeating = sum(x[1] - x[0] - 1
for m in rx.finditer("loooooveee")
for x in [m.span()])
print(repeating)
This correctly yields 6 and makes use of the .span() function.
The expression is
(\w)\1+
which captures a word character (one of a-zA-Z0-9_) and tries to repeat it as often as possible.
See a demo on regex101.com for the repeating pattern.
If you want to match any character (that is, not only word characters), change your expression to:
(.)\1+
See another demo on regex101.com.
try this:
word=input('something:')
sum = 0
chars=set(list(word)) #get the set of unique characters
for item in chars: #iterate over the set and output the count for each item
if word.count(char)>1:
sum+=word.count(char)
print('{}|{}'.format(item,str(word.count(char)))
print('Total:'+str(sum))
EDIT:
added total count of repetitions
Since it doesn't matter where the repetition is occurring or which characters are being repeated, you can make use of the set data structure provided in Python. It will discard the duplicate occurrences of any character or an object.
Therefore, the solution would look something like this:
def measure_normalized_emphasis(text):
return len(text) - len(set(text))
This will give you the exact result.
Also, make sure to look out for some edge cases, which you should as it is a good practice.
I think your code is comparing the wrong things
You start by finding the last character:
char = text[-1]
Then you compare this to itself:
for i in range(1, len(text)):
if text[-i] == char: #<-- surely this is test[-1] to begin with?
Why not just run through the characters:
def measure_normalized_emphasis(text):
char = text[0]
emphasis_size = 0
for i in range(1, len(text)):
if text[i] == char:
emphasis_size += 1
else:
char = text[i]
return emphasis_size
This seems to work.
I need to make sequence of random strings, which increase(decrease) for alphabetic oder. For example: "ajikfk45kJDk", "bFJIPH7CDd", "c".
The simplest thing to do is to create N random strings and then sort them.
So, how do you create a random string? Well, you haven't specified what your rule is, but your three examples are strings of 1 to 12 characters taken from the set of ASCII lowercase, uppercase, and digits, so let's do that.
length = random.randrange(1, 13)
letters = random.choices(string.ascii_letters + string.digits, k=length)
string = ''.join(letters)
So, just do this N times, then sort it.
Putting it together:
chars = string.ascii_letters + string.digits
def make_string():
return ''.join(random.choices(chars, k=random.randrange(1, 13)))
def make_n_strings(n):
return sorted(make_string() for _ in range(n))
This should be simple enough that you can customize it however you want. Want case-insensitive sorting? Just add key=str.upper to the sorted. Want some other distribution of lengths? Just replace the randrange. And so on.
You can use the chr() Python 3 function in a loop while generating random number in the ASCII category range you want.
You can find all the ASCII categories here or on Wikipedia.
For exemple :
chr(99)
-> equals c
More information about the chr() function on Python 3 official documentation.
The simplest way I can think of is
from random import randint
a = ''.join(sorted([chr(randint(33,127)) for i in range(randint(1,20))], reverse = False))
print(a)
reverse = True makes it descending
There's a lot of ways to do that, and this an easy and simple example to do that in Python 3 using Ascii char codes:-
from random import randint
def generateString(minLength, maxLength):
result = "";
resultLength = randint(minLength, maxLength)
for i in range(resultLength):
charType = randint(1,3)
if(charType == 1):
#number
result += chr(randint(48, 57))
elif(charType == 2):
#upper letter
result += chr(randint(65, 90))
elif(charType == 3):
#lower letter
result += chr(randint(97, 122))
return result;
#Example
print(generateString(1,20))
From any *.fasta DNA sequence (only 'ACTG' characters) I must find all sequences which contain at least one repetition of each letter.
For examle from sequence 'AAGTCCTAG' I should be able to find: 'AAGTC', 'AGTC', 'GTCCTA', 'TCCTAG', 'CCTAG' and 'CTAG' (iteration on each letter).
I have no clue how to do that in pyhton 2.7. I was trying with regular expressions but it was not searching for every variants.
How can I achive that?
You could find all substrings of length 4+, and then down select from those to find only the shortest possible combinations that contain one of each letter:
s = 'AAGTCCTAG'
def get_shortest(s):
l, b = len(s), set('ATCG')
options = [s[i:j+1] for i in range(l) for j in range(i,l) if (j+1)-i > 3]
return [i for i in options if len(set(i) & b) == 4 and (set(i) != set(i[:-1]))]
print(get_shortest(s))
Output:
['AAGTC', 'AGTC', 'GTCCTA', 'TCCTAG', 'CCTAG', 'CTAG']
This is another way you can do it. Maybe not as fast and nice as chrisz answere. But maybe a little simpler to read and understand for beginners.
DNA='AAGTCCTAG'
toSave=[]
for i in range(len(DNA)):
letters=['A','G','T','C']
j=i
seq=[]
while len(letters)>0 and j<(len(DNA)):
seq.append(DNA[j])
try:
letters.remove(DNA[j])
except:
pass
j+=1
if len(letters)==0:
toSave.append(seq)
print(toSave)
Since the substring you are looking for may be of about any length, a LIFO queue seems to work. Append each letter at a time, check if there are at least one of each letters. If found return it. Then remove letters at the front and keep checking until no longer valid.
def find_agtc_seq(seq_in):
chars = 'AGTC'
cur_str = []
for ch in seq_in:
cur_str.append(ch)
while all(map(cur_str.count,chars)):
yield("".join(cur_str))
cur_str.pop(0)
seq = 'AAGTCCTAG'
for substr in find_agtc_seq(seq):
print(substr)
That seems to result in the substrings you are looking for:
AAGTC
AGTC
GTCCTA
TCCTAG
CCTAG
CTAG
I really wanted to create a short answer for this, so this is what I came up with!
See code in use here
s = 'AAGTCCTAG'
d = 'ACGT'
c = len(d)
while c <= len(s):
x,c = s[:c],c+1
if all(l in x for l in d):
print(x)
s,c = s[1:],len(d)
It works as follows:
c is set to the length of the string of characters we are ensuring exist in the string (d = ACGT)
The while loop iterates over each possible substring of s such that c is smaller than the length of s.
This works by increasing c by 1 upon each iteration of the while loop.
If every character in our string d (ACGT) exist in the substring, we print the result, reset c to its default value and slice the string by 1 character from the start.
The loop continues until the string s is shorter than d
Result:
AAGTC
AGTC
GTCCTA
TCCTAG
CCTAG
CTAG
To get the output in a list instead (see code in use here):
s = 'AAGTCCTAG'
d = 'ACGT'
c,r = len(d),[]
while c <= len(s):
x,c = s[:c],c+1
if all(l in x for l in d):
r.append(x)
s,c = s[1:],len(d)
print(r)
Result:
['AAGTC', 'AGTC', 'GTCCTA', 'TCCTAG', 'CCTAG', 'CTAG']
If you can break the sequence into a list, e.g. of 5-letter sequences, you could then use this function to find repeated sequences.
from itertools import groupby
import numpy as np
def find_repeats(input_list, n_repeats):
flagged_items = []
for item in input_list:
# Create itertools.groupby object
groups = groupby(str(item))
# Create list of tuples: (digit, number of repeats)
result = [(label, sum(1 for _ in group)) for label, group in groups]
# Extract just number of repeats
char_lens = np.array([x[1] for x in result])
# Append to flagged items
if any(char_lens >= n_repeats):
flagged_items.append(item)
# Return flagged items
return flagged_items
#--------------------------------------
test_list = ['aatcg', 'ctagg', 'catcg']
find_repeats(test_list, n_repeats=2) # Returns ['aatcg', 'ctagg']
I am fairly new to programming and have been learning some of the material through HackerRank. However, there is this one objective or challenge that I am currently stuck on. I've tried several things but still cannot figure out what exactly I am doing wrong.
Objective: Read N and output the numbers between 0 and N without any white spaces or using a string method.
N = int(input())
listofnum = []
for i in range(1, N +1):
listofnum.append(i)
print (*(listofnum))
Output :
1 2 3
N = int(input())
answer = ''
for i in range(1, N + 1):
answer += str(i)
print(answer)
This is the closest I can think of to 'not using any string methods', although technically it is using str.__new__/__init__/__add__ in the background or some equivalent. I certainly think it fits the requirements of the question better than using ''.join.
Without using any string method, just using integer division and list to reverse the digits, print them using sys.stdout.write:
import sys
N = int(input())
for i in range(1,N+1):
l=[]
while(i):
l.append(i%10)
i //= 10
for c in reversed(l):
sys.stdout.write(chr(c+48))
Or as tdelaney suggested, an even more hard-code method:
import os,sys,struct
N = int(input())
for i in range(1,N+1):
l=[]
while(i):
l.append(i%10)
i //= 10
for c in reversed(l):
os.write(sys.stdout.fileno(), struct.pack('b', c+48))
All of this is great fun, but the best way, though, would be with a one-liner with a generator comprehension to do that, using str.join() and str construction:
"".join(str(x) for x in range(1,N+1))
Each number is converted into string, and the join operator just concatenates all the digits with empty separator.
You can print numbers inside the loop. Just use end keyword in print:
print(i, end="")
Try ''.join([str(i) for i in range(N)])
One way to accomplish this is to append the numbers to a blank string.
out = ''
for i in range(N):
out += str(i)
print(out)
You can make use of print()'s sep argument to "bind" each number together from a list comprehension:
>>> print(*[el for el in range(0, int(input())+1)], sep="")
10
012345678910
>>>
You have to do a simple math to do this. What they expect to do is multiply each of your list elements by powers of ten and add them up on each other. As an example let's say you have an array;
a = [2,3,5]
and you need to output;
235
Then you multiply each of loop elements starting from right to left by 10^0, 10^1 and 10^2. You this code after you make the string list.
a = map(int,a)
for i in range(len(a)):
sum += (10**i)*a[-i]
print sum
You are done!