I have a list of 22000 strings like abc.wav . I want to take out a specific character from it in python like a character which is before .wav from all the files. How to do that in python ?
finding the spot of a character could be .split(), but if you want to pull up a specific spot in a string, you could use list[stringNum[letterNum]]. And then list[stringNum].split("a") would get two or more separate strings that are on the other side of the letter "a". Using those strings you could get the spots by measuring the length of the string versus the length of the strings outside of a and compare where those spots were taken. Just a simple algorithm idea ig. You'd have to play around with it.
I am assuming you are trying to reconstruct the same string without the letter before the extension.
resultList = []
for item in list:
newstr = item.split('.')[0]
extstr = item.split('.')[1]
locstr = newstr[:-1] <--- change the selection here depending on the char you want to remove
endstr = locstr + extstr
resultList.append(endstr)
If you are trying to just save a list of the letters you remove only, do the following:
resultList = []
for item in list:
newstr = item.split('.')[0]
endstr = newstr[-1]
resultList.append(endstr)
df= pd.DataFrame({'something':['asb.wav','xyz.wav']})
df.something.str.extract("(\w*)(.wav$)",expand=True)
Gives:
0 1
0 asb .wav
1 xyz .wav
I am using a Python dictionary to read DNA bases into codons however want the program to recognise if nonsense is inputted into this. Currently all that happens is there is a key error when using something like
"codon += cod[F[x]]"
Is there a way to search the string of bases (AGCTATATCAT) (for example) for strings not found in the dictionary? For example if other characters that aren't ACGT were in it how would I detect this?
Thanks
You can check if all characters in a string are in a given set by doing the following:
if set(string).difference(set("AGCT")):
# There are characters other then 'AGCT' in string
else:
# All characters in the string are one of "AGCT"
A quick way to validate that every char in a string is a valid base is to use the set.issuperset method. Eg,
valid_bases = set('ACGT')
for s in ('AGCTAT', 'ATCQAT'):
print(s, valid_bases.issuperset(s))
output
AGCTAT True
ATCQAT False
If you wish to identify the illegal chars, you can use set difference:
valid_bases = set('ACGT')
for s in ('AGCTAT', 'ATCQAT', 'ATCQAZT'):
bad = set(s) - valid_bases
print(s, bad or "ok")
output
AGCTAT ok
ATCQAT {'Q'}
ATCQAZT {'Z', 'Q'}
Is there a way to add comments into a multiline string, or is it not possible? I'm trying to write data into a csv file from a triple-quote string. I'm adding comments in the string to explain the data. I tried doing this, but Python just assumed that the comment was part of the string.
"""
1,1,2,3,5,8,13 # numbers to the Fibonnaci sequence
1,4,9,16,25,36,49 # numbers of the square number sequence
1,1,2,5,14,42,132,429 # numbers in the Catalan number sequence
"""
No, it's not possible to have comments in a string. How would python know that the hash sign # in your string is supposed to be a comment, and not just a hash sign? It makes a lot more sense to interpret the # character as part of the string than as a comment.
As a workaround, you can make use of automatic string literal concatenation:
(
"1,1,2,3,5,8,13\n" # numbers to the Fibonnaci sequence
"1,4,9,16,25,36,49\n" # numbers of the square number sequence
"1,1,2,5,14,42,132,429" # numbers in the Catalan number sequence
)
If you add comments into the string, they become part of the string. If that weren't true, you'd never be able to use a # character in a string, which would be a pretty serious problem.
However, you can post-process the string to remove comments, as long as you know this particular string isn't going to have any other # characters.
For example:
s = """
1,1,2,3,5,8,13 # numbers to the Fibonnaci sequence
1,4,9,16,25,36,49 # numbers of the square number sequence
1,1,2,5,14,42,132,429 # numbers in the Catalan number sequence
"""
s = re.sub(r'#.*', '', s)
If you also want to remove trailing whitespace before the #, change the regex to r'\s*#.*'.
If you don't understand what these regexes are matching and how, see regex101 for a nice visualization.
If you plan to do this many times in the same program, you can even use a trick similar to the popular D = textwrap.dedent idiom:
C = functools.partial(re.sub, r'#.*', '')
And now:
s = C("""
1,1,2,3,5,8,13 # numbers to the Fibonnaci sequence
1,4,9,16,25,36,49 # numbers of the square number sequence
1,1,2,5,14,42,132,429 # numbers in the Catalan number sequence
""")
The part where I need to go from the number values I obtained to characters to spell out a word it not working, it says I need to use an integer for the last part?
accept string
print "This program reduces and decodes a coded message and determines if it is a palindrome"
string=(str(raw_input("The code is:")))
change it to lower case
string_lowercase=string.lower()
print "lower case string is:", string_lowercase
strip special characters
specialcharacters="1234567890~`!##$%^&*()_-+={[}]|\:;'<,>.?/"
for char in specialcharacters:
string_lowercase=string_lowercase.replace(char,"")
print "With the specials stripped out the string is:", string_lowercase
input offset
offset=(int(raw_input("enter offset:")))
conversion of text to ASCII code
result=[]
for i in string_lowercase:
code=ord(i)
result.append([code-offset])
conversion from ASCII code to text
text=''.join(chr(i) for i in result)
print "The decoded string is:", text.format(chr(result))
It looks like you have a list of lists instead of a list of ints when you call result.append([code-offset]). This means later when you call chr(i) for i in result, you are passing a list instead of an int to chr().
Try changing this to result.append(code-offset).
Other small suggestions:
raw_input already gives you a string, so there's no need to explicitly cast it.
Your removal of special characters can be more efficiently written as:
special_characters = '1234567890~`!##$%^&*()_-+={[}]|\:;'<,>.?/'
string_lowercase = ''.join(c for c in string_lowercase if string not in special_characters)
This allows you to only have to iterate through string_lowercase once instead of per character in special_characters.
While doing .append() to list, use code-offset instead of [code-offset]. As in later you are storing the value as a list (of one ASCII) instead of storing the ASCII value directly.
Hence your code should be:
result = []
for i in string_lowercase:
code = ord(i)
result.append(code-offset)
However you may simplified this code as:
result = [ord(ch)-offset for ch in string_lowercase]
You may even further simplify your code. The one line to get decoded string will be:
decoded_string = ''.join(chr(ord(ch)-offset) for ch in string_lowercase)
Example with offset as 2:
>>> string_lowercase = 'abcdefghijklmnopqrstuvwxyz'
>>> offset = 2
>>> decoded_string = ''.join(chr(ord(ch)-offset) for ch in string_lowercase)
>>> decoded_string
'_`abcdefghijklmnopqrstuvwx'
You are passing a list to chr when it only accepts integers. Try result.append(code-offset). [code-offset] is a one-item list.
Specifically, instead of:
result=[]
for i in string_lowercase:
code=ord(i)
result.append([code-offset])
use:
result=[]
for i in string_lowercase:
code=ord(i)
result.append(code-offset)
If you understand list comprehension, this works too: result = [ord(i)-offset for i in string_lowercase]
If I am building a basic encryption program in python that reassigns A to C and D to F and so on, what is a simple algorithm I could use to do this?
I have a list named alphabet that holds each letter, then a variable that takes in the user input to change to the encrypted version.
str.translate should be the easiest way:
table = str.maketrans(
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",
"cdefghijklmnopqrstuvwxyzabCDEFGHIJKLMNOPQRSTUVWXYZAB"
)
s = "Test String"
print(s.translate(table))
Output:
Vguv Uvtkpi
There's two major parts to this. First, ciphering a single letter; and second, applying that to the whole string. We'll start with the first one.
You said you had a list with the alphabet in it. Suppose, too, that we have a letter.
>>> letter = 'F'
If we want to replace that letter with the letter two spaces down in the alphabet, first we'll probably want to find the numerical value of that letter. To do that, use index:
>>> alphabet.index(letter)
5
Next, you can add the offset to it and access it in the list again:
>>> alphabet[alphabet.index(letter) + 2]
'H'
But wait, this won't work if we try doing a letter like Z, because when we add the index, we'll go off the end of the list and get an error. So we'll wrap the value around before getting the new letter:
>>> alphabet[(alphabet.index('Z') + 2) % len(alphabet)]
'B'
So now we know how to change a single letter. Python makes it easy to apply it to the whole string. First putting our single-letter version into a function:
>>> def cipher_letter(letter):
... return alphabet[(alphabet.index(letter) + 2) % len(alphabet)]
...
We can use map to apply it over a sequence. Then we get an iterable of ciphered characters, which we can join back into a string.
>>> ''.join(map(cipher_letter, 'HELLOWORLD'))
'JGNNQYQTNF'
If you want to leave characters not in alphabet in place, add a test in cipher_letter to make sure that letter in alphabet first, and if not, just return letter. VoilĂ .