I am able to convert an Hindi script written in English back to Hindi
import codecs,string
from indic_transliteration import sanscript
from indic_transliteration.sanscript import SchemeMap, SCHEMES, transliterate
def is_hindi(character):
maxchar = max(character)
if u'\u0900' <= maxchar <= u'\u097f':
return character
else:
print(transliterate(character, sanscript.ITRANS, sanscript.DEVANAGARI)
character = 'bakrya'
is_hindi(character)
Output:
बक्र्य
But If I try to do something like this, I don't get any conversions
character = 'Bakrya विकणे आहे'
is_hindi(character)
Output:
Bakrya विकणे आहे
Expected Output:
बक्र्य विकणे आहे
I also tried the library Polyglot but I am getting similar results with it.
Preface: I know nothing of devanagari, so you will have to bear with me.
First, consider your function. It can return two things, character or None (print just outputs something, it doesn't actually return a value). That makes your first output example originate from the print function, not Python evaluating your last statement.
Then, when you consider your second test string, it will see that there's some Devanagari text and just return the string back. What you have to do, if this transliteration works as I think it does, is to apply this function to every word in your text.
I modified your function to:
def is_hindi(character):
maxchar = max(character)
if u'\u0900' <= maxchar <= u'\u097f':
return character
else:
return transliterate(character, sanscript.ITRANS, sanscript.DEVANAGARI)
and modified your call to
' '.join(map(is_hindi, character.split()))
Let me explain, from right to left. First, I split your test string into the separate words with .split(). Then, I map (i.e., apply the function to every element) the new is_hindi function to this new list. Last, I join the separate words with a space to return your converted string.
Output:
'बक्र्य विकणे आहे'
If I may suggest, I would place this splitting/mapping functionality into another function, to make things easier to apply.
Edit: I had to modify your test string from 'Bakrya विकणे आहे' to 'bakrya विकणे आहे' because B wasn't being converted. This can be fixed in a generic text with character.lower().
Related
I have lists of strings, some are hashtags - like #rabbitsarecool others are short pieces of prose like "My rabbits name is fred."
I have written a program to seperate them:
def seperate_hashtags_from_prose(*strs):
props = []
hashtags = []
for x in strs:
if x[0]=="#" and x.find(' ')==-1:
hashtags += x
else:
prose += x
return hashtags, prose
seperate_hashtags_from_prose(["I like cats","#cats","Rabbits are the best","#Rabbits"])
This program does not work. in the above example when i debug it, it tells me that on the first loop:
x=["I like cats","#cats","Rabbits are the best",#Rabbits].
Thisis not what I would have expected - my intuition is that something about the way the loop over optional arguments is constructed is causing an error- but i can't see why.
There are several issues.
The most obvious is switching between props and prose. The code you posted does not run.
As others have commented, if you use the * in the function call, you should not make the call with a list. You could use seperate_hashtags_from_prose("I like cats","#cats","Rabbits are the best","#Rabbits") instead.
The line hashtags += x does not do what you think it does. When you use + as an operator on iterables (such as list and string) it will concatenate them. You probably meant hashtags.append(x) instead.
def double(text_var):
for i in range(len(text_var)):
if text_var[i]=='a' and text_var[i+1]=='b':
text_var=text_var.replace(text_var[i],'0')
text_var=text_var.replace(text_var[i+1],'')
return text_var
i'm trying to achieve something like this --> ab = 0
both characters replaced by one character ( zero in that case )
This should work fine:
def double(text_var):
return text_var.replace("ab", "0")
Here are the reasons your code doesn't work:
Python indexes begin at 0, so they end one unit before the length of the array. That's why you get an string index out of range. To fix that, change
for i in range(len(text_var))
to
for i in range(len(text_var)-1)
This part of your code:
text_var=text_var.replace(text_var[i],'0')
text_var=text_var.replace(text_var[i+1],'')
Although you passed text_var[i] and text_var[i+1] into the replace() method, the values are evaluated there, so python will just find all the as and bs, not necessarily adjacent to each others, to replace.
The correct way to approach this is like this:
def double(text_var):
return text_var.replace("ab", "0")
Not quite sure what the correct title should be.
I have a function with 2 inputs def color_matching(color_old, color_new). This function should check the strings in both arguments and assign either a new string if there is a hit.
def color_matching(color_old, color_new):
if ('<color: none' in color_old):
color_old = "NoHighlightColor"
elif ('<color: none' in color_new):
color_new = "NoHighlightColor"
And so forth. The problem is that each of the arguments can be matched to 1 of 14 different categories ("NoHighlightColor" being one of them). I'm sure there is a better way to do this than repeating the if statement 28 times for each mapping but I'm drawing a blank.
You can at first parse your input arguments, if for example it's something like that:
old_color='<color: none attr:ham>'
you can parse it to get only the value of the relevant attribute you need:
_old_color=old_color.split(':')[1].split()[0]
That way _old_color='none'
Then you can use a dictionary where {'none':'NoHighlightColor'}, lets call it colors_dict
old_color=colors_dict.get(_old_color, old_color)
That way if _old_color exists as a key in the dictionary old_color will get the value of that key, otherwise, old_color will remain unchanged
So your final code should look similar to this:
def color_matching(color_old, color_new):
""" Assuming you've predefined colros_dict """
# Parsing to get both colors
_old_color=old_color.split(':')[1].split()[0]
_new_color=new_color.split(':')[1].split()[0]
# Checking if the first one is a hit
_result_color = colors_dict.get(_old_color, None)
# If it was a hit (not None) then assign it to the first argument
if _result_color:
color_old = _result_color
else:
color_new = colors_dict.get(_color_new, color_new)
You can replace conditionals with a data structure:
def match(color):
matches = {'<color: none': 'NoHighlightColor', ... }
for substring, ret in matches.iteritems():
if substring in color:
return ret
But you seems to have a problem that requires a proper parser for the format you are trying to recognize.
You might build one from simple string operations like "<color:none jaja:a>".split(':')
You could maybe hack one with a massive regex.
Or use a powerful parser generated by a library like this one
def Change(_text):
L = len(_text)
_i = 2
_text[_i] = "*"
_i += 2
print(_text)
How can I add a mark e.g:* every two Index In String
Why are you using _ in your variables? If it is for any of these reasons then you are OK, if it is a made up syntax, try not to use it as it might cause unnecessary confusion.
As for your code, try:
def change_text(text):
for i in range(len(text)):
if i % 2 == 0: # check if i = even (not odd)
print(text[:i] + "*" + text[i+1:])
When you run change_text("tryout string") the output will look like:
*ryout string
tr*out string
tryo*t string
tryout*string
tryout s*ring
tryout str*ng
tryout strin*
If you meant something else, name a example input and wished for output.
See How to create a Minimal, Complete, and Verifiable example
PS: Please realize that strings are immutable in Python, so you cannot actually change a string, only create new ones from it.. if you want to actually change it you might be better of saving it as a list for example. Like they have done here.
Are you trying to separate every two letters with an asterix?
testtesttest
te*st*te*st*te*st
You could do this using itertools.zip_longest to split the string up, and '*'.join to rebuild it with the markers inserted
from itertools import zip_longest
def add_marker(s):
return '*'.join([''.join(x) for x in zip_longest(*[iter(s)]*2, fillvalue='')])
I'm super new to python, and trying to create a very simple function to be used in a larger map coloring program.
The idea of the function is to have a set of variables attributed to different regions (string1) with colors assigned to them, (r,g,b) and then test if the regions touch another region of the same color by recursively looking through a set of region borders (string2) to find variables+colors that match.
The input format would look like this:
("Ar, Bg, Cb", "AB,CB,CA")
Would return True, meaning no two regions of the same color touch.
Here's my code segment so far:
def finding_double_char_function(string1, string2):
if string2=="":
return True
elif string2[0]+"r" and string2[1]+"r" in string1 or string1[::-1]:
return False
elif string2[0]+"g" and string2[1]+"g" in string1 or string1[::-1]:
return False
elif string2[0]+"b" and string2[1]+"b" in string1 or string1[::-1]:
return False
else:
return finding_double_char_function(string1, (string2[3:]))
I keep getting false when I expected True. Can anyone help? Thanks a lot.
You have several problems in this, but your main problem is that you don't seem to know the order of bindings in an expression. What you've written is a little more readable like this:
elif string2[0]+"r" and
((string2[1]+"r" in string1) or
string1[::-1]) :
In other words, you've used strings as boolean values. The value you get from this is not what you expected. I think what you're trying to do is to see whether either constructed string (such as "Ar") is in string 1, either forward or backward.
"in" can join only one pair of strings; there's no distributive property of "and" and "or" over "in".
Here's the first part rewritten properly:
elif (string2[0]+"r" in string1) and
(string2[1]+"r" in string1)
Does this get you going?
Also, stick in print statements to trace your execution and print out useful values along the way.
If I undestood correctly your problem could be solved like this:
def intersect(str1, str2):
if (not str2):
return True
if (str1[str1.find(str2[0]) + 1] == str1[str1.find(str2[1]) + 1]):
return False
else:
return intersect(str1, str2[3:])