How to convert string to seo-url? - python

I would like to convert a accented string to a seo-url ...
For instance:
"Le bébé (de 4 ans) a également un étrange "rire"" to :
"le-bebe-de-4-ans-a-egalement-un-etrange-rire"
Any solution, please ?
Thanks !

This is what I use:
def _doStringSEOptiomization(objectName,pageName,lang,objectId):
"""
Prende in input il nome di un'offerta e svolge dei passi:
1- Trasforma tutte le variazioni delle vocali
in vocali normali
2- Attraverso una serie di REGEX, elimina i caratteri non desiderati e torna
una stringa da inserire in un link adatto ai motori di ricerca e alle indicizzazioni
"""
try:
import re #importo il modulo per le REGEX
Speaker.log_debug(GREEN("core.ws_site.do_sites_offers_data_redux._doStringSEOptiomization() input: objectName=%s, pageName=%s, lang=%s, objectId=%s" % (objectName,pageName,lang,objectId)))
#mappa dei caratteri html-entity e unicode
vocalMap = { 'a' : ['à','á','â','ã','ä','å','æ','à','á','â','ã','ä','å','ā','æ'],
'e' : ['è','é','ê','ë','è','é','ê','ë','ē'],
'i' : ['ì','í','î','ï','ì','í','î','ï','ī'],
'o' : ['ò','ó','ô','œ','õ','ö','ò','ó','ô','œ','õ','ö','ō'],
'u' : ['ù','ú','û','ü','ù','ú','û','ü','ū']
}
objectName = objectName.lower() #trasformo la stringa di partenza in caratteri minuscoli
for vocale, lista in vocalMap.iteritems(): #per ogni elemento della mappa avrà una chiave ed una lista
for elemento in lista: #itero su tutti gli elementi della lista
objectName = objectName.replace(elemento,vocale) #sostituisco nel nome dell'offerta, la vocale all' HTML-entity
objectName = objectName.replace("/","-")
objectName = re.sub("[^a-z0-9_\s-]","",objectName) #######################################
objectName = re.sub("[\s-]+"," ",objectName) #strippo tutti i caratteri non voluti:#
objectName = re.sub("[\s_]","-",objectName) #######################################
objectName = pageName+"--"+objectName
objectName += "-"+lang+"-"+str(objectId) #aggiungo la lingua e l'id dell'offerta
except Exception,s:
Speaker.log_error("_doStringSEOptiomization(): Error=%s"%RED(s))
return objectName
You have to adapt it for your situation.

This might (or might not) be enough:
import re
import unidecode
def normalized_id(title):
title = unidecode.unidecode(title).lower()
return re.sub('\W+', '-', title.replace("'", '')).strip('-')

>>> a = u'Le bébé (de 4 ans) a également un étrange "rire"'
>>> r = unicodedata.normalize('NFKD',a).encode('cp1256','ignore')
>>> r = unicode(re.sub('[^\w\s-]','',r).strip().lower())
>>> r = re.sub('[-\s]+','-',r)
>>> print r
le-bebe-de-4-ans-a-egalement-un-etrange-rire
I use cp1256 (latin 1) to handle accented characters...
Perfect ! Thanks a lot !

If you have Django around, you can use its defaultfilter slugify (or adapt it for your needs).

[~]$ python
Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> import re
>>> def seo_string(x):
... r = unicodedata.normalize('NFKD',x).encode('ascii','ignore')
... r = unicode(re.sub('[^\w\s-]','',x).strip().lower())
... return re.sub('[-\s]+','-',r)
...
>>> seo_string(u'Le bébé (de 4 ans) a également un étrange "rire"')
u'le-bb-de-4-ans-a-galement-un-trange-rire'
With thanks to the great slugify of django's built-in filters, however it won't do replacement of é with e like the solution posted by #doncallisto

Here are the several ways to do so: Generating Slugs By Armin Ronacher.

Related

How can I modify an "if condition" in order to apply it to different list at the same time?

I wrote a script to extract sentences in huge set which contains particular pattern. The problem lied in the fact that , for some patterns I checked the value of the attribute at the beginning or ending of the pattern to see if the word is present in a particular list. I have 4 dictionaries with 2 lists of positive and negative word. So far I wrote the script and I am able to use the function I wrote with one dictionary. I am thinking how can I improve the my function so that I can use it at the same time of the 4 dictionaries without duplicating the bloc which loop in the dictionary.
I give an example with two dictionaries (since the script is quite long I make a small example with all the necessary element
import spacy.attrs
from spacy.attrs import POS
import spacy
from spacy import displacy
from spacy.lang.fr import French
from spacy.tokenizer import Tokenizer
from spacy.util import compile_prefix_regex, compile_infix_regex, compile_suffix_regex
from spacy.lemmatizer import Lemmatizer
nlp = spacy.load("fr_core_news_md")
from spacy.matcher import Matcher#LIST
##################### List of lexicon
# Lexique Diko
lexicon = open(os.path.join('/h/Ressources/Diko.txt'), 'r', encoding='utf-8')
data = pd.read_csv(lexicon, sep=";", header=None)
data.columns = ["id", "terme", "pol"]
pol_diko_pos = data.loc[data.pol =='positive', 'terme']
liste_pos_D = list(pol_diko_pos)
print(liste_pos[1])
pol_diko_neg = data.loc[data.pol =='negative', 'terme']
liste_neg_D = list(pol_diko_neg)
#print(type(liste_neg))
# Lexique Polarimots
lexicon_p = open(os.path.join('/h/Ressources/polarimots.txt'), 'r', encoding='utf-8')
data_p = pd.read_csv(lexicon_p, sep="\t", header=None)
#data.columns = ["terme", "pol", "pos", "degre"]
data_p.columns = ["ind", "terme", "cat", "pol", "fiabilité"]
pol_polarimot_pos = data_p.loc[data_p.pol =='POS', 'terme']
liste_pos_P = list(pol_polarimot_pos)
print(liste_pos_P[1])
pol_polarimot_neg = data_p.loc[data_p.pol =='NEG', 'terme']
liste_neg_P = list(pol_polarimot_neg)
#print(type(liste_neg))
# ############################# Lists
sentence_not_extract_lexique_1 =[] #List of all sentences without the specified pattern
sentence_extract_lexique_1 = [] #list of sentences which the pattern[0] is present in the first lexicon
sentence_not_extract_lexique_2 =[] #List of all sentences without the specified pattern
sentence_extract_lexique_2 = [] #list of sentences which the pattern[0] is present in the second lexicon
list_token_pos = [] #list of the token found in the lexique
list_token_neg = [] #list of the token found in the lexique
list_token_not_found = [] #list of the token not found in the lexique
#PATTERN
pattern1 = [{"POS": {"IN": ["VERB", "AUX","ADV","NOUN","ADJ"]}}, {"IS_PUNCT": True, "OP": "*"}, {"LOWER": "mais"} ]
pattern1_tup = (pattern1, 1, True)
pattern3 = [{"LOWER": {"IN": ["très","trop"]}},
{"POS": {"IN": ["ADV","ADJ"]}}]
pattern3_tup = (pattern3, 0, True)
pattern4 = [{"POS": "ADV"}, # adverbe de négation
{"POS": "PRON","OP": "*"},
{"POS": {"IN": ["VERB", "AUX"]}},
{"TEXT": {"IN": ["pas", "plus", "aucun", "aucunement", "point", "jamais", "nullement", "rien"]}},]
pattern4_tup = (pattern4, None, False)
#Tuple of pattern
pattern_list_tup =[pattern1_tup, pattern3_tup, pattern4_tup]
pattern_name = ['first', 'second', 'third', 'fourth']
length_of_list = len(pattern_list_tup)
print('length', length_of_list)
#index of the value of attribute to check in the lexicon
value_of_attribute = [0,-1,-1]
# List of lexicon to use
lexique_1 = [lexique_neg, lexique_pos]
lexique_2 = [lexique_2neg, lexique_2pos]
# text (example of some sentences)
file =b= ["Le film est superbe mais cette édition DVD est nulle !",
"J'allais dire déplorable, mais je serais peut-être un peu trop extrême.",
"Hélas, l'impression de violence, bien que très bien rendue, ne sauve pas cette histoire gothique moderne de la sécheresse scénaristique, le tout couvert d'un adultère dont le propos semble être gratuit, classique mais intéressant...",
"Tout ça ne me donne pas envie d'utiliser un pieu mais plutôt d'aller au pieu (suis-je drôle).",
"Oui biensur, il y a la superbe introduction des parapluies au debut, et puis lorsqu il sent des culs tout neufs et qu il s extase, j ai envie de faire la meme chose apres sur celui de ma voisine de palier (ma voisine de palier elle a un gros cul, mais j admets que je voudrais bien lui foute mon tarin), mais c est tout, apres c est un film tres noir, lent et qui te plonge dans le depression.",
"Et bien hélas ce DVD ne m'a pas appris grand chose par rapport à la doc des agences de voyages et la petite dame qui fait ses dessins est bien gentille mais tout tourne un peu trop autour d'elle.",
"Au final on passe de l'un a l'autre sans subtilité, et on n'arrive qu'à une caricature de plus : si Kechiche avait comme but initial déclaré de fustiger les préjugés, c'est le contraire qui ressort de ce ''film'' truffé de clichés très préjudiciables pour les quelques habitants de banlieue qui ne se reconnaîtront pas dans cette lourde farce.",
"-ci écorche les mots, les notes... mais surtout nos oreilles !"]
# Loop to check each sentence and extract the sentences with the specified pattern from above
for pat in range(0, length_of_list):
matcher = Matcher(nlp.vocab)
matcher.add("matching_2", None, pattern_list_tup[pat][0])
# print(pat)
# print(pattern_list_tup[pat][0])
for sent in file:
doc =nlp(sent)
matches= matcher(doc)
for match_id, start, end in matches:
span = doc[start:end].lemma_.split()
#print(f"{pattern_name[pat]} pattern found: {span}")
This is the part I want ot modify to use it for another dictionary, the goal is to able to retrieve sentences extract by 4 different dictionaries to make a comparison and then check which sentences are present in more than two list.
# Condition to use the lexicon and extract the sentence
if (pattern_list_tup[pat][2]):
if (span[value_of_attribute[pat]] in lexique_1[pattern_list_tup[pat][1]]):
if sent not in sentence_extract:
sentence_extract_lexique_1.append(sent)
if (pattern_list_tup[pat][1] == 1):
list_token_pos.append(span[value_of_attribute[pat]])
if (pattern_list_tup[pat][1] == 0):
list_token_neg.append(span[value_of_attribute[pat]])
else:
list_token_not_found.append(span[value_of_attribute[pat]]) # the text form is not present in the lexicon need the lemma form
sentence_not_extract_lexique_1.append(sent)
else:
if sent not in sentence_extract:
sentence_extract_lexique_1.append(sent)
print(len(sentence_extract))
print(sentence_extract)
One solution I find is to duplicate the code abode and change the name of the list where the sentences are stored but since I have 2 dictionaries duplicating will make the code longer is there a way to combine the looping the 2 dictionaries (actually 4 dictionaries in the original) and append the result to the good list. So, for example, when I use lexique_1 , all the sentences extracted are send to "sentence_extract_lexique_1" and so on for the other.
In my opinion attempt using the if-elif-else chain. If not attempt only using the if-elif block simply because the elif statement catches the specific condition of interest. In which you're trying to catch a specific to compare and check with the sentences. Keep in mind if you try the if-elif-else chain its a good method, but it only works when you need one test to pass. Because Python finds one test to pass and it skips the rest. Its very efficient and allows you to test for one specific condition.

When using win32print + cx_Freeze, print instruction doesn't work without producing any error

Windows 10 (x64)
Python 3.6.3
cx_Freeze 5.1.1
pypiwin32 223 / pywin32 224
I made a module for printing, the module works fine when launching it as a script.
Once passed through cx_Freeze, the print command doesn't work without producing any error message.
Here is my setup.py for creating builds (by: python setup.py build)
# -*- coding: utf-8 -*-
import sys,os
from cx_Freeze import setup, Executable
PythonPath = os.path.split(sys.executable)[0] #get python path
includes = []
excludes = []
packages = ["win32print"]
includefiles = ["OptiWeb.ico"]
options = {"includes": includes,
"excludes": excludes,
"packages": packages,
"include_files": includefiles,
"optimize":2
}
base=None
if sys.platform == 'win32':
base = "Win32GUI"
options["include_msvcr"] = True
executables = [Executable(script="OptiPrint.py",base=base,icon="Optiweb.ico")]
setup(name = "OptiPrint",options = {"build_exe": options},version = "1.0",author = "ND",description = "print test",executables = executables)
And now my code for printing:
# coding: utf8
import win32print
class ZPLLabel(object):
def __init__(self, printerName):
self.printerName = printerName
self.printerDevice = win32print.OpenPrinter(self.printerName)
self.job = win32print.StartDocPrinter(self.printerDevice, 1, ("Etiquette", None, "RAW"))
self.eraseAll()
self.defineFormat()
def eraseAll(self):
win32print.StartPagePrinter(self.printerDevice)
str2print="~JA"
win32print.WritePrinter(self.printerDevice, str2print.encode("utf8")) #écrit le format d'étiquette
win32print.EndPagePrinter(self.printerDevice) # indique la fin de ce qu'il y a à imprimer
self.printerDevice.close() # ferme le canal d'impression et déclenche l'impression de ce qui précède
#del self.job
self.printerDevice=win32print.OpenPrinter(self.printerName)
self.job = win32print.StartDocPrinter(self.printerDevice, 1, ("Etiquette", None, "RAW"))
def defineFormat(self):
margeLeft = 150
margeTop = 20
interLine = 39
shiftLeft = 20
vDec = 25
#win32print.StartPagePrinter(p)
str2print="^XA\n" #debut de format
str2print+="^CI28"
#FO origine du champ, 100 pos x du champ en dots, 50 pos y du champ en dots
# l'imprimantes est 200 dpi (dotsper inch = 7.874 dots par mm, ici 12.7mm, 6.35mm)
#ADN : A ==> font, D==> font D, N ==> Orientation Normale, 36 hauteur caractère en dots, 20 Largeur caractère en dots
#FD données à imprimer pour le champ
#FS fin du champ
str2print+="^DFFORMAT"
str2print+="^LH"+str(margeLeft)+","+str(margeTop)
#un cadre arrondi
str2print+="^FO0,0^GB500,330,3,B,2^FS"
#str2print+="^FO"+str(shiftLeft)+","+str(interLine)+"^ADN,24,12^FDEtiquette de débit Sangle^FS\n" #format de l'étiquette
str2print+="^FO"+str(shiftLeft)+","+str(1*interLine-vDec) +"^ADN,32,14^FDOF N° : ^FS^FO"+str(shiftLeft+160)+","+str(1*interLine-vDec) +"^ADN,32,14^FN1^FS"
str2print+="^FO"+str(shiftLeft)+","+str(2*interLine-vDec) +"^ADN,32,14^FDPRODUIT : ^FS^FO"+str(shiftLeft+215)+","+str(2*interLine-vDec) +"^ADN,32,14^FN2^FS"
str2print+="^FO"+str(shiftLeft)+","+str(3*interLine-vDec) +"^ADN,24,12^FN3^FS"
str2print+="^FO"+str(shiftLeft)+","+str(4*interLine-vDec) +"^ADN,32,14^FDSANGLE : ^FS^FO"+str(shiftLeft+200)+","+str(4*interLine-vDec) +"^ADN,32,14^FN4^FS"
str2print+="^FO"+str(shiftLeft)+","+str(5*interLine-vDec) +"^ADN,24,12^FN5^FS"
str2print+="^FO"+str(shiftLeft)+","+str(6*interLine-vDec) +"^ADN,28,13^FDNombre de coupe : ^FS^FO"+str(shiftLeft+250)+","+str(6*interLine-vDec) +"^ADN,28,13^FN6^FS"
str2print+="^FO"+str(shiftLeft)+","+str(7*interLine-vDec) +"^ADN,28,13^FDLongueur coupée : ^FS^FO"+str(shiftLeft+250)+","+str(7*interLine-vDec) +"^ADN,28,13^FN7^FS"
str2print+="^FO"+str(shiftLeft)+","+str(8*interLine-vDec) +"^ADN,24,12^FDEmplacement : ^FS^FO"+str(shiftLeft+160)+","+str(8*interLine-vDec) +"^ADN,24,12^FN8^FS"
str2print+="^XZ" # fin du format d'étiquette
win32print.WritePrinter(self.printerDevice, str2print.encode("utf8")) #écrit le format d'étiquette
def printLabel(self, orderNum, productSku, productName, webSku, webName, partNum, partLength, emplacement):
str2print="^XA\n" #debut étiquette
str2print+="^XFFORMAT" #rappel du format enregistré
str2print+="^FN1^FD"+orderNum+"^FS"
str2print+="^FN2^FD"+productSku+"^FS"
str2print+="^FN3^FD"+productName+"^FS"
str2print+="^FN4^FD"+webSku+"^FS"
str2print+="^FN5^FD"+webName+"^FS"
str2print+="^FN6^FD"+str(partNum)+"^FS"
str2print+="^FN7^FD"+partLength+"^FS"
str2print+="^FN8^FD"+emplacement+"^FS"
str2print+="^XZ" # fin du format d'étiquette
win32print.WritePrinter(self.printerDevice, str2print.encode("utf8")) #écrit l'étiquette
def endLabel(self):
self.printerDevice.close() # ferme le canal d'impression et déclenche l'impression de ce qui précède
del self.job
def newPrintLabel():
zpl = ZPLLabel('Zebra ZP 450 CTP')
zpl.printLabel("20009999", "1035691", "Harnais Energy TWIN ss porte outil L/XL",
"90008318", "SA/SANGLE NOIRE 20 MM", 35, "0.38m", "Bavaroise réglable")
zpl.endLabel()
if __name__ == '__main__':
app = newPrintLabel()
I suppose, some package or dll is missing to make it run when frozen.
I tried to add win32api, win32com but it doesn't change the result.
Thanks for your help which is for sure welcome.
Try to use win32ui and win32con as done in the answers of python win32print not printing.
In this case you should also keep base defined as in your original question (regarding my comment to your question).
For those who experiment a such issue.
My code was not properly written:
To each StartDocPrinter instruction must correspond a EndDocPrinter instruction, apparently this not cause trouble with script but has impact on frozen version.
So the correct thread of instructions must be something like:
self.printerName = printerName
self.printerDevice = win32print.OpenPrinter(self.printerName)
self.job = win32print.StartDocPrinter(self.printerDevice, 1, ("Etiquette", None, "RAW"))
win32print.StartPagePrinter(self.printerDevice)
str2print="..." # define what you want to print
win32print.WritePrinter(self.printerDevice, str2print.encode("utf8")) #write
win32print.EndPagePrinter(self.printerDevice) # end the page
win32print.EndDocPrinter(self.printerDevice) # end the doc
self.printerDevice.close() # close the printer thread

where is the error in my code trying to compare case -insensitive using python

I have a code that read files and compare the content with a user-input with ignoring case-sensitive.
i used the list-comprehension in order to loop through the content and compare with user-input.
The problem is that the list comprehension return an empty list, although the entered word exist. Example:
textContent
Les hiboux
Charles Baudelaire
Cycle 3
*
POESIE
Sous les ifs noirs qui les abritent
Les hiboux se tiennent rangés
Ainsi que des dieux étrangers
Dardant leur œil rouge. Ils méditent.
Sans remuer ils se tiendront
Jusqu'à l'heure mélancolique
Où, poussant le soleil oblique,
Les ténèbres s'établiront.
Leur attitude au sage enseigne
Qu'il faut en ce monde qu'il craigne
Le tumulte et le mouvement ;
L'homme ivre d'une ombre qui passe
Porte toujours le châtiment
D'avoir voulu changer de place.
Les Fleurs du Mal
1857
Charles Pierre Baudelaire (1821 – 1867) est un poète français.
user-input: charl
word exist : Charles--charle--CHARLE
x=self.lineEditSearch.text()
print(x)
textString=self.ReadingFileContent(Item)
#self.varStr =[c for c in textString if c.islower() or c.isupper() or c.capitalize()]
self.varStr =[i for i in textString if i.lower() == x.lower()]
print(self.varStr)
If
user_input = "charl"
word_exist = ["Charles","charle","CHARLE","Hello"]
Then
output = [item for item in word_exist if user_input.lower() in item.lower()]
print(output)
# ['Charles', 'charle', 'CHARLE']
Is this what you are looking for?
Your problem is, you are only putting in self.varStr members of textString that satisfies i.lower() == x.lower(), which means "being completely the same (case insensitive) with x".
You want to pick up members that contains x.
You can do that by changing i.lower() == x.lower() into i.lower() in x.lower()

Flask response character set

I'm developing my API and currently I store one of my fields in spanish. Database is Postgres 9.5 (field type is varchar):
File Encoding : utf-8
Text field:
"text" varchar(65536) COLLATE "default"
When i return the value from text, I use flask_sqlalchemy to get my data.
class Serializer(object):
"""
Serialize information from database to a JSON Dictionary
"""
def serialize(self):
return {c: getattr(self, c) for c in inspect(self).attrs.keys()}
#staticmethod
def serialize_list(l):
return [m.serialize() for m in l]
class AutoSerialize(object):
"""
Mixin for retrieving public fields of model in json-compatible format'
"""
__public__ = None
def get_public(self, exclude=(), extra=()):
"Returns model's PUBLIC data for jsonify"
data = {}
keys = self._sa_instance_state.attrs.items()
public = self.__public__ + extra if self.__public__ else extra
for k, field in keys:
if public and k not in public: continue
if k in exclude: continue
value = self._serialize(field.value)
if value:
data[k] = value
return data
#classmethod
def _serialize(cls, value, follow_fk=False):
if type(value) in (datetime,):
ret = value.isoformat()
elif hasattr(value, '__iter__'):
ret = []
for v in value:
ret.append(cls._serialize(v))
elif AutoSerialize in value.__class__.__bases__:
ret = value.get_public()
else:
ret = value
return ret
My field in my Model is defined as follows and my class inherits Serializer and AutoSerialize:
description = Column(String(65536), nullable=True)
This is how I return my values to API client:
articles = Model.Bits.query.order_by(Model.Bits.publishedAt.desc()).limit(10).all()
if articles:
log.info('api() | %d Articles found ' % len(articles))
response = []
values = ['author', 'title', 'description', 'url', 'urlToImage', 'publishedAt']
response = [{value: getattr(d, value) for value in values} for d in articles]
return jsonify(articles=response, status='ok', source='my_source', sortBy='latest')
My response looks like this using curl:
{
"author": "Bros Lopez",
"description": "Spotify quiere ser m\u00e1s competitivo en su servicio de recomendaciones de contenido frente a marcas como Apple Music y Pandora. La empresa anunci\u00f3 la compra de la startup francesa Niland, la cual utiliza un sistema de inteligencia artificial para b\u00fasqueda y recomendaciones de contenido. Con 50 millones de usuarios activos Spotify quiere ser m\u00e1s rentable, pues a pesar de que el a\u00f1o pasado gener\u00f3 $3.1 mmdd en ventas, su margen bruto fue del 14%, pagando cerca de 2.7 mmdd en sellos discogr\u00e1ficos y editoriales. Por su parte, Pandora, unos de los principales competidores de Spotify, podr\u00eda ser adquirida por la empresa de radiodifusi\u00f3n SiriusXM, quien el a\u00f1o pasado le hizo una propuesta de compra por $3.4 mmdd. More Info",
"publishedAt": "Fri, 19 May 2017 20:00:00 GMT",
"title": "\u00bfPandora o Spotify?",
"url": "http://www.cnbc.com/2017/05/18/spotify-buys-niland-french-ai-music-startup.html",
"urlToImage": "https://ci4.googleusercontent.com/proxy/TWmEZRwlpPQrjs4HGZGx2041GryyquO7CjSR0oVBK-JUy4Xv3qHSiDow056iW8DV059chC93zFeXc4GVHKnzPpweUy-JzamK-l9pkW-Hgl1PnOun5s4XsE7K2NXBJljp-1Ltf5jyjfcn4j63Hv68FdFuqsw5UNTFBKkFug0=s0-d-e1-ft#https://gallery.mailchimp.com/f82949535ab2aab4bafde98f6/images/1f0dc47c-358b-4625-8744-105ffccfed98.jpg"
}
Is the encoding correct? I tried different client and characters are displayed correctly, so not sure if it is up to the client to display info properly or server.
It is a client's job to parse such characters, which curl is obviously not doing "out-of-the-box". Depending on OS/shell/encoding you are using, there might be some ways (or the others) to pipe the response to some other command which would parse those characters or some similar approach.

python - cut a string in 2 lines

I'm looking for a line (using str.join I think) to cut a long string if the number of word is too much. I have the begining but I don't know whow to insert \n
example = "Au Fil Des Antilles De La Martinique a Saint Barthelemy"
nmbr_word = len(example.split(" "))
if nmbr_word >= 6:
# cut the string to have
result = "Au Fil Des Antilles De La\nMartinique a Saint Barthelemy"
Thanks
How about using the textwrap module?
>>> import textwrap
>>> s = "Au Fil Des Antilles De La Martinique a Saint Barthelemy"
>>> textwrap.wrap(s, 30)
['Au Fil Des Antilles De La', 'Martinique a Saint Barthelemy']
>>> "\n".join(textwrap.wrap(s, 30))
'Au Fil Des Antilles De La\nMartinique a Saint Barthelemy'
How about:
'\n'.join([' '.join(nmbr_word[i:i+6]) for i in xrange(0, len(nmbr_word), 6)])

Categories