PyEnchant Dict empty? - python

For context I'm trying to use PyEnchant for a uni project where I write ciphers and try to decipher text without knowing the key, and pyenchant is meant to work as a verification for when the text has been deciphered since it should detect once the deciphering algorithm is returning real words in the dictionary.
My issue is coming from trying to use the French dictionary. Now yesterday all was fine, and it was working, but since I tried to make it work on my other computer and now the French dictionary isn't working at all (but the English dictionary is working just fine).
For example check("Bonjour") returns false and suggest("Bonj") returns an empty array...
I tried to reinstall pyenchant and have manually re-added the .aff and .dic files for the french dictionary but it hasn't changed anything, whats going on?
import enchant
de = enchant.Dict("en_US")
df = enchant.Dict("fr")
print(df.check("bonjour"))
print(enchant.dict_exists("fr"))
print(de.check("hello"))
>>False
>>True
>>True
For information I placed the .aff and .dic files in my hunspell folder, and it worked yesterday so I dont see why it wouldn't work today

Related

is there a way to set the title and author metadata properties of a pdf in python?

I need to create numerous ADA-compliant pdfs from word documents. I used a code snippet from comtypes.client which works very well to create a pdf, but when i run the accessibility checker on the PDF, it gives a Title FAIL, and it has my name as the author rather than my organization's name. Is there any way to set the title and author while making the PDF or alternatively, after it is finished? I would prefer to use python, but if there are any other simpler methods out there, I am game.
I have looked at PyPDF2, but it seems to only set 'custom metadata' rather than actually change or set title/author properties. (plus the code snippet I tried from the web kept returning an error - i am not pasting the code though as i dont think it does what i need anyways..)
I cannot tell if something like pdftk does what i need or not...i cant find anyway to do it with the free version, and i see this example https://sejh.wordpress.com/2014/11/26/changing-pdf-titles-with-pdftk/, but at best it looks like it might work for title but not author, and i am not sure if there is an easy way to run the script for many pdfs...
Ive also looked into EXIF, which seems to only read pdfs, but points to XMP as a way to write the metadata... only after resurfacing from an extensive XMP rabbit-hole i went down, i still cant tell if it would be useful or not.
so i thought i would try here. see if anyone has a nice, easy python solution, or if not, can point me to a rabbit hole worth going down, and any hints on how to navigate said rabbit-holes to find an answer.
Much appreciated!
i was able to solve my issue at the Word document stage by using the core_properties attribute in Python's docx (I had not been aware of this attribute at the time of my original post).
import docx
doc = docx.Document()
cp = doc.core_properties
cp.author = 'author name'
cp.title = 'title content'
cp.subject = 'subject content'
when I then used Python's comtypes to pdf the Word doc, the metadata transferred successfully.
The general reason for you as Author is "Machine User is the Author even if that is your secretary" so easiest is run a PDF app in an "Organisation" login, but as you point out pdftk allows you to make changes, so stick with that. Use Python to write the needful change in NewInfo.txt then shell out to run pdftk.
InfoBegin
InfoKey: Creator
InfoValue: Stack OverFlow
InfoBegin
InfoKey: Producer
InfoValue: Status Quo
InfoBegin
InfoKey: Author
InfoValue: K Steinmann
InfoBegin
InfoKey: Title
InfoValue: Whatever You Want, Whatever You Need, dah dah dah...
pdftk input.pdf update_info NewInfo.txt output output.pdf

Not able to write files from PYTTSX3

I am doing Ai research. I have been using pyttsx3 for the voice. The pyttsx3 function does not produce visemes. It does produce limited timing for each word. I intend to make a function that take in the text, then produce the visemes for animation of an avatar. The problem is that the data is trapped in the engine. I wanted to save the audio to a file, then load and extract it. The problem is that the save_to_file appears to be a dummy function. I have spent days researching it and the best that I have gotten are files with 0 (zero) bytes. What is going on?
I came across this when searching for solutions to the same problem.
Eventually, what I noticed was that it was to do with the file name. When I chose the file name to be 'test.mp3' or 'name.mp3', it saved properly, but if I changed the name to 'text.mp3' or basically anything else while keeping everything else exactly the same, it produced a file of zero bytes in size.
When I saved it to a different folder but the same name, e.g. 'folder/test.mp3', then it also failed, presumably because it's the full absolute path name which matters.
My hack/workaround was to just always use the file name 'test.mp3' in the same directory, and then use python's os module to move it or change the name immediately afterwards, e.g.:
engine.save_to_file(text, "test.mp3")
os.rename("test.mp3", "folder/nameIwanttogive.mp3")
So 'test.mp3' seemed to work for me. Maybe see what works by initially copying and pasting some demo code from a website.
I had the same problem, it was because i forgot a litle thing:
engine.save_to_file('hello', 'test.mp3')
it just for preparation, to implement everything use the code:
import pyttsx3 as tts
engine=tts.init()
engine.save_to_file('hello', 'HelloSound.mp3')
engine.runAndWait() # implement everything
I just update my pyttsx3 to 2.90 and started to work!! Just reinstall!
try:
pip uninstall pyttsx3
and than:
pip install pyttsx3

Why do my spacy tags not print?

So I have been trying to use spacy to tag text, and a few days ago in the same program I was able to do just that. However, when I tried to use it today I found that I would only get blankspace when I tried to output(not just print) any kind of tag outside of unicode. To make sure it had nothing to do with my program I opened up a new virtualenv, imported spacy again, and used the tagging documentation code without changing anything. Everything up to defining the 'doc' variable seems to work fine, and when I print out tags in unicode(without the underscore) it works just fine.
I am still unable to output any kind tag though. At this point I do not know what I need to do. What could be going wrong?
import spacy
nlp = spacy.load('en')
doc = nlp(u'They told us to duck.')
for word in doc:
print(word.text, word.lemma, word.lemma_, word.tag, word.tag_, word.pos, word.pos_)

Extremely new user to Python. "No module named request" error while trying code to detect image subdomains in a website to extract them to a folder

I may sound rather uninformed writing this, and unfortunately, my current issue may require a very articulate answer to fix. Therefore, I will try to be specific as possible as to ensure that my problem can be concisely understood.
My apologizes for that- as this Python code was merely obtained from a friend of mine who wrote it for me in order to complete a certain task. I myself had had extremely minimal programming knowledge.
Essentially, I am running Python 3.6 on a Mac. I am trying to work out a code that allows Python to scan through a bulk of a particular website's potentially existent subdomains in order to find possibly-existent JPG images files contained within said subdomains, and download any and all of the resulting found files to a distinct folder on my Desktop.
The Setup-
The code itself, named "download.py" on my computer, is written as follows:
import urllib.request
start = int(input("Start range:100000"))
stop = int(input("End range:199999"))
for i in range(start, stop + 1):
filename = str(i).rjust(6, '0') + ".jpg"
url = "http://website.com/Image_" + filename
urllib.request.urlretrieve(url, filename)
print(url)
(Note that the words "website" and "Image" have been substituted for the actual text included in my code).
Before I proceed, perhaps some explanation would be necessary.
Basically, the website in question contains several subdomains that include .JPG images, however, the majority of the exact URLs that allow the user to access these sub-domains are unknown and are a hidden component of the internal website itself. The format is "website.com/Image_xxxxxx.jpg", wherein x indicates a particular digit, and there are 6 total numerical digits by which only when combined to make a valid code pertain to each of the existent images on the site.
So as you can see, I have calibrated the code so that Python will initially search through number values in the aforementioned URL format from 100000 to 199999, and upon discovering any .JPG images attributed to any of the thousands of link combinations, will directly download all existent uncovered images to a specific folder that resides within my Desktop. The aim would be to start from that specific portion of number values, and upon running the code and fetching any images (or not), continually renumbering the code to work my way through all of the possible 6-digit combos until the operation is ultimately a success.
(Possible Side-Issue- Although I am fairly confident that my friend's code is written in a manner so that Python will only download .JPG files to my computer from images that actually do exist on that particular URL, rather than swarming my folder with blank/bare files from every single one of URL attempts regardless of whether that URL happens to be successful or not, I am admittedly not completely certain. If the latter is the case, informing me of a more suitable edit to my code would be tremendously appreciated.)
The Execution-
Right off the bat, the code experienced a large error. I'll list through the series of steps that led to the creation of said error.
#1- Of course, I first copy-pasted the code into a text document, and saved it as "download.py". I saved it inside of a folder named "Images" where I sought the images to be directly downloaded to. I used BBEdit.
#2- I proceeded, in Terminal, to input the commands "cd Desktop/Images" (to account for the file being held within the "Images" folder on my Desktop), followed by the command "Python download.py" (to actually run the code).
As you can see, the error which I obtained following my attempt to run the code was the ImportError: No module named request. Despite me guessing that the answer to solving this is simple, I can legitimately say I have got such minimal knowledge regarding Python that I've absolutely no idea how to solve this.
Hint: Prior to making the download.py file, the folder, and typing the Terminal code the only interactions I made with Python were downloading the program (3.6) and placing it in my toolbar. I'm not even quite sure if I am required to create any additional scripts/text files, or make any additional downloads before a script like this would work and successfully download the resulting images into my "Images" folder as is my desired goal. If I sincerely missed something integral at any point during this long read, hopefully, someone in here can provide a thoroughly detailed explanation as to how to solve my issue.
Finishing statements for those who've managed to stick along this far:
Thank you. I know this is one hell of a read, and I'm getting more tired as I go along. What I hope to get out of this question is
1.) Obviously, what would constitute a direct solution to the "No module named request" Input Error in Terminal. In other words, what I did wrong there or am missing.
2.) Any other helpful information that you know would assist this code, for example, if there is any integral step or condition I've missed or failed to meet that would ultimately cause the entirety of my code to cease to work. If you do see a fault in this, I only ask of you to be specific, as I've not got much experience in the programming world. After all, I know there is a lot of developers out here that are far more informed and experienced than am I. Thanks.
urllib.request is in Python 3 only. When running 'python' on a Mac, you're running Python 2 by default. Try running executing with python3.
python --version
might need to
brew install python3
urllib.request is a Python 3 construct. Most systems run Python 2 as default and this is what you get when you run simply python.
To install Python 3, go to https://brew.sh/ and follow the instructions to install the Hombrew package manager. Then run
brew install python3
python3 download.py

How do you Edit Video ID3 Tags in Python?

I've been slowly writing a little project to create a Movie/TV Show/Music ID3 tag editor that can be used on batches of files from my iTunes library. I started designing the GUI in python after finding a bunch of ID3 tag editors here: http://wiki.python.org/moin/UsefulModules#ID3Handling
Unfortunately when it came time to set up the actual ID3 tag editor I discovered that none of the libraries on the page, or any of the other ones that I found, like eyeD3, could handle actual movie files. I've already put a lot of effort into the python app and I was hoping someone could tell me one of three things (if not, I have to switch to Java, which, according to an earlier post apparently has a library for handling mp4 files' ID3 tags):
Is there some library/package that you know can be used to edit the ID3 of MOVIES (the vast majority of the stuff out there that I've been able to find, and I have done my homework well on this, is for audio only) MP4 files specifically.
I have to point out that eyeD3 was actually great for mp3's and REALLY easy to use, so the more like eyeD3 the better.
Is there some sort of file reader-like library that would allow me to read the entire file, change the ID3 tag to what I want, and then write it. In desperation I tried to open the MP4 file with notepad++ and found that the first line of the file is the ID3 tag, but I was unable to decode it. If you guys know something on this level, like how to go in and edit it manually with python, then that would be appreciated.
or
Some sort of script in another language which I would be able to execute with python (keep in mind however that I only know Java and Python at the moment), but this would have to be something ridiculously simple to use like:
edit_MP4_Tag(filename, title, artist, etc...) which I doubt exists.
Thanks in advance for any help.
Massively overdue, but if you're on OSX, you can use appscript to alter metadata via iTunes itself, which is very easy if slow.
The great thing about this is that it works on any kind of media and you can alter iTunes-specific metadata, too.
If not, you can use Atomic Parsley via subprocess or similar.
To start, I do not know much about python so I'm just gonna give you my 2 cents as no one answered you yet.
Second, if you know where in the string is the id3, it should be workable. You will need to write your code from scratch as there is nothing really out there for id3 in python in movies.
Now, this all has to do with STRING and SUBSTRING. You got to open the file and grab the substring you need and modify with try and fail. A great example, is made in this by Prashant Khandelwal.
Good luck.

Categories