I'm having the idea of writing a program using Python which shall find a lyric of a song whose name I provided. I think the whole process should boil down to couple of things below. These are what I want the program to do when I run it:
prompt me to enter a name of a song
copy that name
open a web browser (google chrome for example)
paste that name in the address bar and find information about the song
open a page that contains the lyrics
copy that lyrics
run a text editor (like Microsoft Word for instance)
paste the lyrics
save the new text file with the name of the song
I am not asking for code, of course. I just want to know the concepts or ideas about how to use python to interact with other programs
To be more specific, I think I want to know, fox example, just how we point out where is the address bar in Google Chrome and tell python to paste the name there. Or how we tell python how to copy the lyrics as well as paste it into the Microsof Word's sheet then save it.
I've been reading (I'm still reading) several books on Python: Byte of python, Learn python the hard way, Python for dummies, Beginning Game Development with Python and Pygame. However, I found out that it seems like I only (or almost only) learn to creat programs that work on itself (I can't tell my program to do things I want with other programs that are already installed on my computer)
I know that my question somehow sounds rather silly, but I really want to know how it works, the way we tell Python to regconize that this part of the Google chrome browser is the address bar and that it should paste the name of the song in it. The whole idea of making python interact with another program is really really vague to me and I just
extremely want to grasp that.
Thank you everyone, whoever spend their time reading my so-long question.
ttriet204
If what you're really looking into is a good excuse to teach yourself how to interact with other apps, this may not be the best one. Web browsers are messy, the timing is going to be unpredictable, etc. So, you've taken on a very hard task—and one that would be very easy if you did it the usual way (talk to the server directly, create the text file directly, etc., all without touching any other programs).
But if you do want to interact with other apps, there are a variety of different approaches, and which is appropriate depends on the kinds of apps you need to deal with.
Some apps are designed to be automated from the outside. On Windows, this nearly always means they a COM interface, usually with an IDispatch interface, for which you can use pywin32's COM wrappers; on Mac, it means an AppleEvent interface, for which you use ScriptingBridge or appscript; on other platforms there is no universal standard. IE (but probably not Chrome) and Word both have such interfaces.
Some apps have a non-GUI interface—whether that's a command line you can drive with popen, or a DLL/SO/DYLIB you can load up through ctypes. Or, ideally, someone else has already written Python bindings for you.
Some apps have nothing but the GUI, and there's no way around doing GUI automation. You can do this at a low level, by crafting WM_ messages to send via pywin32 on Windows, using the accessibility APIs on Mac, etc., or at a somewhat higher level with libraries like pywinauto, or possibly at the very high level of selenium or similar tools built to automate specific apps.
So, you could do this with anything from selenium for Chrome and COM automation for Word, to crafting all the WM_ messages yourself. If this is meant to be a learning exercise, the question is which of those things you want to learn today.
Let's start with COM automation. Using pywin32, you directly access the application's own scripting interfaces, without having to take control of the GUI from the user, figure out how to navigate menus and dialog boxes, etc. This is the modern version of writing "Word macros"—the macros can be external scripts instead of inside Word, and they don't have to be written in VB, but they look pretty similar. The last part of your script would look something like this:
word = win32com.client.dispatch('Word.Application')
word.Visible = True
doc = word.Documents.Add()
doc.Selection.TypeText(my_string)
doc.SaveAs(r'C:\TestFiles\TestDoc.doc')
If you look at Microsoft Word Scripts, you can see a bunch of examples. However, you may notice they're written in VBScript. And if you look around for tutorials, they're all written for VBScript (or older VB). And the documentation for most apps is written for VBScript (or VB, .NET, or even low-level COM). And all of the tutorials I know of for using COM automation from Python, like Quick Start to Client Side COM and Python, are written for people who already know about COM automation, and just want to know how to do it from Python. The fact that Microsoft keeps changing the name of everything makes it even harder to search for—how would you guess that googling for OLE automation, ActiveX scripting, Windows Scripting House, etc. would have anything to do with learning about COM automation? So, I'm not sure what to recommend for getting started. I can promise that it's all as simple as it looks from that example above, once you do learn all the nonsense, but I don't know how to get past that initial hurdle.
Anyway, not every application is automatable. And sometimes, even if it is, describing the GUI actions (what a user would click on the screen) is simpler than thinking in terms of the app's object model. "Select the third paragraph" is hard to describe in GUI terms, but "select the whole document" is easy—just hit control-A, or go to the Edit menu and Select All. GUI automation is much harder than COM automation, because you either have to send the app the same messages that Windows itself sends to represent your user actions (e.g., see "Menu Notifications") or, worse, craft mouse messages like "go (32, 4) pixels from the top-left corner, click, mouse down 16 pixels, click again" to say "open the File menu, then click New".
Fortunately, there are tools like pywinauto that wrap up both kinds of GUI automation stuff up to make it a lot simpler. And there are tools like swapy that can help you figure out what commands you want to send. If you're not wedded to Python, there are also tools like AutoIt and Actions that are even easier than using swapy and pywinauto, at least when you're getting started. Going this way, the last part of your script might look like:
word.Activate()
word.MenuSelect('File->New')
word.KeyStrokes(my_string)
word.MenuSelect('File->Save As')
word.Dialogs[-1].FindTextField('Filename').Select()
word.KeyStrokes(r'C:\TestFiles\TestDoc.doc')
word.Dialogs[-1].FindButton('OK').Click()
Finally, even with all of these tools, web browsers are very hard to automate, because each web page has its own menus, buttons, etc. that aren't Windows controls, but HTML. Unless you want to go all the way down to the level of "move the mouse 12 pixels", it's very hard to deal with these. That's where selenium comes in—it scripts web GUIs the same way that pywinauto scripts Windows GUIs.
The following script uses Automa to do exactly what you want (tested on Word 2010):
def find_lyrics():
print 'Please minimize all other open windows, then enter the song:'
song = raw_input()
start("Google Chrome")
# Disable Google's autocompletion and set the language to English:
google_address = 'google.com/webhp?complete=0&hl=en'
write(google_address, into="Address")
press(ENTER)
write(song + ' lyrics filetype:txt')
click("I'm Feeling Lucky")
press(CTRL + 'a', CTRL + 'c')
press(ALT + F4)
start("Microsoft Word")
press(CTRL + 'v')
press(CTRL + 's')
click("Desktop")
write(song + ' lyrics', into="File name")
click("Save")
press(ALT + F4)
print("\nThe lyrics have been saved in file '%s lyrics' "
"on your desktop." % song)
To try it out for yourself, download Automa.zip from its Download page and unzip into, say, c:\Program Files. You'll get a folder called Automa 1.1.2. Run Automa.exe in that folder. Copy the code above and paste it into Automa by right-clicking into the console window. Press Enter twice to get rid of the last ... in the window and arrive back at the prompt >>>. Close all other open windows and type
>>> find_lyrics()
This performs the required steps.
Automa is a Python library: To use it as such, you have to add the line
from automa.api import *
to the top of your scripts and the file library.zip from Automa's installation directory to your environment variable PYTHONPATH.
If you have any other questions, just let me know :-)
Here's an implementation in Python of #Matteo Italia's comment:
You are approaching the problem from a "user perspective" when you
should approach it from a "programmer perspective"; you don't need to
open a browser, copy the text, open Word or whatever, you need to
perform the appropriate HTTP requests, parse the relevant HTML,
extract the text and write it to a file from inside your Python
script. All the tools to do this are available in Python (in
particular you'll need urllib2 and BeautifulSoup).
#!/usr/bin/env python
import codecs
import json
import sys
import urllib
import urllib2
import bs4 # pip install beautifulsoup4
def extract_lyrics(page):
"""Extract lyrics text from given lyrics.wikia.com html page."""
soup = bs4.BeautifulSoup(page)
result = []
for tag in soup.find('div', 'lyricbox'):
if isinstance(tag, bs4.NavigableString):
if not isinstance(tag, bs4.element.Comment):
result.append(tag)
elif tag.name == 'br':
result.append('\n')
return "".join(result)
# get artist, song to search
artist = raw_input("Enter artist:")
song = raw_input("Enter song:")
# make request
query = urllib.urlencode(dict(artist=artist, song=song, fmt="realjson"))
response = urllib2.urlopen("http://lyrics.wikia.com/api.php?" + query)
data = json.load(response)
if data['lyrics'] != 'Not found':
# print short lyrics
print(data['lyrics'])
# get full lyrics
lyrics = extract_lyrics(urllib2.urlopen(data['url']))
# save to file
filename = "[%s] [%s] lyrics.txt" % (data['artist'], data['song'])
with codecs.open(filename, 'w', encoding='utf-8') as output_file:
output_file.write(lyrics)
print("written '%s'" % filename)
else:
sys.exit('not found')
Example
$ printf "Queen\nWe are the Champions" | python get-lyrics.py
Output
I've paid my dues
Time after time
I've done my sentence
But committed no crime
And bad mistakes
I've made a few
I've had my share of sand kicked [...]
written '[Queen] [We are the Champions] lyrics.txt'
If you really want to open a browser, etc, look at selenium. But that's overkill for your purposes. Selenium is used to simulate button clicks, etc for testing the appearance of websites on various browsers, etc. Mechanize is less of an overkill for this
What you really want to do is understand how a browser (or any other program) works under the hood i.e. when you click on the mouse or type on the keyboard or hit Save, what does the program do behind the scenes? It is this behind-the-scenes work that you want your python code to do.
So, use urllib, urllib2 or requests (or heck, even scrapy) to request a web page (learn how to put together the url to a google search or the php GET request of a lyrics website). Google also has a search API that you can take advantage of, to perform a google search.
Once you have your results from your page request, parse it with xml, beautifulsoup, lxlml, etc and find the section of the request result that has the information you're after.
Now that you have your lyrics, the simplest thing to do is open a text file and dump the lyrics in there and write to disk. But if you really want to do it with MS Word, then open a doc file in notepad or notepad++ and look at its structure. Now, use python to build a document with similar structure, wherein the content will be the downloaded lyrics.
If this method fails, you could look into pywinauto or such to automate the pasting of text into an MS Word doc and clicking on Save
Citation: Matteo Italia, g.d.d.c from the comments on the OP
You should look into a package called selenium for interacting with web browsers
Related
I want to make a bot that copies messages from a discord server and then pastes them into the Minecraft chat. I'm not talking about rcon.
I copied some code that takes the last message and puts it in a text file:
async def copy(ctx):
with open("file.txt", "w") as f:
async for message in ctx.history(limit=1000):
f.write(message.content + "\n")
But I can't find this text file. I've tried putting it in the same folder with main.py and replacing "file.txt" with a full path to the file but it still won't work.
If I manage to get this whole "copy message into txt/variable" thing I should be able to finish the "paste stuff into chat" thing.
Please help me I'm stupid.
Not Python-specific, although you can try AutoHotkey for this (assuming you are dealing with 2 different apps consuming 2 different windows). AutoIt is also a solid alternative for this. Both are Windows specific and if you are using some other operating system, then good luck finding a proper macro recorder.
If you only need to output the text to a file, Selenium might be a better solution, given the plethora of options, as Discord can be opened in a browser too.
For a pure Python solution, you might look into PyWinAuto (again, Windows-specific).
The website Download the GLEIF Golden Copy and Delta Files.
has buttons that download data that I want to retrieve automatically with a python script. Usually when I want to download a file, I use mget or similar, but that will not work here (at least I don't think it will).
For some reason I cannot fathom, the producers of the data seem to want to force one to manually download the files. I really need to automate this to reduce the number of steps for my users (and frankly for me), since there are a great many files in addition to these and I want to automate as many as possible (all of them).
So my question is this - is there some kind of python package for doing this sort of thing? If not a python package, is there perhaps some other tool that is useful for it? I have to believe this is a common annoyance.
Yup, you can use BeautifulSoup to scrape the URLs then download them with requests.
I am learning how to do webscraping, crawlers etc. and I came across this repo. I understand how the code works, what the input and outputs should be, but how do I run it in a terminal on Windows? How do I call the respective .txt files and test the search engine?
I saw that someone else asked that and the creator showed them this link here. But it still doesn't explain how to actually apply it to files.
The author of logicx24 has hard coded the target text files in querytexts.py. See line 122 which reads:
q = Query(['pg135.txt', 'pg76.txt', 'pg5200.txt'])
The list input to Query are all references to files that exist in the corpus directory. Try changing that to include a different file in their corpus directory. Better yet, add a new target text file of your own and use that.
Good luck!
Why are you using text files? I don't get it. Either way, you could just use Python itself to do that. Use the selenium library for Python. There's a tutorial to installing this here. Once that's done, just use this code if you're using Google:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.google.com")
search = driver.find_element_by_css_selector(".gLFyf.gsfi")
time.sleep(5)
search.send_keys("Desired Input Text Goes Here")
search.send_keys(Keys.RETURN)
Don't worry if it takes a while to load. It usually does that. If you want to reduce the amount of time it takes, use a lower number for the parameter on line 8 (time.sleep(5)). Assuming you've gone ahead and learned a bit more about Selenium, there isn't really much else to talk about apart from one thing. That is, line 7 (search = driver.find_element_by_css_selector(".gLFyf.gsfi"). Assuming you've learned advanced CSS selectors already (if you have literally no experience in web development, specifically HTML and CSS, you can just copy-paste the code), the .gLFyf.gsfi is simply the CSS selector for the search bar in Google. You can find the selector for the search bar in any engine by just looking through the source code using Ctrl + Shift + I on Windows. You can use any other Selenium element selector for this as long as it works. Make sure to also change the URL on line 6 (driver.get("https://www.google.com")) to match that of your search engine if you're not using Google.
Sorry if this seemed a bit vague or strange. If you don't really care, feel free to download Selenium, copy-paste the code, and move on. Otherwise, I suggest also learning Selenium and HTML/CSS if you haven't already.
Here is the code:
Sublime plugin:
File 1: open_in_default_program.py:
# https://github.com/SublimeTextIssues/Core/issues/2368
import webbrowser
import sublime_plugin
class OpenInDefaultProgramCommand(sublime_plugin.TextCommand):
def run(self, edit):
if self.view.file_name():
webbrowser.open_new_tab("file://" + self.view.file_name())
def is_visible(self):
return self.view.file_name() is not None and (
self.view.file_name()[-5:] == ".html" or
self.view.file_name()[-3:] == ".md" or
self.view.file_name()[-4:] == ".ahk")
File 2: Context.sublime-menu:
[
{ "command": "open_in_default_program" },
]
AutoHotkey test file:
Test.ahk:
MsgBox Something
My question:
It works for HTML and Markdown files. It also works for AutoHotkey files - but how? From what I see, it uses browser. AutoHotkey files can't be opened in browser - but actually they are perfectly could be launched with this plugin. Why it works?
Here is another plugin for opening files in default application, but it's much more complex: https://github.com/SublimeText/OpenDefaultApplication/blob/master/OpenDefault.py
This is mentioned in the documentation for webbrowser.open:
Note that on some platforms, trying to open a filename using this function, may work and start the operating system’s associated program. However, this is neither supported nor portable.
The reason for this is that some browsers, when given a file they don't know how to handle, will automatically open it in the default program for that file. For example, on Windows, Internet Explorer is basically the same program as Windows Explorer,1 so asking Internet Explorer to open a file it doesn't know how to handle has basically the same effect as double-clicking that file in Windows Explorer.
Of course other browsers might do nothing, or copy the file to your Downloads directory, or pop up a dialog asking you what you want to do with this file. That's why the docs say "this is neither supported nor portable".
It's also worth noting that, like many of the stdlib modules, the docs for webbrowser have a link to the source code at the top, and the source code is pretty straightforward, simple Python code. You can see that ultimately, it's just using the subprocess module to call something like (depending on your detected browser, and possibly with some browser-specific options to tell it "don't start a whole new browser, tell the existing browser window to open a new tab"):
iexplore.exe file://path/to/your/file
You can easily work out exactly what command it's running and experiment running the same command in your shell/command prompt.
The more complex plugin shows the way to do this as portably as possible:
On Windows, you can call os.startfile.
On other platforms, you run a command-line tool. (The plugin seems to work out the right tool at install time, store it in a settings file, and look it up in that file.)
On macOS, it's open.
On FreeDesktop systems, including most modern Linux distros, it's xdg-open.
Those three options are usually enough to cover 99% of your users, and almost all of the remaining users will be people who know what they're doing and can figure out what to put in your settings file. (Unless, of course, you're developing for mobile, in which case you'll want to write special handlers for iOS and Android.)
1. This isn't really true anymore in modern Windows, but it's close enough to illustrate the point.
Is there a reasonably standard and cross platform way to print text (or even PS/PDF) to the system defined printer?
Assuming CPython here, not something clever like using Jython and the Java printing API.
This has only been tested on Windows:
You can do the following:
import os
os.startfile("C:/Users/TestFile.txt", "print")
This will start the file, in its default opener, with the verb 'print', which will print to your default printer.Only requires the os module which comes with the standard library
Unfortunately, there is no standard way to print using Python on all platforms. So you'll need to write your own wrapper function to print.
You need to detect the OS your program is running on, then:
For Linux -
import subprocess
lpr = subprocess.Popen("/usr/bin/lpr", stdin=subprocess.PIPE)
lpr.stdin.write(your_data_here)
For Windows: http://timgolden.me.uk/python/win32_how_do_i/print.html
More resources:
Print PDF document with python's win32print module?
How do I print to the OS's default printer in Python 3 (cross platform)?
To print to any printer on the network you can send a PJL/PCL print job directly to a network printer on port 9100.
Please have a look at the below link that should give a good start:
http://frank.zinepal.com/printing-directly-to-a-network-printer
Also, If there is a way to call Windows cmd you can use FTP put to print your page on 9100. Below link should give you details, I have used this method for HP printers but I believe it will work for other printers.
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=bpj06165
You can try wx library. It's a cross platform UI library. Here you can find the printing tutorial:
https://web.archive.org/web/20160619163747/http://wiki.wxpython.org/Printing
I find this to be the superior solution, at least when dealing with web applications. The idea is this: convert the HTML page to a PDF document and send that to a printer via gsprint.
Even though gsprint is no longer in development, it works really, really well. You can choose the printer and the page orientation and size among several other options.
I convert the web page to PDF using Puppeteer, Chrome's headless browser. But you need to pass in the session cookie to maintain credentials.