Python image search wrapper - python

I'm trying to figure out a straightforward way to do some image searching in Python with larger numbers of return results than Google's API allows, i.e. 64. I spent some time trying to use the Python Boss Mashup Framework, and I followed the install instructions carefully, but attempting to do any searches always returns authorization errors.
I basically just want to search on a term and grab the first, say, 100 image urls that come up. The particular search engine doesn't matter. Any advice on a simple way to do this would be appreciated!

Not sure if this is simple enough, but you could use Mechanize to directly access Google Image search, click the next link for more results, etc. It's very easy to use, almost trivial.

Related

Python Reading Specific Parts of the screen and Taking actions Based on them

i am kind of new to Python and trying to learn the language the best i can. Right now i am having a problem of trying to automate something. I dont know how i should
What i am trying to automate I was trying to make the Programm being able to Read the value of how much Food, Wood etc is in the Storage right now(maybe also know if the storage is full or not) If it is full, or has reached the value needed to upgrade the storage for Food for ecaple, then it should do so.
I was thinking at first of how it should know that the storage is full, should be done by using pyautogui using but i am unsure...i am sorry that this is such a long massage and i dont know if you guys understand what i am trying to explain,...if not so, just ask
Python does not recognize (nor know) the value of the characters for this type of automation it is recommended to use an image recognition library. (I Recommend Opencv) and for clicking on the screen you can use pyput.
Opencv Link: https://pypi.org/project/opencv-python/
Pyput Link: https://pypi.org/project/pynput/
and you can also watch the following project since it can help you get started.
https://www.youtube.com/watch?v=vXqKniVe6P8

How to: Python script that will 'click' on a portion of my screen, and then do key commands?

Python noobie.
I'm trying to make Python select a portion of my screen. In this case, it is a small window within a Firefox window -- it's Firebug source code. And then, once it has selected the right area, control-A to select all and then control-C to copy. If I could figure this out then I would just do the same thing and paste all of the copies into a .txt file.
I don't really know where to begin -- are there libraries for this kind of thing? Is it even possible?
I would look into PyQt or PySide which are Python wrapper on TOp of Qt.
Qt is a big monster but it's very well documented and i'm sure it will help you further in your project once you grabbed your screen section.
As you've mentioned in the comments, the data is all in the HTML to start (I'm guessing it's greyed out in your Firebug screenshot since it's a hidden element). This approach avoids the complexity of trying to automate a browser. Here's a rough outline of how I would get the data:
Download the HTML for the whole page - I'd do this manually at first (i.e. File > Save from a browser), and if there are a bunch of pages you want to process, figure out how to download all the pages you want later. If you want to use python for this part, I'd recommend urllib2. The URLs for each page are probably pretty structured, so you could easily store them in a list, and download each one and save it locally. .
Write a script to parse the HTML - don't use regex. Since you're using Python, use something like Beautiful Soup, which will create a nice object representation of the page, and then you can get the elements you want.
You mention you're new to python, so there's definitely going to be a learning curve around this, but this actually sounds like a pretty doable project to use to learn some more python.
If you run into specific obstacles with each step, start a new question with a bit of sample code, showing what you're trying to accomplish, and people will be more than willing to help out.

How can I render a bar code as an image through google app engine with python?

After a day's reading, I'm no closer to finding a good way to render a bar code through Google app engine on python. The closest I've found is PyPNG as a pure python image library but it looks as though that would require quite a bit of work to make it usable. Does anyone have any good suggestions on how to do it before i walk down the tumultuous path of rolling my own?
Thanks,
Richard
You might want to look at a webservice to generate the barcode. I found the site below on google but there are many results if you search for: barcode webservice. Many of them are code for you to download and use but some appear to be live webservices.
http://www.barcodesoft.com/barcode-generator-webservice.aspx
Edit: Now that I think about it another option may be to generate the barcode in js/html depending on your need. This SO question/answers might be helpful:
https://stackoverflow.com/questions/835082/what-is-the-best-javascript-barcode-generator

Google app engine datastore tag cloud with python

We have some unstructured textual data in our app engine datastore. I wanted to create a 'one off' tag cloud of one property on a subset of the datastore objects. After a look around, I can't see any framework that will allow me to do this without writing it myself.
The way I had in mind was:
Write a map (as in map reduce) function to go over every object of the particular type in a datastore,
Split the text string into words
For each word increment a counter
Use the final counts to generate the tag cloud with some third party software (offline - any suggestions here welcome)
As I've never done this before, I was wandering if firstly there is some framework around that does this for me (please) of if not am I approaching it in the right way. i.e please feel free to point out gaping holes in the plan.
Feed TagCloud and PyTagCloud are two possibilities.
Feed TagCloud Generator Gadget for
Google App Engine might fit your
needs. Unfortunately, it's
undocumented. Fortunately it's
rather simple, though I'm not sure
how well-suited it is to your needs.
It operates on a feed, and appears
to be somewhat flexible, so if you
have an feed of your site, it might
not be too much trouble to
integrate, though all processing
will be online.
PyTagCloud is also worth a
look. You'll be able to do the
processing offline, and it generates
rather handsome clouds.
All you'll have to do to get this
working, is export your datastore;
the counts and splitting will be
done for you, as PyTagCloud can
operate on text files. Following
the instructions in the App Engine
docs about Uploading and
Downloading Data will show you
how to export the datastore to your
local machine. You'll want to write
an "Exporter Class", and have
PyTagCloud operate on the output.
If you decide to roll your own, you probably want to skip the online processing and use the offline method of Uploading and Downloading Data above, unless you want a dynamically-updated cloud. Iterating over your entire data store, and doing online counts is the most annoying and expensive part of the task. It only makes sense to do this if you want or need a dynamic tag-cloud. As above, I'd recommend writing an "Exporter Class", and operating on that locally.

Writing a Faster Python Spider

I'm writing a spider in Python to crawl a site. Trouble is, I need to examine about 2.5 million pages, so I could really use some help making it optimized for speed.
What I need to do is examine the pages for a certain number, and if it is found to record the link to the page. The spider is very simple, it just needs to sort through a lot of pages.
I'm completely new to Python, but have used Java and C++ before. I have yet to start coding it, so any recommendations on libraries or frameworks to include would be great. Any optimization tips are also greatly appreciated.
You could use MapReduce like Google does, either via Hadoop (specifically with Python: 1 and 2), Disco, or Happy.
The traditional line of thought, is write your program in standard Python, if you find it is too slow, profile it, and optimize the specific slow spots. You can make these slow spots faster by dropping down to C, using C/C++ extensions or even ctypes.
If you are spidering just one site, consider using wget -r (an example).
Where are you storing the results? You can use PiCloud's cloud library to parallelize your scraping easily across a cluster of servers.
As you are new to Python, I think the following may be helpful for you :)
if you are writing regex to search for certain pattern in the page, compile your regex wherever you can and reuse the compiled object
BeautifulSoup is a html/xml parser that may be of some use for your project.
Spidering somebody's site with millions of requests isn't very polite. Can you instead ask the webmaster for an archive of the site? Once you have that, it's a simple matter of text searching.
You waste a lot of time waiting for network requests when spidering, so you'll definitely want to make your requests in parallel. I would probably save the result data to disk and then have a second process looping over the files searching for the term. That phase could easily be distributed across multiple machines if you needed extra performance.
What Adam said. I did this once to map out Xanga's network. The way I made it faster is by having a thread-safe set containing all usernames I had to look up. Then I had 5 or so threads making requests at the same time and processing them. You're going to spend way more time waiting for the page to DL than you will processing any of the text (most likely), so just find ways to increase the number of requests you can get at the same time.

Categories