How can I import hbase in python? - python

I'm trying to play around with hbase in python and I am using the cloudera repository to install the hadoop/hbase packages. It seems to work as I can access and work on the database using the shell but its not fully working within python.
I know to communicate with hbase I need thrift so I downloaded and complied it from source, I can import thrift into python but when I do from hbase import Hbase, I get module not found errors.
Does anyone know what package/module I would need to get it to work? I tried to look around easy_install and yum(I'm using centos6) and no luck. I did find an article where a person using debain installed it by doing sudo aptiutde install python-hbase I don't have that command/package, so I'm not sure how to get it(or if I have to compile from source to get it).
Also if it helps, I installed most of the base from cloudera and followed some instructions(the ones didn't require install) from http://yannramin.com/2008/07/19/using-facebook-thrift-with-python-and-hbase/
Any help/tips/suggestions would be great.
Thanks!

Have a look at HappyBase (see https://github.com/wbolster/happybase for info). It is the modern way to interact with HBase from Python. It covers the complete Thrift API but wraps it in a much better interface.

Okay, I figured it out. If anyone else is having problems with this in the future its actually pretty easy. In the step where you run thrift --gen py Hbase.thrift, it creates a hbase folder in the location you ran that command. Simply take that command and copy it to your default module folder(or in the folder where you run your program and it should work).

search for /src/contrib/thriftfs/gen-py under hadoop installation folder
Copy the output of thrift --gen py Hbase.thrif onto the location below (part till /home/hadoop/data/ will differ in your case) /home/hadoop/data/hadoop-1.0.4/src/contrib/thriftfs/gen-py
then
$ python
import sys
sys.path.append("/home/hadoop/data/hadoop-1.0.4/src/contrib/thriftfs/gen-py")
import hbase
It should work now

Related

How do I use Visual Studio Code to connect to a newly created MongoDB collection?

I saw this question was asked once before but the answer did not help my situation.
I am a beginner software development student trying to connect Visual Studio Code to a newly created collection in MongoDB and display the collection, through a python file.
I'm sorry if that explanation doesn't use established terminology, I'm not entirely sure what I'm doing yet. I am using the code my professor provided and yet I get a whole host of syntax errors and a message that says there is no module named MongoClient. I am not allowed to use anything but VS Code so trying a different environment isn't an option. Here is my code:
Import MongoClient
from pymongo import MongoClient
url = “mongodb+srv://admin:admin#cluster0.zcfrz.mongodb.net/pytech?retryWrites=true”
client = MongoClient(url)
db = client.pytech
print(db.list_collection_names)
Any guidance would be truly appreciated.
You Got this error saying there is no MongoClient simply because of the first line.
You can't import MongoClient directly you need to import it from pymongo.
you can read more about how to use pymongo on the official docs.
So the correct way based on the code you provided is (just remove the first line):
from pymongo import MongoClient
url = "mongodb+srv://admin:admin#cluster0.zcfrz.mongodb.net/pytech?retryWrites=true"
client = MongoClient(url)
db = client.pytech
print(db.list_collection_names)
Also you need to make sure you have pymongo installed. to do that read this on the python pypi.
I'm assuming you have pytyhon3 and pip3 installed . it's simple and straight forward just use pip to install it :
pip3 install pymongo
Alternatively :
python3 -m pip install pymongo
Also : i noticed you are mongodb+srv protocol so you need to install dnspython as well :
pip install dnspython
Looks like I was missing a "()" in my last line of code. I had print(db.list_collection_names), but what was needed was print(db.list_collection_names()). Since list_collection_names is a function I needed another set of parenthesis inside the print call.
Check if there's surfix like venv or conda in your selected python interpreter located in lower left corner of VS Code. If not, you're not using any virtual environment but global one.
pythonpath is in Settings.json and could be generated automatically by selecting python interpreter. Run pip show pymongo in integrated Terminal to check if the module is located in current selected python environment. If it is, then you should import pymongo and use MongoClient without errors:

Erwin API with Python

I am trying to get clear concept on how to get the Erwin generated DDL objects with python ? I am aware Erwin API needs to be used. What i am looking if what Python Module and what API needs to used and how to use them ? I would be thankful for some example !
Here is a start:
import win32com.client
ERwin = win32com.client.Dispatch("erwin9.SCAPI")
I haven't been able to browse the scapi dll so what I know is from trial and error. Erwin publishes VB code that works, but it is not straightforward to convert.
Install pywin32 (run the below from pip folder e.g. c:\Program Files\Python37\Scripts)
python -m pip install pywin32
python pywin32_postinstall.py -install
Sample script to extract DDL using Erwin's Forward Engineer functionality (change paths accordingly):
import win32com.client
api = win32com.client.Dispatch("erwin9.SCAPI")
unit = api.PersistenceUnits.Add("c:/models/data_model.erwin", "RDO=Yes")
unit.FEModel_DDL("c:/scripts/ddl_script.sql")
For the above to work, Erwin application should be running (probably).

Set up Brew installed Python 2.7.X as sdk for Intellij / Pycharm

I am trying to import the brew installed version of python by emulating the Global Libraries structure existing for the (mostly) working mac os built-in 2.7.2. However IJ is unable to infer the types or to create the library properly.
Update this is a large existing project. Creating a new project just to get a different version of python is not an option.
Here are the steps:
Try to create new Global Library: Fail : no python .
OK, so I use Copy to clone the built-in SDK:
Now - let us try to emulate the paths included in the original built-in but with the brew base dir: here is a starting point:
And here is one of the exact entries from the builtin library:
So let us clikc on the + to add it:
So .. IJ is unable to handle it properly. I also tried a half dozen others - all with same shrug result from IJ.
So then what is the correct process?
Update Here is the project SDK dialog (thanks to scribbles).
And trying to add: **but the "OK" button is not enabled! So then IJ is not able to load it..
New Project -> Select SDK.
See this video if you still have any questions.
EDIT: Is this more along the lines of what you're looking for (link)?
This is old, but but I ran into the same problem with the current Python 2 install from homebrew in High Sierra. Instead of choosing a directory like it needed in the previous setup, I just setup the Python SDK pointing to the python executable link in /usr/local/opt/python/libexec/bin (which is the directory I added to my path for Python 2. It seems to be working just fine now.
Hopefully this will help someone.

How to bundle Python dependancies in IronWorker?

I'm writing a simple IronWorker in Python to do some work with the AWS API.
To do so I want to use the boto library which is distributed via PyPi repository. The boto library is not installed by default in the IronWorker runtime environment.
How can I bundle the boto library dependancy with my IronWorker code?
Ideally I'm hoping I can use something like the gem dependancy bundling available for Ruby IronWorkers - i.e in myRuby.worker specify
gemfile '../Gemfile', 'common', 'worker' # merges gems from common and worker groups
In the Python Loggly sample, I see that the hoover library is used:
#here we have to include hoover library with worker.
hoover_dir = os.path.dirname(hoover.__file__)
shutil.copytree(hoover_dir, worker_dir + '/loggly') #copy it to worker directory
However, I can't see where/how you specify which hoover library version you want, or where to download it from.
What is the official/correct way to use 3rd party libraries in Python IronWorkers?
Newer iron_worker version has native support of pip command.
So, you need:
runtime "python"
exec "something.py"
pip "boto"
pip "someotherpip"
full_remote_build true
[edit]We've worked on our toolset a bit since this answer was written and accepted. The answer from my colleague below is the recommended course moving forward.[/edit]
I wrote the Python client library for IronWorker. I'm also employed by Iron.io.
If you're using the Python client library, the easiest (and recommended) way to do this is to just copy over the library's installed folder, and include it when uploading the package. That's what the Python Loggly sample is doing above. As you said, that doesn't specify a version or where to download the library from, because it doesn't care. It just takes the one installed on your system and uses it. Whatever you get when you enter "import boto" on your local machine is what would be uploaded.
The other option is using our CLI to upload your worker, with a .worker file.
To do this, here's what you'd need to do:
Create a botoworker.worker file:
runtime "binary"
build 'pip install --install-option="--prefix=`pwd`/pips" boto'
file 'botoworker.py'
exec "botoworker.sh"
That second line is the pip command that will be run to install the dependency. You can modify it like you would any pip command run from the command line. It's going to execute that command on the worker during the "build" phase, so it's only executed once instead of every time you run a task.
The third line should be changed to the Python file you want to run--it's your Python worker file. Here's the one we used to test this:
import boto
If you save that as botoworker.py, the above should work without any modification. :)
The fourth line is a shell script that's going to actually run your worker. I've included the one we used below. Just save it as botoworker.sh, and you won't have to worry about modifying the .worker file above.
PYTHONPATH="$HOME/pips/lib/python2.7/site-packages:$PYTHONPATH" python botoworker.py "$#"
You'll notice it refers to your Python file--if you don't name your Python file botoworker.py, remember to change it here, too. All this does is set your PYTHONPATH to include the installed library, and then runs your Python file.
To upload this, just make sure you have the CLI installed (gem install iron_worker_ng, making sure your Ruby version is 1.9.3 or higher) and then run "iron_worker upload botoworker" in your shell, from the same directory your botoworker.worker file is in.
Hope this helps!

connecting Python 2.6.1 with MySQLdb

I am using Python 2.6.1 and I want to connect to MySQLdb, I installed mySQL in my system, and I am trying to connect MySQL-python-1.2.2.win32-py2.6 from http://www.codegood.com/archives/4 site but its not working
while running my application its saying that No module named MySQLdb
please any one provide me the proper setup for MySQLdb.
thanks in advance
The best setup for Windows that I've found:
http://www.codegood.com/downloads?dl_cat=2
EDIT: Removed original link (it's an ad farm now :( )
The module is not likely in your python search path..
Check to see if that module is in your Python Path... In windows...you may find it in the registry
HKLM\Software\Python\PythonCore\2.6\PythonPath
Be careful editing it...
You may also alter the Python Path programmaticly by the following
import sys
sys.path.append('somepath_to_the_module_you_wanted')
import the_module_you_wanted
Hope that helps
I was having this problem and then I realised I was importing MySQLdb erroneously - it's case sensitive:
Incorrect: >>>import mysqldb
Correct: >>>import MySQLdb
Silly mistake, but cost me a few hours!
generally, (good) python modules provide a 'setup.py' script that takes care of things like proper installation (google for 'distutils python'). MySQLdb is a "good" module in this sense.
since you're using windows, things might be a bit more complex. I assume you already installed MySQLdb following the instructions and it still gives this problem. what I would do is open a cmd.exe window, cd to the directory containing the 'setup.py' script and there type something like
C:\Python26\Python.exe setup.py install
if this does not work, then grab the module somewhere else, maybe at the place where it is actively developed: http://sourceforge.net/projects/mysql-python/
See this post on the mysql-python blog: MySQL-python-1.2.3 beta 2 released - dated March 2009. Looks like MySQLdb for Python 2.6 is still a work in progress...
I went for compiled binary , thats the best way to go on windows. There is a good source maintained by someone.
I wrote about it here before because some months down the lane I will forget how I solved this and be searching Stack again :/
http://vangel.3ezy.com/archives/101-Python-2.4-2.5-2.6-and-2.7-Windows-MySQLdb-python-installation.html

Categories