open access file in python - python

I am not able to open the access file using python. I am not sure if the problem is with the mdb file or the python commands.
In [1]: import sys, subprocess
In [2]: DATABASE = 'Exam_BackUp.mdb'
In [3]: table_names = subprocess.Popen(["mdb-tables", "-1", DATABASE], stdout=subprocess.PIPE).communicate()[0]
Couldn't open database.
How do I know if the file is microsoft access file?
I have checked that mdbtools is installed on my Ubuntu server.
I need to open the (access or fortran) file and save the contents to csv.

Why not try opening it with an ODBC driver?
A good example is here, reproducing it for your case would be along the lines of:
import pyodbc
DBfile = 'Exam_BackUp.mdb'
conn = pyodbc.connect('FILEDSN='+DBfile)
cursor = conn.cursor()
# Do whatever you want with SQL selects, etc
cursor.close()
conn.close()

You can convert it by the Terminal using mdbtool like this:
Install mdbtools and upgrade it:
pip install mdbtools
pip install --upgrade pip
Then look for the name of the table inside the mdb file:
home/Docs$ mdb-tables 'file.mdb'
And finally convert the file to .csv with this line:
home/Docs$ mdb-export 'file.mdb' 'name_of_table' > 'file.csv'

Related

Successfully access MySQL database from python using optuna

Since the optuna documentation does not address which modules are required from MySQL, I installed everything of MySQL on my Windows 10 machine. I looked for MySQL on my PC (in which folder the installation takes place is not revealed during installation) and updated the Path variables to
C:\Program Files\MySQL\MySQL Server 8.0\bin
I have successfully created the mysqltestexample database.
Using python SQL connectors, I can reproduce the output using:
import mysql.connector
mydb = mysql.connector.connect(
host="localhost",
user="root",
password="Start123"
)
print(mydb)
mycursor = mydb.cursor()
mycursor.execute("SHOW DATABASES")
for x in mycursor:
print(x)
mydb = mysql.connector.connect(
host="localhost",
user="root",
password="Start123",
database="mysqltesteexample"
)
Connection to the mysqltesteexample does not raise an error - so everything seems to be fine. However, optuna is not able to connect to my database
My python script looks like this. It is the code from the optuna documentation, I just altered the name of the test database.
study0 = optuna.create_study(storage="mysql://root#localhost/mysqltesteexample",study_name="distributed-example")
study0 = optuna.create_study(storage="mysql+pymysql://root:Start123#localhost:3306/mysqltesteexample",study_name="distributed-example")
All attempts to modify the URL string according to https://docs.sqlalchemy.org/en/14/core/engines.html failed with the following error: ImportError: Failed to import DB access module for the specified storage URL. Please install appropriate one.
Can you please help me to get it done? Thank you in advance, please don't be too harsh.
Finally, I made it. I have to install some further packages from the cmd:
py -3.8 -m easy_install mysql-python
py -3.8 -m pip install mysqlclient
Python packages - as well documented as they are eyes rolling

Export dataframe in pyspark to excel file given the 'openpyxl' module is not installed

I am trying to write my spark dataframes in an excel file to generate desired reports by changing them in pandas dataframe and then using
panda_df = df.toPandas()
writer = pd.ExcelWriter(filename)
panda_df.to_excel(writer,'Sheet1', startcol = 0, startrow = 0)
this gives an error saying
File "/usr/lib64/python2.6/site-packages/pandas/io/excel.py", line 350, in __init__
from openpyxl.workbook import Workbook
ImportError: No module named openpyxl.workbook
I am running this on a remote server and hence do not have admin rights to use sudo apt-get as it says "Sudo: apt-get: command not found" and I have also tried using pip to no usage as it is not installed either. Is there any other way I can write my dataframes in excel?
You can proceed as follows.
You can clone the library from it's source repository here:
git clone https://bitbucket.org/openpyxl/openpyxl
Go into the openpyxl directory, then run the following to install it for your user without admin permission:
python setup.py install --user
Then, you can add the path to the openpyxl to your code as follows:
import sys
sys.path.append('/path/to/openpyxl/folder')
panda_df = df.toPandas()
writer = pd.ExcelWriter(filename)
panda_df.to_excel(writer,'Sheet1', startcol = 0, startrow = 0)
Alternatively, you can use the Spark2 datasource of the HadoopOffice library (supports also Python). You can read/write Excel files that encrypted, linked to other workbooks, have metadata etc.
Furthermore, it has a low footprint mode, which enables you quickly writing of larger Excel files without requiring large memory amounts or CPUs:
https://github.com/ZuInnoTe/spark-hadoopoffice-ds
The datasource is based on the HadoopOffice library enabling virtually any Hadoop application to read/write Excel files, because it has corresponding Hadoop FileInputFormats and FileOutputFormats:
https://github.com/ZuInnoTe/hadoopoffice

Twittersearch module is not getting imported to api file

I am trying to make twitter search api everything was working fine but suddenly twittersearch() module is not getting imported. I am using python 3.4.2 with windows 8.1 64-bit. I have tried easy_install twittersearch it successfully installs packages and everything is fine but when I run this code
from TwitterSearch import *
import pyodbc
cnxn = pyodbc.connect(driver='{SQL Server}', server='localhost', database='capstone',trusted_connection='yes')
cursor = cnxn.cursor()
cursor.execute("select word from dbo.search where sl in (select max(sl) from dbo.search)")
for row in cursor.fetchall():
print (row)
print("This is positive data ")
term = row[0]
try:
tso=TwitterSearchOrder()
tso.set_keywords([term])
tso.set_language('en')
When i execute this it shows errors like below
Traceback (most recent call last):
File "C:\Python34\search_test.py", line 18, in <module>
tso=TwitterSearchOrder()
NameError: name 'TwitterSearchOrder' is not defined
But it is actually there I don't know why it is not recognizing the module. till 2 days back it was running successfully with IDLE but not in commandprompt and I have reinstalled the python and added all tools, now it is showing this error in both IDLE and command prompt
TIA
Make sure that TwitterSearch is in your lib\site-packages path. If it is you must've installed it right. If not, a simple python -m pip install TwitterSearch from command prompt will do.
However, there seems to be an issue(bug) with TwitterSearch wrt its functions. You might consider using an alternate API in the meantime. I would suggest using Tweepy

sqlite3.connect() not working in python 3.3

It's probably something quite easy but I can't figure out why my script won't work. I'm trying to make a connection with my sqlite3 database but eclipse returns the error: "Undefined variable from import: connect". I'm running python 3.3 in a virtualenv on linux. Thanks for your help!
from urllib.request import urlopen
import datetime
import sqlite3
class Crawler():
def storeContent(self, html, url):
conn = sqlite3.connect('database.db')
c = conn.cursor()
c.execute("INSERT .. ", [item, item])
c.commit()
c.close()
It seems like Alex Barcelo resolved this issue here.
What worked for me on Ubuntu was almost the same*:
cd /usr/lib/python2.7/lib-dynload/
sudo ln -s _sqlite3.x86_64-linux-gnu.so _sqlite3.so
After that, I had to reconfigure the Python Interpreter for my PyDev project:
Project Properties -> PyDev-Interpreter/Grammar -> Click here to configure an interpreter not listed, then delete, run auto-config for the python environment you're using, and hit "Apply".
*Replace "python2.7" with the version of python you're using sqlite3 with, and if "_sqlite3.x86_64-linux-gnu.so" is not the right name of the file for your linux system, you can normally search for it using "locate _sqlite3"

Using portable python to connect to an Access database

I am wanting to use portable python 2.7.x to connect to an Access database. I can't seem to get it working as it doesn't have the pyodbc libraries. Is there another way to use portable python to connect?
The newest version of portable python has an option to install pyodbc but you have to select the option it doesn't go in by default.
Click on the modules option
Select the option for pyodbc
I have did it in different way.. .
follow what i have just done on my mac snow leopard!!
Download the the pyodbc's source from where it is on internet.
Extract and 'cd' into that dir.. . Run 'python setup.py build' and then take 'pyodbc.so' file from that build's dir. Make new python file named as 'pyodbc.py' and write the content given below.(and put that 'pyodbc.so' file with it)
def __bootstrap__():
global __bootstrap__, __loader__, __file__
import sys, pkg_resources, imp
__file__ = pkg_resources.resource_filename(__name__,'pyodbc.so')
__loader__ = None; del __bootstrap__, __loader__
imp.load_dynamic(__name__,__file__)
__bootstrap__()
(remember put above code in file named as 'pyodbc.py' and put that 'pyodbc.so' file with that)
and at last ..put all these where ever you want to use or in run time add that location into sys.path as:
>>> import sys
>>> sys.path.insert(0,"/my_portable/location") # location to dir which contains those two files
after doing all this i have put those two files with my test python file..and in that i am able to import 'pyodbc' without installing it.
>>> import pyodbc
>>> dir(pyodbc)

Categories