I am running this code to import multiple file into a table in mysql, and the return is Engine object has no attribute cursor, I find many similiar topic, the answer is about pandas version, but I use pandas 0.19 so it might not be the reason. Could anyone help? Or are there any other way to import multiple text file into mysql
import MySQLdb
import os
import glob
import pandas
import datetime
import MySQLdb as mdb
import requests
from sqlalchemy import create_engine
engine =create_engine("mysql://vn_user:thientai3004#127.0.0.1/vietnam_stock?charset=utf8")
indir='E:\DataExport'
os.chdir(indir)
fileList=glob.glob('*.txt')
dfList = []
colnames= ['Ticker','Date','Open','High','Low','Close','Volume']
for filename in fileList:
print(filename)
df = pandas.read_csv(filename,header=0)
df.to_sql('daily_price',engine, if_exists='append', index=False)
Related
I had one more problem with my app on stremalit. I am using PyDrive library to connect with google drive where I am uploading photos for educational purposes. The problem is that every week the app is throwing authentication error. The app is on the streamlit cloud. In my github directory I have clients_secrets.json file, creds.json file and also settings.yaml file.
Here is my code:
import streamlit as st
import pandas as pd
import requests
import sqlalchemy as db
#import matplotlib.pyplot as plt
import leafmap.foliumap as leafmap
from sqlalchemy import create_engine
from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy import text
from plotly import graph_objects as go
from plotly.subplots import make_subplots
import branca
#import os
from pathlib import Path
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import streamlit as st
#import logging
import numpy as np
from PIL import Image
#logging.getLogger('googleapicliet.discovery_cache').setLevel(logging.ERROR)
gauth = GoogleAuth()
gauth.CommandLineAuth()
drive = GoogleDrive(gauth)
What I am doing at the moment is changing creds.json file the moment my credentials are expired. But I wish that I won't need to do it all the time. Any idea how can I make authentication process automatic without the need to change the creds.json file every week?
I tried to change creds.json file every week
I am getting that error and have already checked the forums & have found no answer as I do have the correct imports installed. Here is my current import list:
import xlwings as xw
import xlsxwriter as xlsx
import xlrd
import xlwt
from xlutils.copy import copy
import pandas as pd
import win32com.client as win32
import openpyxl as xl
from openpyxl import load_workbook
import numpy as np
import datetime
import os.path
import warnings
Here's the code I'm getting errors on:
wb = xlsx.Workbook(File_path)
ws = ws
ws.set_column('A:A', 60)
ws comes form:
ws = wb.get_worksheet_by_name('Data')
Any help would be greatly appreciated. Thanks!
You need to replace
ws = wb.sheets['Data']
with
ws = wb.get_worksheet_by_name('Data')
Language: Python 3.8.3
I faced this error when I was importing my xlxs file ModuleNotFoundError: No module named 'xlxswriter'
import xlxswriter
import pandas as pd
from pandas import DataFrame
path = ('mypath.xlxs')
xl = pd.ExcelFile(path)
print(xl.sheet_names)
How can I fix this?
Instead of typing xlsx, type xlsx like this:
import xlsxwriter
import pandas as pd
from pandas import DataFrame
path = ('mypath.xlsx')
xl = pd.ExcelFile(path)
print(xl.sheet_names)
It'll work.
The module name is xlsxwriter not xlxswriter, so replace that line with:
import xlsxwriter
I am trying to access gremlin via AWS Glue with PySpark as Runtime. As gremlinpython is external library, i had downloaded .whl file and placed in AWS S3. Now it was asking for "aenom" did the same. then isodate is required. So just wanted to know if there is anypackage which i can use instead of having separate modules.
Below is the sample script i am testing initially with all modules to keep it simple.
import boto3
import os
import sys
import site
import json
import pandas as pd
#from setuptools.command import easy_install
from importlib import reload
from io import StringIO
s3 = boto3.client('s3')
#dir_path = os.path.dirname(os.path.realpath(__file__))
#os.path.dirname(sys.modules['__main__'].__file__)
#install_path = os.environ['GLUE_INSTALLATION']
#easy_install.main( ["--install-dir", install_path, "gremlinpython"] )
#(site)
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.process.traversal import T, Column
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
required libraries are below, after that there is no error related to modules.
tornado-6.0.4-cp35-cp35m-win32.whl
isodate-0.6.0-py2.py3-none-any.whl
aenum-2.2.4-py3-none-any.whl
gremlinpython-3.4.8-py2.py3-none-any.whl
I am loading the website data into Postgres using Python-Django, there are around 9 csv files that i am loading to database but its taking so long ~10 hrs to fetch the data from website, and load it to Postgres. I wanted to increase the performance of the query. Can you guys please help with that?
This is just 1 dataframe, but i have 9 more similar to that, overall data is less than 500k records
from django.db import models
from django_pandas.managers import DataFrameManager
import pandas as pd
from sqlalchemy import create_engine
import zipfile
import os
from urllib.request import urlopen
import urllib.request
import io
from io import BytesIO
class mf(models.Model):
pg_engine = create_engine('postgresql://user:password#server:host/db')
zf = zipfile.ZipFile(BytesIO(urllib.request.urlopen('http://md_file.zip').read()))
df1 = pd.read_csv(zf.open('nm.dat'),header=None,delimiter='|', index_col=0, names=['aaa', 'xxxx', 'yyy','zzz'])
df1.to_sql('nm',pg_engine,if_exists='replace')