import data from salesforce to databricks

import data from salesforce to databricks - python

I am trying to import data from salesforce to databricks using simple_salesforce . It is working fine with objects having less fields but started failing if my object has more fields.
Below is what I am trying
sf = Salesforce (
username = username,
password = password,
security_token = security_token,
domain="test"
)
df = pd.DataFrame(sf.query("Here I am passing all fields they are around 1000 in select query" from "+table)['records']).dropna(axis='columns', how='all').drop(['attributes'],axis=1)
Error
Error Code 414. Response content: <h1>Bad Message 414</h1><pre>reason: URI Too Long</pre>

SOQL can handle queries up to 100K characters. Have you hit that limit?
Can you cut it into 2 queries? SELECT Id, ExternalId__c, Afield__c, Bfield__c, ... say till "M" and then SELECT Id, ExternalId__c, Mfield__c, Nfield__c, ... ?
Or split them other way, maybe more and less important ones? You could even hide certain fields from the integration user (remove the checkboxes in Profile / Permission Set) if you can think of something that doesn't have to be synchronised.

Related

Python Dash - Basic Authentication with multiple usernames filtering database based on user input in username

I have an SQL database that stores all my data. I have a column there called “client_name” where I have client1, client2, etc.
I have created a basic dash authentication where as soon as my dash application loads, it asks the user for a username and password.
How can I make it work that if the user inputs “client1” as username, the app will automatically filter my SQL database to only read rows that belong to client1 and thus all visuals will display client1’s data?
Sample code
# User dictionary
USERNAME_PASSWORD_PAIRS = {
'client1': 'client1'
, 'client2': 'client2'
}
# Basic authentication
auth = dash_auth.BasicAuth(
app,
USERNAME_PASSWORD_PAIRS
)
# Connect and read data from SQL server
odbc_params = f'DRIVER={driver};SERVER=tcp:{server};PORT=1433;DATABASE={database};UID={username};PWD={password}'
connection_string = f'mssql+pyodbc:///?odbc_connect={odbc_params}'
engine = create_engine(connection_string)
query = "SELECT * FROM [dbo].[test]" # database with all client data
df = pd.read_sql(query, engine)
engine.dispose()
So I want the “df” to read the filtered dataframe based on wether “client1” or “client2” was used to log in.
Thanks for the help

You have a column in df which is named "client" or something alike. You need to get the name of the logged in user, and then filter the df for it:
username = auth.get_username()
df_filtered_for_client = df[df["client"].isin([username])

Pulling Snowflake table into Dataframe

I continue to get and error "ProgrammingError: 002003 (42502): SQL compilation error: Object 'Table' does not exist or not authorized. I am using the following code:
con = snowflake.connector.connect(
user = "user.name",
authenticator="externalbrowser",
warehouse = "ware house name",
database = "db name",
schema = "schema name"
)
cur.con.cursor()
sql = "select * from Table"
cur.execute(sql)
df = cur.fetch_pandas_all()
When I execute the code in Jupyter Notebook the browser window opens and authenticates my creds but when it gets to the sql execute line the error rises and tells me that the table does not exist. When I open up Snowflake in my browser I can see that the table does exist in the correct warehouse, database and schema I have in my code.
Has anyone else ever experienced this? Do I need to authorize my user to be able to access this table via Python and Jupyter Notebook?

It's likely your session doesn't have a role assigned to it (current role).
You can add the role in your list of connection session paramters,
e.g. add something like the following
role = 'RICH_ROLE',
You might want to consider setting a default role for your user.
ALTER USER userNameHere SET DEFAULT_ROLE = 'THE_BEST_ROLE';
docs link: https://docs.snowflake.com/en/sql-reference/sql/alter-user.html
Also, when all else fails, use the fully qualified table name, note this won't help much if the role isn't set:
sql = "select * from databaseName.schemaName.TableName"

Fetch from one database, Insert/Update into another using SQLAlchemy

We have data in a Snowflake cloud database that we would like to move into an Oracle database. As we would like to work toward refreshing the Oracle database regularly, I am trying to use SQLAlchemy to automate this.
I would like to do this using Core because my team is all experienced with SQL, but I am the only one with Python experience. I think it would be easier to tweak the data pulls if we just pass SQL strings. Plus the Snowflake db has some columns with JSON that seems easier to parse using direct SQL since I do not see JSON in the SnowflakeDialect.
I have established connections to both databases and am able to do select queries from both. I have also manually created the tables in our Oracle db so that the keys and datatypes match what I am pulling from Snowflake. When I try to insert, though, my Jupyter notebook just continuously says "Executing Cell" and hangs. Any thoughts on how to proceed or how to get the notebook to tell me where the hangup is?
from sqlalchemy import create_engine,pool,MetaData,text
from snowflake.sqlalchemy import URL
import pandas as pd
eng_sf = create_engine(URL( #engine for snowflake
account = 'account'
user = 'user'
password = 'password'
database = 'database'
schema = 'schema'
warehouse = 'warehouse'
role = 'role'
timezone = 'timezone'
))
eng_o = create_engine("oracle+cx_oracle://{}[{}]:{}#{}".format('user','proxy','password','database'),poolclass=pool.NullPool) #engine for oracle
meta_o = MetaData()
meta_o.reflect(bind=eng_o)
person_o = meta_o['bb_lms_person'] # other oracle tables follow this example
meta_sf = MetaData()
meta_sf.reflect(bind=eng_sf,only=['person']) # other snowflake tables as well, but for simplicity, let's look at one
person_sf = meta_sf.tables['person']
person_query = """
SELECT ID
,EMAIL
,STAGE:student_id::STRING as STUDENT_ID
,ROW_INSERTED_TIME
,ROW_UPDATED_TIME
,ROW_DELETED_TIME
FROM cdm_lms.PERSON
"""
with eng_sf.begin() as connection:
result = connection.execute(text(person_query)).fetchall() # this snippet runs and returns result as expected
with eng_o.begin() as connection:
connection.execute(person_o.insert(),result) # this is a coinflip, sometimes it runs, sometimes it just hangs 5ever
eng_sf.dispose()
eng_o.dispose()
I've checked the typical offenders. The keys for both person_o and the result are all lowercase and match. Any guidance would be appreciated.

use the metadata for the table. the fTable_Stage update or inserted as fluent functions and assign values to lambda variables. This is very safe because only metadata field variables can be used in the lambda. I am updating three fields:LateProbabilityDNN, Sentiment_Polarity, Sentiment_Subjectivity
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
connection=engine.connect()
metadata=MetaData()
Session = sessionmaker(bind = engine)
session = Session()
fTable_Stage=Table('fTable_Stage', metadata,autoload=True,autoload_with=engine)
stmt=fTable_Stage.update().where(fTable_Stage.c.KeyID==keyID).values(\
LateProbabilityDNN=round(float(late_proba),2),\
Sentiment_Polarity=round(my_valance.sentiment.polarity,2),\
Sentiment_Subjectivity= round(my_valance.sentiment.subjectivity,2)\
)
connection.execute(stmt)

Python, Flask, MySQL - Check if email exists in database

I'm pretty new to SQL but I need it for a school project. I'm trying to make a (python) web-app which requires accounts. I'm able to put data into my SQL database but now I need some way to verify if an e-mail (inputted via html form) already exists inside the database. Probably the easiest query ever but I haven't got a single clue on how to get started. :(
I'm sorry if this is a duplicate question but I can't find anything out there that does what I need.

if you are using SQLAlchemy in your project:
#app.route("/check_email")
def check_email():
# get email from you form data
email = request.form.get("email")
# check if someone already register with the email
user = Users.query.filter_by(email=email).first()
if not user:
# the email doesnt exist
pass
else:
# the email exists
pass
Users.query.filter_by(email=email).first() equal to SQL:
SELECT * from users where email="EMAIL_FROM_FORM_DATA"
if you are using pymsql(or something like that):
import pymsql
#app.route("/check_email")
def check_email():
# get email from you form data
email = request.form.get("email")
conn = connect(host='localhost',port=3306,user='',password='',database='essentials')
cs1 = conn.cursor()
params = [email]
# cursor return affected rows
count = cs1.execute('select * from users where email=%s', params) # prevent SqlInject
if count == 0:
# count 0 email
else:
# the email exists
# and if you want to fetch the user's info
user_info = cs1.fetchall() # the user_info should be a tuple
# close the connection
cs1.close()
conn.close()

I was able to solve my issue by simply using
INSERT IGNORE and after that checking if it was ignored with the primary key.
Thank you for everyone that helped out though!

How to query a MySQL database via python using peewee/mysqldb?

I'm creating an iOS client for App.net and I'm attempting to setup a push notification server. Currently my app can add a user's App.net account id (a string of numbers) and a APNS device token to a MySQL database on my server. It can also remove this data. I've adapted code from these two tutorials:
How To Write A Simple PHP/MySQL Web Service for an iOS App - raywenderlich.com
Apple Push Notification Services in iOS 6 Tutorial: Part 1/2 - raywenderlich.com
In addition, I've adapted this awesome python script to listen in to App.net's App Stream API.
My python is horrendous, as is my MySQL knowledge. What I'm trying to do is access the APNS device token for the accounts I need to notify. My database table has two fields/columns for each entry, one for user_id and a one for device_token. I'm not sure of the terminology, please let me know if I can clarify this.
I've been trying to use peewee to read from the database but I'm in way over my head. This is a test script with placeholder user_id:
import logging
from pprint import pprint
import peewee
from peewee import *
db = peewee.MySQLDatabase("...", host="localhost", user="...", passwd="...")
class MySQLModel(peewee.Model):
class Meta:
database = db
class Active_Users(MySQLModel):
user_id = peewee.CharField(primary_key=True)
device_token = peewee.CharField()
db.connect()
# This is the placeholder user_id
userID = '1234'
token = Active_Users.select().where(Active_Users.user_id == userID)
pprint(token)
This then prints out:
<class '__main__.User'> SELECT t1.`id`, t1.`user_id`, t1.`device_token` FROM `user` AS t1 WHERE (t1.`user_id` = %s) [u'1234']
If the code didn't make it clear, I'm trying to query the database for the row with the user_id of '1234' and I want to store the device_token of the same row (again, probably the wrong terminology) into a variable that I can use when I send the push notification later on in the script.
How do I correctly return the device_token? Also, would it be easier to forgo peewee and simply query the database using python-mysqldb? If that is the case, how would I go about doing that?

The call User.select().where(User.user_id == userID) returns a User object but you are assigning it to a variable called token as you're expecting just the device_token.
Your assignment should be this:
matching_users = Active_Users.select().where(Active_Users.user_id == userID) # returns an array of matching users even if there's just one
if matching_users is not None:
token = matching_users[0].device_token

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

import data from salesforce to databricks - python

Related

Python Dash - Basic Authentication with multiple usernames filtering database based on user input in username

Pulling Snowflake table into Dataframe

Fetch from one database, Insert/Update into another using SQLAlchemy

Python, Flask, MySQL - Check if email exists in database

How to query a MySQL database via python using peewee/mysqldb?

Categories

Resources