Neo4j Bolt StatementResult to Pandas DataFrame

Neo4j Bolt StatementResult to Pandas DataFrame - python

Based on example from Neo4j
from neo4j.v1 import GraphDatabase, basic_auth
driver = GraphDatabase.driver("bolt://localhost", auth=basic_auth("neo4j", "neo4j"))
session = driver.session()
session.run("CREATE (a:Person {name:'Arthur', title:'King'})")
result = session.run("MATCH (a:Person) WHERE a.name = 'Arthur' RETURN a.name AS name, a.title AS title")
for record in result:
print("%s %s" % (record["title"], record["name"]))
session.close()
Here result is of datatype neo4j.v1.session.StatementResult. How to access this data in pandas dataframe without explicitly iterating?
pd.DataFrame.from_records(result) doesn't seem to help.
This is what I have using list comprehension
resultlist = [[record['title'], record['name']] for record in result]
pd.DataFrame.from_records(resultlist, columns=['title', 'name'])

The best I can come up with is a list comprehension similar to yours, but less verbose:
df = pd.DataFrame([r.values() for r in result], columns=result.keys())
The py2neo package seems to be more suitable for DataFrames, as it's fairly straightforward to return a list of dictionaries. Here's the equivalent code using py2neo:
import py2neo
# Some of these keyword arguments are unnecessary, as they are the default values.
graph = py2neo.Graph(bolt=True, host='localhost', user='neo4j', password='neo4j')
graph.run("CREATE (a:Person {name:'Arthur', title:'King'})")
query = "MATCH (a:Person) WHERE a.name = 'Arthur' RETURN a.name AS name, a.title AS title"
df = pd.DataFrame(graph.data(query))

Casting result records into dictionaries does the trick:
df = pd.DataFrame([dict(record) for record in result])

What about:
from neo4j.v1 import GraphDatabase
from pandas import DataFrame
uri = "bolt://localhost:7687"
driver = GraphDatabase.driver(uri, auth=("",""))
get_instances = """MATCH (n)--(m)
RETURN n
LIMIT 10
"""
with driver.session() as graphDB_Session:
result = graphDB_Session.run(get_instances)
df = DataFrame(result.records(), columns=result.keys())
Works for me.

In the V4 of py2neo, the conversion to pandas DataFrame is even easier.
import py2neo
# Some of these keyword arguments are unnecessary, as they are the default values.
graph = py2neo.Graph(uri, user=username, password=password)
df = graph.run("Your query goes here").to_data_frame()

Related

Problems querying with python to BigQuery (Python String Format)

I am trying to make a query to BigQuery in order to modify all the values of a row (in python). When I use a simple string to query, I have no problems. Nevertheless, when I introduce the string formatting the query does not work. As follows I'm presenting the same query, but diminishing the number of columns that I am modifying.
I already made the connection to BigQuery, by defining the Client, etc (and works properly).
I tried:
"UPDATE `riscos-dev.survey_test.data-test-bdrn` SET informaci_meteorol_gica = {inf}, risc = {ri} WHERE objectid = {obj_id}".format(inf = df.informaci_meteorol_gica[index], ri = df.risc[index], obj_id = df.objectid[index])
To specify the input values in format:
df.informaci_meteorol_gica[index] = 'Neu' , also a string for df.risc[index] and df.objectid[index] = 3
I am obtaining the following error message:
BadRequest: 400 Braced constructors are not supported at [1:77]

Instead of using format method of string, I propose you another approach with the f string formating in Python :
def build_query():
inf = "'test_inf'"
ri = "'test_ri'"
obj_id = "'test_obj_id'"
return f"UPDATE `riscos-dev.survey_test.data-test-bdrn` SET informaci_meteorol_gica = {inf}, risc = {ri} WHERE objectid = {obj_id}"
if __name__ == '__main__':
query = build_query()
print(query)
The result is :
UPDATE `riscos-dev.survey_test.data-test-bdrn` SET informaci_meteorol_gica = 'test_inf', risc = 'test_ri' WHERE objectid = 'test_obj_id'
I mocked the query params in my example with :
inf = "'test_inf'"
ri = "'test_ri'"
obj_id = "'test_obj_id'"

Fetching results from cypher bolt statement

I am trying to access neo4j using neo4j python driver.I am running the following code to get a property of a thing A., I open the driver and session directly from GraphDatabase of neo4j and use session.run() to execute graph queries. These queries return a BoltStatementResult object.My question is how this object can be converted to actual result that I need(Property of thing A).?
from neo4j import GraphDatabase
uri = "bolt://abc:7687"
driver = GraphDatabase.driver(uri, auth=("neo4j", "password"))
def matchQuestion(tx, intent,thing):
result = tx.run("MATCH (e:thing) WHERE e.name = {thing}"
"RETURN e.description", thing=thing)
print(result)
with driver.session() as session:
session.read_transaction(matchQuestion, "define","A")

result = tx.run("MATCH (e:thing) WHERE e.name = {thing}"
"RETURN e.description AS description", thing=thing)
for line in result:
print line["description"]
or
print result.single()
You could also specify the item position like -
print result.single()[0]

Using Python elastisearch_dsl with nested objects

I want to try and use elasticsearch_dsl with python for the following
import elasticsearch
es_server = 'my_server_name'
es_port = '9200'
es_index_name = 'my_index_name'
es_connection = Elasticsearch([{'host': es_server, 'port': es_port}])
es_query = '{"query":{"bool":{"must":[{"term":{"data.party.fullName":"john do"}}],"must_not":[],"should":[]}},"from":0,"size":1,"sort":[],"facets":{}}'
my_results = es_connection.search(index=es_index_name, body=es_query)
print my_results
es_query ='{"query": {"nested" : {"filter" : {"term" : {"party.phoneList.phoneFullNumber" : "4081234567"}},"path" : "party.phoneList"}},"from" :0,"size" : 1}';
my_results = es_connection.search(index=es_index_name, body=es_query)
print my_results
I am able to get the 1st query but am not sure on the second one
from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q
client = Elasticsearch('my_server:9200')
s = Search(using=client, index = "my_index").query("term",fullName="john do ")
response = s.execute()
print response
Not sure how to do the query using DSL for the nested object party.phoneList.phoneFullNumber
New to ES and hence could not figure out how to do the nested objects.
I looked at https://github.com/elastic/elasticsearch-dsl-py/issues/28 and could not quite figure out.
Thanks !

Just use __ instead of . to get around python's limitations and the nested query:
s = Search(using=client, index = "my_index")
s = s.query("nested",
path="party.phoneList",
query=Q("term", party__phoneList__phoneFullNumber="4081234567")
)

py2neo command in neo4j BOLT driver

I have a command written in python using the py2neo to access the name of an exchange. This works.
graph = Graph()
stmt = 'MATCH (i:Index{uniqueID: 'SPY'})-[r:belongsTo]->(e:Exchange) RETURN e.name'
exchName = graph.cypher.execute(stmt)[0][0]
Can this be converted to a BOLT neo4j-driver statement? I always get an error. I want to avoid an iterator statement where I loop through the StatementResult.
driver = GraphDatabase.driver("bolt://localhost", auth=basic_auth("neo4j", "neo4j"))
session = driver.session()
stmt = 'MATCH (i:Index{uniqueID: 'SPY'})-[r:belongsTo]->(e:Exchange) RETURN e.name'
exchName = session.run(stmt)[0][0]
TypeError: 'StatementResult' object is not subscriptable

Try to store the results of session.run() in a list to retain them:
driver = GraphDatabase.driver("bolt://localhost", auth=basic_auth("neo4j", "neo4j"))
session = driver.session()
stmt = 'MATCH (i:Index{uniqueID: 'SPY'})-[r:belongsTo]->(e:Exchange) RETURN e.name'
# transform to list to retain result
exchName = list(session.run(stmt))[0][0]
See the docs: http://neo4j.com/docs/developer-manual/current/#result-retain

How to print all columns in SQLAlchemy ORM

Using SQLAlchemy, I am trying to print out all of the attributes of each model that I have in a manner similar to:
SELECT * from table;
However, I would like to do something with each models instance information as I get it. So far the best that I've been able to come up with is:
for m in session.query(model).all():
print [getattr(m, x.__str__().split('.')[1]) for x in model.__table__.columns]
# additional code
And this will give me what I'm looking for, but it's a fairly roundabout way of getting it. I was kind of hoping for an attribute along the lines of:
m.attributes
# or
m.columns.values
I feel I'm missing something and there is a much better way of doing this. I'm doing this because I'll be printing everything to .CSV files, and I don't want to have to specify the columns/attributes that I'm interested in, I want everything (there's a lot of columns in a lot of models to be printed).

This is an old post, but I ran into a problem with the actual database column names not matching the mapped attribute names on the instance. We ended up going with this:
from sqlalchemy import inspect
inst = inspect(model)
attr_names = [c_attr.key for c_attr in inst.mapper.column_attrs]
Hope that helps somebody with the same problem!

Probably the shortest solution (see the recent documentation):
from sqlalchemy.inspection import inspect
columns = [column.name for column in inspect(model).c]
The last line might look more readable, if rewrite it in three lines:
table = inspect(model)
for column in table.c:
print column.name

Building on Rodney L's answer:
model = MYMODEL
columns = [m.key for m in model.__table__.columns]

Take a look at SQLAchemy's metadata reflection feature.
A Table object can be instructed to load information about itself from the corresponding database schema object already existing within the database. This process is called reflection.

print repr(model.__table__)
Or just the columns:
print str(list(model.__table__.columns))

I believe this is the easiest way:
print [cname for cname in m.__dict__.keys()]
EDIT: The answer above me using sqlalchemy.inspection.inspect() seems to be a better solution.

Put this together and found it helpful:
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
engine = create_engine('mysql+pymysql://testuser:password#localhost:3306/testdb')
DeclarativeBase = declarative_base()
metadata = DeclarativeBase.metadata
metadata.bind = engine
# configure Session class with desired options
Session = sessionmaker()
# associate it with our custom Session class
Session.configure(bind=engine)
# work with the session
session = Session()
And then:
d = {k: metadata.tables[k].columns.keys() for k in metadata.tables.keys()}
Example output print(d):
{'orderdetails': ['orderNumber', 'productCode', 'quantityOrdered', 'priceEach', 'orderLineNumber'],
'offices': ['addressLine1', 'addressLine2', 'city', 'country', 'officeCode', 'phone', 'postalCode', 'state', 'territory'],
'orders': ['comments', 'customerNumber', 'orderDate', 'orderNumber', 'requiredDate', 'shippedDate', 'status'],
'products': ['MSRP', 'buyPrice', 'productCode', 'productDescription', 'productLine', 'productName', 'productScale', 'productVendor', 'quantityInStock'],
'employees': ['employeeNumber', 'lastName', 'firstName', 'extension', 'email', 'officeCode', 'reportsTo', 'jobTitle'],
'customers': ['addressLine1', 'addressLine2', 'city', 'contactFirstName', 'contactLastName', 'country', 'creditLimit', 'customerName', 'customerNumber', 'phone', 'postalCode', 'salesRepEmployeeNumber', 'state'],
'productlines': ['htmlDescription', 'image', 'productLine', 'textDescription'],
'payments': ['amount', 'checkNumber', 'customerNumber', 'paymentDate']}
OR and then:
from sqlalchemy.sql import text
cmd = "SELECT * FROM information_schema.columns WHERE table_schema = :db ORDER BY table_name,ordinal_position"
result = session.execute(
text(cmd),
{"db": "classicmodels"}
)
result.fetchall()

I'm using SQL Alchemy v 1.0.14 on Python 3.5.2
Assuming you can connect to an engine with create_engine(), I was able to display all columns using the following code. Replace "my connection string" and "my table name" with the appropriate values.
from sqlalchemy import create_engine, MetaData, Table, select
engine = create_engine('my connection string')
conn = engine.connect()
metadata = MetaData(conn)
t = Table("my table name", metadata, autoload=True)
columns = [m.key for m in t.columns]
columns
the last row just displays the column names from the previous statement.

You may be interested in what I came up with to do this.
from sqlalchemy.orm import class_mapper
import collections
# structure returned by get_metadata function.
MetaDataTuple = collections.namedtuple("MetaDataTuple",
"coltype, colname, default, m2m, nullable, uselist, collection")
def get_metadata_iterator(class_):
for prop in class_mapper(class_).iterate_properties:
name = prop.key
if name.startswith("_") or name == "id" or name.endswith("_id"):
continue
md = _get_column_metadata(prop)
if md is None:
continue
yield md
def get_column_metadata(class_, colname):
prop = class_mapper(class_).get_property(colname)
md = _get_column_metadata(prop)
if md is None:
raise ValueError("Not a column name: %r." % (colname,))
return md
def _get_column_metadata(prop):
name = prop.key
m2m = False
default = None
nullable = None
uselist = False
collection = None
proptype = type(prop)
if proptype is ColumnProperty:
coltype = type(prop.columns[0].type).__name__
try:
default = prop.columns[0].default
except AttributeError:
default = None
else:
if default is not None:
default = default.arg(None)
nullable = prop.columns[0].nullable
elif proptype is RelationshipProperty:
coltype = RelationshipProperty.__name__
m2m = prop.secondary is not None
nullable = prop.local_side[0].nullable
uselist = prop.uselist
if prop.collection_class is not None:
collection = type(prop.collection_class()).__name__
else:
collection = "list"
else:
return None
return MetaDataTuple(coltype, str(name), default, m2m, nullable, uselist, collection)

I use this because it's slightly shorter:
for m in session.query(*model.__table__.columns).all():
print m

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Neo4j Bolt StatementResult to Pandas DataFrame - python

Casting result records into dictionaries does the trick: df = pd.DataFrame([dict(record) for record in result])

In the V4 of py2neo, the conversion to pandas DataFrame is even easier. import py2neo # Some of these keyword arguments are unnecessary, as they are the default values. graph = py2neo.Graph(uri, user=username, password=password) df = graph.run("Your query goes here").to_data_frame()

Related

Problems querying with python to BigQuery (Python String Format)

Fetching results from cypher bolt statement

Using Python elastisearch_dsl with nested objects

py2neo command in neo4j BOLT driver

How to print all columns in SQLAlchemy ORM

Categories

Resources