I am using confluent-kafka and I need to serialize my keys as strings and produce some messages. I have a working code for the case where I retrieve the schema from the schema registry and use it to produce a message. The problem is that it fails when I am trying to read the schema from a local file instead.
The code below is the working one for the schema registry:
import argparse
from confluent_kafka.schema_registry.avro import AvroSerializer
from confluent_kafka.serialization import StringSerializer
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka import SerializingProducer
import avro.schema
SCHEMA_HOST = '192.168.40.10'
TOPIC = 'my_topic'
SCHEMA = 'path/to/schema.avsc'
# Just parse argumments
parser = argparse.ArgumentParser(description="Avro Kafka Generator")
parser.add_argument('--schema_registry_host', default=SCHEMA_HOST, help="schema registry host")
parser.add_argument('--schema', type=str, default=SCHEMA, help="schema to produce under")
parser.add_argument('--topic', type=str, default=TOPIC, help="topic to publish to")
parser.add_argument('--frequency', type=float, default=1.0, help="number of message per second")
args = parser.parse_args()
# Actual code
schema_registry_conf = {'url': "http://{}:8081".format(SCHEMA_HOST)}
schema_registry_client = SchemaRegistryClient(schema_registry_conf)
schema = schema_registry_client.get_latest_version(subject_name=TOPIC + "-value")
# schema = schema_registry_client.get_schema(schema.schema_id)
schema = schema_registry_client.get_schema(schema.schema_id)
schema_str = schema.schema_str
pro_conf = {"auto.register.schemas": False}
avro_serializer = AvroSerializer(schema_registry_client=schema_registry_client, schema_str=schema_str, conf=pro_conf)
conf = {'bootstrap.servers': "{}:9095".format(args.schema_registry_host),
'schema.registry.url': "http://{}:8081".format(args.schema_registry_host)}
# avro_producer = AvroProducer(conf, default_value_schema=value_schema)
producer_conf = {'bootstrap.servers': "{}:9095".format(SCHEMA_HOST),
'key.serializer': StringSerializer('utf_8'),
'value.serializer': avro_serializer}
avro_producer = SerializingProducer(producer_conf)
But when I try to use a variation for the local file it fails:
# Read schema from local file
value_schema = avro.schema.Parse(open(args.schema, "r").read())
schema_str = open(args.schema, "r").read().replace(' ', '').replace('\n', '')
pro_conf = {"auto.register.schemas": True}
avro_serializer = AvroSerializer(schema_registry_client=schema_registry_client, schema_str=schema_str, conf=pro_conf)
this part is common to both versions:
producer_conf = {'bootstrap.servers': "{}:9095".format(SCHEMA_HOST),
'key.serializer': StringSerializer('utf_8'),
'value.serializer': avro_serializer}
avro_producer = SerializingProducer(producer_conf)
avro_producer.produce(topic=args.topic, value=message)
The error I am getting is the following;
KafkaError{code=_VALUE_SERIALIZATION,val=-161,str="'RecordSchema'
object has no attribute 'lookup_schema'"}
Obviously, it's not the best approach and I guess if it worked the code seems ugly and error prone. But it doesn't even work so, I need some help on how I could read a local schema and use the AvroSerializer afterwards.
Related
This is quite a specific question regarding nohup in linux, which runs a python file.
Back-story, I am trying to save down streaming data (from IG markets broadcast signal). And, as I am trying to run it via a remote-server (so I don't have to keep my own local desktop up 24/7),
somehow, the nohup will not engage when it 'listen's to a broadcast signal.
Below, is the example python code
#!/usr/bin/env python
#-*- coding:utf-8 -*-
"""
IG Markets Stream API sample with Python
"""
user_ = 'xxx'
password_ = 'xxx'
api_key_ = 'xxx' # this is the 1st api key
account_ = 'xxx'
acc_type_ = 'xxx'
fileLoc = 'marketdata_IG_spx_5min.csv'
list_ = ["CHART:IX.D.SPTRD.DAILY.IP:5MINUTE"]
fields_ = ["UTM", "LTV", "TTV", "BID_OPEN", "BID_HIGH", \
"BID_LOW", "BID_CLOSE",]
import time
import sys
import traceback
import logging
import warnings
warnings.filterwarnings('ignore')
from trading_ig import (IGService, IGStreamService)
from trading_ig.lightstreamer import Subscription
cols_ = ['timestamp', 'data']
# A simple function acting as a Subscription listener
def on_prices_update(item_update):
# print("price: %s " % item_update)
print("xxxxxxxx
))
# A simple function acting as a Subscription listener
def on_charts_update(item_update):
# print("price: %s " % item_update)
print(xxxxxx"\
.format(
stock_name=item_update["name"], **item_update["values"]
))
res_ = [xxxxx"\
.format(
stock_name=item_update["name"], **item_update["values"]
).split(' '))]
# display(pd.DataFrame(res_))
try:
data_ = pd.read_csv(fileLoc)[cols_]
data_ = data_.append(pd.DataFrame(res_, columns = cols_))
data_.to_csv(fileLoc)
print('there is data and we are reading it')
# display(data_)
except:
pd.DataFrame(res_, columns = cols_).to_csv(fileLoc)
print('there is no data and we are saving first time')
time.sleep(60) # sleep for 1 min
def main():
logging.basicConfig(level=logging.INFO)
# logging.basicConfig(level=logging.DEBUG)
ig_service = IGService(
user_, password_, api_key_, acc_type_
)
ig_stream_service = IGStreamService(ig_service)
ig_session = ig_stream_service.create_session()
accountId = account_
################ my code to set sleep function to sleep/read at only certain time intervals
s_time = time.time()
############################
# Making a new Subscription in MERGE mode
subscription_prices = Subscription(
mode="MERGE",
# make sure to put L1 in front of the instrument name
items= list_,
fields= fields_
)
# adapter="QUOTE_ADAPTER")
# Adding the "on_price_update" function to Subscription
subscription_prices.addlistener(on_charts_update)
# Registering the Subscription
sub_key_prices = ig_stream_service.ls_client.subscribe(subscription_prices)
print('this is the line here')
input("{0:-^80}\n".format("HIT CR TO UNSUBSCRIBE AND DISCONNECT FROM \
LIGHTSTREAMER"))
# Disconnecting
ig_stream_service.disconnect()
if __name__ == '__main__':
main()
#######
Then, I try to run it on linux using this command : nohup python marketdata.py
where marketdata.py is basically the python code above.
Somehow, the nohup will not engage....... Any experts/guru who might see what I am missing in my code?
I have been trying to patch the list_blobs() function of ContainerClient, have not been able to do this successfully, this code outputs a MagicMock() function - but the function isn't patched as I would expect it to be (Trying to patch with a list ['Blob1', 'Blob2'].
#################Script File
import sys
from datetime import datetime, timedelta
import pyspark
import pytz
import yaml
# from azure.storage.blob import BlobServiceClient, ContainerClient
from pyspark.dbutils import DBUtils as dbutils
import azure.storage.blob
# Open Config
def main():
spark_context = pyspark.SparkContext.getOrCreate()
spark_context.addFile(sys.argv[1])
stream = None
stream = open(sys.argv[1], "r")
config = yaml.load(stream, Loader=yaml.FullLoader)
stream.close()
account_key = dbutils.secrets.get(scope=config["Secrets"]["Scope"], key=config["Secrets"]["Key Name"])
target_container = config["Storage Configuration"]["Container"]
target_account = config["Storage Configuration"]["Account"]
days_history_to_keep = config["Storage Configuration"]["Days History To Keep"]
connection_string = (
"DefaultEndpointsProtocol=https;AccountName="
+ target_account
+ ";AccountKey="
+ account_key
+ ";EndpointSuffix=core.windows.net"
)
blob_service_client: azure.storage.blob.BlobServiceClient = (
azure.storage.blob.BlobServiceClient.from_connection_string(connection_string)
)
container_client: azure.storage.blob.ContainerClient = (
blob_service_client.get_container_client(target_container)
)
blobs = container_client.list_blobs()
print(blobs)
print(blobs)
utc = pytz.UTC
delete_before_date = utc.localize(
datetime.today() - timedelta(days=days_history_to_keep)
)
for blob in blobs:
if blob.creation_time < delete_before_date:
print("Deleting Blob: " + blob.name)
container_client.delete_blob(blob, delete_snapshots="include")
if __name__ == "__main__":
main()
#################Test File
import unittest
from unittest import mock
import DeleteOldBlobs
class DeleteBlobsTest(unittest.TestCase):
def setUp(self):
pass
#mock.patch("DeleteOldBlobs.azure.storage.blob.ContainerClient")
#mock.patch("DeleteOldBlobs.azure.storage.blob.BlobServiceClient")
#mock.patch("DeleteOldBlobs.dbutils")
#mock.patch("DeleteOldBlobs.sys")
#mock.patch('DeleteOldBlobs.pyspark')
def test_main(self, mock_pyspark, mock_sys, mock_dbutils, mock_blobserviceclient, mock_containerclient):
# mock setup
config_file = "Delete_Old_Blobs_UnitTest.yml"
mock_sys.argv = ["unused_arg", config_file]
mock_dbutils.secrets.get.return_value = "A Secret"
mock_containerclient.list_blobs.return_value = ["ablob1", "ablob2"]
# execute test
DeleteOldBlobs.main()
# TODO assert actions taken
# mock_sys.argv.__get__.assert_called_with()
# dbutils.secrets.get(scope=config['Secrets']['Scope'], key=config['Secrets']['Key Name'])
if __name__ == "__main__":
unittest.main()
Output:
<MagicMock name='BlobServiceClient.from_connection_string().get_container_client().list_blobs()' id='1143355577232'>
What am I doing incorrectly here?
I'm not able to execute your code in this moment, but I have tried to simulate it. To do this I have created the following 3 files in the path: /<path-to>/pkg/sub_pkg1 (where pkg and sub_pkg1 are packages).
File ContainerClient.py
def list_blobs(self):
return "blob1"
File DeleteOldBlobs.py
from pkg.sub_pkg1 import ContainerClient
# Open Config
def main():
blobs = ContainerClient.list_blobs()
print(blobs)
print(blobs)
File DeleteBlobsTest.py
import unittest
from unittest import mock
from pkg.sub_pkg1 import DeleteOldBlobs
class DeleteBlobsTest(unittest.TestCase):
def setUp(self):
pass
def test_main(self):
mock_containerclient = mock.MagicMock()
with mock.patch("DeleteOldBlobs.ContainerClient.list_blobs", mock_containerclient.list_blobs):
mock_containerclient.list_blobs.return_value = ["ablob1", "ablob2"]
DeleteOldBlobs.main()
if __name__ == '__main__':
unittest.main()
If you execute the test code you obtain the output:
['ablob1', 'ablob2']
['ablob1', 'ablob2']
This output means that the function list_blobs() is mocked by mock_containerclient.list_blobs.
I don't know if the content of this post can be useful for you, but I'm not able to simulate better your code in this moment.
I hope you can inspire to my code to find your real solution.
The structure of the answer didn't match my solution, perhaps both will work but it was important for me to patch pyspark even though i never call it, or exceptions would get thrown when my code tried to interact with spark.
Perhaps this will be useful to someone:
#mock.patch("DeleteOldBlobs.azure.storage.blob.BlobServiceClient")
#mock.patch("DeleteOldBlobs.dbutils")
#mock.patch("DeleteOldBlobs.sys")
#mock.patch('DeleteOldBlobs.pyspark')
def test_list_blobs_called_once(self, mock_pyspark, mock_sys, mock_dbutils, mock_blobserviceclient):
# mock setup
config_file = "Delete_Old_Blobs_UnitTest.yml"
mock_sys.argv = ["unused_arg", config_file]
account_key = 'Secret Key'
mock_dbutils.secrets.get.return_value = account_key
bsc_mock: mock.Mock = mock.Mock()
container_client_mock = mock.Mock()
blob1 = Blob('newblob', datetime.today())
blob2 = Blob('oldfile', datetime.today() - timedelta(days=20))
container_client_mock.list_blobs.return_value = [blob1, blob2]
bsc_mock.get_container_client.return_value = container_client_mock
mock_blobserviceclient.from_connection_string.return_value = bsc_mock
# execute test
DeleteOldBlobs.main()
#Assert Results
container_client_mock.list_blobs.assert_called_once()
I am new to Azure and dealing with all these paths is proving to be extremely challenging. I am trying to create a pipeline that contains a dataprep.py step and an AutoML step. What i want to do is (after passing the input to the dataprep block and performing several operations on it) to save the resulting tabulardataset in the datastore and have it as an output to then be able to reuse in my train block.
My dataprep.py file
-----dataprep stuff and imports
parser = argparse.ArgumentParser()
parser.add_argument("--month_train", required=True)
parser.add_argument("--year_train", required=True)
parser.add_argument('--output_path', dest = 'output_path', required=True)
args = parser.parse_args()
run = Run.get_context()
ws = run.experiment.workspace
datastore = ws.get_default_datastore()
name_dataset_input = 'Customer_data_'+str(args.year_train)
name_dataset_output = 'DATA_PREP_'+str(args.year_train)+'_'+str(args.month_train)
# get the input dataset by name
ds = Dataset.get_by_name(ws, name_dataset_input)
df = ds.to_pandas_dataframe()
# apply is one of my dataprep functions that i defined earlier
df = apply(df, args.mois_train)
# this is where i am having issues, I want to save this in the datastore but also have it as output
ds = Dataset.Tabular.register_pandas_dataframe(df, args.output_path ,name_dataset_output)
The pipeline block instructions.
from azureml.data import OutputFileDatasetConfig
from azureml.pipeline.steps import PythonScriptStep
prepped_data_path = OutputFileDatasetConfig(name="output_path", destination = (datastore, 'managed-dataset/{run-id}/{output-name}'))
dataprep_step = PythonScriptStep(
name="dataprep",
script_name="dataprep.py",
compute_target=compute_target,
runconfig=aml_run_config,
arguments=["--output_path", prepped_data_path, "--month_train", month_train,"--year_train",year_train],
allow_reuse=True
I'm working on the PoC about Amundsen Lyft for introducing data catalog.
So, I use the file 'sample_mysql_loader.py' and insert database information.
OS: Ubuntu 18.04
Python: 3.6.9
pip: 21.3.1
elasticsearch: 8.0.0
neo4j: 3.5.26
I installed Amundsen Lyft using quickstart on Docker
import logging
import sys
import textwrap
import uuid
from elasticsearch import Elasticsearch
from pyhocon import ConfigFactory
from sqlalchemy.ext.declarative import declarative_base
from databuilder.extractor.mysql_metadata_extractor import MysqlMetadataExtractor
from databuilder.extractor.neo4j_extractor import Neo4jExtractor
from databuilder.extractor.neo4j_search_data_extractor import Neo4jSearchDataExtractor
from databuilder.extractor.sql_alchemy_extractor import SQLAlchemyExtractor
from databuilder.job.job import DefaultJob
from databuilder.loader.file_system_elasticsearch_json_loader import FSElasticsearchJSONLoader
from databuilder.loader.file_system_neo4j_csv_loader import FsNeo4jCSVLoader
from databuilder.publisher import neo4j_csv_publisher
from databuilder.publisher.elasticsearch_publisher import ElasticsearchPublisher
from databuilder.publisher.neo4j_csv_publisher import Neo4jCsvPublisher
from databuilder.task.task import DefaultTask
from databuilder.transformer.base_transformer import NoopTransformer
import pymysql
pymysql.install_as_MySQLdb()
es_host = None
neo_host = None
if len(sys.argv) > 1:
es_host = sys.argv[1]
if len(sys.argv) > 2:
neo_host = sys.argv[2]
es = Elasticsearch([
{'host': es_host or 'localhost'},
])
DB_FILE = '/tmp/test.db'
SQLITE_CONN_STRING = 'sqlite:////tmp/test.db'
Base = declarative_base()
NEO4J_ENDPOINT = f'bolt://{neo_host or "localhost"}:7687'
neo4j_endpoint = NEO4J_ENDPOINT
neo4j_user = 'neo4j'
neo4j_password = 'test'
LOGGER = logging.getLogger(__name__)
# todo: connection string needs to change
def connection_string():
***user = 'user_name'
password='password'
host = 'host'
port = 'port'
db = 'db_name'***
return "mysql://%s:%s#%s:%s/%s" % (user, ***password,*** host, port, db)
def run_mysql_job():
where_clause_suffix = textwrap.dedent("""
where c.table_schema = '***db_name***'
""")
tmp_folder = '/var/tmp/amundsen/table_metadata'
node_files_folder = f'{tmp_folder}/nodes/'
relationship_files_folder = f'{tmp_folder}/relationships/'
job_config = ConfigFactory.from_dict({
f'extractor.mysql_metadata.{MysqlMetadataExtractor.WHERE_CLAUSE_SUFFIX_KEY}': where_clause_suffix,
f'extractor.mysql_metadata.{MysqlMetadataExtractor.USE_CATALOG_AS_CLUSTER_NAME}': True,
f'extractor.mysql_metadata.extractor.sqlalchemy.{SQLAlchemyExtractor.CONN_STRING}': connection_string(),
f'loader.filesystem_csv_neo4j.{FsNeo4jCSVLoader.NODE_DIR_PATH}': node_files_folder,
f'loader.filesystem_csv_neo4j.{FsNeo4jCSVLoader.RELATION_DIR_PATH}': relationship_files_folder,
f'publisher.neo4j.{neo4j_csv_publisher.NODE_FILES_DIR}': node_files_folder,
f'publisher.neo4j.{neo4j_csv_publisher.RELATION_FILES_DIR}': relationship_files_folder,
f'publisher.neo4j.{neo4j_csv_publisher.NEO4J_END_POINT_KEY}': neo4j_endpoint,
f'publisher.neo4j.{neo4j_csv_publisher.NEO4J_USER}': neo4j_user,
f'publisher.neo4j.{neo4j_csv_publisher.NEO4J_PASSWORD}': neo4j_password,
f'publisher.neo4j.{neo4j_csv_publisher.JOB_PUBLISH_TAG}': 'unique_tag', # should use unique tag here like {ds}
})
job = DefaultJob(conf=job_config,
task=DefaultTask(extractor=MysqlMetadataExtractor(), loader=FsNeo4jCSVLoader()),
publisher=Neo4jCsvPublisher())
return job
def create_es_publisher_sample_job(elasticsearch_index_alias='table_search_index',
elasticsearch_doc_type_key='table',
model_name='databuilder.models.table_elasticsearch_document.TableESDocument',
cypher_query=None,
elasticsearch_mapping=None):
"""
:param elasticsearch_index_alias: alias for Elasticsearch used in
amundsensearchlibrary/search_service/config.py as an index
:param elasticsearch_doc_type_key: name the ElasticSearch index is prepended with. Defaults to `table` resulting in
`table_search_index`
:param model_name: the Databuilder model class used in transporting between Extractor and Loader
:param cypher_query: Query handed to the `Neo4jSearchDataExtractor` class, if None is given (default)
it uses the `Table` query baked into the Extractor
:param elasticsearch_mapping: Elasticsearch field mapping "DDL" handed to the `ElasticsearchPublisher` class,
if None is given (default) it uses the `Table` query baked into the Publisher
"""
# loader saves data to this location and publisher reads it from here
extracted_search_data_path = '/var/tmp/amundsen/search_data.json'
task = DefaultTask(loader=FSElasticsearchJSONLoader(),
extractor=Neo4jSearchDataExtractor(),
transformer=NoopTransformer())
# elastic search client instance
elasticsearch_client = es
# unique name of new index in Elasticsearch
elasticsearch_new_index_key = 'tables' + str(uuid.uuid4())
job_config = ConfigFactory.from_dict({
f'extractor.search_data.extractor.neo4j.{Neo4jExtractor.GRAPH_URL_CONFIG_KEY}': neo4j_endpoint,
f'extractor.search_data.extractor.neo4j.{Neo4jExtractor.MODEL_CLASS_CONFIG_KEY}': model_name,
f'extractor.search_data.extractor.neo4j.{Neo4jExtractor.NEO4J_AUTH_USER}': neo4j_user,
f'extractor.search_data.extractor.neo4j.{Neo4jExtractor.NEO4J_AUTH_PW}': neo4j_password,
f'loader.filesystem.elasticsearch.{FSElasticsearchJSONLoader.FILE_PATH_CONFIG_KEY}': extracted_search_data_path,
f'loader.filesystem.elasticsearch.{FSElasticsearchJSONLoader.FILE_MODE_CONFIG_KEY}': 'w',
f'publisher.elasticsearch.{ElasticsearchPublisher.FILE_PATH_CONFIG_KEY}': extracted_search_data_path,
f'publisher.elasticsearch.{ElasticsearchPublisher.FILE_MODE_CONFIG_KEY}': 'r',
f'publisher.elasticsearch.{ElasticsearchPublisher.ELASTICSEARCH_CLIENT_CONFIG_KEY}':
elasticsearch_client,
f'publisher.elasticsearch.{ElasticsearchPublisher.ELASTICSEARCH_NEW_INDEX_CONFIG_KEY}':
elasticsearch_new_index_key,
f'publisher.elasticsearch.{ElasticsearchPublisher.ELASTICSEARCH_DOC_TYPE_CONFIG_KEY}':
elasticsearch_doc_type_key,
f'publisher.elasticsearch.{ElasticsearchPublisher.ELASTICSEARCH_ALIAS_CONFIG_KEY}':
elasticsearch_index_alias,
})
# only optionally add these keys, so need to dynamically `put` them
if cypher_query:
job_config.put(f'extractor.search_data.{Neo4jSearchDataExtractor.CYPHER_QUERY_CONFIG_KEY}',
cypher_query)
if elasticsearch_mapping:
job_config.put(f'publisher.elasticsearch.{ElasticsearchPublisher.ELASTICSEARCH_MAPPING_CONFIG_KEY}',
elasticsearch_mapping)
job = DefaultJob(conf=job_config,
task=task,
publisher=ElasticsearchPublisher())
return job
if __name__ == "__main__":
# Uncomment next line to get INFO level logging
# logging.basicConfig(level=logging.INFO)
loading_job = run_mysql_job()
loading_job.launch()
job_es_table = create_es_publisher_sample_job(
elasticsearch_index_alias='table_search_index',
elasticsearch_doc_type_key='table',
model_name='databuilder.models.table_elasticsearch_document.TableESDocument')
job_es_table.launch()
But, I got the error message below. How can I connect my database to Amundsen Lyft?
elasticsearch.exceptions.RequestError: RequestError(400, 'mapper_parsing_exception', 'Root mapping definition has unsupported parameters: [table : {properties={schema={analyzer=simple, type=text, fields={raw={type=keyword}}}, cluster={type=text}, description={analyzer=simple, type=text}, display_name={type=keyword}, column_descriptions={analyzer=simple, type=text}, programmatic_descriptions={analyzer=simple, type=text}, tags={type=keyword}, badges={type=keyword}, database={analyzer=simple, type=text, fields={raw={type=keyword}}}, total_usage={type=long}, name={analyzer=simple, type=text, fields={raw={type=keyword}}}, last_updated_timestamp={format=epoch_second, type=date}, unique_usage={type=long}, column_names={analyzer=simple, type=text, fields={raw={type=keyword}}}, key={type=keyword}}}]')
I've been using the following code, which is from the pynetdicom library to query and store some images from a remote server SCP, on my machine(SCU).
"""
Query/Retrieve SCU AE example.
This demonstrates a simple application entity that support the Patient
Root Find and Move SOP Classes as SCU. In order to receive retrieved
datasets, this application entity must support the CT Image Storage
SOP Class as SCP as well. For this example to work, there must be an
SCP listening on the specified host and port.
For help on usage,
python qrscu.py -h
"""
import argparse
from netdicom.applicationentity import AE
from netdicom.SOPclass import *
from dicom.dataset import Dataset, FileDataset
from dicom.UID import ExplicitVRLittleEndian, ImplicitVRLittleEndian, \
ExplicitVRBigEndian
import netdicom
# netdicom.debug(True)
import tempfile
# parse commandline
parser = argparse.ArgumentParser(description='storage SCU example')
parser.add_argument('remotehost')
parser.add_argument('remoteport', type=int)
parser.add_argument('searchstring')
parser.add_argument('-p', help='local server port', type=int, default=9999)
parser.add_argument('-aet', help='calling AE title', default='PYNETDICOM')
parser.add_argument('-aec', help='called AE title', default='REMOTESCU')
parser.add_argument('-implicit', action='store_true',
help='negociate implicit transfer syntax only',
default=False)
parser.add_argument('-explicit', action='store_true',
help='negociate explicit transfer syntax only',
default=False)
args = parser.parse_args()
if args.implicit:
ts = [ImplicitVRLittleEndian]
elif args.explicit:
ts = [ExplicitVRLittleEndian]
else:
ts = [
ExplicitVRLittleEndian,
ImplicitVRLittleEndian,
ExplicitVRBigEndian
]
# call back
def OnAssociateResponse(association):
print "Association response received"
def OnAssociateRequest(association):
print "Association resquested"
return True
def OnReceiveStore(SOPClass, DS):
print("FINALLY ENTERED")
print "Received C-STORE", DS.PatientName
try:
# do something with dataset. For instance, store it.
file_meta = Dataset()
file_meta.MediaStorageSOPClassUID = '1.2.840.10008.5.1.4.1.1.2'
# !! Need valid UID here
file_meta.MediaStorageSOPInstanceUID = "1.2.3"
# !!! Need valid UIDs here
file_meta.ImplementationClassUID = "1.2.3.4"
filename = '%s/%s.dcm' % (tempfile.gettempdir(), DS.SOPInstanceUID)
ds = FileDataset(filename, {},
file_meta=file_meta, preamble="\0" * 128)
ds.update(DS)
#ds.is_little_endian = True
#ds.is_implicit_VR = True
ds.save_as(filename)
print "File %s written" % filename
except:
pass
# must return appropriate status
return SOPClass.Success
# create application entity
MyAE = AE(args.aet, args.p, [PatientRootFindSOPClass,
PatientRootMoveSOPClass,
VerificationSOPClass], [StorageSOPClass], ts)
MyAE.OnAssociateResponse = OnAssociateResponse
MyAE.OnAssociateRequest = OnAssociateRequest
MyAE.OnReceiveStore = OnReceiveStore
MyAE.start()
# remote application entity
RemoteAE = dict(Address=args.remotehost, Port=args.remoteport, AET=args.aec)
# create association with remote AE
print "Request association"
assoc = MyAE.RequestAssociation(RemoteAE)
# perform a DICOM ECHO
print "DICOM Echo ... ",
st = assoc.VerificationSOPClass.SCU(1)
print 'done with status "%s"' % st
print "DICOM FindSCU ... ",
d = Dataset()
d.PatientsName = args.searchstring
d.QueryRetrieveLevel = "PATIENT"
d.PatientID = "*"
st = assoc.PatientRootFindSOPClass.SCU(d, 1)
print 'done with status "%s"' % st
for ss in st:
if not ss[1]:
continue
# print ss[1]
try:
d.PatientID = ss[1].PatientID
except:
continue
print "Moving"
print d
assoc2 = MyAE.RequestAssociation(RemoteAE)
gen = assoc2.PatientRootMoveSOPClass.SCU(d, 'SAMTEST', 1)
for gg in gen:
print
print gg
assoc2.Release(0)
print "QWEQWE"
print "Release association"
assoc.Release(0)
# done
MyAE.Quit()
Running the program, I get the following output:
Request association
Association response received
DICOM Echo ... done with status "Success "
DICOM FindSCU ... done with status "<generator object SCU at 0x106014c30>"
Moving
(0008, 0052) Query/Retrieve Level CS: 'PATIENT'
(0010, 0010) Patient's Name PN: 'P*'
(0010, 0020) Patient ID LO: 'Pat00001563'
Association response received
QWEQWE
Moving
(0008, 0052) Query/Retrieve Level CS: 'PATIENT'
(0010, 0010) Patient's Name PN: 'P*'
(0010, 0020) Patient ID LO: 'Pat00002021'
Association response received
QWEQWE
Release association
Echo works so I know my associations are working and I am able to query and see the files on the server as suggested by the output. As you can see however, OnReceiveStore does not seem to be called. I am pretty new to DICOM and I was wondering what might be the case. Correct me if I am wrong but I think the line gen = assoc2.PatientRootMoveSOPClass.SCU(d, 'SAMTEST', 1) is supposed to invoke OnReceiveStore. If not, some insight on how C-STORE should be called is appreciated.
DICOM C-MOVE is a complicated operation, if You are not very familiar with DICOM.
You are not calling OnReceiveStore Yourself, it's called when the SCP starts sending instances to Your application. You issue a C-MOVE command to the SCP with Your own AE title, where You want to receive the images. In Your case SAMTEST. Since this is the only parameter for the C-MOVE command, then the SCP needs to be configured before-hand to know the IP and port of the SAMTEST AE. Have You done this?
I don't know, why the code is not outputting responses from the SCP to the C-MOVE command. It looks like it should. These could give a good indication, what is happening. You can also check the logs on the SCP side, to see, what's happening.
Also, here is some quite good reading about C-MOVE operation.