Reading metadata results from pyoai - python

I'm working with pyoai library on python3.7 to harvest metadata using oai-pmh protocol but i'm getting troubles at the moment of read list of records
from oaipmh.client import Client
from oaipmh.metadata import MetadataRegistry, oai_dc_reader
URL = 'http://revista-iberoamericana.pitt.edu/ojs/index.php/Iberoamericana/oai'
registry = MetadataRegistry()
registry.registerReader('oai_dc', oai_dc_reader)
client = Client(URL, registry)
for record in client.listRecords(metadataPrefix='oai_dc'):
print(record)
i was especting a kind of xml file on tuples, but the results are like this:
(<oaipmh.common.Header object at 0x00000251FAA16A20>, <oaipmh.common.Metadata object at 0x00000251FAA160B8>, None)
(<oaipmh.common.Header object at 0x00000251FA9DB5C0>, <oaipmh.common.Metadata object at 0x00000251FA9C6518>, None)
(<oaipmh.common.Header object at 0x00000251FA9DB0F0>, <oaipmh.common.Metadata object at 0x00000251FA9DB208>, None)
could you tellme if i'm forgetting something

You can use record[1].getMap()
https://tinker.edu.au/resources/recipes/api-via-oai-pmh/

Related

How can I import JSON files

I have a problem. I have several JSON files. I do not want to create manually Collections and import these files. I found this question Bulk import of .json files in arangodb with python, but unfortunately I got an error [OUT] AttributeError: 'Database' object has no attribute 'collection'.
How can I import several JSON files and import them fully automatically via Python in Collections?
from pyArango.connection import *
conn = Connection(username="root", password="")
db = conn.createDatabase(name="test")
a = db.collection('collection_name') # <- here is the error
for x in list_of_json_files:
with open(x,'r') as json_file:
data = json.load(json_file)
a.import_bulk(data)
I also looked at the documentation from ArangoDB https://www.arangodb.com/tutorials/tutorial-python/
There is no "collection" method in db instance, which you try to call in your code on this line:
a = db.collection('collection_name') # <- here is the error
According to docs you should use db.createCollection method of db instance.
studentsCollection = db.createCollection(name="Students")

how to delete documents in batch in firestore using python

I've a collection named XYZ in my firestore. And there are 500 documents with different fields in it.
I have to delete multiple documents using a where clause from the collection.
cred = credentials.Certificate('XXXX')
app = firebase_admin.initialize_app(cred)
db = firestore.Client()
batch = db.batch()
doc_ref = db.collection('collection_name').where(u'month', '==', 07).get()
for doc in doc_ref:
batch.delete(doc)
batch.commit()
I tried this but ending up with an error
AttributeError: AttributeError: 'DocumentSnapshot' object has no attribute '_document_path'
Looking for help!
You are passing a DocumentSnapshot object to batch.delete(), which is not allowed. You must pass a DocumentReference object instead, which can be found in a property of a DocumentSnapshot.
batch.delete(doc.reference)

TypeError when importing json data with pymongo

I am trying to import json data from a link containing valid json data to MongoDB.
When I run the script I get the following error:
TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument, or a type that inherits from collections.MutableMapping
What am I missing here or doing wrong?
import pymongo
import urllib.parse
import requests
replay_url = "http://live.ksmobile.net/live/getreplayvideos?"
userid = 769630584166547456
url2 = replay_url + urllib.parse.urlencode({'userid': userid}) + '&page_size=1000'
print(f"Replay url: {url2}")
raw_replay_data = requests.get(url2).json()
uri = 'mongodb://testuser:password#ds245687.mlab.com:45687/liveme'
client = pymongo.MongoClient(uri)
db = client.get_default_database()
replays = db['replays']
replays.insert_many(raw_replay_data)
client.close()
I saw that you are getting the video information data for 22 videos.
You can use :
replays.insert_many(raw_replay_data['data']['video_info'])
for saving them
You can make one field as _id for mongodb document
use the following line before insert_many
for i in raw_replay_data['data']['video_info']:
i['_id'] = i['vid']
this will make the 'vid' field as your '_id'. Just make sure that the 'vid' is unique for all videos.

How to convert suds object to xml string

This is a duplicate to this question:
How to convert suds object to xml
But the question has not been answered: "totxt" is not an attribute on the Client class.
Unfortunately I lack of reputation to add comments. So I ask again:
Is there a way to convert a suds object to its xml?
I ask this because I already have a system that consumes wsdl files and sends data to a webservice. But now the customers want to alternatively store the XML as files (to import them later manually). So all I need are 2 methods for writing data: One writes to a webservice (implemented and tested), the other (not implemented yet) writes to files.
If only I could make something like this:
xml_as_string = My_suds_object.to_xml()
The following code is just an example and does not run. And it's not elegant. Doesn't matter. I hope you get the idea what I want to achieve:
I have the function "write_customer_obj_webservice" that works. Now I want to write the function "write_customer_obj_xml_file".
import suds
def get_customer_obj():
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
customer = c.factory.create("ns0:CustomerType")
return customer
def write_customer_obj_webservice(customer):
wsdl_url = r'file:C:/somepathhere/Customer.wsdl'
service_url = r'http://someiphere/Customer'
c = suds.client.Client(wsdl_url, location=service_url)
response = c.service.save(someparameters, None, None, customer)
return response
def write_customer_obj_xml_file(customer):
output_filename = r'C\temp\testxml'
# The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
xml = customer.to_xml()
fo = open(output_filename, 'a')
try:
fo.write(xml)
except:
raise
else:
response = 'All ok'
finally:
fo.close()
return response
# Get the customer object always from the wsdl.
customer = get_customer_obj()
# Since customer is an object, setting it's attributes is very easy. There are very complex objects in this system.
customer.name = "Doe J."
customer.age = 42
# Write the new customer to a webservice or store it in a file for later proccessing
if later_processing:
response = write_customer_obj_xml_file(customer)
else:
response = write_customer_obj_webservice(customer)
I found a way that works for me. The trick is to create the Client with the option "nosend=True".
In the documentation it says:
nosend - Create the soap envelope but don't send. When specified, method invocation returns a RequestContext instead of sending it.
The RequestContext object has the attribute envelope. This is the XML as string.
Some pseudo code to illustrate:
c = suds.client.Client(url, nosend=True)
customer = c.factory.create("ns0:CustomerType")
customer.name = "Doe J."
customer.age = 42
response = c.service.save(someparameters, None, None, customer)
print response.envelope # This prints the XML string that would have been sent.
You have some issues in write_customer_obj_xml_file function:
Fix bad path:
output_filename = r'C:\temp\test.xml'
The following line is the problem. "to_xml" does not exist and I can't find a way to do it.
What's the type of customer? type(customer)?
xml = customer.to_xml() # to be continued...
Why mode='a'? ('a' => append, 'w' => create + write)
Use a with statement (file context manager).
with open(output_filename, 'w') as fo:
fo.write(xml)
Don't need to return a response string: use an exception manager. The exception to catch can be EnvironmentError.
Analyse
The following call:
customer = c.factory.create("ns0:CustomerType")
Construct a CustomerType on the fly, and return a CustomerType instance customer.
I think you can introspect your customer object, try the following:
vars(customer) # display the object attributes
help(customer) # display an extensive help about your instance
Another way is to try the WSDL URLs by hands, and see the XML results.
You may obtain the full description of your CustomerType object.
And then?
Then, with the attributes list, you can create your own XML. Use an XML template and fill it with the object attributes.
You may also found the magic function (to_xml) which do the job for you. But, not sure the XML format matches your need.
client = Client(url)
client.factory.create('somename')
# The last XML request by client
client.last_sent()
# The last XML response from Web Service
client.last_received()

Python: Automatically updating Twitch moderator list using API

I'm trying to auto update a moderator list using this API:
https://tmi.twitch.tv/group/user/ice3lade/chatters
I am accessing and storing it with
from urllib.request import urlopen
response = urlopen('https://tmi.twitch.tv/group/user/ice3lade/chatters')
chatlist = response.read()
But attempting to simply use it as a dictionary e.g.
print(chatlist("chatters"))
Returns an error
TypeError: 'bytes' object is not callable
I'm a total python noob so any help is appreciated. How do I either access this as a dictionary directly from the API, or how to store the data I get from reading the API as a proper dictionary?
Made a fairly reasonable solution, chatlist gives the full dictionary, chatters gives all the keys and values within the chatters dictionary, and moderators gives the list of moderators.
from urllib.request import urlopen
from json import loads
response = urlopen('https://tmi.twitch.tv/group/user/xflixx_teampokerstars/chatters')
readable = response.read().decode('utf-8')
chatlist = loads(readable)
chatters = chatlist['chatters']
moderators = chatters['moderators']
Didn't know json was required to decode the API.

Categories