Create HDF5 Group /Table if it does not exist - python

I am building an HDF5 file using PyTables python package. The file would be updated everyday with latest tick data. I want to create two groups - Quotes and Trades and tables for different futures expiries. I want to check if the group Quotes exists or not and if not then create it. What is the best way to do it in PyTables?
Here is a code snippet of where I am right now:
hdf_repos_filters = tables.Filters(complevel=1, complib='zlib')
for instrument in instruments:
if options.verbose:
hdf_file = os.path.join(dest_path, "{}.h5".format(instrument))
store = tables.open_file(hdf_file, mode='a', filters=hdf_repos_filters)
# This is where I want to check whether the group "Quotes" and "Trades" exist and if not create it

Kapil is on the right track in that you want to use the __contains__ method, although because it is a double underscore method it is not intended to be called directly rather through an alternate interface. In this case that interface is in. So to check a file hdf_file contains a group "Quotes" you can run:
with tables.open_file(hdf_file) as store:
if "/Quotes" in store:
print(f"Quotes already exists in the file {hdf_file}")

I think I have figured it out.
I am using the File.__contains__(path) method in the File class in PyTables.
As per the documentation:
File.__contains__(path)
Is there a node with that path?
Returns True if the file has a node with the given path (a string), False otherwise.
PyTables File class

Related

office365-rest-client: How to get name of file modifier

I am mapping a SharePoint document library using office365-rest-client. My intention is to make a dictionary of the form:
file_dict = {File.serverRelativeUrl: [file_attribute_1, file_attribute_2, ...]}
Where File.serverRealtiveUrl is a string, and one of the above-mentioned attributes is to be the name of the latest person to modify the file.
The File class (seen here) also has a method modified_by() that I have been trying to use to determine the name of the person who last modified the file. However, using this returns an instance of the User class (seen here).
When looking at the code behind User, it doesn't appear to contain any method that would allow for the name of the modifier to be retrieved.
When looking at the files saved within my SharePoint document library, it is clear that the names of these users are present:
Therefore, I would like to know the following:
Does anybody know of the correct method to determine the names of the file modifiers (if one exists)?
Alternatively, is it possible to determine the email addresses of these users instead?
I have already attempted to use the File.properties attribute, but have found that the modifier name / mailing address is not included within these properties:
import json
# Get file
print(json.dumps(file.properties))

How to check if a key already exists in the database using Firebase and Python?

Essentially, I'll be using a database of this structure:
to keep track of the users' xp. Under the xp_data section, there will be multiple timestamps and xp numbers for each timestamp. A function will run every 24 hours, that will log the users' XP. I want to have some way to check if the player is already in the database (and if so, add to their existing xp count) and if not, create a new node for them. Here is my code for writing to the server:
db_ref = db.reference('/')
for i in range(100):
tom = await mee6API.levels.get_leaderboard_page(i)
if xp_trigger:
break
this_lb_list = {}
for l in tom['players']:
if l['xp'] < 300:
xp_trigger = True
break
this_lb_list.update({l['id']: {'name': l['username'], 'xp_data': {time.strftime(time_format_str, time.gmtime()): l['xp']}}})
details += [{ int(l['id']) : l['xp']}]
print(i)
db_ref.update(this_lb_list)
Basically, this code loops through each page in the leaderboard, obtains the XP for each user, and appends it to a dict, which is then used to update the database. there are two problems with this code, one is that it does not check if the user already exists, meaning that, and this is the second problem, that it overwrites the user's existing data. I've also attempted to write the data for each player individually, but problem 1 was still an issue, and it was painfully slow. What can I do to rectify this?
When you pass a value for a property in update(), that value replaces the entire existing value of the property in the database. So while update() leaves the properties you don't specify in the call unmodified, it does completely replace any property you do specify.
To add a value to an existing property, you'll want to specify the entire path as the key, separating the various child nodes with /.
So something like:
this_lb_list.update({'xp_data/13-Auth-2021': l['xp']})
This will write only the 13-Auth-2021 of xp_data, leaving all other child nodes of xp_data unmodified.
You'll of course want to use a variable for the date/time, but the important thing is that you specify it in the key, and not in the value of the dictionary.

How to call a file name when I want to use a small portion of the name?

I am trying for my code to pull a file when only a portion of the file name changes.
Example: I want to pull the file named JEFF_1234.csv where 1234 is an input from a GUI window.
The reason for the file name to have this structure is I want to have one main database that has multiple files for a specific part number. So if the user inputs a part number of 1234 and that will point to 4 files. JEFF_1234.csv, SAM_1234.csv, FRED_1234.csv and JACK_1234.csv.
What you need is a way to update a template with some dynamic value.
A neat way to do this is to define a template string using curly brackets as place-holders for the content that will be generated at runtime.
jefffile_template = "JEFF_{t}.csv"
Then, once you've assigned a value to the unknown pointer, you can convert your template into an appropriate string:
jeff_filename = jefffile_template.format(t="1234")
Which will store the value of JEFF_1234.csv into the variable jeff_filename for usage later in your program.
There are other similar ways of calling formatting functions, but using this by name style is my preferred method.
...and in case you're wondering, yes, this was still valid in 2.7.

Python Docx Module merges tables when added subsequently to document

I'm using the python-docx module and python 3.9.0 to create word docx files with python. The problem I have is the following:
A) I defined a table style named my_table_style
B) I open my template, add one table of that style to my document object and then I store the created file with the following code:
import os
from docx import Document
template_path = os.path.realpath(__file__).replace("test.py","template.docx")
my_file = Document(template_path)
my_file.add_table(1,1,style="my_table_style").rows[-1].cells[0].paragraphs[0].add_run("hello")
my_file.save(template_path.replace("template.docx","test.docx"))
When I now open test.docx, it's all good, there's one table with one row saying "hello".
NOW, when I use this syntax to create two of these tables:
import os
from docx import Document
template_path = os.path.realpath(__file__).replace("test.py","template.docx")
my_file = Document(template_path)
my_file.add_table(1,1,style="my_table_style").rows[-1].cells[0].paragraphs[0].add_run("hello")
my_file.add_table(1,1,style="my_table_style").rows[-1].cells[0].paragraphs[0].add_run("hello")
my_file.save(template_path.replace("template.docx","test.docx"))
Instead of getting two tables, each with one row saying "hello", I get one single table with two rows, each saying "hello". The formatting is however correct, according to my_table_style, so it seems that python-docx merges two subsequently added tables of the same table style. Is this normal behavior? How can I avoid that?
Cheers!
HINTS:
When I use print(len(my_file.tables)) to print the amount of tables present in my_file, I actually get "2"! Also, when I change the style used in the second add_table line it works all good, so this seems to be related to the fact of using the same style. Any ideas, anyone?
Alright, so I figured it out, it seems to be default behaviour by Word to do what's described above. I manually created a table style my_custom_style in the template.docx file where I customized the table border lines etc. to have the format I want to have as if I would have two tables.
Instead of then using two add_table() statements, I used
new_table = my_file.add_table(1,1,style = "my_custom_style")
first_row = new_table.rows[-1]
second_row = new_table.add_row()
(you can actually access table styles defined in your template via python-docx, simply by using the table style name you used to manually create your table style in your word template file used to open your Document object. Just make sure you tick the "add this table style to the word template" option upon saving the style in Word and it should all work). Everything working now.

PyYAML and unusual tags

I am working on a project that uses the Unity3D game engine. For some of the pipeline requirements, it is best to be able to update some files from external tools using Python. Unity's meta and anim files are in YAML so I thought this would be strait forward enough using PyYAML.
The problem is that Unity's format uses custom attributes and I am not sure how to work with them as all the examples show more common tags used by Python and Ruby.
Here is what the top lines of a file look like:
%YAML 1.1
%TAG !u! tag:unity3d.com,2011:
--- !u!74 &7400000
AnimationClip:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
...
When I try to read the file I get this error:
could not determine a constructor for the tag 'tag:unity3d.com,2011:74'
Now after looking at all the other questions asked, this tag scheme does not seem to resemble those questions and answers. For example this file uses "!u!" which I was unable to figure out what it means or how something similar would behave (my wild uneducated guess says it looks like an alias or namespace).
I can do a hack way and strip the tags out but that is not the ideal way to try to do this. I am looking for help on a solution that will properly handle the tags and allow me to parse & encode the data in a way that preserves the proper format.
Thanks,
-R
I also had this problem, and the internet was not very helpful. After bashing my head against this problem for 3 days, I was able to sort it out...or at least get a working solution. If anyone wants to add more info, please do. But here's what I got.
1) The documentation on Unity's YAML file format(they call it a "textual scene file" because it contains text that is human readable) - http://docs.unity3d.com/Manual/TextualSceneFormat.html
It is a YAML 1.1 compliant format. So you should be able to use PyYAML or any other Python YAML library to load up a YAML object.
Okay, great. But it doesn't work. Every YAML library has issues with this file.
2) The file is not correctly formed. It turns out, the Unity file has some syntactical issues that make YAML parsers error out on it. Specifically:
2a) At the top, it uses a %TAG directive to create an alias for the string "unity3d.com,2011". It looks like:
%TAG !u! tag:unity3d.com,2011:
What this means is anywhere you see "!u!", replace it with "tag:unity3d.com,2011".
2b) Then it goes on to use "!u!" all over the place before each object stream. But the problem is that - to be YAML 1.1 compliant - it should actually declare a tag alias for each stream (any time a new object starts with "--- "). Declaring it once at the top and never again is only valid for the first stream, and the next stream knows nothing about "!u!", so it errors out.
Also, this tag is useless. It basically appends "tag:unity3d.com,2011" to each entry in the stream. Which we don't care about. We already know it's a Unity YAML file. Why clutter the data?
3) The object types are given by Unity's Class ID. Here is the documentation on that:
http://docs.unity3d.com/Manual/ClassIDReference.html
Basically, each stream is defined as a new class of object...corresponding to the IDs in that link. So a "GameObject" is "1", etc. The line looks like this:
--- !u!1 &100000
So the "--- " defines a new stream. The "!u!" is an alias for "tag:unity3d.com,2011" and the "&100000" is the file ID for this object (inside this file, if something references this object, it uses this ID....remember YAML is a node-based representation, so that ID is used to denote a node connection).
The next line is the root of the YAML object, which happens to be the name of the Unity Class...example "GameObject". So it turns out we don't actually need to translate from Class ID to Human Readable node type. It's right there. If you ever need to use it, just take the root node. And if you need to construct a YAML object for Unity, just keep a dictionary around based on that documentation link to translate "GameObject" to "1", etc.
The other problem is that most YAML parsers (PyYAML is the one I tested) only support 3 types of YAML objects out of the box:
Scalar
Sequence
Mapping
You can define/extend custom nodes. But this amounts to hand writing your own YAML parser because you have to define EXPLICITLY how each YAML constructor is created, and outputs. Why would I use a Library like PyYAML, then go ahead and write my own parser to read these custom nodes? The whole point of using a library is to leverage previous work and get all that functionality from day one. I spent 2 days trying to make a new constructor for each class ID in unity. It never worked, and I got into the weeds trying to build the constructors correctly.
THE GOOD NEWS/SOLUTION:
Turns out, all the Unity nodes I've ever run into so far are basic "Mapping" nodes in YAML. So you can throw away the custom node mapping and just let PyYAML auto-detect the node type. From there, everything works great!
In PyYAML, you can pass a file object, or a string. So, my solution was to write a simple 5 line pre-parser to strip out the bits that confuse PyYAML(the bits that Unity incorrectly syntaxed) and feed this new string to PyYAML.
1) Remove line 2 entirely, or just ignore it:
%TAG !u! tag:unity3d.com,2011:
We don't care. We know it's a unity file. And the tag does nothing for us.
2) For each stream declaration, remove the tag alias ("!u!") and remove the class ID. Leave the fileID. Let PyYAML auto-detect the node as a Mapping node.
--- !u!1 &100000
becomes...
--- &100000
3) The rest, output as is.
The code for the pre-parser looks like this:
def removeUnityTagAlias(filepath):
"""
Name: removeUnityTagAlias()
Description: Loads a file object from a Unity textual scene file, which is in a pseudo YAML style, and strips the
parts that are not YAML 1.1 compliant. Then returns a string as a stream, which can be passed to PyYAML.
Essentially removes the "!u!" tag directive, class type and the "&" file ID directive. PyYAML seems to handle
rest just fine after that.
Returns: String (YAML stream as string)
"""
result = str()
sourceFile = open(filepath, 'r')
for lineNumber,line in enumerate( sourceFile.readlines() ):
if line.startswith('--- !u!'):
result += '--- ' + line.split(' ')[2] + '\n' # remove the tag, but keep file ID
else:
# Just copy the contents...
result += line
sourceFile.close()
return result
To create a PyYAML object from a Unity textual scene file, call your pre-parser function on the file:
import yaml
# This fixes Unity's YAML %TAG alias issue.
fileToLoad = '/Users/vlad.dumitrascu/<SOME_PROJECT>/Client/Assets/Gear/MeleeWeapons/SomeAsset_test.prefab'
UnityStreamNoTags = removeUnityTagAlias(fileToLoad)
ListOfNodes = list()
for data in yaml.load_all(UnityStreamNoTags):
ListOfNodes.append( data )
# Example, print each object's name and type
for node in ListOfNodes:
if 'm_Name' in node[ node.keys()[0] ]:
print( 'Name: ' + node[ node.keys()[0] ]['m_Name'] + ' NodeType: ' + node.keys()[0] )
else:
print( 'Name: ' + 'No Name Attribute' + ' NodeType: ' + node.keys()[0] )
Hope that helps!
-Vlad
PS. To Answer the next issue in making this usable:
You also need to walk the entire project directory and parse all ".meta" files for the "GUID", which is Unity's inter-file reference. So, when you see a reference in a Unity YAML file for something like:
m_Materials:
- {fileID: 2100000, guid: 4b191c3a6f88640689fc5ea3ec5bf3a3, type: 2}
That file is somewhere else. And you can re-cursively open that one to find out any dependencies.
I just ripped through the game project and saved a dictionary of GUID:Filepath Key:Value pairs which I can match against.

Categories