Hi all. I am trying to open an intake catalog from yaml file with intake.open_catalog('test.yaml)
but I get a Constructor Error: ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object:intake.catalog.local.LocalCatalogEntry'
Does anyone know why this could be? Am I using the wrong open command? I also tried intake.open_yaml_file_cat()
. Or is my yaml file perhaps formatted incorrectly? The script and my yaml file are here.
Does anyone know why this could be? Am I using the wrong open command? I also tried
intake.open_catalog('test.yaml')
is the right command. The contents of the YAML file are the culprit. How was the test.yaml
file produced?
How was the test.yaml file produced?
Never mind... I see the notebook
I generate it in the intake_serialize.ipynb
notebook in that repository. I am trying to load the url of a catalog as a catalog, walk down to a few levels of depth, and save that new catalog as a yaml file that can also be loaded as a catalog.
Thanks for helping!
The meat of it is:
with open('test.yaml', 'w') as f:
f.write(yaml.dump(stac_cat.walk(depth=10)))
@jukent, Since stac_cat.walk(...)
returns a dictionary with Python objects that may or may not be serializable, serializing this dict results in an invalid YAML file... You will need to jump through some hoops to get a valid YAML file :frown:.
I don't how to deal with these problematic Python objects ( for e.g. the satstac.item.Item
which I believe comes from https://github.com/sat-utils/sat-stac/blob/master/satstac/item.py)
It's my understanding that satstac.item.Item
isn't serializable. So, if your goal is to serialize the walked catalog, you may have to exclude some of the items
An alternative would be to serialize just top-level (the parents). For e.g.
Screen-Shot-2021-08-05-at-11.50.01-AM.png
Thanks @Anderson Banihirwe Could you point me to some documentation on what makes a valid YAML that can be opened by intake? I thought any dictionary could be turned into a YAML.
I think the purpose of the project is to find those hoops and figure out how to jump through them to get more than just the top-level in the YAML.
Could you point me to some documentation on what makes a valid YAML that can be opened by intake?
There isn't documentation about this :frown:.
I thought any dictionary could be turned into a YAML.
That's right... It's the YAML loading part that creates all sorts of issues. intake uses the default yaml loader (which isn't aware of some of these custom objects).
The main test I use is if you can read the YAML file with the default pyyaml loader, intake should be able to do the same(https://github.com/intake/intake/blob/6959346c1db430547546627989875ce0cbdfb53f/intake/utils.py#L75)
import yaml
with open('test.yaml') as f:
data = yaml.safe_load(f)
I don't know if creating custom YAML loaders is going to help for your use case, but here's an example in case you are interested: https://stackoverflow.com/questions/58924168/loading-custom-objects-with-pyyaml
Thanks @Anderson Banihirwe It seems the yaml.dump()
method adds all sorts of things to the YAML that the intake.yaml()
method excludes, and I need to figure out what those are so that I can have save the YAML a few layers down in a form that is still readable by intake
. I will reach out if I have any more specific questions.
Even just knowing that I am reading the YAML correctly is a huge help in finding the error!
I will reach out if I have any more specific questions.
Even just knowing that I am reading the YAML correctly is a huge help in finding the error!
:+1: sounds good...
By the way, there's a shortcut method .save()
for saving an intake catalog object to a YAML file
In [14]: url = 'https://raw.githubusercontent.com/sat-utils/sat-stac/master/test/catalog/catalog.json'
In [15]: cat = intake.open_stac_catalog(url)
In [16]: list(cat)
Out[16]: ['stac-catalog-eo']
In [17]: walked_cat_dict = cat.walk(depth=10)
In [18]: type(walked_cat_dict)
Out[18]: dict
In [19]: walked_cat = intake.catalog.Catalog.from_dict(walked_cat_dict)
In [21]: walked_cat.save('test.yaml')
@Joe Hamman And I were looking at that method today as a possible avenue as well.
Last updated: May 16 2025 at 17:14 UTC