ferc_xbrl_extractor.instance

Parse a single instance.

Module Contents

Classes

Period

Pydantic model that defines an XBRL period.

DimensionType

Indicate dimension type.

Axis

Pydantic model that defines an XBRL Axis.

Entity

Pydantic model that defines an XBRL Entity.

Context

Pydantic model that defines an XBRL Context.

Fact

Pydantic model that defines an XBRL Fact.

Instance

Class to encapsulate a parsed instance.

InstanceBuilder

Class to manage parsing XBRL filings.

Functions

instances_from_zip(→ list[InstanceBuilder])

Get list of instances from specified path to zipfile.

get_filing_name(→ str)

Generate the filing filename based on its metadata, as seen in rssfeed.

get_instances(→ list[InstanceBuilder])

Get list of instances from specified path.

Attributes

ferc_xbrl_extractor.instance.XBRL_INSTANCE = 'http://www.xbrl.org/2003/instance'[source]
class ferc_xbrl_extractor.instance.Period(/, **data: Any)[source]

Bases: pydantic.BaseModel

Pydantic model that defines an XBRL period.

A period can be instantaneous or a duration of time. Instantaneous periods will only have the end_date field, while duration periods will have start_date, and end_date.

instant: bool[source]
start_date: str | None[source]
end_date: str[source]
classmethod from_xml(elem: lxml.etree._Element) Period[source]

Construct Period from XML element.

class ferc_xbrl_extractor.instance.DimensionType(*args, **kwds)[source]

Bases: enum.Enum

Indicate dimension type.

XBRL contains explicit (all allowable values defined in taxonomy) and typed (dimension with dynamic values) dimensions.

EXPLICIT[source]
TYPED[source]
class ferc_xbrl_extractor.instance.Axis(/, **data: Any)[source]

Bases: pydantic.BaseModel

Pydantic model that defines an XBRL Axis.

Axes (or dimensions, terms are interchangeable in XBRL) are used for identifying individual facts when the entity id, and period are insufficient. All axes will be turned into columns, and be a part of the primary key for the table they belong to.

name: str[source]
value: str = ''[source]
dimension_type: DimensionType[source]
classmethod strip_prefix(name: str) str[source]

Strip XML prefix from name.

classmethod from_xml(elem: lxml.etree._Element) Axis[source]

Construct Axis from XML element.

class ferc_xbrl_extractor.instance.Entity(/, **data: Any)[source]

Bases: pydantic.BaseModel

Pydantic model that defines an XBRL Entity.

Entities are used to identify individual XBRL facts. An Entity should contain a unique identifier, as well as any dimensions defined for a table.

identifier: str[source]
dimensions: list[Axis][source]
classmethod from_xml(elem: lxml.etree._Element) Entity[source]

Construct Entity from XML element.

snakecase_dimensions() list[str][source]

Return list of dimension names in snakecase.

check_dimensions(primary_key: list[str]) bool[source]

Check if Context has extra axes not defined in primary key.

class ferc_xbrl_extractor.instance.Context(/, **data: Any)[source]

Bases: pydantic.BaseModel

Pydantic model that defines an XBRL Context.

Contexts are used to provide useful background information for facts. The context indicates the entity, time period, and any other dimensions which apply to the fact.

c_id: str[source]
entity: Entity[source]
period: Period[source]
classmethod from_xml(elem: lxml.etree._Element) Context[source]

Construct Context from XML element.

check_dimensions(primary_key: list[str]) bool[source]

Check if Context has extra axes not defined in primary key.

Facts missing axes from primary key can be treated as totals across that axis, but facts with extra axes would not fit in table.

Parameters:

primary_key – Primary key of table.

as_primary_key(filing_name: str, axes: list[str]) dict[str, str][source]

Return a dictionary that represents the context as composite primary key.

__hash__()[source]

Just hash Context ID as it uniquely identifies contexts for an instance.

class ferc_xbrl_extractor.instance.Fact(/, **data: Any)[source]

Bases: pydantic.BaseModel

Pydantic model that defines an XBRL Fact.

A fact is a single “data point”, which contains a name, value, and a Context to give background information.

name: str[source]
c_id: str[source]
value: str | None[source]
classmethod from_xml(elem: lxml.etree._Element) Fact[source]

Construct Fact from XML element.

f_id() str[source]

A unique identifier for the Fact.

There is an id attribute on most fact entries, but there are some facts without an id attribute, so we can’t use that. Instead we assume that each fact is uniquely identified by its context ID and the concept name.

NB, this is a function, not a property. This would be a property, but a property is not pickleable within Pydantic 1.x

class ferc_xbrl_extractor.instance.Instance(contexts: dict[str, Context], instant_facts: dict[str, list[Fact]], duration_facts: dict[str, list[Fact]], filing_name: str, publication_time: datetime.datetime)[source]

Class to encapsulate a parsed instance.

This class should be constructed using the InstanceBuilder class. Instance wraps the contexts and facts parsed by the InstanceBuilder, and is used to construct dataframes from fact tables.

get_facts(instant: bool, concept_names: list[str], primary_key: list[str]) dict[str, list[Fact]][source]

Return a dictionary that maps Context ID’s to a list of facts for each context.

Parameters:
  • instant – Get facts with instant or duration period.

  • concept_names – Name of concepts which map to a column name and name of facts.

  • primary_key – Name of columns in primary_key used to filter facts.

class ferc_xbrl_extractor.instance.InstanceBuilder(file_info: str | BinaryIO, name: str, publication_time: datetime.datetime)[source]

Class to manage parsing XBRL filings.

parse(fact_prefix: str = 'ferc') Instance[source]

Parse a single XBRL instance using XML library directly.

This will return an Instance class which wraps the data parsed from the filing in question.

Parameters:

fact_prefix – Prefix to identify facts in filing (defaults to ‘ferc’).

Returns:

Dictionary of contexts in filing. fact_dict: Dictionary of facts in filing. filing_name: Name of filing.

Return type:

context_dict

ferc_xbrl_extractor.instance.instances_from_zip(instance_path: pathlib.Path | io.BytesIO) list[InstanceBuilder][source]

Get list of instances from specified path to zipfile.

Parameters:

instance_path – Path to zipfile containing XBRL filings.

ferc_xbrl_extractor.instance.get_filing_name(filing_metadata: dict[str, str | int]) str[source]

Generate the filing filename based on its metadata, as seen in rssfeed.

This uses the same logic as pudl_archiver.archivers.ferc.xbrl.archive_year.

NOTE: the published time appears to be in America/New_York. We need to make the archivers explictly use UTC everywhere, but until then we will force America/New_York in this function.

ferc_xbrl_extractor.instance.get_instances(instance_path: pathlib.Path | io.BytesIO) list[InstanceBuilder][source]

Get list of instances from specified path.

Parameters:

instance_path – Path to one or more XBRL filings.