ferc_xbrl_extractor.cli ======================= .. py:module:: ferc_xbrl_extractor.cli .. autoapi-nested-parse:: A command line interface (CLI) to the xbrl extractor. Functions --------- .. autoapisummary:: ferc_xbrl_extractor.cli.parse ferc_xbrl_extractor.cli.write_to_sqlite ferc_xbrl_extractor.cli.write_to_duckdb ferc_xbrl_extractor.cli.load_extracted ferc_xbrl_extractor.cli.run_main ferc_xbrl_extractor.cli.convert_duckdb_into_parquet ferc_xbrl_extractor.cli.convert_and_validate_datapackage_sqlite_to_parquet ferc_xbrl_extractor.cli.write_datapackage ferc_xbrl_extractor.cli.main Module Contents --------------- .. py:function:: parse() Process base commands from the CLI. .. py:function:: write_to_sqlite(sqlite_engine: sqlalchemy.engine.Engine, table_name: str, table_data: pandas.DataFrame) Write one table to a SQLite database. .. py:function:: write_to_duckdb(duckdb_path: str, table_name: str, table_data: pandas.DataFrame) Write one table to a duckdb database. .. py:function:: load_extracted(extracted: ferc_xbrl_extractor.xbrl.ExtractOutput, sqlite_uri: str, duckdb_path: str | None) -> None Write extracted data to SQLite/Duckdb databases. .. py:function:: run_main(filings: list[pathlib.Path] | list[io.BytesIO], taxonomy: str | pathlib.Path | io.BytesIO, output_dir: pathlib.Path, sqlite_path: pathlib.Path, duckdb_path: pathlib.Path, form_number: int, workers: int | None, batch_size: int | None, loglevel: str, logfile: pathlib.Path | None, requested_tables: list[str] | None = None, instance_pattern: str = '') Log setup, taxonomy finding, and SQL IO. .. py:function:: convert_duckdb_into_parquet(duckdb_path: pathlib.Path, parquet_dir: pathlib.Path) Convert the duckdb into a directory of parquet files. We do this using COPY. We tried using EXPORT DATABASE, but it unfortunately sanitizes the table names, which removes the schedule numbers in the table names so we can't use it. .. py:function:: convert_and_validate_datapackage_sqlite_to_parquet(datapackage_path: pathlib.Path) -> dict Convert the SQLite datapackage into one that points at Parquet files. * instead of ``path`` pointing at monolithic SQLite db, point at individual Parquet files instead * update format/metadata fields * remove irrelevant dialect field .. py:function:: write_datapackage(datapackage: dict, output_dir: pathlib.Path) Write a datapackage to /datapackage.json. output_dir must exist. .. py:function:: main() Parse arguments and pass to run_main.