Code Monkey home page Code Monkey logo

pipeline-sec-filings's Introduction

Pre-Processing Pipeline for SEC Filings

This repo implements a document pre-processing pipeline for SEC filings. Currently, the pipeline is capable of extracting narrative text from user-specified sections in 10-K, 10-Q, and S-1 filings.

Developer Quick Start

  • Using pyenv to manage virtualenv's is recommended

    • Mac install instructions. See here for more detailed instructions.
      • brew install pyenv-virtualenv
      • pyenv install 3.8.15
    • Linux instructions are available here.
  • Create a virtualenv to work in and activate it, e.g. for one named sec-filings:

    pyenv virtualenv 3.8.15 sec-filings
    pyenv activate sec-filings

  • Run make install

  • Start a local jupyter notebook server with make run-jupyter
    OR
    just start the fast-API locally with make run-web-app

Quick Tour

You can run this Colab notebook to see how pipeline-section.ipynb extracts the narrative text sections from an SEC Filing and defines an API.

Extracting Narrative Text from an SEC Filing

To retrieve narrative text section(s) from an iXBRL S-1, 10-K, or 10-Q document (or amended version S-1/A, 10-K/A, or 10-Q/A), post the document to the /section API. You can try this out by downloading the sample documents using make dl-test-artifacts. Then, from the sample-docs folder, run:

curl -X 'POST' \
  'http://localhost:8000/sec-filings/v0.2.1/section' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected]' \
  -F section=RISK_FACTORS | jq -C . | less -R

Note that additional -F section parameters may be included in the curl request to fetch multiple sections at once. Valid sections for 10-Ks, 10-Qs, and S-1s are available on the SEC website. You can also reference this file for a list of valid section parameters, e.g. RISK_FACTORS OR MANAGEMENT_DISCUSSION.

You'll get back a response that looks like the following. Piping through jq and less formats/colors the outputs and lets your scroll through the results.

{
  "RISK_FACTORS": [
    {
      "text": "You should carefully consider the risks described in this section. Our future performance is subject to risks and uncertainties that could have a material adverse effect on our business, results of operations, and financial condition and the trading price of our common stock. We may be subject to other risks and uncertainties not presently known to us. In addition, please see our note about forward-looking statements included in the MD&A.",
      "type": "NarrativeText"
    },
    {
      "text": "Our revenue is subject to volatility in metal prices, which could negatively affect our results of operations or cash flow.",
      "type": "NarrativeText"
    },
    {
      "text": "Market prices for gold, silver, copper, nickel, and other metals may fluctuate widely over time and are affected by numerous factors beyond our control. These factors include metal supply and demand, industrial and jewelry fabrication, investment demand, central banking actions, inflation expectations, currency values, interest rates, forward sales by metal producers, and political, trade, economic, or banking conditions.",
      "type": "NarrativeText"
    },
    ...
  ]
}

You can also pass in custom section regex patterns using the section_regex parameter. For example, you can run the following command to request the risk factors section:

curl -X 'POST' \
  'http://localhost:8000/sec-filings/v0.2.1/section' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected]' \
  -F 'section_regex=risk factors'  | jq -C . | less -R

The result will be:

{
  "REGEX_0": [
    {
      "text": "You should carefully consider the risks described in this section. Our future performance is subject to risks and uncertainties that could have a material adverse effect on our business, results of operations, and financial condition and the trading price of our common stock. We may be subject to other risks and uncertainties not presently known to us. In addition, please see our note about forward-looking statements included in the MD&A.",
      "type": "NarrativeText"
    },
    {
      "text": "Our revenue is subject to volatility in metal prices, which could negatively affect our results of operations or cash flow.",
      "type": "NarrativeText"
    },
    {
      "text": "Market prices for gold, silver, copper, nickel, and other metals may fluctuate widely over time and are affected by numerous factors beyond our control. These factors include metal supply and demand, industrial and jewelry fabrication, investment demand, central banking actions, inflation expectations, currency values, interest rates, forward sales by metal producers, and political, trade, economic, or banking conditions.",
      "type": "NarrativeText"
    },
    ...
  ]
}

As with the section parameter, you can request multiple regexes by passing in multiple values for the section_regex parameter. The requested pattern will be treated as a raw string.

You can also use special regex characters in your pattern, as shown in the example below:

 curl -X 'POST' \
  'http://localhost:8000/sec-filings/v0.2.1/section' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected]' \
  -F "section_regex=^(\S+\W?)+$"

You can always replace the header -H 'accept: application/json' with -H 'accept: text/csv' depending on the format you want to fetch from the API as follows:

 curl -X 'POST' \
  'http://localhost:8000/sec-filings/v0.2.1/section' \
  -H 'accept: text/csv' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected]' \
  -F section=RISK_FACTORS | jq -C . | less -R

The result will be:

"section,element_type,text\r\nRISK_FACTORS,NarrativeText,\"You should carefully consider the risks described in this section. Our future performance is subject to risks and uncertainties that could have a material adverse effect on our business, results of operations, and financial condition and the trading price of our common stock. We may be subject to other risks and uncertainties not presently known to us. In addition, please see our note about forward-looking statements included in the MD&A.\"\r\nRISK_FACTORS,NarrativeText,\"Our revenue is subject to volatility in metal prices, which could negatively affect our results of operations or cash flow.\"\r\nRISK_FACTORS,NarrativeText,\"Market prices for gold, silver, copper, nickel, and other metals may fluctuate widely over time and are affected by numerous factors beyond our control. These factors include metal supply and demand, industrial and jewelry fabrication, investment demand, central banking actions, inflation expectations, currency values, interest rates, forward sales by metal producers, and political, trade, economic, or banking conditions.\"\r\n

In addition, you can add the form -F 'output_schema=labelstudio' if you want an output to be compatible with labelstudio as follows:

 curl -X 'POST' \
  'http://localhost:8000/sec-filings/v0.2.1/section' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected]' \
  -F 'output_schema=labelstudio' \
  -F section=RISK_FACTORS | jq -C . | less -R

The result will be:

{
  "RISK_FACTORS": [
    {
      "data": {
        "text": "You should carefully consider the risks described in this section. Our future performance is subject to risks and uncertainties that could have a material adverse effect on our business, results of operations, and financial condition and the trading price of our common stock. We may be subject to other risks and uncertainties not presently known to us. In addition, please see our note about forward-looking statements included in the MD&A.",
        "ref_id": "7a912bb639b547404be4ceaf5d9083a9"
      }
    },
    {
      "data": {
        "text": "Our revenue is subject to volatility in metal prices, which could negatively affect our results of operations or cash flow.",
        "ref_id": "d4cc8e0e0c2b68ef69282c5250b721c9"
      }
    },
    ...
    ]
}

Helper functions for SEC EDGAR API

You can use some of the functions provided in prepline_sec_filings.fetch to directly view or manipulate the filings available from the SEC's EDGAR API. For example, get_filing(cik, accession_number, your_organization_name, your_email) will return the text of the filing with accession number accession_number for the organization with CIK number cik. your_organization_name and your_email should be your information. The parameters your_organization_name and your_email are passed along to Edgar's API to identify the caller and are required by Edgar. Alternatively, the parameters may be omitted if the environment variables SEC_API_ORGANIZATION and SEC_API_EMAIL are defined.

Helper functions are also provided for cases where the CIK and/or accession numbers are not known. For example, get_form_by_ticker('mmm', '10-K', your_organization_name, your_email) returns the text of the latest 10-K filing from 3M, and open_form_by_ticker('mmm', '10-K', your_organization_name, your_email) opens the SEC index page for the same filing in a web browser.

Generating Python files from the pipeline notebooks

The python module section.py contains the FASTApi code needed to serve the API. It's created with make generate-api, which derives the API from the notebook pipeline-section.ipynb.

You can generate the FastAPI APIs from all pipeline-notebooks/ by running make generate-api.

Docker

It is not necessary to run Docker in a local development environment, however a Dockerfile and make targets of docker-build, docker-start-api, and docker-start-jupyter are provided for convenience.

You can also launch a Jupyter instance to try out the notebooks with Binder.

Security Policy

See our security policy for information on how to report security vulnerabilities.

Learn more

Section Description
Company Website Unstructured.io product and company info
EDGAR API Documentation for the SEC
10-K Filings Detailed documentation on 10-K filings
10-Q Filings Detailed documentation on 10-Q filings
S-1 Filings Detailed documentation on S-1 filings

pipeline-sec-filings's People

Contributors

asymness avatar cragwolfe avatar dependabot[bot] avatar dgharsallah avatar duongvy0112 avatar gokullan avatar laverdes avatar mthwrobinson avatar natygyoon avatar qued avatar rishav1707 avatar ryannikolaidis avatar stamzid avatar tabossert avatar yuming-long avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pipeline-sec-filings's Issues

make install does not work

 make install
>>> 
python3 -m pip install pip==23.1.2
Requirement already satisfied: pip==23.1.2 in /Users/xxxx/.pyenv/versions/3.8.15/envs/sec-filings/lib/python3.8/site-packages (23.1.2)
WARNING: There was an error checking the latest version of pip.
pip install -r requirements/base.txt
Collecting anyio==3.7.0 (from -r requirements/base.txt (line 7))
  Using cached anyio-3.7.0-py3-none-any.whl (80 kB)
Collecting attrs==23.1.0 (from -r requirements/base.txt (line 11))
  Using cached attrs-23.1.0-py3-none-any.whl (61 kB)
Collecting beautifulsoup4==4.12.2 (from -r requirements/base.txt (line 13))
  Using cached beautifulsoup4-4.12.2-py3-none-any.whl (142 kB)
Collecting bleach==6.0.0 (from -r requirements/base.txt (line 15))
  Using cached bleach-6.0.0-py3-none-any.whl (162 kB)
Collecting certifi==2023.5.7 (from -r requirements/base.txt (line 17))
  Using cached certifi-2023.5.7-py3-none-any.whl (156 kB)
Collecting charset-normalizer==3.1.0 (from -r requirements/base.txt (line 19))
  Using cached charset_normalizer-3.1.0-cp38-cp38-macosx_11_0_arm64.whl (121 kB)
Collecting click==8.1.3 (from -r requirements/base.txt (line 21))
  Using cached click-8.1.3-py3-none-any.whl (96 kB)
Collecting defusedxml==0.7.1 (from -r requirements/base.txt (line 26))
  Using cached defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Collecting exceptiongroup==1.1.1 (from -r requirements/base.txt (line 28))
  Using cached exceptiongroup-1.1.1-py3-none-any.whl (14 kB)
Collecting fastapi==0.95.2 (from -r requirements/base.txt (line 30))
  Using cached fastapi-0.95.2-py3-none-any.whl (56 kB)
Collecting fastjsonschema==2.17.1 (from -r requirements/base.txt (line 32))
  Using cached fastjsonschema-2.17.1-py3-none-any.whl (23 kB)
Collecting h11==0.14.0 (from -r requirements/base.txt (line 34))
  Using cached h11-0.14.0-py3-none-any.whl (58 kB)
Collecting httptools==0.5.0 (from -r requirements/base.txt (line 36))
  Using cached httptools-0.5.0-cp38-cp38-macosx_10_9_universal2.whl (230 kB)
Collecting idna==3.4 (from -r requirements/base.txt (line 38))
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting importlib-metadata==6.6.0 (from -r requirements/base.txt (line 42))
  Using cached importlib_metadata-6.6.0-py3-none-any.whl (22 kB)
Collecting importlib-resources==5.12.0 (from -r requirements/base.txt (line 46))
  Using cached importlib_resources-5.12.0-py3-none-any.whl (36 kB)
Collecting jinja2==3.1.2 (from -r requirements/base.txt (line 48))
  Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting joblib==1.2.0 (from -r requirements/base.txt (line 52))
  Using cached joblib-1.2.0-py3-none-any.whl (297 kB)
Collecting jsonschema==4.17.3 (from -r requirements/base.txt (line 56))
  Using cached jsonschema-4.17.3-py3-none-any.whl (90 kB)
Collecting jupyter-client==8.2.0 (from -r requirements/base.txt (line 58))
  Using cached jupyter_client-8.2.0-py3-none-any.whl (103 kB)
Collecting jupyter-core==5.3.0 (from -r requirements/base.txt (line 60))
  Using cached jupyter_core-5.3.0-py3-none-any.whl (93 kB)
Collecting jupyterlab-pygments==0.2.2 (from -r requirements/base.txt (line 67))
  Using cached jupyterlab_pygments-0.2.2-py2.py3-none-any.whl (21 kB)
Collecting lxml==4.9.2 (from -r requirements/base.txt (line 69))
  Using cached lxml-4.9.2.tar.gz (3.7 MB)
  Preparing metadata (setup.py) ... done
Collecting markupsafe==2.1.2 (from -r requirements/base.txt (line 71))
  Using cached MarkupSafe-2.1.2-cp38-cp38-macosx_10_9_universal2.whl (17 kB)
Collecting mistune==2.0.5 (from -r requirements/base.txt (line 75))
  Using cached mistune-2.0.5-py2.py3-none-any.whl (24 kB)
Collecting mypy==1.3.0 (from -r requirements/base.txt (line 77))
  Using cached mypy-1.3.0-cp38-cp38-macosx_11_0_arm64.whl (9.7 MB)
Collecting mypy-extensions==1.0.0 (from -r requirements/base.txt (line 79))
  Using cached mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Collecting nbclient==0.8.0 (from -r requirements/base.txt (line 81))
  Using cached nbclient-0.8.0-py3-none-any.whl (73 kB)
Collecting nbconvert==7.4.0 (from -r requirements/base.txt (line 83))
  Using cached nbconvert-7.4.0-py3-none-any.whl (285 kB)
Collecting nbformat==5.9.0 (from -r requirements/base.txt (line 85))
  Using cached nbformat-5.9.0-py3-none-any.whl (77 kB)
Collecting nltk==3.8.1 (from -r requirements/base.txt (line 89))
  Using cached nltk-3.8.1-py3-none-any.whl (1.5 MB)
Collecting numpy==1.24.3 (from -r requirements/base.txt (line 91))
  Using cached numpy-1.24.3-cp38-cp38-macosx_11_0_arm64.whl (13.8 MB)
Collecting packaging==23.1 (from -r requirements/base.txt (line 96))
  Using cached packaging-23.1-py3-none-any.whl (48 kB)
Collecting pandocfilters==1.5.0 (from -r requirements/base.txt (line 100))
  Using cached pandocfilters-1.5.0-py2.py3-none-any.whl (8.7 kB)
Collecting pkgutil-resolve-name==1.3.10 (from -r requirements/base.txt (line 102))
  Using cached pkgutil_resolve_name-1.3.10-py3-none-any.whl (4.7 kB)
Collecting platformdirs==3.5.1 (from -r requirements/base.txt (line 104))
  Using cached platformdirs-3.5.1-py3-none-any.whl (15 kB)
Collecting pydantic==1.10.8 (from -r requirements/base.txt (line 106))
  Using cached pydantic-1.10.8-cp38-cp38-macosx_11_0_arm64.whl (2.5 MB)
Collecting pygments==2.15.1 (from -r requirements/base.txt (line 108))
  Using cached Pygments-2.15.1-py3-none-any.whl (1.1 MB)
Collecting pyrsistent==0.19.3 (from -r requirements/base.txt (line 110))
  Using cached pyrsistent-0.19.3-cp38-cp38-macosx_10_9_universal2.whl (82 kB)
Collecting python-dateutil==2.8.2 (from -r requirements/base.txt (line 112))
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting python-dotenv==1.0.0 (from -r requirements/base.txt (line 114))
  Using cached python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Collecting python-multipart==0.0.6 (from -r requirements/base.txt (line 116))
  Using cached python_multipart-0.0.6-py3-none-any.whl (45 kB)
Collecting pyyaml==6.0 (from -r requirements/base.txt (line 118))
  Using cached PyYAML-6.0.tar.gz (124 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [48 lines of output]
      running egg_info
      writing lib/PyYAML.egg-info/PKG-INFO
      writing dependency_links to lib/PyYAML.egg-info/dependency_links.txt
      writing top-level names to lib/PyYAML.egg-info/top_level.txt
      Traceback (most recent call last):
        File "/Users/xxxx/.pyenv/versions/sec-filings/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/Users/xxxx/.pyenv/versions/sec-filings/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/Users/xxxx/.pyenv/versions/sec-filings/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 288, in <module>
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 104, in setup
          return distutils.core.setup(**attrs)
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 184, in setup
          return run_commands(dist)
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
          dist.run_commands()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 967, in run_command
          super().run_command(command)
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 321, in run
          self.find_sources()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 329, in find_sources
          mm.run()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 550, in run
          self.add_defaults()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/command/egg_info.py", line 588, in add_defaults
          sdist.add_defaults(self)
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/command/sdist.py", line 102, in add_defaults
          super().add_defaults()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/sdist.py", line 250, in add_defaults
          self._add_defaults_ext()
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/sdist.py", line 335, in _add_defaults_ext
          self.filelist.extend(build_ext.get_source_files())
        File "<string>", line 204, in get_source_files
        File "/private/var/folders/4r/qt2_c_d93gg36q06mpq5pm_00000gn/T/pip-build-env-yh9zc8j4/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 107, in __getattr__
          raise AttributeError(attr)
      AttributeError: cython_sources
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
WARNING: There was an error checking the latest version of pip.
make: *** [install-base-pip-packages] Error 1

Improve Narrative Section Extraction

Right now, a SECSection regex is used to identify a TOC section in get_section_narrative. That generally works pretty well. The matching TOC title text is then used to look for the section in the content but rather than sticking with the original regex, a more lenient match condition is ultimately used in 10-K’s and 10-Q’s with match_10k_toc_title_to_section. The better thing to do is likely stick with the original matching regex.

The lenient post-TOC match is why the EHC test fails for the BUSINESS section, and may be the reason for other failures as well.

Definition of Done

  • Updated section extraction logic such that fewer tests are marked as xfailed, in particular the EHC case mentioned above.

Rename the sample-sec-docs/ folder

The convention for preprocessing pipeline repos is to place sample documents in the sample-docs/ folder.

Pipeline-sec-filings should follow this convention, i.e. sample-sec-docs/ should be moved do sample-docs/.

Definition of Done

  • The directory is moved, and the Makefile and tests are updated, e.g. starting with make dl-test-artifacts.
  • Grep'ing through the repo shows no references to sample-sec-docs

Quick tour Colab notebook not working

Hi, I'm getting the following error in the cell where get_form_by_ticker is called.

HTTPError: 403 Client Error: Forbidden for url: http://www.sec.gov/cgi-bin/browse-edgar?CIK=rgld&Find=Search&owner=exclude&action=getcompany

Could you please help me how to solve this issue?

Thanks.

Fix section collection

Problem:
Not all sections were collected into elements.

How to replicate this issue:
Run the following commands which test this document.

  • PYTHONPATH=. pytest "test_real_docs/test_real_example\ s.py::test_first_last[bj-SECSection.CERTAIN_TRADEMARKS-first]"
  • PYTHONPATH=. pytest "test_real_docs/test_real_example\ s.py::test_first_last[bj-SECSection.CERTAIN_TRADEMARKS-last]"

Note that the NarrativeText part of "CERTAIN_TRADEMARKS" section was not collected into elements in the first place.

This prospectus includes trademarks and service marks owned by us, including BJ’s Wholesale Club®, BJ’s®, Wellsley Farms®, Berkley Jensen®, My BJ’s Perks®, BJ’s Easy Renewal®, BJ’s Gas®, BJ’s Perks Elite®, BJ’s Perks Plus®, Inner Circle® and BJ’s Perks Rewards®. This prospectus also contains trademarks, trade names and service marks of other companies, which are the property of their respective owners. Solely for convenience, trademarks, trade names and service marks referred to in this prospectus may appear without the ®, ™ or SM symbols, but such references are not intended to indicate, in any way, that we will not assert, to the fullest extent under applicable law, our rights or the right of the applicable licensor to these trademarks, trade names and service marks. We do not intend our use or display of other parties’ trademarks, trade names or service marks to imply, and such use or display should not be construed to imply, a relationship with, or endorsement or sponsorship of us by, these other parties.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.