Code Monkey home page Code Monkey logo

Comments (5)

scottzach1 avatar scottzach1 commented on September 13, 2024

Changes made to this endpoint may have also broken the Python Azure-SDK package downstream:

from azure-rest-api-specs.

v-jiaodi avatar v-jiaodi commented on September 13, 2024

@xuhumsft Please help take a look, thanks.

from azure-rest-api-specs.

scottzach1 avatar scottzach1 commented on September 13, 2024

*it appears that the behavior of this endpoint has changed significantly for us, but pagination appears just as broken.

Today I found the time to test the /threatIntelligence/main/indicators endpoint further. From what I can tell
pagination is clearly broken.

Unless I am using the endpoint wrong it appears that every page appears to contain every single indicator. It also
appears the generated nextLink is incrementing the $skip value but this is being completely ignored. The paging
appears to be never ending with the $skip value even being incremented by the endpoint greater than the
defined $top.

Although this is manageable for small environments this becomes a very serious concern when the number of indicators
exceeds ~6000 (depending on indicator pattern length) as we are no longer able to paginate all the indicators in the
platform.

This is a significant problem for us as we are relying on this endpoint to protect our customers. Unfortunately, it
appears that the Threat Intelligence Data Connector + Graph API GET /security/tiIndicators is the
only reliable way to ingest indicators into customer environments. Yet somehow this approach is both beta (Graph) and
deprecated (Connector).

I'll show the behavior I'm experiencing below:

Script

For this example I have modified reproduce.py to help log duplicates. Let's name it reproduce_pages.py.

# reproduce_pages.py
import os
import sys
import urllib.parse
from collections import Counter

import requests
from azure.identity import ClientSecretCredential
from requests import Session


def pagination_example() -> None:
    secret = ClientSecretCredential(
        tenant_id=os.getenv("SENTINEL_TENANT_ID"),
        client_id=os.getenv("SENTINEL_CLIENT_ID"),
        client_secret=os.getenv("SENTINEL_CLIENT_SECRET"),
    )

    s = Session()
    s.headers = {
        "Accept": "application/json",
        "Authorization": "bearer " + secret.get_token("https://management.azure.com/.default").token,
    }

    subscription_id = os.getenv("SENTINEL_SUBSCRIPTION_ID")
    resource_group_name = os.getenv("SENTINEL_RESOURCE_GROUP_NAME")
    workspace_name = os.getenv("SENTINEL_WORKSPACE_NAME")

    params = {"api-version": "2024-03-01", "$top": 8_000}
    next_link = (
        f"https://management.azure.com/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}"
        f"/providers/Microsoft.OperationalInsights/workspaces/{workspace_name}/providers/Microsoft.SecurityInsights"
        f"/threatIntelligence/main/indicators?{urllib.parse.urlencode(params)}"
    )

    counter = Counter()
    page = 1

    while next_link:
        r = s.get(next_link)
        print(f"# PAGE {page} - [{r.status_code}] GET {next_link}")
        j = r.json()
        try:
            r.raise_for_status()
        except requests.HTTPError:
            print(j, file=sys.stderr)
            raise

        for i, indicator in enumerate(j["value"], start=1):
            indicator_id = indicator["name"]
            counter[indicator_id] += 1

            # Only log the first and last 2 indicators (for brevity)
            if indicator in j["value"][:2] + j["value"][-2:]:
                print(f"[{i}] {indicator_id} (count={counter[indicator_id]})")
            elif i == 3:
                print("...")

        page += 1
        next_link = j.get("nextLink", None)


if __name__ == "__main__":
    pagination_example()

Sentinel Instance (Small)

When I run it a Sentinel instance with 3262 indicators we can see that every page contains every single indicator. The
next link then appears to edit the $skip parameter but as shown by the responses this is clearly not respected.

foo@bar:~$ python reproduce_pages.py
# PAGE 1 - [200] GET https://management.azure.com/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&%24top=8000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=1)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=1)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=1)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=1)
# PAGE 2 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=1000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=2)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=2)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=2)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=2)
# PAGE 3 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=2000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=3)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=3)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=3)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=3)
# PAGE 4 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=3000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=4)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=4)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=4)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=4)
# PAGE 5 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=4000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=5)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=5)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=5)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=5)
# PAGE 6 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=5000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=6)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=6)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=6)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=6)
# PAGE 7 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=6000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=7)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=7)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=7)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=7)
# PAGE 8 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=7000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=8)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=8)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=8)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=8)
# PAGE 9 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=8000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=9)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=9)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=9)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=9)
# PAGE 10 - [200] GET https://management.azure.com:443/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&$top=8000&$skip=9000
[1] f313e338-7556-1044-b097-c24e2d2a9f9d (count=10)
[2] 228c1439-2ea6-0988-784d-108bed178331 (count=10)
...
[3261] 1780bb5e-6eb3-2fda-084a-7edd31f7e7cb (count=10)
[3262] 5d9bc8e3-060b-42c0-ecc6-2b04d02ed399 (count=10)
Traceback (most recent call last):
<SIGKILL> ... manually killed script because this script will run forever!

(note in page 10 that $skip is greater than $top)

Sentinel Instance (Medium)

The problem gets more damning when I attempt to run the same script in a Sentinel environment with 6919 indicators. As
the first response contains every indicator, we quickly encounter a hard limit where our responses 400 due to a memory
limit on the endpoint.

foo@bar:~$ python reproduce_pages.py
# PAGE 1 - [400] GET https://management.azure.com/subscriptions/{SENTINEL_SUBSCRIPTION_ID}/resourceGroups/{SENTINEL_RESOURCE_GROUP_NAME}/providers/Microsoft.OperationalInsights/workspaces/{SENTINEL_WORKSPACE_NAME}/providers/Microsoft.SecurityInsights/threatIntelligence/main/indicators?api-version=2024-03-01&%24top=8000
{'error': {'code': 'BadRequest', 'message': 'Response too large. Please try a lower page size.'}}
Traceback (most recent call last):

This leaves us in a scenario where we are unable to paginate at all because the $skip is completely ignored by the
endpoint as shown by #sentinel-instance-small.

Any advice would be greatly appreciated.

from azure-rest-api-specs.

v-jiaodi avatar v-jiaodi commented on September 13, 2024

@ityankel Can you help take a look?

from azure-rest-api-specs.

scottzach1 avatar scottzach1 commented on September 13, 2024

Hi team, has there been any followup with this issue? Is there a better place I should escalate this instead?

from azure-rest-api-specs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.