Code Monkey home page Code Monkey logo

ansible-risk-insight's Introduction

Ansible Risk Insight

Ansible Risk Insight (ARI) is the tool to evaluate the quality and risk of the ansible content. It works as a CLI tool, but can be integrated as python module. ARI can take Ansible content files such as playbooks, projects, collections and roles as inputs, then parse it to create a call tree in a static analysis fashion.

ari overview

ARI can apply rules to evaluate quality and risk for the tasks, roles, playbooks and taskfiles on the call tree.

ari apply rules

ARI has default set of rules, and users can easily define new rules (see the guide). match() function defines which nodes in the call tree the rule should be applied to. In this example, the rule is applied only for all tasks. The process() function defines what should be computed, determined, and possibly changed.

ARI simulates the execution and goes through all nodes of the call tree in order, and the context when running the task is provided to the process() function. The context of a task includes the information about the current task spec, call tree, and all variables defined before the task (see the example below). ARI can consider all variable assignments and its precedence order.

There are different types of rules.

  • Validating rule: a rule to compute verdict to validate the task content
  • Information rule: a rule to derive something about the task content. The result is reported as the rule result.
  • Mutating rule: a rule to apply some changes to the task content

All rules are applied to every step of execution in a sequential order. Each rule can attach information to the tree nodes. It is called an “annotation”, which can be referred to from other rules.

To create the useful annotations, the ARI can crawl the external sources such as Ansible Galaxy, Automation Hub, Github repository, local directly, etc to enrich the knowledge base available for rules. ARI pre-computes scanning result for the crawled content and stores it in a data store (called "RAM"), which keeps

  • Collections, roles, tasks, modules
  • Metadata (digest, signature, timestamp, versions, repo url, license, etc.)
  • Module spec (acquired from module document via ansible-doc command)
  • Rule results from ARI

ari ram list

ari arch

Prerequisites

Currently this documentation assumes the following prerequisites.

  • pip command
  • ansible-galaxy command
  • ansible-doc command

Install

You can install ARI from GitHub source code using pip command.

$ pip install git+https://github.com/ansible/ansible-risk-insight.git

How to try

Role

ansible-risk-insight role <role_name>

Collection (now fixing an issue)

ansible-risk-insight collection <collection_name>

All intermediate files are installed under a temporary directory. The src directory which includes dependency collections and roles are moved under command directory for ARI to avoid repeated install from Galaxy repository. The location of the ARI common directory can be specified by environment variable ARI_DATA_DIR (default = /tmp/ari-data).

Prepare backend data

ARI can crawl the external sources such as Ansible Galaxy to enrich the knowledge base (called RAM) available for rules. ARI pre-computes scanning result for the crawled content and stores it in a data store (called "RAM"), which keeps

  • Collections, roles, tasks, modules
  • Metadata (digest, signature, timestamp, versions, repo url, license, etc.)
  • Module spec (acquired from module document via ansible-doc command)
  • Findings from ARI

For example, you can setup the RAM by the following command. The files are created under ARI_DATA_DIR.

# create a text file for the input
$ cat << EOS > ram_input_list.txt
collection amazon.aws
collection azure.azcollection
collection google.cloud
collection arista.eos
collection junipernetworks.junos
collection containers.podman
collection ansible.builtin
collection community.general
collection ansible.posix
collection arista.avd
EOS

# prepare the backend data based on the input
$ ari ram generate -f ram_input_list.txt

(this takes a while...)

Installation (for development)

git clone [email protected]:ansible/ansible-risk-insight.git
cd ansible-risk-insight
pip install -e .

ansible-risk-insight's People

Contributors

akasurde avatar dependabot[bot] avatar gebhardtr avatar goneri avatar hirokuni-kitahara avatar mbwhite avatar rurikudo avatar shanemcd avatar yuji-watanabe-jp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ansible-risk-insight's Issues

Race condition when updating the ARI modules index

Observed
Issue was that when updating an existing ARI knowledge base, some of the new collections had findings added, but are absent from the modules index.

The effect is that on lookups, the module is reported as not existing; when in fact it does.

Possible Cause

Updates to the module_index.json appears to be done at the scanner level, so if multiple of these are running parallel is there a potential race condition? The multiple run is the standard way of working; with over 300 modules in the current set I need to use running in parallel is important.

Unable to process ansible.builtin

I've cloned the latest code (as of 16:42 GMT/ 7March) and found that with the ansible.builtin collection an error is created,

From collections/findings/ansible.builtin/unknown/unknown/error.log

Traceback (most recent call last):
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/ram_generator.py", line 78, in scan
    self._scanner.evaluate(
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/scanner.py", line 777, in evaluate
    scandata._prepare_dependencies()
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/scanner.py", line 364, in _prepare_dependencies
    dep_dirs = ddp.prepare_dir(
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/dependency_dir_preparator.py", line 115, in prepare_dir
    self.prepare_root_dir(root_install, is_src_installed)
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/dependency_dir_preparator.py", line 150, in prepare_root_dir
    self.src_install()
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/dependency_dir_preparator.py", line 348, in src_install
    self.root_install(self.tmp_install_dir)
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/dependency_dir_preparator.py", line 410, in root_install
    self.move_src(tmp_src_dir, dst_src_dir)
  File "/home/matthew/github.com/wisdom/ansible-risk-insight/ansible_risk_insight/dependency_dir_preparator.py", line 687, in move_src
    raise ValueError("src {} is not directory".format(src))
ValueError: src /tmp/tmpryrruleg/src is not directory

Command was ari ram generate -f ram_input_list.txt

input list had these entries
collection amazon.aws
collection azure.azcollection
collection google.cloud
collection arista.eos
collection junipernetworks.junos
collection containers.podman
collection ansible.builtin
collection community.general
collection ansible.posix
collection arista.avd

Modules with rename / redirected history reported as not existing

Some modules that have a history of rename, are causing issues when trying to search for them. There are a number of concern; but as an example

if you look for community.kubernetes.k8s ARI will return the module does not exist. However looking in the indexes file, we can see it does, but the fqcn has changed. This make sit hard to search for.

   "k8s": [
        {
            "fqcn": "kubernetes.core.k8s",
            "type": "collection",
            "name": "kubernetes.core",
            "version": "2.4.0",
            "hash": "3713580886f8f9b509f5ce09cee360675ce45b170e53631d9307b76d1dcdbf2c"
        },
        {
            "fqcn": "kubernetes.core.k8s",
            "type": "collection",
            "name": "community.kubernetes",
            "version": "2.0.1",
            "hash": "ca9f014c93d6198a10afa93987e7de4a599e71897b7488fb7c5609e489211494"
        },
        {
            "fqcn": "community.okd.k8s",
            "type": "collection",
            "name": "community.okd",
            "version": "2.3.0",
            "hash": "9b5eb97135d64f063d2a021686336abbd5d798c9e57e4f8c1198cc66e79e4f99"
        },
        {
            "fqcn": "shanemcd.kubernetes.k8s",
            "type": "collection",
            "name": "shanemcd.kubernetes",
            "version": "0.11.0",
            "hash": "d9068b33c4d78c6dae57ba10607820ce0b3765a9d6c89b6fa87c9bd5c73772a7"
        }
    ],

The first 20 or so that appear to be representative of the whole

  "community.aws.ec2_asg_scheduled_action",
  "community.crypto.openssl_certificate",
  "community.elastic.elastic_bulk",
  "community.elastic.elastic_role",
  "community.elastic.elastic_snapshot",
  "community.elastic.elastic_user",
  "community.general.hana_query",
  "community.general.sap_task_list_execute",
  "community.general.sapcar_extract",
  "community.kubernetes.helm",
  "community.kubernetes.helm_repository",
  "community.kubernetes.k8s",
  "community.kubernetes.k8s_exec",
  "community.kubernetes.k8s_info",
  "consoledot.edgemanagement.custom_repositories",
  "consoledot.edgemanagement.groups",
  "dellemc.openmanage.idrac_firmware",
  "dellemc.openmanage.idrac_server_config_profile",
  "dellemc.openmanage.ome_device_info",
  "ibm.cloud.ibm_iam_access_group",
  "ibm.cloud.ibm_resource_group",
  "infoblox.nios_modules.nios_a_record",
  "infoblox.nios_modules.nios_cname_record",
  "infoblox.nios_modules.nios_dns_view",
  "infoblox.nios_modules.nios_host_record",
  "infoblox.nios_modules.nios_member",
  "infoblox.nios_modules.nios_mx_record",
  "infoblox.nios_modules.nios_naptr_record",
  "infoblox.nios_modules.nios_network_view",
  "infoblox.nios_modules.nios_nsgroup",
  "infoblox.nios_modules.nios_ptr_record",
  "infoblox.nios_modules.nios_srv_record",
  "infoblox.nios_modules.nios_txt_record",
  "infoblox.nios_modules.nios_zone",
  "servicenow.itsm.attachment"

ARI scanning is resulting in to truncated output

ARI scanning is resulting in to truncated output. The following issue is observed in the repo: LoadBotTesting-2. Please ref the below play:

Input Play:

---
  - name: Define timestamp
	set_fact: timestamp="{{ lookup('pipe', 'date +%Y%m%d_%H%M%S') }}"
	run_once: true
  - name: Define file to place results
	set_fact: template={{rootdir}}/{{host}}/{{host}}_{{datatype}}_{{timestamp}}
  - name: Create dropoff directory for host
	file:
  	path: "{{ rootdir }}/{{ host }}"
  	state: directory

Output Play:

- name: Define timestamp
  run_once: true
  ansible.builtin.set_fact:
	timestamp: "\"{{"

- name: Define file to place results
  ansible.builtin.set_fact:
	template: "{{rootdir}}/{{host}}/{{host}}_{{datatype}}_{{timestamp}}"

- name: Create dropoff directory for host
  ansible.builtin.file:
	path: "{{ _path_ }}"
	state: "directory"

[Annotators] Refactor annotator design

Reduce duplicate code for annotator and multiple if condition by moving the logic to base annotator class and invoking the annotator method for the module from base class. Is there 1:1 mapping between annotator method and module?

Better system structure between ARI components

Define interfaces and parameters

  • support multiple input patterns
  • refine ARI findings data format
  • document all command for ARI
  • document all in/out

Data storage for intermediates and outputs

  • Dep Preparator実装
  • RAM
  • RAM Generator
  • Prepare RAM for Galaxy using RAM Generator
  • Support search path for Ansible

ARI scan resulting in module_defaults being copied to each of the tasks which is unexpected

ARI scan resulting in module_defaults being copied to each of the tasks which is unexpected. Please ref the example below:

Input play:

- name: rds_instance / processor integration tests
  collections:
  - community.aws
  module_defaults:
	group/aws:
  	aws_access_key: '{{ aws_access_key }}'
  	aws_secret_key: '{{ aws_secret_key }}'
  	security_token: '{{ security_token | default(omit) }}'
  	region: '{{ aws_region }}'
  block:
  - name: Ensure the resource doesn't exist
	rds_instance:
  	id: '{{ instance_id }}'
  	state: absent
  	skip_final_snapshot: true
	register: result

Output play:

- name: Ensure the resource doesn't exist
  register: result
  amazon.aws.rds_instance:
	id: "{{ instance_id }}"
	state: "absent"
	skip_final_snapshot: true
  module_defaults:
	group/aws:
  	aws_access_key: '{{ aws_access_key }}'
  	aws_secret_key: '{{ aws_secret_key }}'
  	security_token: '{{ security_token | default(omit) }}'
  	region: '{{ aws_region }}'

- ansible.builtin.assert:
	that:
	- not result.changed
  module_defaults:
	group/aws:
  	aws_access_key: '{{ aws_access_key }}'
  	aws_secret_key: '{{ aws_secret_key }}'
  	security_token: '{{ security_token | default(omit) }}'
  	region: '{{ aws_region }}'

The following issue is observed in the repo: LoadBotTesting-3 from the task file of rds_instance_processor, but the issue is relevant for other repos play as well

breaking change in 0.1.1: "TypeError: 'NoneType' object is not iterable" when calling ARIScanner.evaluate()

This code works in versions 0.0.0 and 0.1.0.
I pulled from the latest main (git rev-parse --short HEAD: 1cdca85e) and started getting this error.
We are initializing an ARIScanner and then calling its evaluate method on different scripts a couple of times. The first calls are fine, but when we try to call it with a different script we run into this Exception: TypeError: 'NoneType' object is not iterable.

Here's a more detailed stack trace:

line 17, in __init__
    val = self.scanner.evaluate(
  File "/Users/andrewjda/ansible-risk-insight/ansible_risk_insight/scanner.py", line 1006, in evaluate
    scandata.annotate()
  File "/Users/andrewjda/ansible-risk-insight/ansible_risk_insight/scanner.py", line 609, in annotate
    contexts = analyze(self.contexts)
  File "/Users/andrewjda/ansible-risk-insight/ansible_risk_insight/analyzer.py", line 64, in analyze
    for j, t in enumerate(ctx.tasks):
  File "/Users/andrewjda/ansible-risk-insight/ansible_risk_insight/models.py", line 1645, in tasks
    return self.taskcalls
  File "/Users/andrewjda/ansible-risk-insight/ansible_risk_insight/models.py", line 1641, in taskcalls
    return [t for t in self.sequence if t.type == RunTargetType.Task]

Let me know if you'd like any more information from me. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.