Hi do you have support for scanning text / prompts in batches? I am thinking about

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

In our <a href="https://join.slack.com/t/laiyerai/shared_invite/zt-28jv3ci39-sVxXrLs3r

Maybe something like this(add scan_batch to <code cla

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Batch scan support about llm-guard HOT 5 OPEN

protectai commented on May 30, 2024

Batch scan support

from llm-guard.

Comments (5)

asofter commented on May 30, 2024

Hey @gustavz , thanks for reaching out. RAG is the scope of the next release introducing more examples and features related to that. One of them can be a support of the batch.

Probably one thing we should explore is latency because there are a few ways to run it:

Sequentially like now
Using multiprocessing or threading libraries
Pass multiple texts in transformers and let them handle concurrency

I will keep you updated on the progress

from llm-guard.

asofter commented on May 30, 2024

In our Slack, you can engage in discussions, share your feedback and get latest updates.

from llm-guard.

vincent-pli commented on May 30, 2024

Any more information? I work on RAG now, try to introduce guard in RAG process and I adopt ray to do the concurrency.
but the result is not good enough...

from llm-guard.

vincent-pli commented on May 30, 2024

Maybe something like this(add scan_batch to prompt injection):

    def scan_batch(self, prompts: list[str]) -> list[(str, bool, float)]:
        if len(prompts) == 0:
            return []
        
        prompt_batch: list[str] = []
        for prompt in prompts:
            prompt_batch.extend(self._match_type.get_inputs(prompt))

        highest_score = 0.0
        results_all = self._pipeline(prompt_batch)

        result_batch: list[(bool, float)] = []
        for result in results_all:
            injection_score = round(
                result["score"] if result["label"] == self._model["label"] else 1 - result["score"],
                2,
            )

            if injection_score > highest_score:
                highest_score = injection_score

            if injection_score > self._threshold:
                logger.warning(f"Detected prompt injection with score: {injection_score}")
                result_batch.append((False, calculate_risk_score(injection_score, self._threshold)))
            else:
                result_batch.append((True, 0.0))
        
        return [(prompts[i],) + result_batch[i] for i in range(0, len(result_batch))]

from llm-guard.

asofter commented on May 30, 2024

Hey @vincent-pli ,
On one side, it makes sense to have prompts as a list. However, we are planning to improve accuracy by sending chunks of prompts instead to the pipeline. It might make things a bit more complex.

In the next version, we are planning to kick off the refactoring of inputs and outputs.

from llm-guard.

Recommend Projects

Batch scan support about llm-guard HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent