[DRAFT][Feat] Beam Search for OpenAIResponseTarget#1346
[DRAFT][Feat] Beam Search for OpenAIResponseTarget#1346riedgar-ms wants to merge 49 commits intoAzure:mainfrom
Conversation
riedgar-ms
left a comment
There was a problem hiding this comment.
This is ready for preliminary review; there aren't any docs or (proper) tests yet. I'd like to make sure that I'm manipulating the database correctly before delving into those.
| @@ -0,0 +1,72 @@ | |||
| import asyncio | |||
There was a problem hiding this comment.
This file will ultimately be converted into tests and/or a notebook. For now, it's the easiest way for me to test.
| **deepcopy(kwargs), | ||
| } | ||
|
|
||
| def fresh_instance(self) -> "OpenAIResponseTarget": |
There was a problem hiding this comment.
These changes are required because the OpenAI API takes a grammar as a tool and PyRIT makes the tool list part of the object, not the send_prompt_async() API. Since we have multiple beams being managed asynchronously, each task needs its own copy of the OpenAIResponseTarget
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def _print_message(message: Message) -> None: |
There was a problem hiding this comment.
Will be deleted; this is for my debugging convenience.
| return new_beams | ||
|
|
||
|
|
||
| class BeamSearchAttack(SingleTurnAttackStrategy): |
There was a problem hiding this comment.
This is largely copied from the PromptSendingAttack
| target = self._get_target_for_beam(beam) | ||
|
|
||
| current_context = copy.deepcopy(self._start_context) | ||
| await self._setup_async(context=current_context) |
There was a problem hiding this comment.
I'm not certain I'm handling the context correctly here. I end up making lots of copies of things, which is going to be filling up the database with fragmentary responses. Each time one is extended, it ends up being cloned and a new conversation started.
| objective=context.objective, | ||
| ) | ||
|
|
||
| aux_scores = scoring_results["auxiliary_scores"] |
There was a problem hiding this comment.
So the auxilliary scorer is required; it is used to assess the beams as they develop
| pass | ||
|
|
||
|
|
||
| class TopKBeamPruner(BeamPruner): |
There was a problem hiding this comment.
This is simple, bordering on simplistic, meant mainly as a concrete example of what is required.
|
|
||
| new_beams = list(reversed(sorted_beams[: self.k])) | ||
| for i in range(len(beams) - len(new_beams)): | ||
| nxt = copy.deepcopy(new_beams[i % self.k]) |
There was a problem hiding this comment.
We don't just duplicate the highest scoring beam, in the hope of maintaining some variety
| Args: | ||
| context (SingleTurnAttackContext): The attack context containing attack parameters. | ||
| """ | ||
| self._start_context = copy.deepcopy(context) |
There was a problem hiding this comment.
See note below. I duplicate the context and the message for each beam on each iteration. I'm not certain that this is the best way to use the database.
Description
Use the Lark grammar feature of the
OpenAIResponseTargetto create a beam search for PyRIT. This is a single turn attack, where a collection of candidate responses (the beams) are maintained. On each iteration, the model's response is allowed to extend a little for each beam. The beams are scored, with the worst performing ones discarded, and replaced with copies of higher scoring beams.Tests and Documentation
TBD. Want to get the actual attack code checked