Skip to main content

Filter Component

The Filter Component is designed to filter input data based on the value of a specific field, retaining only the data that meets the defined conditions.

Scenarios

In a typical data filtering scenario, the LLM component can be used as a preliminary step. Configured as a classifier, the LLM can classify and output results for each piece of data. The Filter Component can then filter these classification results to obtain the desired dataset.

For example, consider the following pipeline for generating questions based on a document:

Pipeline Structure

Capabilities

The Filter Component can:

  1. Remove data unrelated to the original document.
  2. Eliminate overly basic questions.

For instance, to remove data unrelated to the original document, the LLM component’s prompt in the previous step could be:

You are a juror, and are tasked with giving a judgement if there is enough evidence in the passage to answer a given question.
……
<Judgements-Options>
- "Beyond a reasonable doubt" - There is enough evidence in the passage or the information in the passage can be used to completely answer the question beyond a reasonable doubt.
- "Somewhat relevant" - Only part of the evidence required to completely answer, or to reason through to get the answer is available in the passage.
- "Not useful" - The passage doesn't contain enough information to answer the question.
</Judgement-Options>

Generate your answer in JSON format with the fields below:
- "Reasoning": 1-10 words of reasoning
- "your_decision": "fill with judgement option"

The "your_decision" field is the field that carries the classification result. The configuration of the filter component is as follows:

Filter Component Configuration

The component will judge whether the value of the "your_decision" field is "Not useful". If the condition is met, the filtering operation will be performed.

Configuration

Configuration Steps

  1. Configure the field used for filtering.
  2. Set the filtering condition type. Currently, two types are supported: exact match and numeric condition.
  3. Set the specific value of the filtering condition. When it is of numeric condition type, you can set the value of the field to be greater than or less than a certain value. When it is of exact match type, you can set the value of the field to be equal to a configured value. The condition value of numeric condition type supports the following three configurations.

Filter Component Conditions

Component Input

  • Type: List of dictionaries.
  • Description: Each dictionary in the list contains consistent fields.

Component Output

  • Type: Filtered list of dictionaries.
  • Description: The output list includes only those dictionary elements that meet the specified filtering conditions.