Ingestions
The Ingestions plugins are used to ingest user/enterprise data into custome datasets
defined by the user/enterprise.
Navigate to Ingesting Data to learn more about the process of ingesting data into Ask Sage.
Table of contents
List of Ingestions Plugins & Agents
Index | Title | Access | Description of the Plugin/Agent | Category |
---|---|---|---|---|
1 | CSV Lines | Paid users only | Ingest each CSV as a separate training per line | Ingestion |
2 | Content into Dataset | Paid users only | This plugin lets you train text content into specific dataset | Ingestion |
3 | File | Paid users only | Import file content, split it in chunks, train the chunks into dataset, summarize the content and ingest the summaries | Ingestion |
4 | plain/text content | Paid users only | Import plain/text content, split it in chunks, train the chunks into dataset, summarize the content and ingest the summaries | Ingestion |
CSV
The CSV Lines
plugin is used to ingest each entry/line in a CSV file as a separate training.
Step 1 - Navigate to the Ask Sage Prompt Settings
section and select Prompt Templates
. Followed by then selecting the CSV Lines
plugin.
Step 2 - Click on the Choose File
button to upload the CSV file you want to ingest.
Step 3 - Provide a short description of the CSV file you are ingesting.
Step 5 - Select the dataset you want to ingest the CSV file into.
Step 6 - Click on the Submit
button
Step 7 - Execute the prefilled prompt generated by the plugin, which will loop through each line in the CSV and execute the prompt against each line.
After ingesting your CSV file, you can proceed to ask questions or generate text from the data you ingested.
Content into Dataset
The Content into Dataset
plugin is used to ingest text content into a specific dataset. A use case for this plugin is where the text content is not in a file format (e.g., CSV, PDF, etc.).
An example is utilizing our
Summarize Website
plugin to summarize a website content and then ingest the summarized content into a dataset via theContent into Dataset
plugin.
Step 1 - Navigate to the Ask Sage Prompt Settings
section and select Prompt Templates
. Followed by then selecting the Content into Dataset
plugin.
Step 2 - Enter the text content you want to ingest into the dataset. (Recommend 500 tokens per ingestion)
Step 3 - Provide a short description of the text content you are ingesting.
Step 5 - Select the dataset you want to ingest the text content into.
Step 6 - Click on the Submit
button
Step 7 - Execute the prefilled prompt generated by the plugin, which will ingest the text content into the dataset.
The expected output is similar to the
CSV ingestion
plugin, where you get a confirmation of the ingestion and can proceed to ask questions or generate text from the data you ingested.
File
The File
plugin is used to ingest file content, split it into chunks, train the chunks into a dataset, summarize the content, and ingest the summaries.
File types supported are listed in the plugin description, but you can also find the supported file types by navigating to the
Ingesting Data
section.
Step 1 - Navigate to the Ask Sage Prompt Settings
section and select Prompt Templates
. Followed by then selecting the File
plugin.
Step 2 - Click on the Choose File
button to upload the file you want to ingest.
Step 3 - Select the file reader strategy
from the dropdown list. The options are:
-
Auto (default): This is the default setting, where the system automatically selects the most appropriate file reading strategy based on the file type and content. It aims to balance speed and accuracy.
-
Fast: This strategy prioritizes speed over accuracy. It is useful when you need to quickly process a large number of files and can tolerate some loss in detail or accuracy.
-
Hi_res (for OCR recognition): This strategy is designed for high-resolution processing, particularly for Optical Character Recognition (OCR). It is useful for extracting text from images or scanned documents where high accuracy is required.
If you are unsure which strategy to choose, you can leave it as the default “Auto” setting.
Step 4 - Provide a short description of the file content you are ingesting.
Step 5 - Enter the number of tokens you want to ingest per chunk. (Max 2,000 tokens per chunk for training)
Step 6 - Enter the prompt you want to use to summarize the content. (Keep default if unsure)
Step 7 - Select the dataset you want to ingest the file content into.
Step 8 - Click on the Submit
button
Step 9 - Execute the prefilled prompt generated by the plugin, which will ingest the file content and prompt you to confirm the summaries.
The expected output is to get a confirmation as show below, which list the number of chunks ingested and the summaries of the content.
At the end of the output, you will have the option to accept or reject the summaries from being ingested into the dataset.
The options are:
A) Ingest the data into the dataset
.
/yes
B) User can skip
and then is prompted with the option to re-run the summarization plugin on the summarized results or to stop
the process.
/skip
If you choose to
skip
, you can re-run the summarization plugin on the summarized results and then ingest the summaries into the dataset.
C) User can stop
the process. (Which will not ingest the summarized results into the dataset)
/stop
Plain/text content
The plain/text content
plugin is used to ingest plain/text content, split it into chunks, train the chunks into a dataset, summarize the content, and ingest the summaries.
The main difference between this plugin and the
Content into Dataset
plugin is that this plugin is able to ingest very large text content.
Step 1 - Navigate to the Ask Sage Prompt Settings
section and select Prompt Templates
. Followed by then selecting the plain/text content
plugin.
Step 2 - Enter the text content you want to ingest into the dataset.
Step 3 - Provide a short description of the text content you are ingesting.
Step 4 - Enter the number of tokens you want to ingest per chunk. (Max 2,000 tokens per chunk for training)
Step 5 - Enter the prompt you want to use to summarize the content. (Keep default if unsure)
Step 6 - Select the dataset you want to ingest the text content into.
Step 7 - Click on the Submit
button
Step 8 - Execute the prefilled prompt generated by the plugin, which will ingest the text content and prompt you to confirm the summaries.
Similar to the File
plugin, you will have the option to accept or reject the summaries from being ingested into the dataset.