Ingesting Data into Ask Sage

In this section, we will guide you through the process of ingesting data into Ask Sage. Understanding this process is crucial to generating relevant and accurate results relevant to your use case.

This is valuable because the GenAI models are trained on a diverse range of open-source datasets, and the more specific and relevant the data you ingest, the better the results you will get from the GenAI models. Therefore in this section we will guide you step-by-step on how to ingest data into Ask Sage.

1) Ask Sage allows you to ingest data in any format, including text, images, and audio. This allows you to generate text from the data you ingest, which can be useful for a variety of use cases.

2) Ask Sage users benefit from the ability of only having to ingest data once! Which allows users to leverage the data across multiple GenAI models on the platform.

Table of contents
  1. Define a Dataset
  2. Upload/Ingest Data
  3. Using the Dataset with GenAI Models
  4. Summary

Define a Dataset

The first step is to create a dataset in Ask Sage. A dataset is equivalent to a folder where you can store all the data you want to ingest into Ask Sage. You can create multiple datasets to organize your data based on specific use cases or projects.

To create a dataset, follow these steps:

  • In the Advanced Settings section, click on the Upload New Files.
  • Click on the Create New Dataset button.
    • Enter a dataset name. Only alphanumeric characters and hyphens are allowed. No spaces or special characters are allowed.(e.g., my-dataset12345).
    • Click on the Create Dataset button. (If successful, you will see Dataset created)

After creating a dataset, you can now start ingesting data into Ask Sage.

1) As a best practice, it is recommended have a clear naming convention for your datasets to easily identify them when ingesting data.

2) On your local machine, you can create a folder with the same name as the dataset you created in Ask Sage. This will help you organize your data locally and easily upload it to Ask Sage.

Upload/Ingest Data

After creating a dataset, you can now upload/ingest data into Ask Sage. You can ingest data in any format and as listed in the table below:

Data Type File Format Example Max Size Per File
Text .txt, .docx, .pdf, .pptx, .ppt, .csv, .cc, .sql, .cs, .hh, .c, .php, .js, .py, .html, .xml, .msg, .odt, .epub, .eml, .rtf, .doc, .json, .md, .tsv, .yaml, .yml, .java, .rb, .sh, .bat, .ps1 example.txt 50MB
Image .jpg, .jpeg, .png example.jpg 50MB
Audio .wav, .mp3, .mp4, .mpeg, .mpga, .m4a, .webm example.wav 500MB
Compressed .zip example.zip 50MB
Spreadsheet .xlsx, .tsv example.xlsx 50MB
Presentation .pptx, .ppt example.pptx 50MB
Code .cc, .sql, .cs, .hh, .c, .php, .js, .py, .java, .rb, .sh, .bat, .ps1 example.py 50MB
E-book .epub example.epub 50MB
Email .eml, .msg example.eml 50MB
Rich Text .rtf example.rtf 50MB
Markup .md, .html, .xml example.html 50MB
Data Interchange .json, .yaml, .yml example.json 50MB

Be aware that images in text file documents will not be extracted. You will need to upload the images separately.

To upload data into Ask Sage, you navigate to the Ingest Files section and follow these steps:

  • Select the dataset you created from the dropdown list.
  • In the box, drag and drop the files you want to upload or click in the box to select files from your local machine.
  • After selecting the files, you will see the file names listed in the box. Review the files to ensure they are correct and delete any files you do not want to upload via the garbage bin icon.
  • Click on the Ingest Files button to start uploading the files.
  • If successful, you will see a white checkmark , and Successfully Imported text for each file that is uploaded.

1) You can upload multiple files at once by selecting multiple files from your local machine.

Using the Dataset with GenAI Models

After ingesting data into Ask Sage, you can now use the data with any of the GenAI models available on the platform.

To use the data with the GenAI models, follow these steps:

  • Navigate to the Advanced Settings section.
  • Update the Advanced Settings to use the dataset(s) you created.
    • Update any other settings as needed (e.g., Model, Persona, Temperature, Personality, etc.)
  • Enter a prompt and submit your prompt.

Here is an example of how to use the data with the GenAI models on Ask Sage:

The dataset(s) selected will appear in the Advanced Settings section, but also under the prompt window so users can easily identify the dataset(s) used with the prompt.

For optimal results with the ingested data, we recommend keeping the Temperature setting at its default value of 0.0 and ensuring that the Live setting is turned off. Incorrect settings may result in subpar outcomes or data contamination.

The inference/response generated by the GenAI model utilizes the dataset assigned to the prompt.

Ask Sage users benefit from the Show Explainability feature, which provides users with a detailed reference to the data used to generate the text when using a dataset and/or the live feature. This is useful for understanding the context of the generated text and ensuring the text is relevant and not a hallucination.

Summary

In this section, we guided you through the process of ingesting data into Ask Sage. Understanding this process is crucial to generating relevant and accurate results relevant to your work/organization.

Now that you have a better understanding of how to ingest data into Ask Sage, you are ready to start utilizing the platform and leveraging the power of GenAI!

Proceed to the next sections to learn more about Ask Sage! 🚀


Back to top

Copyright © 2024 Ask Sage Inc. All Rights Reserved.