Copilot
Overview
The copilot for Workflow Designer is an AI assistant that helps streamline your workflow.
- It understands your requests in plain language. The copilot understands your intentions and places configured operators directly onto the canvas to solve the task -- no need to manually search for operators and set their parameters.
- The copilot speaks many languages, not just English. Don't hesitate to use your own language!
The copilot uses a customized large language model (LLM) that has been trained on two things:
- Understanding tasks related to data preparation and data analysis, and
- Translating these tasks into Altair AI Cloud operators.
What distinguishes the copilot for Workflow Designer from other AI assistants (like ChatGPT, Siri, ...) is that it is seamlessly integrated into the building process and has the current data as a context. Hence, it can find the right column in the data and correctly configure the returned operators, so they are ready to be executed.
Using the copilot
Once a data table is loaded, you can simply click on the output port of an operator.
This will do two things:
- the data is shown in the panel at the bottom of the designer, and
- the port will get extended with a plus icon, indicating that it is selected, and you want to continue using it.
Without copilot
Normally, you would now go over to the operator panel on the right-hand side of the designer and select an operator. Because we have a port selected, it is sufficient to click on the (+) plus icon next to the name. Then the operator will be placed directly on the canvas and connected to the previously selected port.
With copilot
But if you don’t know exactly what operator to use or you want help with configuration, you can use the copilot, by writing your request directly in the combined search/query field on top of the panel.
Because the query is linked to a data table, you can reference elements of that table in your query. The easiest way is to directly state the column name(s) you want to work with. But the model is also smart enough to understand abstractions. So, for example, instead of writing the full name of total product cost
it is also okay to say "price column". This makes it easier to write meaningful statements without needing to keep track of the exact column names. It's also in the nature of large language models to be relatively forgiving of spelling errors.
The language understanding capability of the model has another huge benefit when working with the copilot. It can also understand your prompts when you're using another language than English. Both
- the German prompt "Sortiere die Daten aufsteigend nach Preis" and
- the Japanese prompt "データを価格の昇順に並べる"
- will result in the same outcome as the English version, "order the data ascending by price".
The copilot can understand the content of the table and work with more abstract commands, for example:
- “Remove the ID column”
- “Sort by the label” (this works even if the label column is called something completely else, because the copilot also checks for metadata)
The new operators are sent directly to the canvas and connected to the previously selected port.
If in addition the auto-update feature is activated,
- the newly placed operator(s) will execute on the spot, so you can inspect the results directly, and
- the output port of the last added operator will be selected.
Then you get a non-stop process-building chain without the need to manually execute your workflow.
Example use cases
Data transformation
You’ve asked your data team for an overview of the latest customer sales data, so you can analyze recent behavior changes and start building a trend prediction model.
Unfortunately, the table is very messy and contains a lot of unnecessary information. It has confusing column names and the key values you’re looking for are not yet calculated.
This might take a while to sort out, and it will require a lot of operators that need to be configured individually.
Simply stating your intended actions is a lot easier and faster.
For an initial investigation, you want to focus on customers having a discount and you want to trim the data. A sample command sequence could be:
-
“Keep only rows where the discount is greater than 0”
-
“Keep only rows from 1 to 1000”
This will add a filter operator configured to filter the table, so only customers where the discount value was greater than zero remain. In case there are multiple columns that contain "discount" in their name, you can either be more specific by using the correct column name or change the selected column manually. Again, such refinements are way easier, once you know where to look, even without having used the workflow designer before.
The second query returns a Filter range operator, that limits your data to the first one thousand rows.
Next step is to create the target you’re interested in and prepare it for model building.
For example, the customer data contains the number of sold products and their price but are missing a total amount per transaction.
Creating a new column is again a simple statement of what you want to do:
- "Create a new column called “Total Price” by multiplying Amount and Price"
Data exploration and question answering
Another strength of the copilot system is that it allows very quick question answering and exploration of the data set.
The copilot is useful for building aggregations or extracting summary information out of the data.
Continuing with our example data from before, we can now start to ask some exploratory questions, like:
-
“What’s the highest sales price?”
-
“How many different customer IDs?”
-
“Sum the total price by customer ID”
-
“Calculate the impact on the customer rating”
The first commands will return Aggregate operators that will answer your questions directly. The last one will create a whole chain of operators that will return a weight table with the correlations between the other columns and the customer rating. This helps to identify important driving factors for customer satisfaction.
Tips and tricks
Some general hints on how to optimally use the copilot.
-
It’s not necessary to be extra wordy or polite with the copilot.
The copilot won’t hold a grudge if you don’t add please at the end of your query. It can help to phrase your queries more explicitly, also to organize your own thoughts, but it’s also totally fine to keep the queries super brief and cut off all extra clutter.
-
Remember that the copilot can understand context and can abstract common meanings.
-
If there’s a column called
cost
, then "price", "total" or similar phrases are enough to allow the copilot to identify what column you mean. -
It can also identify columns by their content (especially if they contain categories).
-
-
Because one of the most prominent uses cases for natural language models was initially translation tasks, it might not be a surprise that the copilot can also understand more languages than just English. If you feel more comfortable writing prompts in your native language, feel free to do so.
-
You can always combine multiple tasks in a single prompt. Especially more simple prompts can be quickly strung together in a single sentence
"Change the role of the customer number to ID and then rename it to identifier" is a totally valid prompt that will result in the two operations chained together.