Categories

Versions

You are viewing the RapidMiner Developers documentation for version 9.1 - Check here for latest version

How to create custom tutorials

This article will guide you through:

  • Designing a new tutorial
  • Bundling the tutorial with your extension

Designing a tutorial

Tutorials are self-contained files that can be distributed via extensions or loaded from your local file system. While the file format is simple, it is important to comply with the structure described below.

File format

Tutorials are ZIP archives that use the .tutorial file extension and contain at least one file and a folder with three files:

my_tutorial.tutorial 
├── groups.properties
└── tutorial1
    ├── process.rmp
    ├── tutorial.properties
    └── steps.xml

The group.properties file defines the name of the chapter of tutorials and contains a short description. It must be encoded as ISO-8859-1 – in particular, unicode encodings will most likely not work. Based on the example above, the file's content could look like this:

template.name=My first tutorial chapter
template.description=This is my first tutorial chapter.

The tutorial zip contains a folder for every tutorial; in the example above there is only one folder with the name tutorial1.

The tutorial.properties file defines the name of the tutorial and contains a short description - again encoded as ISO-8859-1. For the example above, the file's contents could be like so:

template.name=My first tutorial
template.description=This is my first tutorial.

The tutorial folder must contain exactly one RapidMiner process (*.rmp).

Furthermore, it must contain a steps.xml file that describes the steps of the tutorial. The format of this file is explained in the next section.

In addition to these files, you can bundle arbitrary repository entries. For instance, you can include a certain data set into your template by copying the corresponding files from your local repository. More so, you can use an image as a background to your toturial process, by adding the image file after changing its extension to .blob. An example for this can be found below.

Format of the steps.xml

The steps.xml file determines what is displayed in the tutorial panel that opens on the left side of RapidMiner Studio when a tutorial is in progress. Its content must be build following these rules:

  • The content of the steps.xml file must be wrapped into a steps tag

      <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
      <steps>
      ...
      </steps>
    
  • The actual tutorial is broken down into an arbitrary number of steps, each step is wrapped into a step tag and can have the attribute name:

      <step name="step name">...</step>
    

    You can use arbitrary content in each step. This will just be shown as is. Don’t use any special HTML or styling since the tutorials will be compiled into other formats. Also avoid special characters for HTML or XML like the greater-than-symbol.

  • There is a special content type for tasks which work similar to ordered lists in HTML:

      <tasks>
          <task>...</task>
          <task>...</task>
      </tasks>
    
  • Another important element besides tasks is the info section which can be used to give more background information or describe other important facts or concepts. You can use arbitrary content within an info section including all tags described below. You cannot define tasks or steps within an info section though. The format is:

      <info>...</info>
    
  • At the of each tutorial is a section of questions where learners are encourage to try out different things or can check what they learned in this tutorial. Those questions work in general exactly like the tasks block:

      <questions>
          <question>...</question>
          <question>...</question>
      </questions>
    
  • For a normal text block that is neither task, nor info or question use a text tag:

      <text>...</text>
    
  • In content, tasks, or info sections, you can add hyperlinks with the link tag:

      <link url="http://www.domain.com/path/">name</link>
    
  • You can emphasize content with the emph-tag:

      <emph>...</emph>
    
  • You can add UI icons to the text with the icon tag. Here you can either use the full resource path to an icon

      <icon>path/to/icon.png</icon>
    

    or use icons delived with studio by using the format icon size/icon name, e.g.,

      <icon>/16/media_play.png</icon>
    
  • The following tags should be used in all types of content to mark the corresponding types:

    • <op>Operator Name</op>
      
    • <param>Parameter Name</param>
      
    • <value>Parameter Value</value>
      
    • <folder>Repository or Folder Name</folder>
      
    • <file>File or Repository Entry Name</file>
      
    • <ui>Name of UI element</ui>
      

    Any hierarchy (operators, folders, ..) can be specified by using a '/'

  • To open the next tutorial add the following at the end of a tutorial:

      <nextTutorial>START NEXT TUTORIAL</nextTutorial>
    

The following section covers an example for a steps.xml file.

Developing tutorials

You can load your tutorial processes locally without bundling them with your extension. Bundling becomes only necessary if you want to dirstribute your tutorials.

All you need to do is to move the tutorial archive into your .RapidMiner\tutorials directory. Please note that tutorials stored in that directory supersede bundled tutorials: if you bundle a tutorial with your extension and have a local copy of the tutorial installed, only the local copy will be loaded.

A minimal tutorial

Let us start with a simple tutorial group with only one tutorial containing a single step.

To create the process file open a new process in RapidMiner Studio and drag a Generate Data operator onto the process canvas:

Simple RapidMiner process

Export the process via the File menu (chose Export Process…) and save it as tutorial1.rmp. Note that you can also leave the process empty and export it the same way if you want your tutorial to start with an empty process.

Next we need a steps.xml file that describes what to do with this process:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<steps>
    <step name="The only step.">
        <text> Have fun with the tasks below!</text>
        <tasks>
            <task>
                <activity>
                    Connect
                </activity> the "out" port of <op>Generate Data</op> with the result port on the right.
            </task> 
            <task>
                <icon>16/media_play.png</icon>
                <activity>Run</activity>
                the process.
            </task>         
        </tasks>
        <questions>
            <question> What happens if you run the process <emph>without</emph> the connection?
            </question>
        </questions>
    </step>
</steps>

As last file for the tutorial folder, we need to create the tutorial.properties file:

tutorial.name=First tutorial
tutorial.description=A minimal tutorial.

We put these three files together in a folder that we call just 1.

The last file we need is the group.properties file that describes the whole tutorial group (which contains only one tutorial in our simple case):

group.name=First tutorial chapter
group.description=A minimal tutorial chapter.

Finally, we need to create the tutorial archive. For this purpose, create a new ZIP archive containing the group.properties file and the folder 1 using an archiver of your choice. Then change the file extension from *.zip to *.tutorial, e.g., rename sample.zip to sample.tutorial. The final file structure should look as follows:

sample.tutorial 
├── groups.properties
└── 1
    ├── tutorial1.rmp
    ├── tutorial.properties
    └── steps.xml

Make sure that your tutorial file does not contain an extra folder between sample.tutorial and the group.properties file and the folder 1.

To load the newly created tutorial chapter, copy the tutorial file to .RapidMiner/tutorials and restart RapidMiner Studio. In the Learn menu, you should now see the newly created template:

Tutorial selection

Selecting the first (and only) tutorial of the new chapter should open up the process we created in the first step and show the step on left hand side.

Tutorial panel

Including repository entries

Let us now add a custom data set to use in the tutorial. To do this, we first need this data to be in the repository of RapidMiner Studio.

For example, we can get some custom data into our repository as follows: As before, drag the Generate Data operator onto the process panel. Now add a Store operator and connections.

Data storer

Configure the Store operator to store the data set in your repository, e.g., as //Local Repository/Custom data, and run the process.

Now, alternative-click on the newly created entry in the repository and choose Open in file browser. There are three files with the name "Custom data": Custom data.ioo, Custom data.md and Custom data.properties. Copy the .ioo (the actual data) and the .md (meta data) file and add them to the tutorial archive such that it has the following new structure:

sample.tutorial 
├── groups.properties
└── 1
    ├── tutorial1.rmp
    ├── tutorial.properties
    ├── steps.xml
    ├── Custom data.ioo
    └── Custom data.md

Restart studio and have a look at the Samples Repository:

Repository structure

You can see that the Custom data is now part of the tutorial folder 1. We can now add a second step to the steps.xml using the new data:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<steps>
    <step name="The first step.">
        <text> Have fun with the tasks below!</text>
        <tasks>
            <task>
                <activity>
                    Connect
                </activity> the "out" port of <op>Generate Data</op> with the result port on the right.
            </task> 
            <task>
                <icon>16/media_play.png</icon>
                <activity>Run</activity>
                the process.
            </task>         
        </tasks>
        <questions>
            <question> What happens if you run the process <emph>without</emph> the connection?
            </question>
        </questions>
    </step>
        <step name="The second step.">
        <tasks>
            <task>
                <activity>
                    Drag
                </activity> the data from the Repository at <file>//Samples/Tutorials/samples/1/Custom data</file> into the process.
            </task> 
            <task>
                <activity>
                    Connect
                </activity> the "out" port of <op>Retrieve</op> with the second result port on the right.
            </task> 
            <task>
                <icon>16/media_play.png</icon>
                <activity>Run</activity>
                the process.
            </task>         
        </tasks>
        <questions>
            <question> What is the difference between the two resulting data sets?
            </question>
        </questions>
    </step>
</steps>

The result (after restarting RapidMiner Studio) looks as follows:

Second tutorial

You can also use the Custom data in your process file tutorial1.rmp via a Retrieve operator, but you must use the absolute path //Samples/Tutorials/samples/1/Custom data.

Adding a background image

You can use a background image to give hints how to fill the process canvas, for example:

Tutorial background

The easiest way to construct such an image is to take a screenshot of the finished tutorial process and adapt it with the graphics program of your choice.

We change the file extension to .blob, e.g., rename the background.png file to background.blob and add the .blob file into the folder 1:

sample.tutorial 
├── groups.properties
└── 1
    ├── tutorial1.rmp
    ├── tutorial.properties
    ├── steps.xml
    ├── Custom data.ioo
    ├── Custom data.md
    └── background.blob

Furthermore, we tell our process tutorial1.rmp to include the background image by adding the line

<background height="-1" location="//Samples/Tutorials/sample/1/background" width="-1" x="72" y="45"/>

at the inner process, i.e., the last lines should look like this:

      <background height="-1" location="//Samples/Tutorials/sample/1/background" width="-1" x="72" y="45"/>
    </process>
  </operator>
</process>

You can adjust the position of the background image by adjusting the x and y values. The easiest way to try different values is to use the XML panel in RapidMiner Studio and then to adjust the tutorial1.rmp file with the optimal values.

The tutorial process now looks like this:

Tutorial with background

Now you can add a second tutorial, for example

sample.tutorial 
├── groups.properties
└── 1
│   ├── tutorial1.rmp
│   ├── tutorial.properties
│   ├── steps.xml
│   ├── Custom data.ioo
│   ├── Custom data.md
│   └── background.blob
└── 2
    ├── tutorial2.rmp
    ├── tutorial.properties
    ├── steps.xml
    …

Don't forget to add

<nextTutorial>START NEXT TUTORIAL</nextTutorial>

at the end of the last step of your first tutorial.

Bundling tutorials

Bundling tutorials with your extension is a straight forward process. All you have to do is to add your tutorial archives as resources and register them in the initialization code of your extension.

If you have no experience with writing extensions yet, please refer to our Creating your Own Extension guide. Sections 1-3 cover all you need to know to build an extension that can be used to distribute templates.

Adding tutorials as resources

By convention build tools such as Maven and Gradle look for resources in the src/main/resources directory. We recommend using this structure for RapidMiner extensions as well.

Let us assume that you chose org.myorg.myextension as group id. Then your resources should be located under src/main/resources/org/myorg/myextension. Note that RapidMiner will add the tutorial directory to the path automatically. Thus, you could bundle the tutorial created above as …/org/myorg/myextension/tutorial/sample.tutorial:

my_extension
├── README.md
├── build.gradle
├── …
├── src
│   └── main
│       ├── java
│       │   └── …
│       └── resources
│           ├── org
│           │   └── myorg
│           │       └── myextension
│           │           └── tutorial
│           │               └── sample.tutorial
│           └── …
└── …

However, RapidMiner will not search for resources in that directory unless you register the location as resource source. This can be easily done in the initialization code of your extension. All you need to do is to add the following line to the initPlugin() method:

/**
 * This method will be called directly after the extension is initialized. This is the first
 * hook during start up. No initialization of the operators or renderers has taken place when
 * this is called.
 */
public static void initPlugin() {
    // register extension resources
    Tools.addResourceSource(new ResourceSource(PluginInitMyExtension.class.getClassLoader(), 
        "org/myorg/myextension/"));
}

Now you can register extension via the so-called tutorial registry. To register the tutorial designed above, you could add another line to the plugin initialization code:

public static void initPlugin() {
    // register extension resources
    Tools.addResourceSource(new ResourceSource(PluginInitMyExtension.class.getClassLoader(), 
        "org/myorg/myextension/"));
    // register sample tutorial
    TutorialRegistry.INSTANCE.register("sample");
}

Testing the tutorials

There are no further special steps involved in testing the bundled tutorials. All you need to do is to build a new version of your extension, e.g., via the command gradle clean installExtension.

But keep in mind that tutorials stored in your .RapidMiner directory override bundled tutorials of the same name. Thus, make sure to remove all working copies of your tutorials before starting RapidMiner Studio.