Categories

Versions

What's New in RapidMiner Studio 9.2.0?

Released: February 05th, 2019

The following describes the bug fixes in RapidMiner Studio 9.2.0:

New Features

  • Replaced old charts and advanced charts with new, powerful HTML5 visualizations. There are lots of new plot types and capabilities to explore! Main features:

    • New chart types: Step Line, Spline, Area, Step Area, Spline Area, Range (Line, Step, Column, Errorsbars), Streamgraph, Bellcurve, Funnel, Pyramid, Heatmap, Treemap, Sankey, Packed Bubble, Vector, Wordcloud
    • Enhanced existing chart types and new ones with features like multi-attribute selection, grouping, stacking options, inversion, and displaying as a radar chart (for select charts)
    • Multiple y axes supported
    • Added plotline support (annotated marker lines on x/y/z axes)
    • Chart configurations are now automatically saved. You configure a chart for your data set, close Studio, come back the next day, and when you look at the data again, the same chart you configured will be there again!
    • Some plots can be combined with other plots. You can add as many of those combinable plots to a single charts as you want!
    • Allows you to quickly select the basic settings to get started, but also to fine-tuning even minor chart details
    • Have multiple series in a single chart (e.g. something grouped by labels)? Try hovering over and clicking the legend items to highlight and hide the respective series!
  • Auto Model

    • Added support for textual data
    • Added feature selection for clustering
    • Added Fast Large Margin and Multiclass Logistic Regression learners
    • Improved feature extraction from dates (calculate all pairwise differences and differences to today)
    • Added predictions vs. label chart for regression
    • Added correlation as performance criterion for regression
    • Explain predictions is now optional in Auto Model and is only automatically activated for smaller data sets
    • Significantly improved runtimes of Auto Model for larger data sets
  • New text analysis operator for feature extraction for text, adding sentiments, and language detection: Text Vectorization

  • New operator for assigning batch numbers to data rows: Generate Batch

  • Cloud Connectivity

    • Added connectivity to Azure Data Lake Storage (Gen 1):
      • Read Azure Data Lake Storage
      • Loop Azure Data Lake Storage
      • Write Azure Data Lake Storage
  • Time Series

    • New Operator: Extract Coefficients (Polynomial Fit)

      • It fits a polynomial function to the time series and provides coefficients and (if selected) the discrepancy as features
      • It also provides the fitted function evaluated on the index values of the time series on an additional output port
    • New Operator: Exponential Smoothing

      • It smooths a time series by a factor alpha
    • New Operator: Lag

      • It lags (move) time series attributes to each other
  • Introducing the new Create ExampleSet Operator to create example sets from functions, numbers, dates, etc for quick prototyping

Enhancements

  • Improved CPU utilization of parallel processes (e.g. when using nested Loops).
  • Pre-run check and better error descriptions for Filter Examples wrong and correct predictions

  • Attribute selection dialogs and comboboxes now display the type (numeric, nominal, date_time) of the attribute

  • Attribute selection dialogs now properly sort the available attributes on the left in a human-readable way

  • All "Legacy Result Access" operators are now deprecated, existing processes that are still using these operators will continue to work. Please use the operators Store and Retrieve in future processes.

    • Use Retrieve instead of Read Model, Read Clustering, Read Weights, Read Constructions, Read Performance, Read Parameters, Read Threshold and Read.
    • Use Store instead of Write Model, Write Clustering, Write Weights, Write Constructions, Write Performance, Write Parameters, Write Threshold and Write.
  • Improved meta data generation and propagation for several source operators

  • Combobox popups are now as wide as their content needs them to be, regardless of the actual combobox width. This can look a bit funny sometimes, but it's much more useful to be able to actually read the contents than go for nicer looks.

  • Better information in Auto Model for cases and settings where longer runtimes can be expected

  • Dialogs opened by extensions are no longer displaying a warning icon next to them

  • Changed style of tutorials to incorporate RapidMiner Academy

  • Improved default parameters for Gradient Boosted Trees

Bugfixes

  • Cross Validation now applies Bessel's correction on the performance variance and standard deviation.
  • Connecting operators in an infinity loop no longer freezes RapidMiner Studio.
  • Fixed unhelpful error message: "Error while training the H2O model: {0}"
  • Fixed a rare bug in Log operator where a process seemingly was not stopping when it was done.
  • Fixed cause for sliders sometimes looking a bit broken.
  • Fixed rare bug in feature set navigator in Auto Model which could lead to misaligned plots and tables
  • Fixed rare bug in Automatic Feature Extraction which could lead to a wrong selection of final feature set
  • Fixed bug for data sets from read-only repositories shown in the results view and opened in Auto Model
  • Time Series
    • Fixed calculation of first quartile, median and third quartile in Extract Aggregates
    • Fixed a bug for all attributes selection when a filter type is selected which checks all Examples individually.
    • Fixed a bug in Apply Forecast operator, in case it was executed inside a parallel operator.
    • Fixed a bug for Windowing and Process Windows in case parameters were wrongly configured
    • Fixed Cross Validation returning the test example set with duplicate rows if multiple performance vectors were connected inside the cross validation. This did not affect any performance metrics.

Development

  • Added utility class PersistentContentMapperStore. This class can be used to store arbitrary information in the local user cache. This can be used to store configurations of results for repository objects, or even things identified via a hash. Example usage of this are the HTML5 charts which save their configuration that way.
  • Added utility class ColorChooserUtilities for opening a HSL color chooser
  • Added DistinctColorSlider and LinearGradientColorSlider UI components where users can select and change a list of distinct colors / linear color gradients conveniently
  • Added ExtendedJListTransferHandler class that allows re-ordering inside a JList via drag&drop
  • Added new interface CleanupRequiringComponent which GUI result components can use to indicate they need to clean something up after a result has been closed by the user. It is called whenever a result tab has been closed.
  • Added "BETA" tag support for result visualization cards (the cards on the left shown when viewing results in the Results view). Add a gui.cards.I18N_KEY.beta = true flag to your i18n properties to indicate a result renderer as a Beta version.
  • Packages com.rapidminer.gui.plotter and com.rapidminer.gui.new_plotter were deprecated and will be removed in the future.