Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.9 - Check here for latest version

Execute SQL (JDBC Connectors)

Synopsis

This operator executes the specified SQL statement on the specified database.

Description

The Execute SQL operator executes the specified SQL statement on the specified SQL database. The SQL query can be specified through the query parameter. If the SQL query is in a file then the path of that file can be specified through the query file parameter. Please note that this operator cannot be used for loading data from databases. It can be used for executing SQL statements like CREATE or ADD etc. In order to load data from an SQL database, please use the Read Database operator. You need to have at least a basic understanding of databases, database connections and queries in order to use this operator properly. Please go through the parameters and the attached Example Process to understand the working of this operator.

Differentiation

Read Database

The Read Database operator is used for loading data from a database into RapidMiner. The Execute SQL operator cannot be used for loading data from databases. It can be used for executing SQL statements like CREATE or ADD etc on the database.

Input

  • connection (Connection)

    This input port expects a Connection object if any. See the parameter connection entry for more information.

  • through (IOObject)

    It is not compulsory to connect any object with this port. Any object connected at this port is delivered without any modifications to the output port. This operator can have multiple inputs. When one input is connected, another through input port becomes available which is ready to accept another input (if any). The order of inputs remains the same. The object supplied at the first through input port of the Execute SQL operator is available at the first through output port.

Output

  • connection (Connection)

    This output port delivers the Connection object from the input port. If the input port is not connected the port delivers nothing.

  • through (IOObject)

    The objects that were given as input are passed without changing to the output through this port. It is not compulsory to connect this port to any other port; the SQL command is executed even if this port is left without connections. The Execute SQL operator can have multiple outputs. When one output is connected, another through output port becomes available which is ready to deliver another output (if any). The order of outputs remains the same. The object delivered at the first through input port of the Execute SQL operator is delivered at the first through output port

Parameters

  • define_connectionThis parameter indicates how the database connection should be specified. It gives you four options: repository, predefined, url and jndi. The parameter is not visible if the connection input port is connected. Range: selection
  • connection_entryThis parameter is only available when the define connection parameter is set to repository. This parameter is used to specify a repository location that represents a connection entry. The connection can also be provided using the connection input port. Range: string
  • connectionThis parameter is only available when the define connection parameter is set to predefined. This parameter is used for connecting to a database using a predefined connection. You can have many predefined connections. You can choose one of them using the drop down list. You can add a new connections or modify previous connections using the button next to the drop down list. You may also accomplish this by clicking on Manage Database Connections... from the Tools menu in the main window. A new window appears. This window asks for several details e.g. Host, Port, Database system, schema, username and password. The Test button in this new window will allow you to check whether the connection can be made. Save the connection once the test is successful. After saving a new connection, it can be chosen from the drop down list of the connection parameter. You need to have a basic understanding of databases for configuring a connection. Range: selection
  • database_systemThis parameter is only available when the define connection parameter is set to url. This parameter is used for selecting the database system in use. It can have one of the following values: MySQL, PostgreSQL, Sybase, HSQLDB, ODBC Bridge (e.g. Access), Microsoft SQL Server (JTDS), Ingres, Oracle. Range: selection
  • database_urlThis parameter is only available when the define connection parameter is set to url. This parameter is used for defining the URL connection string for the database, e.g. 'jdbc:mysql://foo.bar:portnr/database'. Range: string
  • usernameThis parameter is only available when the define connection parameter is set to url. This parameter is used for specifying the username of the database. Range: string
  • passwordThis parameter is only available when the define connection parameter is set to url. This parameter is used for specifying the password of the database. Range: string
  • jndi_nameThis parameter is only available when the define connection parameter is set to jndi. This parameter is used for specifying the JNDI name for a data source. Range: string
  • queryThis parameter is used for specifying the SQL query which will be executed on the specified database. Range: string
  • query_fileThis parameter is used for selecting the file that contains the SQL query which will be executed on the specified database. Long queries are usually stored in files. Storing queries in files can also enhance reusability. Range: filename
  • prepare_statementIf checked, the statement is prepared, and '?' can be filled in using the parameters parameter. Range: boolean
  • parametersThis parameter specifies the Parameters to insert into '?' placeholders when the statement is prepared. Range: enumeration

Tutorial Processes

Creating a new table in mySQL database

The Execute SQL operator is used for creating a new table in an existing mySQL database. The define connection parameter is set to predefined. The define connection parameter was configured using the button next to the drop down list. The name of the connection was set to 'mySQLconn'. The following values were set in the connection parameter's wizard. The Database system was set to 'mySQL'. The Host was set to 'localhost'. The Port was set to '3306'. The Database scheme was set to 'golf'; this is the name of the database. The User was set to 'root'. No password was provided. You will need a password if your database is password protected. Set all the values and test the connection. Make sure that the connection works.

The query parameter is set to the following query: 'CREATE TABLE Weather(Temperature INTEGER)'. This query creates a new table named Weather in the 'golf' database. This table has one integer attribute named Temperature. Run the process, you will not see any results in RapidMiner because this operator did not return anything. It simply executed the query on the specified database. So, in order to see the changes you can open the database and verify that a new table has been created.