Creating a Wrangler Configuration

How to:

This section describes how to create a wrangler configuration in iWay Big Data Integrator (iBDI). The wrangler provides structure to unstructured data by assigning a schema to the data. The wrangler will conform the data to the schema.

Procedure: How to Create a Wrangler Configuration

  1. Expand an available iBDI project node in the Project Explorer tab, right-click the Wranglers folder, select New, and then click Other from the context menu.

    The New dialog opens, as shown in the following image.

  2. Type wrangler in the Wizards field to filter the selection, select Wrangler, and then click Next.

    The New Wrangler dialog opens, as shown in the following image.

    The Project Folder field is automatically populated with a folder path, which you can modify as required.

  3. Click Browse to the right of the Source field.

    The Select a Source dialog opens.

  4. Navigate to the file that will be used by the wrangler you are configuring, select the file, and then click OK.

    You are returned to the New Wrangler dialog where the Source field and Name field are now populated.

  5. Click Finish.

    The wrangler opens as a new tab (for example, FlumeData2.wrangler) in the iBDI workspace, as shown in the following image.

  6. Click Customize headers to modify the Header text.
  7. Click the title in the cell to change the name.
  8. Click OK at the bottom of the dialog to commit the changes to the column headers.
  9. In the Wrangler Details section, enter a schema name in Hive or RDBMS where the table will be created. This must be a currently connected data source.
  10. In the Table field, enter the new table name to be created.
  11. Enter or modify the Row Delimiter and Column Delimiter values.
  12. Click Execute to write the changes to the database under the table and schema names that were specified.

    The console pane in the lower part of the iBDI user interface displays the results. For example:

    12/09/2016 03:30:43.590 [INFO] Executing statement: CREATE
    DATABASE IF NOT EXISTS new_schemax 
    12/09/2016 03:30:43.610 [INFO] Executing statement: CREATE EXTERNAL
    TABLE new_schemax.personnel ( First_name STRING , Last_name STRING ) ROW
    FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
    LOCATION '/user/root/flume/JR'
    12/09/2016 03:30:43.707 [INFO] Wrangler execution was completed
    successfully. 
  13. Perform the following steps to verify the results:
    1. Return to the Data Source Explorer and browse to the schema and table that you specified in the previous steps (Step 9 and Step 10).

      The new table should be available.

    2. Right-click and select Refresh from the context menu or press F5 to view the updated data.

      Note: If the result set is very large, this process may take several minutes to complete and appear under the menu, as shown in the following image.