-
Notifications
You must be signed in to change notification settings - Fork 2
Adding a new PDI Step
Adding and registering a new Pentaho Data Integration Step is a straightforward process which has been streamlined by the design of the BICenter project. A list of all available PDI Steps can be found in this link.
After selecting the Step to be added, the following is the process required to add said step into the system.
- Register the new step on the
public > editor > diagrameditor.xml
file. - Add a new object to
conf > configuration.json
under the subsection pertaining to the new component's type with the desired component properties. - If necessary, add extra functionality to the submitClick method on the
app > assets > javascripts > step > stepController.js
or the applyChanges method onapp > assets > javascripts > services > step.js
. - Create a new class on
app > diSdk > step > parser
with the appropriate name. - If the component requires pre-processing or extra logic you can alter the decodeStep method on
app > diSdk > step > AbstractStep.java
The following is an example of how to add the CSVFileInput component. For more information about this component's functioning you can check out this link.
- Firstly, we have to add the following line to
public > editor > diagrameditor.xml
<add as="CSV File Input" template="CSVInput" icon="/assets/images/editor/rectangle.gif"/>
- Next we'll have to register the component's properties on the
conf > configuration.json
file. As the CSV File Input step is an Input type component, we'll be registering it under the Input Components. We'll be naming this object the same name we used on thetemplate
field on the prior step. Note that theshortName
field should have a name that goes in accordance with the naming specified by the Pentaho Kettle library (in our specific case, this can be seen in this link). Under thecomponentProperties
array we can specify the multiple inputs and fields of our component.
{
"name": "CSVInput",
"description": "CSV File Input",
"shortName": "csvInput",
"componentProperties": [
{
"name": "Step Name",
"shortName": "stepName",
"type": "input"
},
{
"name": "File Name",
"shortName": "fileName",
"type": "fileinput"
},
{
"name": "Delimiter",
"shortName": "delimiter",
"type": "input"
},
{
"name": "Enclosure",
"shortName": "enclosure",
"type": "input"
},
{
"name": "NIO Buffer Size",
"shortName": "bufferSize",
"type": "number"
},
{
"name": "File Encoding",
"shortName": "encoding",
"type": "select",
"componentMetadatas": [
{
"value": "UTF-8",
"name": "UTF-8"
},
{
"value": "ANSI",
"name": "ANSI"
}
]
},
{
"name": "Lazy Conversion?",
"shortName": "lazyConversionActive",
"type": "checkbox"
}
]
}
}
-
As BICenter is already prepared to receive files and all of our CSVInput's parameters, we can skip this step.
-
Now we have to create a new class on the
app > diSdk > step > parser
directory. This can simply be done by copying any of the other classes already present in the folder. Don't worry about the lack of logic in this class, as this serves as a mere extension of the AbstractStep class, which itself contains all logic needed to communicate with the PDI, used in order to allow our component to be detected and processed.
package diSdk.step.parser;
import diSdk.step.AbstractStep;
import models.Step;
import org.pentaho.di.trans.step.StepMetaInterface;
import org.w3c.dom.Element;
public class CSVInput extends AbstractStep {
@Override
public void decode(StepMetaInterface stepMetaInterface, Step step) throws Exception {
}
@Override
public Element encode(StepMetaInterface stepMetaInterface) throws Exception {
return null;
}
}
- As we want our system to automatically detect the CSV's fields automatically without the users having to manually introduce them themselves, we have to add some logic to the decodeStep method on
app > diSdk > step > AbstractStep.java
which will do an initial read of the file in order to extrapolate it's columns' names. This can be done by creating a new method on this class and adding it to the decodeStep method, or by injecting the logic directly into the function.
// If dealing with CSVFileInput get the input fields and define them
if (shortName.equals("InputFields")) {
if (fileName == null) {
Optional<StepProperty> fileNameStepProperty = stepProperties.stream()
.filter(stepProperty -> stepProperty.getComponentProperty().getShortName().equalsIgnoreCase("Filename"))
.findFirst();
if (!fileNameStepProperty.isPresent())
continue;
fileName = fileNameStepProperty.get().getValue();
}
if (delimiter == null) {
Optional<StepProperty> delimiterStepProperty = stepProperties.stream()
.filter(stepProperty -> stepProperty.getComponentProperty().getShortName().equalsIgnoreCase("Delimiter"))
.findFirst();
if (!delimiterStepProperty.isPresent())
continue;
delimiter = delimiterStepProperty.get().getValue();
}
try {
BufferedReader br = new BufferedReader(new FileReader(fileName));
String header = br.readLine();
String[] fields = new String[0];
if (header != null) {
fields = header.split(delimiter);
}
TextFileInputField[] value = new TextFileInputField[fields.length];
for (int i = 0; i < fields.length; i++) {
String field = fields[i];
value[i] = new TextFileInputField();
value[i].setName(field);
System.out.println(field);
}
// Invoke the current method with the StepProperty value.
invokeMethod(stepMetaInterface, method, value, databases);
} catch (FileNotFoundException e) {
}