XSV to XML converter





AboutThe Product



The XSV to XML converts CSV, TSV or any delimited files to XML files with very little effort. Mainly, it gives you a lot of control over processing of data during conversion.

During migration or loading exercises, many times, you need to convert delimited data (CSV, TSV etc) to XML load files. Sure, you can use out of the box converter to achieve this, but often, you want more control.

Imagine loading 10,000 objects and the loader throws error while loading 999th object stating “Folder not Found” or “Incorrect Format”

With this converter, you will be able to achieve the following.

  1. Validate rows that qualify for loading.  For example, validate data against existence of folder in the system. You can validate columns using existing validation methods or write your own in a very easy manner.
  2. Format columns. For example, formatting date field. You can format columns using existing formatting methods or write your own in a very easy manner.
  3. Dynamically calculate value for columns. For example, display data to internal value mapping.
  4. Move invalid rows into separate file and convert valid rows to loadable XML
  5. Log4J integration and logging that clearly shows columns that are invalid and the reasons for them to be invalid.
  6. Easily customize existing conversion framework to suite your needs.

To understand the solution by example, please download the free solution and run examples inside PACKAGE_HOME/example/xsvtoxml folder. The converter is currently referencing Windchill API to show how to handle Windchill validations, however it can be easily changed to run in a standalone mode.


Inputs and Outputs

At the minimum, the converter takes three inputs.

XSV File – File containing delimited data

Recipe File – A recipe file shows the converter how the output should look like.

Delimiter – You specify the delimiter XSV file is using. For example it would be “,” for CSV files and “\t” for TSV files

Example input file content

Notice, second row does not have first field which is the name field and should be written in error.csv after processing is comple

#Name(1),Part Number(2),Version(3),State(4),Folder(5),Effective Date(6)
#Here numbering of index is not necessary. This is just for informative purpose<strong>    METAL PRINT ASY,001089767,B,In Work,/Default,11-07-2018
,35798390,,In Work,/Default,2012-12-01


Converter Flow

Example Recipe file content

<csvBeginWTPart handler=”wt.part.LoadPart.beginCreateWTPart”>
<csvpartName>{column1}</csvpartName>
<csvpartNumber>{column2}</csvpartNumber>
<csvtype>separable</csvtype>
<csvgenericType>variant</csvgenericType>
<csvsource>make</csvsource>
<csvfolder>{column5}</csvfolder>
<csvlifecycle></csvlifecycle>
<csvview>Design</csvview>
<csvteamTemplate/>
<csvlifecyclestate>{column4}</csvlifecyclestate>
<csvtypedef/>
<csvversion>{column3}</csvversion>
<csviteration/>
<csvorganizationName/>
<csvorganizationID/>
</csvBeginWTPart>
<csvIBAValue handler=”wt.iba.value.service.LoadValue.createIBAValue”>
<csvdefinition>effectiveDate</csvdefinition>
<csvvalue1>{column6}</csvvalue1>
<csvvalue2></csvvalue2>
<csvdependency_id></csvdependency_id>
</csvIBAValue>
<csvEndWTPart handler=”wt.part.LoadPart.endCreateWTPart”>
<csvpublishFlag/>
<csvparentContainerPath/>
</csvEndWTPart>

Example Output file content

&lt;?xml version=”1.0″ encoding=”big5″?&gt;
&lt;!DOCTYPE NmLoader SYSTEM “standardX10.dtd”&gt;
&lt;NmLoader&gt;
&lt;csvBeginWTPart handler=”wt.part.LoadPart.beginCreateWTPart”&gt;
&lt;csvpartName&gt;METAL PRINT ASY&lt;/csvpartName&gt;
&lt;csvpartNumber&gt;001089767&lt;/csvpartNumber&gt;
&lt;csvtype&gt;separable&lt;/csvtype&gt;
&lt;csvgenericType&gt;variant&lt;/csvgenericType&gt;
&lt;csvsource&gt;make&lt;/csvsource&gt;
&lt;csvfolder&gt;/Default&lt;/csvfolder&gt;
&lt;csvlifecycle&gt;&lt;/csvlifecycle&gt;
&lt;csvview&gt;Design&lt;/csvview&gt;
&lt;csvteamTemplate/&gt;
&lt;csvlifecyclestate&gt;INWORK&lt;/csvlifecyclestate&gt;
&lt;csvtypedef/&gt;
&lt;csvversion&gt;B&lt;/csvversion&gt;
&lt;csviteration/&gt;
&lt;csvorganizationName/&gt;
&lt;csvorganizationID/&gt;
&lt;/csvBeginWTPart&gt;
&lt;csvIBAValue handler=”wt.iba.value.service.LoadValue.createIBAValue”&gt;
&lt;csvdefinition&gt;effectiveDate&lt;/csvdefinition&gt;
&lt;csvvalue1&gt;2018-07-11 00:00:00&lt;/csvvalue1&gt;
&lt;csvvalue2&gt;&lt;/csvvalue2&gt;
&lt;csvdependency_id&gt;&lt;/csvdependency_id&gt;
&lt;/csvIBAValue&gt;
&lt;csvEndWTPart handler=”wt.part.LoadPart.endCreateWTPart”&gt;
&lt;csvpublishFlag/&gt;
&lt;csvparentContainerPath/&gt;
&lt;/csvEndWTPart&gt;
&lt;/NmLoader&gt;

Example Error file content

,35798390,,In Work,/Default,2012-12-01

Example Config file content

<header><![CDATA[
<?xml version=”1.0″ encoding=”big5″?>
<!DOCTYPE NmLoader SYSTEM “standardX10.dtd”>
<NmLoader>
]]></header><footer><![CDATA[</NmLoader>]]></footer>             &lt;!– Name Column –&gt;
1
name

validateNull

truncate

length
20

&lt;!– Number Column –&gt;
2
number

validateNumber

&lt;!– State Column –&gt;
4
state

getMappedValue

In Work
INWORK

Cancelled
CANCELLED

&lt;!– Effective IBA Column –&gt;
6
Effective Date

validateDateFormat

format
dd-MM-yyyy

formatDate

inputFormat
dd-MM-yyyy

outputFormat
yyyy-MM-dd HH:mm:ss

&lt;!– Folder Column –&gt;
&lt;!– This section has been commented out so that you could test the conversion utility without having functioning Windchill –&gt;
&lt;!–
5
folder

validateFolderExistence

conainerPath
/wt.inf.container.OrgContainer=My Org/wt.pdmlink.PDMLinkProduct=Wind Turbine

–&gt;

Example Log File

RowInfoHolder
index ==> 1
line ==> METAL PRINT ASY,001089767,B,In Work,/Default,11-07-2018

isValid ==> true
[
CellInfoHolder
column ==> 1
name ==> name
value ==> METAL PRINT ASY
isValid ==> true
validators ==> [
method ==> validateNull
args ==> {}
result ==> true]
valueprocessors ==> [
method ==> truncate
args ==> {length=20}
result ==> METAL PRINT ASY],
CellInfoHolder
column ==> 2
name ==> number
value ==> 001089767
isValid ==> true
validators ==> [
method ==> validateNumber
args ==> {}
result ==> true]
valueprocessors ==> [],
CellInfoHolder
column ==> 4
name ==> state
value ==> INWORK
isValid ==> true
validators ==> []
valueprocessors ==> [
method ==> getMappedValue
args ==> {Cancelled=CANCELLED, In Work=INWORK}
result ==> INWORK],
CellInfoHolder
column ==> 6
name ==> Effective Date
value ==> 2018-07-11 00:00:00
isValid ==> true
validators ==> [
method ==> validateDateFormat
args ==> {format=dd-MM-yyyy}
result ==> true]
valueprocessors ==> [
method ==> formatDate
args ==> {outputFormat=yyyy-MM-dd HH:mm:ss, inputFormat=dd-MM-yyyy}
result ==> 2018-07-11 00:00:00]]

RowInfoHolder
index ==> 1
line ==> ,35798390,,In Work,/Default,2012-12-01

isValid ==> false
[
CellInfoHolder
column ==> 1
name ==> name
value ==>
isValid ==> false
validators ==> [
method ==> validateNull
args ==> {}
result ==> false]
valueprocessors ==> [
method ==> truncate
args ==> {length=20}
result ==> METAL PRINT ASY],
CellInfoHolder
column ==> 2
name ==> number
value ==> 35798390
isValid ==> true
validators ==> [
method ==> validateNumber
args ==> {}
result ==> true]
valueprocessors ==> [],
CellInfoHolder
column ==> 4
name ==> state
value ==> In Work
isValid ==> true
validators ==> []
valueprocessors ==> [
method ==> getMappedValue
args ==> {Cancelled=CANCELLED, In Work=INWORK}
result ==> INWORK],
CellInfoHolder
column ==> 6
name ==> Effective Date
value ==> 2012-12-01
isValid ==> false
validators ==> [
method ==> validateDateFormat
args ==> {format=dd-MM-yyyy}
result ==> false]
valueprocessors ==> [
method ==> formatDate
args ==> {outputFormat=yyyy-MM-dd HH:mm:ss, inputFormat=dd-MM-yyyy}
result ==> 2018-07-11 00:00:00]]
Skipping……..
Writing output
Done


Solution

You can download the converter along with the source code from the bottom of this page. The converter uses a “recipe” file that acts as an output definition file to convert XSV to XML.

You can execute conversion command with different parameters depending on your needs. Please look into examples folder to view sample files.

If you do not want to validate or format data but simply output XML, execute the following.

windchill com.ezcollab.converter.XSVToXMLConverter -r recipe.xml -i input.csv -d , -o output.xml

If you want to validate or format data, output XML, output invalid lines in separate file, add header, add footer, use following command.

windchill com.ezcollab.converter.XSVToXMLConverter -r recipe.xml -i input.csv -d , -o output.xml -c config.xml -e error.csv

If you want to validate data using Windchill specific validators and formatters, output XML, output invalid lines in separate file, add header, add footer, use following command.

windchill com.ezcollab.converter.XSVToXMLConverter -r recipe.xml -i input.csv -d , -o output.xml -c config.xml -e error.csv -u username -p password

Here

Args Description
-r Recipe file tells the converter how the repeatable blocks/nodes of  output XML should look like. The recipe file contains XML block that repeats for each row of CSV or TSV data. XML blocks have parameters ({column1}, {column2} …{column4} and so on) that gets  replaced by actual corresponding columns of CSV or TSV file.
-i Input CSV or TSV file. File that needs to be converted to XML
-d Delimiter of the data file
-o (Optional) Output file that contains resulting XML from the processing. Even though this is optional, you will use it most of the time.
-e (Optional) Error file. File that will contain rows that did not pass validations.
-c (Optional) Configuration file. The configuration file contains definitions for header, footer, validations and formats. Please go to Advance Configuration to understand how configuration works.
-u (Optional) Only if your validator or formatter needs windchill access
-p (Optional) Only if your validator or formatter needs windchill access

 


Advance Configuration

Logging

The solution uses Log4J for logging. Currently, logs are routed to Standard out. If you want to output logs to a file, update following file.
PACKAGE_HOME/com/ezcollab/log4j.properties

You may familiarize yourself with log4j by going to http://logging.apache.org/log4j/1.2/manual.html

Configuration

The configuration file contains definitions for header, footer, validators and formatters. com.ezcollab.converter.FormatterValidator class contains methods for validating and formatting.  You may add your own custom methods for additional validation and formatting functionalities.

Validators and Formatters get passed following three arguments based on which you can do your validation and formatting logic

(CellInfoHolder cellInfoHolder, RowInfoHolder rowInfoHolder, HashMap argMap)

Here,

CellInforHolder contains information about the cell that is getting processed. You can access following information of a cell

int index; //Index of the cell. 1st column is 1, 2nd is 2 and so on
String name; //Name of the column if you define it in config file
String value; //Value of the cell

RowInfoHolder contains information about the row that is getting processed. You can access following information of a row

int index; //Index of the row. 1st row is 1, 2nd is 2 and so on
String line; //The actual row
HashMap cellInfoHoldersMap//Map of index to CellInforHolder for that row. You can extract value of any column in that row by index or name.

argMap contains additional information passed to the converter from the config xml file. Any arg name/value pair inside args node of config file will be in argMap.

An example configuration file, config.xml,  for the converter is located in PACKAGE_HOME/example/xsvtoxml.  Following is the content of the configuration file. Please look at the explanation of different sections to understand the config file.

<meta>
<!– <header> – The value of the header section gets put at the top of the resulting XML. DTD Definition and Opening NmLoader node are always at the top of any load file. You can leave this as it is. –>
<header><![CDATA[
<?xml version=”1.0″ encoding=”big5″?>
<!DOCTYPE NmLoader SYSTEM “standardX10.dtd”>
<NmLoader>
]]>
</header>
<!– <footer> – The value of the footer section gets appended to the resulting XML. Closing NmLoader node is always at the bottom of any load file. You can leave this as it is. –>
<footer><![CDATA[</NmLoader>]]></footer>
<!– <columns> – This node contains definition for multiple columns. –>
<columns>
<!– <column> – Each column in XSV file can be represented using column. It is not required that you specify column definition for each column in XSV File. If you want to validate or process a column then you have to define the it here. –>
<column>
<!– <index> – Column index of the column for which you are creating definition. First column of XSV is 1. –>
<index>1</index>
<!– <name> – Name of the column. Validators and formatters will have this information in case you want to refer column information by its name for a particular row. –>
<name>name</name>
<!– <validators> – This node contains definition for multiple validators. –>
<validators>
<!– <validate> – Each validation method is represented by this node. –>
<validate>
<!– <name> – method name in com.ezcollab.converter.FormatterValidator class that needs to be called to validate –>
<name>validateNull</name>
</validate>
</validators>
<!– <valueprocessors> – This node contains definition for multiple formatters. –>
<valueprocessors>
<!– <processor> – Each formatter method is represented by this node. –>
<processor>
<name>truncate</name>
<!– <args> – This node can contain multiple arguments. –>
<args>
<!– <arg> – This node is optional. It is name value pair that will be passed to the com.ezcollab.converter.FormatterValidator as HashMap argument –>
<arg>
<name>length</name>
<value>20</value>
</arg>
</args>
</processor>
</valueprocessors>
</column>
<!– Number Column –>
<column>
<index>2</index>
<name>number</name>
<validators>
<validate>
<name>validateNumber</name>
</validate>
</validators>
</column>
<!– State Column –>
<column>
<index>4</index>
<name>state</name>
<valueprocessors>
<processor>
<name>getMappedValue</name>
<args>
<arg>
<name>In Work</name>
<value>INWORK</value>
</arg>
<arg>
<name>Cancelled</name>
<value>CANCELLED</value>
</arg>
</args>
</processor>
</valueprocessors>
</column>

<!– Effective IBA Column –>
<column>
<index>6</index>
<name>Effective Date</name>
<validators>
<validate>
<name>validateDateFormat</name>
<args>
<arg>
<name>format</name>
<value>dd-MM-yyyy</value>
</arg>
</args>
</validate>
</validators>
<valueprocessors>
<processor>
<name>formatDate</name>
<args>
<arg>
<name>inputFormat</name>
<value>dd-MM-yyyy</value>
</arg>
<arg>
<name>outputFormat</name>
<value>yyyy-MM-dd HH:mm:ss</value>
</arg>
</args>
</processor>
</valueprocessors>
</column>
<!– Folder Column –>
<!– This section has been commented out so that you could test the conversion utility without having functioning Windchill –>
<!–
<column>
<index>5</index>
<name>folder</name>
<validators>
<validate>
<name>validateFolderExistence</name>
<args>
<arg>
<name>conainerPath</name>
<value>/wt.inf.container.OrgContainer=My Org/wt.pdmlink.PDMLinkProduct=Wind Turbine</value>
</arg>
</args>
</validate>
</validators>
</column>
–>
</columns>
</meta>