SmartXML is an advanced ETL tool for processing XML files, developed in the Red programming language[1]. SmartXML engine is designed to working with XML, enabling handle complex data structures, classify documents, and transform data into formats suitable for databases.

Application use a virtual DOM-like representation named SmartDOM[2], that allow to processes XML files without requiring an XSD schema that helps to extract, classify, and transform data. It addresses challenges outlined in XPath and XPointer: Locating Content in XML Documents by John Simpson.[3]

SmartXML supports uploading data to into PostgreSQL[4], MongoDB[5] and ArangoDB.

SmartXML implements proprietary parsing rules to prevent vulnerabilities such as XPath injection attacks.[6]

Features

  • Schema Independence: Builds a virtual DOM-like representation of XML data, enabling transformations into tabular or JSON formats without relying on predefined XSD schemas.
  • Document Classification: Automatically classifies documents based on content, even without a fixed schema.
  • Field Extraction Configuration: Allows users to flexibly configure the required fields for data extraction.
  • Hierarchical Data Preservation: Generates SQL or JSON from XML, preserving hierarchical relationships for seamless database integration.
  • Database Compatibility: Supports both relational databases (e.g., PostgreSQL) and NoSQL databases for data loading.
  • Data Preprocessing with Built-In Grammars: Utilizes built-in grammars and lightweight natural language processing techniques for data cleansing and preprocessing.
  • Batch Processing Mode: Efficiently handles large-scale data transformations.
  • Secure Parsing Rules: Implements proprietary parsing rules to prevent vulnerabilities such as XPath injection attacks.

See Also

References

No tags for this post.