eSubmission validator

Introduction

In order to ensure that an electronic submission can be technically processed a technical validation of an electronic submission is recommended. ECPA supplies a validation engine called eSubmission validator that is capable of validating both CADDY-xml and GHSTS dossiers.

The validation engine can process predefined test configurations that are loaded into the validation engine and can be selected by the user.

Installation requirements

The eSubmission validator is currently available in two different parallel downloads:

  • Download for 32bit Windows
  • Download for 64bit Windows

Please check your local Windows installation before selecting the download.

A separately installed Java runtime (not part of the installation) is required for the eSubmission validator. Currently Java 8 and Java 11 is supported. Please make sure that the directory with the java.exe file is either part of the environment variable PATH or the environment variables JAVA_HOME or JAVA_JRE are properly set. If needed, the full path to the java.exe file can also be added to the eSubmission.ini file (in the installation directory).

Please see the file readme.txt (in the installation directory) for further and more detailled information. Please read the instructions carefully before issueing a service request due to installation problems.

If you do not fully understand the instructions above and/or have insufficient Windows access rights, please contact your local IT department for support first.

We plan to stop the support of 32bit Windows due to technical limitations of the runtime environment in the next major release.

Usage in company intranets

If used in a company intranet, the validator has to apply the proxy settings from Windows to check and download the test configurations from the internet. The same proxy settings are also used in web browsers to access the internet.

In some circumstances the validator can automatically retrieve the Windows proxy configuration, with the checkbox "Automatically detect proxy settings" on the settings pane set.

If this does not work, the checkbox "Use proxy" has to be checked and the IP address and port of the proxy server have to be supplied.

To find out the correct proxy server and port, a couple of possibilities exist:

  • Ask your local IT department for those settings
  • Check the proxy settings in your webbrowser and look for a proxy server and port there. If your company uses a so-called PAC script (usually a URL with suffix.pac) you may use an online tool like https://app.thorsen.pm/proxyforurl (use at your own risk!) to derive the proxy IP and port from the PAC script. Please note, that the eSubmission validator cannot evaluate PAC scripts itself.
  • Use a specific tool to find out, see e.g. https://github.com/MarkusBernhardt/proxy-vole#proxy-vole-tester (Use at your own risk!)

If you are not sure what specifically to do, please contact your local IT department.

Download and installation

The validator is provided as setup file that installs the software on Windows. You may require local administration rights for installation. Please note that security warnings may occur from Windows, as the setup was not digitally signed with a code signing certificate and Windows may not trust the installation due to the rather small amount of downloads and installations. If you have questions please contact your local IT support.

User interface

eSubmission Validator GUI

Test configurations for technical validation

A test configuration is a set of tests to be performed on a submission package. There are different test types than can be part of a test configuration:

  • Tests on the XML backbone of the submission
  • Tests of the submission package structure
  • Tests on the content (MD5 checksum and PDF/A-1b validation)

Please open the eSubmission validator for more details about the available technical test configurations. By default the eSubmission validator will check online for newly available technical test configurations. If you work from an Intranet you may need to provide proxy settings to run the update. Please contact your local IT support.

CADDY-xml test configurations

Test configuration Name Description
Complete Complete check - Consists of the checks ob the XML backbone, file system, MD5 and PDF/A checks (using Apache PDFBox for validation)
Partial Partial check - same as "Complete" test configuration, but without PDF/A-1b validation
PDF Partial check - checks all attached PDF for PDA/A-1b compliance (using Apache PDFBox for validation)

As PDF/A-1b validation is by far the most time-consuming part of a validation, there is the possibilty to do either a Partial check without PDF/A validation or only check for PDF/A compliance, when e.g. rechecking with modified PDF content only.

GHSTS test configurations

As of now (6/2019) there are not yet any agreed technical validation rules for GHSTS available. Some test validation rules are available for demonstration purposes, please contact the technical support if you are interested.

Proprietary test configurations

Industry and authorities may define additional validation rules that control specific business constraints. The test configurations use the standard validation language Schematron. Industry and authorities can adapt and extend the existing test configurations to test additional business constraints and put those validation rules on own intranet test configuration repositories. If you want to use the eSubmission validator for such business validation purposes please contact the technical support.

Functional overview

The eSubmission validator has the following main functionality:

  • Select applicable test configuration and run validation
  • Save a validation report in HTML, PDF or XML format
  • Get detailled validation error information on test level and issue level
  • Check for and download available test configurations from the online test configuration repository
  • Possibility to supply test configurations as ZIP files in case when no internet connection is available
  • Saving information from grid for test and issue information to CSV
  • Configuration of different user settings

Server-side integration

The eSubmission validator is also available as a server-side component without GUI that can be integrated into publication and ingestion workflows. If you want to integrate the validator into an existing environment please contact the technical support.

PDF/A-1b validation components

The eSubmssion validation engine can use one of two PDF/A-1b validation components:

The two validation components have a different validation model and hence can produce different errors. There are currently Pro's and Cons for the validation engines. The Apache PDFBox component has proven to be more stable, with better runtime and fewer memory consumption.

Please see e.g. https://github.com/veraPDF/veraPDF-library/issues/956 for a discussion of difference. For details on how to use the two validation components please contact the technical support.