Automatic PDF Processor - automatically process PDF files

The complete solution for automated processing of PDF documents

Automatic PDF Processor - Online help

Program help as PDF

Step-by-step instructions

Automatically print PDF files
Automatically rename PDF files
Automatically move PDF files
Automatically split PDF files
Automatically merge PDF files
Automatically send PDF files by email
Automatically make PDF files searchable (OCR)
Automatically save PDF attachments
Automatically extract PDF form data
Extract PDF data
Use a file grouping for printing contiguous files

1 Start

With Automatic PDF Processor, PDF files can be processed automatically. Any number of folders can be monitored to automatically print, rename, or move incoming PDF files to dynamically named folders. Numerous metadata of the PDF document can be used in addition to the content, for example to include invoice information in the file path. The PDF files to be processed can be narrowed down with various profile-specific filters - the document text, PDF metadata and general file information are available as filters.

The program only processes PDF files.

The help describes the program functions and provides instructions for the use of Automatic PDF Processor.

2 Main menu – menu entries of the group Menu

2.1 Options…

Use this menu item to open the dialog box for customizing the program options.

2.2 Enter license key…

This menu entry opens the registration dialog to enter the license key respectively to unlock the full version of the program.

2.3 Help

Clicking on this menu item opens the online help in the default browser. To use the online help, an active Internet connection is required.

2.4 About…

Using this menu item, the dialog window to display the program version of the license state will be opened. This dialog window contains also links to contact the technical support, the product's web page etc.

2.5 Other --> Create error report

In case of technical problems, you can create a bug report by using this menu item. The created file is named "Automatic PDF Processor - error report" and located on the Desktop. You can then send us the error report attached to an email with a short description of the problem.

>2.6 Other --> Clear error logs

This menu item allows the error log files to be emptied manually. However, the program automatically removes entries that are older than 7 days during every saving process.

2.7 Other --> Backup application data…

This menu item allows you to store the entire application data (profiles, log, etc.) as ZIP archive in a directory to be selected. After saving the ZIP archive is highlighted in the Windows Explorer.

2.8 Other --> Restore application data…

This menu entry can be used to restore previously saved application data.

2.9 Exit

Use this menu item to exit the program.

3 Buttons of the profile toolbar

3.1 New profile…

A click on this button opens a window for creating a new profile.

3.2 Edit…

This button opens the window for editing the settings of the profile that is currently selected in the list.

3.3 Duplicate

Creates a copy of the selected profile.

3.4 Catch up…

This button allows you to apply the selected profiles retrospectively to all PDF files that meet the respective profile-specific filter criteria. Optionally, the search process can be limited to a certain period.

Process PDF files subsequently

3.5 Activate

A click on this button activates the selected entries of the profile list, i.e., the profiles get the status Active and are applied to newly incoming PDF documents.

3.6 Deactivate

This button sets the selected profiles to the status Inactive, i.e., the profiles are ignored when processing newly incoming PDF documents.

3.7 Delete

By clicking the button, you can delete the selected profiles after a confirmation prompt.

3.8 Other

Contains the following entries:

Apply all profiles subsequently…
This menu entry opens a new window in which you can optionally specify the period of a file date. After confirmation, all PDF files of the monitored folders will be processed subsequently using all active profiles.
Activate all profiles
With this button, you can set all profiles to the status Active, i.e., the profiles are applied to newly incoming PDF documents.
Deactivate all profiles
This button sets all profiles to the status Inactive, i.e., the profiles are ignored when processing newly incoming PDF documents.
Delete all profiles
Clicking this button deletes all profiles after a confirmation prompt.
Import profiles...
This menu item is used to import profiles from a JSON file.

3.8.1 Export profiles

Here you can export profiles, for example to transfer them to another computer. Choose from the following options:

Export all profiles
Export active profiles
Export selected profiles

4 Profile list

4.1 Status

Newly created profiles are given the status Active. All profiles with this status will be applied to incoming files. To disable a profile, click the check box at the beginning of each row. The status then changes to Inactive.

A status change causes the profile change date to be altered. A profile is only automatically applied to files whose detection date lies after the profile creation date and the profile change date.

4.2 Name

The name of the profile.

4.3 Last use (regularly)

Shows the date and time of the last successful regular respectively automatic application of the profile.

4.4 Last use (subsequently)

Shows the date and time of the last successful application of a profile using the Catch-Up function.

4.5 Monitored folders

A list of all folders the profile will be applied to.

4.6 Comment

An optional comment to the profile.

4.7 Context menu of the profile list

The context menu of the profile list contains the following entries:

Select all
Select none
Invert selection
Export profile

5 Profile settings

5.1 General

5.1.1 Name

Here you can give the profile a meaningful name.

5.1.2 Comment

Optionally, you can enter a comment that will be displayed in the profile list.

5.1.3 Color highlighting in the log list

Here you have the option to determine a color in which the profile will be highlighted in the log list.

5.1.4 Password for PDF files (if needed)

If necessary, enter the required password for the PDF files here. The program tries to open the files without a password at first and only uses the password stored here if necessary.

5.2 File grouping

File groupings are required only for very special configurations. They allow files to be processed in dependence on each other. For example, to print (or merge) an invoice together with a delivery bill, terms and conditions, and a shipping label, ensuring the correct order and togetherness.

A file group consists of one or more primary files as well as secondary files. Secondary files are optional if the option to process multiple primary files is enabled. The primary file is considered to be the file received in the monitored folder.

Secondary files are identified with the help of a separate profile. In this profile, the filter criteria such as part of the file name are specified, and additional placeholders such as an invoice number are provided via extraction rules. The placeholders of the different profiles can be compared with each other via additional validity checks, for example, to ensure that all or certain files of the group contain the same invoice number in the document text.

The tasks of the primary subgroup are processed first, followed by those of the secondary subgroups in the visible order.

Files that have been processed as part of the primary subgroup will not be processed again in a regular manner, even during the current run. If necessary, the profile with the file grouping must be named so that it is executed last.

5.2.1 Use cases for file grouping

Use cases for using the primary subgroup without secondary subgroups:

Create collective invoices or protocols
Process unsorted incoming files sorted by file name after a short waiting time

Use cases for using the primary subgroup with secondary subgroups:

Process different but related files in a specific order and with individual configurations, for example print invoice, delivery bill and shipping label
Merge or send different but related files together

5.3 Monitored folders

Here you can add one or more folders. The profile is applied to all added folders if the specified filter criteria are met. When the option Including Sub-Folders is activated, additionally, all PDF files from folders which are subordinate to the added folders are processed.

5.4 Filter

Here you specify the (optional) conditions that must be met by the file properties. Only if all filter criteria are met, the PDF document is processed. You can use logical comparison operators to set several conditions for a file property. In this case, the AND operator takes precedence over the OR operator. The filter is case insensitive - no distinction is made between uppercase and lowercase letters of the entered terms. Wildcards as the asterisk are not supported. Use the placeholder Regex for this. Please note that a Regex placeholder must either be the only filter value or must be surrounded by logical operators only. Example: <BeginOfRegex>^Offer<EndOfRegex><AND>Company XYZ

PDF file property filter

Filter PDF properties

5.5 Example files

Specify here 5 or more PDF files that correspond to the ones you want to process. In the profile settings, you will get a preview of the extracted data, verification results etc. based on these sample files. If the PDF files are only to be printed, this specification is not necessary.

5.6 Data extraction

In this category, you can create or change rules for extracting data. For already created rules, you can preview their following properties here:

Name of the rule
Comment
Data source
Data type
Verification result
Determined data

By double-clicking on the name of the rule, you can navigate directly to the properties of the rule to adjust them if necessary.

PDF data extraction

5.7 Extraction rules

5.7.1 General

5.7.1.1 Name

Here you can give the extraction rule a meaningful name. The name given here is used in several places in the program. For example, it is listed in the list of available placeholders. When exporting as a CSV file, it serves as an identifier for the respective column. It should therefore be short and concise. If you create several rules with the same name, it is sufficient if one of these rules produces a valid result.

5.7.1.2 Comment

Optionally, you can enter a comment that will be included in the rule overview.

5.7.1.3 Data source

The following sources are available for the extraction of data:

Determine data from document text
by using a keyword (variable position of data, for invoices etc.)
by specifying the position (fixed position of the data, for forms etc.)
Using document metadata (title, author, subject, keywords, created with, program name, date created, date modified, version number, number of pages)
Use file information (file name, folder name, path, path including file name, creation date, modification date)
Custom Text

Depending on the selection, different configuration options or tabs are available.

5.7.1.4 Data type

The following data types can be assigned to the extracted values:

Text
Date
Number
Query

Depending on the selection, among others different verification options are available. For the datatype Date, all components of the date are provided as placeholders for use as file names etc. The type Query allows to assign a certain value in case of existence/non-existence of a search term.

5.7.2 Data determination (Keyword)

5.7.2.1 Select file

Here, you can navigate through the individual sample files (specified in 6.4) to check the extraction result and to ensure that the expected extraction result is determined in all sample files. This is also possible in the rules overview - there the result of all established rules for the respective sample file is displayed.

5.7.2.2 Determine page

This step is optional here and only necessary if the document contains several pages and if the keyword can occur several times at unpredictable positions. If the occurrence of the keyword specified in the next step (for example "Invoice date:") always precedes the value to be determined, the default setting No Determination Necessary can be left here. Otherwise, either the corresponding page number or a unique keyword (including occurrence) can be entered to determine the page.

Note: Depending on the structure of the page, the first visible occurrence is not necessarily the first occurrence in the document structure. In this case, the occurrence number can be specified accordingly.

5.7.2.3 Set data area

Enter here the search term (for example "Invoice date:") and its occurrence (a .NET-compatible regular expression can also be entered as a search term). Then, you specify the data position, for example "To the right of the found location" if the value to be extracted directly follows the search term. There are two options for specifying the data range:

Text block
First character

The default setting Text Block covers all subsequent characters of the text block adjacent to the search word and is sufficient in most cases. However, if the text block overlaps into an unwanted adjacent data range, you must switch to the First Character setting. With this setting, only the first visible character of the text block is used as the extraction result - the data range must therefore be extended in almost all cases.

The preview located below the configuration area shows the currently extracted value.

5.7.2.4 Extend data area/Adjust data area extension

Here the data range can be extended or (for example by specifying negative values) reduced. The following options are available:

by X mm (or point)
up to before keyword
up to behind keyword
up to page margin

5.7.3 Data determination (Position)

This data retrieval type is only suitable for PDF documents in which the data is always found in the same position (mainly forms or for documents with fixed data in the footer).

Unlike keyword data retrieval, the page of the document to be used must be specified. The data to be extracted is defined here using a resizable selection rectangle. The selection rectangle can be unlocked or locked, i.e., protected against unintentional changes, by clicking the Change/Lock Position button.

The preview located below the configuration area shows the currently extracted value.

5.7.4 Clean-up

This tab provides numerous functions for cleaning up the text. For example, surrounding spaces or additionally captured text lines can be removed. Furthermore, among other things, an addition to the text is possible. However, the main purpose of the cleanup is to prepare the extracted text for verification. In order to keep it as simple as possible, it is better to keep the text short here and format it only in a further step.

5.7.5 Verification

Here you can set up criteria which must be met by the extracted and cleaned text. For example, the number of extracted characters or the occurrence of certain terms can be verified. Failed verifications result in a processing error and are listed in the details in the log area on the Errors tab. For example, a different number of characters would be visible there as "Error: Number of characters".

5.7.6 Formatting

This tab is only available for the data type Text and provides functions for final editing of the extracted text.

5.8 Task type: File operations (rename, move, copy …)

5.8.1 General settings

Here you can define whether the task is assigned the status Activate and will therefore be executed when there are matching PDF files, or whether it is assigned the status Inactive. Activate tasks are given a check mark in the task name.

5.8.2 Storage location

5.8.2.1 Folder

Select the base directory where the attachments should be stored by clicking on the button with the label "...".

5.8.2.2 Subfolders

Use the Link Menu above the input field to select file properties that should be used to create an optional subfolder structure within the base directory.

Storage location

5.8.2.3 File name

Use the Link Menu above the input field to select file properties from which the file name should be generated. If you leave this field empty, the original name of the PDF file is used.

5.8.2.4 Name collisions

If a file with the distinguished name already exists, the collision rule determined here will be applied. Decide whether the program should: overwrite the file, add a number or the processing date to the name, or cancel the operation.

5.8.2.5 Execute program

For automatic further processing, the path of the stored file can be passed to a program as parameter. Here, you can optionally enter the path of a program that should be executed after storing the file.

5.8.2.6 Parameter

Optionally enter one or more parameters that should be passed to the program to be executed by using the link menu.

5.9 Task type: Print

5.9.1 General settings

5.9.2 Print settings

Here you determine whether matching files should be printed immediately after receipt. Activate the respective check box, and then make the appropriate settings on the printer selection dialog. Confirm these in the printer selection dialog by clicking the Print button.

Print settings

5.9.2.1 Page range

By using the Page Range dialog, you can specify which pages to print. For example, to skip the first page, select the type Some Pages, enter the value 2 in the field From Page, and select Relative To The Beginning. In the field To Page, enter the value 1 and select Relative To The End.

With the "Minimum number of pages of the document", you can ensure that only documents corresponding to this setting are printed.

Page range

5.9.2.2 Automatically rotate pages

Determines whether the pages to be printed should be rotated automatically. This is necessary, for example, if some documents/pages to be processed are in portrait and some in landscape format. The default setting is set in the program options.

5.10 Notifications

Here you can make settings that are used to tell the processing status of a file.

5.10.1 Status notification via email

Select the case in which a notification is to be made:

on success
on errors
if filter criteria are not fulfilled

and enter one or more (comma-separated) recipients of the status message.

5.10.2 Status notification by an acoustic signal

Here you can optionally select a wave file that you want to play after successful processing.

6 Toolbar of the log area

The toolbar contains the following buttons:

Log area

6.1 Filter log entries

By using the log filter, log entries can be limited. The filter criteria may be, for example, the date of receipt, which is to be limited to a period within the last week. The comparison operator is context-sensitive, that is, he always fits the selected filter criteria. By clicking on the button with the plus or minus symbol, further filters can be added (linked with a logical AND) or the respective filter will be removed.

Log filter

6.2 Create Excel report

This button is used to generate an overview of the processings as an Excel report. This contains the following sheets:

Overview
Successful processing
Errors
No match
No text

and allows filtering and custom sorting of processing data. In the program options, you can configure a periodic sending of the current report.

7 Log List

The log list contains information about all processed files and is divided into the four tabs: Successful Processed, Error, No Match, No Text.

Depending on the active tab page, the context menu contains the following entries:

Open saved file
(opens a saved file with the associated program)
Select saved file in Explorer
(opens the Explorer and selects the saved file)
Copy log entry to clipboard
(copies relevant data of the selected entry to the clipboard)
Clear list
(removes all entries from the log list)

8 Status bar

8.1 Profiles

Shows the total number of profiles and the number of active profiles.

8.2 Log filter

Shows the number of entries that do not fulfill the conditions of the log filter, for example, entries that are too old.

8.3 Status

Displays the following status information:

the number of seconds until the next check for new PDF files
information about the current check (the number of files already checked)
information on current processing (the number of files already processed)
other information about the current process

9 Program Options

9.1 General

Here you can define settings such as the language of the program interface and the startup behavior as well as a centrally managed profile file.

As soon as a centrally managed profile file has been specified, no further changes can be made to the individual profiles - even deletion and recreation are blocked. Changes can or should only be made by the administrator. The administrator edits the profile file locally (i.e., using the application data directory "%AppData%\Automatic PDF Processor") and then copies it to the designated network folder (provided that he does not use a centrally managed profile file himself).

The centrally managed profile file is reloaded by the individual processes before each run and the profile list in the main window is updated at regular intervals (every 5 minutes). Updating the profile list in the main window can be forced by minimizing the application to the notification area of the taskbar and then restoring it.

9.2 Processing

9.2.1 Interval of the check for new files to be processed

Determines at what interval (in seconds) Automatic PDF Processor checks whether new files have been stored.

9.2.2 Pause between processing multiple new files

Here you can set whether the program should pause for a certain time after processing a file. This option is relevant for the further processing of saved files.

9.2.3 Log errors for the generation of error reports

Determines whether errors are recorded. If problems occur, this option must be activated because detailed information about the error is necessary to generate an error report. Otherwise, this option can be disabled to reduce write accesses to the hard disk.

9.2.4 Report general errors by email

Determines whether general errors, such as missing write permissions, are reported by email.

9.2.5 Move files with faulty processing to

Enables the specification of a general directory for files that could not be processed properly and therefore have to be checked or processed manually.

9.2.6 Move files, which do not fulfill filter criteria, to

Allows to specify a common directory for files that do not meet all filter criteria.

9.2.7 Move files without text to

Allows you to specify a common directory for files that do not contain text (e.g., scanned files without OCR processing). Be sure to read section 1.0 (2nd paragraph) for more information.

9.2.8 Send files without text by email

Allows you to specify an email box for documents that do not contain text (e.g., scanned files without OCR processing). Be sure to read section 1.0 (2nd paragraph) for more information.

9.2.9 Process files only in the following time

This option allows files to be processed automatically only at the times specified here.

9.3 Email account for sending

Here, the email account for sending notifications is set. You can use the default Outlook email account or provide custom settings. The account data specified here will be used to send status notifications to the recipients listed in the respective profile settings.

9.4 Excel report by email

This option allows you to define an interval at which an automatically generated Excel report about the processings of a specific period is sent to the specified email address.

9.5 Backups

If this option is enabled, the program periodically creates a complete backup of the application data as ZIP archive in the specified folder. The backups can be used to restore application data to an earlier state via Menu à Other à Restore application data.

9.6 Print settings

Here you can define as a default setting whether pages to be printed should be rotated automatically. This setting can be overwritten in the profile settings.

9.7 Other

9.7.1 Maximum number of entries in the log area

Defines the maximum number of entries available in the log area. The default setting is 25,000 entries.