Instance manager¶
Mork comes with a built-in instance manager, which manages the whole instance load-unload lifecycle, and provides a way to classify instances based on their properties.
Instance manager modes¶
The instance manager is configured by using the properties with the prefix instance.*
. For example,
the default instance path is configured using the property instance.path.default
.
The default
keyword can be replaced with the experiment name,
overriding the instance path for each user defined experiment if necessary.
The value of the instance.path
property can be interpreted in multiple ways, depending on the path type.
Path is a "normal" file¶
In the simplest case, the path represents a single file. In this case, the instance manager will ask
the user implemented InstanceImporter
to load the instance from the file, and run the experiment using only this file.
Path is a compressed file¶
If the path is a compressed file, the instance manager will decompress each file in memory,
and get an input stream to avoid disk I/O. This is useful when the instances are compressed to save disk space, as
decompression only happens in memory when required. From the user point of view, this is done transparently, and the
same method of the InstanceImporter
is called, regardless of the file being compressed or not. If the compressed file
contains multiple files or directories, they are recursively enumerated.
Path is a directory¶
If the path is a directory, the instance manager will enumerate all files in the directory, and load each file as an instance. If the directory contains directories, they will be recursively enumerated. If there are compressed files inside the directory, they will be processed as explained in the previous section.
Path is an index file¶
Index files are a way to group instances together, without splitting them in different folders. They are specially useful when instances are extremely big, to avoid data duplication.
For example, if we have a folder called "instances", which contains all the known instances in the literature for the problem we are currently solving,
we may want to execute only a subset of this instance set, for example when running preliminary experiments. An easy way to do this is to create a new instance folder, called preliminary-instances
, and copy the instances that we want to run there. However, this approach has the downside of duplicating the data, which can be a problem if the instances are huge.
As an alternative, we can create a file inside the instances
folder, called preliminary.index
, which contains the names of the instances that we want to run. The instance manager will read this file, and load only the instances that are listed there, ignoring the rest.
Info
Note that a index files are identified by the .index
extension, and are ignored when enumerating instances if they are not explicitly configured as the instance.path
properties.
Instance loading¶
In order to load instances, users must extend the InstanceImporter
class.
By default, the following template is generated by the Mork generator:
public class ExampleInstanceImporter extends InstanceImporter<__RNAME__Instance> {
@Override
public ExampleInstance importInstance(BufferedReader reader, String suggestedName) throws IOException {
// Create and return instance object from file data
// TODO parse all data from the given reader however I want
// TIP: You may use a Scanner if you prefer it to a Buffered Reader:
// Scanner sc = new Scanner(reader);
// Call instance constructor when we have parsed all the data, passing data as we see fit
var instance = new ExampleInstance(suggestedName);
// IMPORTANT! Remember that instance data must be immutable from this point
return instance;
}
}
One important thing to always take into account is that instances must be immutable after they have been loaded.
This means that under no circumstance should the instance data be modified after the importInstance
method
has finished. The reason for this is that immutable data can be safely and efficiently shared between threads,
which is a key feature for Mork's parallel execution.
Tip
Note that if you want to precalculate any kind of data, it is perfectly valid to do it while loading the instance, before returning.
Advanced loading methods¶
In some cases, the same logical instance is split in multiple files, or maybe instance data follows a customized binary
and you have an existing method that receives a File objects and initializes the instance. In these advanced use cases,
where more control over the loading process is needed, the second overload of the importInstance
method can be used
instead. For example:
public class ExampleInstanceImporter extends InstanceImporter<ExampleInstance> {
@Override
public ExampleInstance importInstance(BufferedReader reader, String suggestedName) throws IOException {
throw new UnsupportedOperationException("Loading from a BufferedReader is not supported");
}
@Override
public ExampleInstance importInstance(String path) {
var file = new File(path);
// call our custom method to load the instance
ExampleInstance instance = customLoadMethod(file);
return instance;
}
}
Instance solve order¶
By default, instances are solved in lexicographic order, which means sorting instances by their filename.
If you want to solve instances in a specific order, for example from smaller to bigger instances, you can
override the compareTo
method of the Instance
class. For example:
/**
*
* Sort instances by their node count, from smaller to bigger
* @param o the instance to compare to
* @return a negative integer, zero, or a positive integer as this instance is less than, equal to, or greater than the specified instance.
*/
@Override
public int compareTo(Instance o) {
return this.nNodes.compareTo(o.nNodes);
}
Tip
Instead of manually implementing the compare or comparator methods, which can sometimes be confusing, we recommend using the Comparator.comparing
methods or the property comparator directly, as the previous example.
Instance unloading¶
By default, instances are validated and cached before the experiment starts running, which means that if they fit in memory, I/O is reduced to a minimum. If they do not fit into memory, they are loaded and unloaded as needed, which is suboptimal but works fine without user intervention.
This feature is called instance preloading, and while it is is enabled by default,
it can be disabled by setting instances.preload
to false
. The main advantages of instance preloading are that you make sure
all instances can be actually loaded before the experiment starts, so any mistake is detected as early as possible, and that you avoid I/O during the experiment execution.
Moreover, preloading is required if a custom solve order is used.
However, if the instances are huge, you do not need to solve the instances in any given order, and you are sure that all instances are valid, you can safely disable this feature.
Tip
As a general rule, do not use automatic parallelization if instances are huge. The reason for this is that multiple threads can be solving different instances, and therefore Mork will be forced to keep multiple instances in memory at the same time, which can produce out-of-memory errors.