tainacan/docs/importer-flow.md

164 lines
6.6 KiB
Markdown
Raw Normal View History

2018-03-30 22:40:21 +00:00
# The Importer flow
2018-04-01 14:34:05 +00:00
This page describes how Tainacan importer works and is a reference to write your own importer.
2018-03-30 22:40:21 +00:00
## Initializing a new importer
When the user starts a new import process, he/she first choose which import to use.
Once the Importer is chosen, the first thing that happens when is the creation of a new instance of the chosen Importer. This fires the `__construct()` method.
## Choose a collection
After choosing the importer, user will be given the choice to choose the destination collection.
If called from inside a collection, this step is skipped and the current collection is set as destination.
If the importer has the attribute `$import_structure_and_mapping` set to `true`, and the importer is called from repository level,
they will also be able to choose to create a new collection.
In that cases, a new collection is created, and the importer must implement a method called `create_fields_and_mapping()`, which, as the name says, will create all collections metadata and set the mapping for the importer.
2018-03-30 22:40:21 +00:00
## Set options
2018-03-30 22:40:21 +00:00
Now its time to set the importer options. Each importer may have its own set of options, that will be used during the import process. It could be anything, from the delimiter character in a CSV importer, to an API key for a importer that fetches something from an API.
2018-03-30 22:40:21 +00:00
Importer classes should declare the default values for its options in the `__construct()` method, by calling `set_default_options()`.
2018-03-30 22:40:21 +00:00
```
namespace Tainacan\Importer;
2018-03-30 22:40:21 +00:00
class MyImporter extends Importer {
function __construct() {
parent::construct();
$this->set_default_options(['
'foo' => 'bar'
']);
}
}
```
2018-03-30 22:40:21 +00:00
The Importer classes must also implement the `options_form` method, in which it will echo the form that the user will interact to set the options.
2018-03-30 22:40:21 +00:00
```
function options_form() {
?>
<label>Foo</label>
<input type="text" name="options[foo]" value="<?php echo $this->get_options('foo'); ?>" />
<?php
}
```
TODO: better html reference and validate_options() method.
## Fetch source
Next, users will choose the source. Each importer declares the kind of sources they accpet: URL, file, or both.
2018-03-30 22:40:21 +00:00
Importers do this by calling the `add_import_method()` and `remove_import_method` in its construction. By default, importers will support only `file` method. Here is an example of what an importer thar accepts only URLs should do:
```
function __construct() {
parent::construct();
$this->add_import_method('url');
$this->remove_import_method('file');
}
```
If the Importer accepts the `file` method, user will be prompted with a file input to upload a file. The file will then be saved and will be accessible via the `$this->tmp_file` attribute.
If the importer accepts the `url` method, user will be prompted with an text input to enter an URL.By default, the importer will fetch any given URL to a file. However, each importer may override the `fetch_from_remote()` method and do whatever it want to create the file. For example, it could make several paged requests.
From that point forward, the importer will behave just as if it was using the file method.
## Mapping
At this point the user is presented with an interface to map the metadata from the source to the metadata present in the chosen collection.
2018-03-30 22:40:21 +00:00
The Importer class must implement the `get_fields()` method, that will return an array of the metadata found in the source. It can either return an hard coded array or an array that is red from the source file. For example, an importer that fetches data from an api knows beforehand what are the metadata the api will return, however, an importer that reads from a csv, may want to return whatever is found in the first line of the array.
2018-03-30 22:40:21 +00:00
```
// Example 1: returns a hard coded set of metadata
2018-03-30 22:40:21 +00:00
function get_fields() {
return [
'title',
'description'
'date',
];
}
// Example 2: returns the columns of the first line of a CSV file
public function get_fields(){
$file = new \SplFileObject( $this->tmp_file, 'r' );
$file->seek(0 );
return $file->fgetcsv( $this->get_option('delimiter') );
}
```
## Importing Collection Structure and Mapping
Alternatively, an importer may also create all the metadata and mappings from the source. In that cases, the user does not have to map anything.
2018-03-30 22:40:21 +00:00
First thing an Importer must do to accomplish this is to declare it does so in the `construct()`, by setting `$import_structure_and_mapping` to `true`.
```
function __construct() {
parent::construct();
$this->import_structure_and_mapping = true;
}
```
Second, the importer must implement the `create_fields_and_mapping()` to populate the collection with metadata and set the mapping.
2018-03-30 22:40:21 +00:00
In order to do this, the Importer will use Tainacan internal API to create the metadata. Please refer to the documentation (TODO: write this documentation).
2018-03-30 22:40:21 +00:00
This method must be aware that even a brand new collection comes with two core metadata (title and description), and use them.
2018-03-30 22:40:21 +00:00
Again, this metadata can come from the file, the URL or may be hardcoded in the function.
2018-03-30 22:40:21 +00:00
Example
```
function create_fields_and_mapping() {
$metadata_repository = \Tainacan\Repositories\Metadata::get_instance();
2018-03-30 22:40:21 +00:00
$newMetadatum1 = new \Tainacan\Entities\Metadatum();
$newMetadatum1->set_name = 'New Metadatum';
$newMetadatum1->set_field_type = 'Tainacan\Metadatum_Types\Text';
$newMetadatum1->set_collection($this->collection);
2018-03-30 22:40:21 +00:00
$newMetadatum1->validate(); // there is no user input here, so we can be sure it will validate.
$newMetadatum1 = $metadata_repository->insert($newMetadatum1);
2018-03-30 22:40:21 +00:00
$source_fields = $this->get_fields();
$this->set_mapping([
$newMetadatum1->get_id() => $source_fields[0]
2018-03-30 22:40:21 +00:00
]);
}
TODO: helpers and explanation on how to fetch the core metadata and map them
2018-03-30 22:40:21 +00:00
```
2018-03-30 22:40:21 +00:00
## Run importer
Finally, everything is ready. The importer runs.
The `run()` method is called, the importer runs a step of the import process, and returns the number of items imported so far. The client (browser) will repeat this request as many times as necessary to complete the process and will give feedback to the user about the progress.
2018-04-01 14:34:05 +00:00
In order to allow this, the importer must implement the `get_total_items_from_source()` method, which will inform the total number of items present in the source.
All the steps and insertion are handled by the Importer super class. The importer class only have to implement one method (`process_item()`) to handle one single item. It will receive the index of this item and it must return the item in as an array, where each key is the identifier of the source metadatum (the same used in the mapping array), and the values are each metadatum value.
2018-03-30 22:40:21 +00:00
In the end, a report is generated with all the logs generated in the process.