How to Clean a CSV File During a Product Import

Foundations of connector creation have been covered in the previous chapter (cf How to Create a New Connector). With the following hands-on practice, we will create our own specific connector.

To stay focused on the main concepts, we will implement the simplest connector possible by avoiding to use too many existing elements.

The use case is to clean the following CSV file when importing products:

1
2
3
4
sku;name
uselesspart-sku-1;my full name 1
uselesspart-sku-2;my full name 2
uselesspart-sku-3;my full name 3

Here, we want remove the prefix uselesspart- in the sku before running a classic import.

We assume that we’re using a standard edition with the icecat_demo_dev data set, sku and name already exist as real attributes of the PIM.

Note

The code inside this cookbook entry is available in the src directory, you can clone pim-docs (https://github.com/akeneo/pim-docs) and use a symlink to make the Acme bundle available in the src/. The same cookbook could be applied for XLSX Product Import

Create the Connector

Create a new bundle:

1
2
3
4
5
6
7
8
9
<?php

namespace Acme\Bundle\CsvCleanerConnectorBundle;

use Symfony\Component\HttpKernel\Bundle\Bundle;

class AcmeCsvCleanerConnectorBundle extends Bundle
{
}

Register the bundle in AppKernel:

1
2
3
4
5
6
public function registerBundles()
{
    // ...
        new Acme\Bundle\CsvCleanerConnectorBundle\AcmeCsvCleanerConnectorBundle(),
    // ...
}

Create the ArrayConverter

The purpose of the array converter is to transform the array provided by the reader to the standard array format, cf Understanding the Product Import

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<?php

namespace Acme\Bundle\CsvCleanerConnectorBundle\ArrayConverter\StandardToFlat;

use Pim\Component\Connector\ArrayConverter\ArrayConverterInterface;

class Product implements ArrayConverterInterface
{
    /** @var ArrayConverterInterface */
    protected $productConverter;

    /**
     * @param ArrayConverterInterface $productConverter
     */
    public function __construct(ArrayConverterInterface $productConverter)
    {
        $this->productConverter = $productConverter;
    }

    public function convert(array $item, array $options = [])
    {
        // cleans the sku
        $item['sku'] = str_replace('uselesspart-', '', $item['sku']);

        $convertedItem = $this->productConverter->convert($item, $options);



        return $convertedItem;
    }
}

Then we declare this new array converter service in array_converters.yml.

1
2
3
4
5
6
7
8
parameters:
    acme_csvcleanerconnector.array_converter.flat.product.class: 'Acme\Bundle\CsvCleanerConnectorBundle\ArrayConverter\StandardToFlat\Product'

services:
    acme_csvcleanerconnector.array_converter.flat.product:
        class: '%acme_csvcleanerconnector.array_converter.flat.product.class%'
        arguments:
            - '@pim_connector.array_converter.flat_to_standard.product_delocalized'

Note

You can notice here that we use the Decorator Pattern by injecting the default array converter in our own class.

The big advantage of this practice is to decouple your custom code from the PIM code, for instance, if in the future, an extra dependency is injected in the constructor of the default array converter, your code will not be impacted.

Finally, we introduce the following extension to load the services files in configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?php

namespace Acme\Bundle\CsvCleanerConnectorBundle\DependencyInjection;

use Symfony\Component\HttpKernel\DependencyInjection\Extension;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\DependencyInjection\Loader;
use Symfony\Component\Config\FileLocator;

class AcmeCsvCleanerConnectorExtension extends Extension
{
    public function load(array $configs, ContainerBuilder $container)
    {
        $loader = new Loader\YamlFileLoader($container, new FileLocator(__DIR__.'/../Resources/config'));
        $loader->load('array_converters.yml');
        $loader->load('jobs.yml');
        $loader->load('job_constraints.yml');
        $loader->load('job_defaults.yml');
        $loader->load('job_parameters.yml');
        $loader->load('readers.yml');
        $loader->load('steps.yml');
    }
}

Configure the Job

To be executed, a Job is launched with a JobParameters which contains runtime parameters. We also need a ConstraintCollectionProviderInterface which contains the form job constraints and the DefaultValuesProviderInterface which contains the default job values.

First we need to define the reader service with the new array converter:

1
2
3
4
5
6
7
8
services:
    acme_csvcleanerconnector.reader.file.csv_product:
        class: '%pim_connector.reader.file.csv_product.class%'
        arguments:
            - '@pim_connector.reader.file.csv_iterator_factory'
            - '@acme_csvcleanerconnector.array_converter.flat.product'
            - '@pim_connector.reader.file.media_path_transformer'
            - []

Then we need to create our custom step service definition to use our new reader:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
services:
    acme_csvcleanerconnector.step.csv_product.import:
        class: '%pim_connector.step.item_step.class%'
        arguments:
            - 'import'
            - '@event_dispatcher'
            - '@akeneo_batch.job_repository'
            - '@acme_csvcleanerconnector.reader.file.csv_product'
            - '@pim_connector.processor.denormalization.product'
            - '@pim_connector.writer.database.product'

Finally we need to create a new job configuration that uses our custom step:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
parameters:
    acme_csvcleanerconnector.job_name.csv_product_import_cleaner: 'csv_product_import_cleaner'

services:
    acme_csvcleanerconnector.job.csv_product_import_cleaner:
        class: '%pim_connector.job.simple_job.class%'
        arguments:
            - '%acme_csvcleanerconnector.job_name.csv_product_import_cleaner%'
            - '@event_dispatcher'
            - '@akeneo_batch.job_repository'
            -
                - '@pim_connector.step.charset_validator'
                - '@acme_csvcleanerconnector.step.csv_product.import'
        tags:
            - { name: akeneo_batch.job, connector: '%pim_connector.connector_name.csv%', type: '%pim_connector.job.import_type%' }

At this point, the job is usable in command line though it cannot be configured via the UI. We need to write a service providing the form type configuration for each parameter of our JobParameters instance:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
services:
    acme_csvcleanerconnector.job_parameters.form_configuration_provider.product_csv_import_cleaner:
        class: '%pim_import_export.job_parameters.form_configuration_provider.product_csv_import.class%'
        arguments:
            - '@pim_import_export.job_parameters.form_configuration_provider.simple_csv_import'
            -
                - 'csv_product_import_cleaner'
            - '%pim_catalog.localization.decimal_separators%'
            - '%pim_catalog.localization.date_formats%'
        tags:
            - { name: pim_import_export.job_parameters.form_configuration_provider }
1
2
3
4
5
6
7
8
9
services:
    acme_csvcleanerconnector.job.job_parameters.constraint_collection_provider.product_csv_import:
        class: '%pim_connector.job.job_parameters.constraint_collection_provider.product_csv_import.class%'
        arguments:
            - '@pim_connector.job.job_parameters.constraint_collection_provider.simple_csv_import'
            -
                - 'csv_product_import_cleaner'
        tags:
            - { name: akeneo_batch.job.job_parameters.constraint_collection_provider }
1
2
3
4
5
6
7
8
9
services:
    acme_csvcleanerconnector.job.job_parameters.default_values_provider.product_csv_import:
        class: '%pim_connector.job.job_parameters.default_values_provider.product_csv_import.class%'
        arguments:
            - '@pim_connector.job.job_parameters.default_values_provider.simple_csv_import'
            -
                - 'csv_product_import_cleaner'
        tags:
            - { name: akeneo_batch.job.job_parameters.default_values_provider }

For further information you can check the How to Create a New Connector

As for the jobs.yml, this service file job_parameters.yml must be loaded in our AcmeCsvCleanerConnectorExtension.

Translate Job and Step labels in the UI

Behind the scene, the service Pim\Bundle\ImportExportBundle\JobLabel\TranslatedLabelProvider provides translated Job and Step labels to be used in the UI.

This service uses following conventions:
  • for a job label, given a $jobName, “batch_jobs.$jobName.label”
  • for a step label, given a $jobName and a $stepName, “batch_jobs.$jobName.$stepName.label”

Create a file Resources/translations/messages.en.yml in our Bundle to translate label keys.

1
2
3
4
5
batch_jobs:
    csv_product_import_cleaner:
        label: Product Import Cleaned Csv
        import.label: Product import cleaned
        validation.label:  File encoding validation

Use the new Connector

Now if you refresh the cache, the new export can be found under Extract > Import profiles > Create import profile.

You can run the job from the UI or you can use following command:

php app/console akeneo:batch:job my_job_code