How to Clean a CSV File During a Product Import

Foundations of connector creation have been covered in the previous chapter (cf How to Create a New Connector). With the following hands-on practice, we will create our own specific connector.

To stay focused on the main concepts, we will implement the simplest connector possible by avoiding to use too many existing elements.

The use case is to clean the following CSV file when importing products:

1
2
3
4
sku;name
uselesspart-sku-1;my full name 1
uselesspart-sku-2;my full name 2
uselesspart-sku-3;my full name 3

Here, we want remove the prefix uselesspart- in the sku before to run a classic import.

We assume that we’re using a standard edition with the icecat_demo_dev data set, sku and name already exist as real attributes of the PIM.

Note

The code inside this cookbook entry is available in the src directory, you can clone pim-docs (https://github.com/akeneo/pim-docs) and use a symlink to make the Acme bundle available in the src/.

Create the Connector

Create a new bundle:

1
2
3
4
5
6
7
8
9
<?php

namespace Acme\Bundle\CsvCleanerConnectorBundle;

use Symfony\Component\HttpKernel\Bundle\Bundle;

class AcmeCsvCleanerConnectorBundle extends Bundle
{
}

Register the bundle in AppKernel:

1
2
3
4
5
6
public function registerBundles()
{
    // ...
        new Acme\Bundle\CsvCleanerConnectorBundle\AcmeCsvCleanerConnectorBundle(),
    // ...
}

Configure the Job

Configure a job in Resources/config/batch_jobs.yml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
connector:
    name: Cleaned Csv Connector
    jobs:
        cleaner_product_import:
            title: acme_cleaner_connector.jobs.cleaner_product_import.title
            type:  import
            steps:
                import:
                   title: acme_cleaner_connector.jobs.cleaner_product_import.import.title
                   services:
                        reader:    pim_connector.reader.file.csv_product
                        processor: acme_csvcleanerconnector.processor.denormalization.product.flat
                        writer:    pim_connector.writer.doctrine.product

Here we create an import job which contains a single step: import.

The default step is Akeneo\Bundle\BatchBundle\Step\ItemStep.

An item step is configured with 3 elements, a reader, a processor and a writer.

Here, we’ll use a custom processor service, named acme_csvcleanerconnector.processor.denormalization.product.flat, but we’ll continue to use default reader and writer.

Important

We strongly advise to always try to re-use most of the existing pieces, it ensures that all business rules and validation will be applied.

Configure the Processor

In fact, we’re using the default processor class but we create a new service to change the injected array converter (replace pim_connector.array_converter.flat.product by acme_csvcleanerconnector.array_converter.flat.product), all other services remain the same.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
services:
    acme_csvcleanerconnector.processor.denormalization.product.flat:
        class: %pim_connector.processor.denormalization.product.class%
        arguments:
            - '@acme_csvcleanerconnector.array_converter.flat.product'
            - '@pim_catalog.repository.product'
            - '@pim_catalog.builder.product'
            - '@pim_catalog.updater.product'
            - '@pim_catalog.validator.product'
            - '@akeneo_storage_utils.doctrine.object_detacher'
            - '@pim_catalog.comparator.filter.product'

Create the ArrayConverter

The purpose of the array converter is to transform the array provided by the reader to the standard array format, cf Understand the CSV Product Import

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
<?php

namespace Acme\Bundle\CsvCleanerConnectorBundle\ArrayConverter;

use Pim\Component\Connector\ArrayConverter\Flat\ProductStandardConverter as BaseProductConverter;
use Pim\Component\Connector\ArrayConverter\StandardArrayConverterInterface;

class ProductStandardConverter implements StandardArrayConverterInterface
{
    protected $baseProductConverter;

    public function __construct(BaseProductConverter $baseProductConverter)
    {
        $this->baseProductConverter = $baseProductConverter;
    }

    public function convert(array $item, array $options = [])
    {
        // [
        //     'sku'  => 'uselesspart-sku-1',
        //     'name' => 'my full name 1',
        // ]

        // clean sku
        $item['sku'] = str_replace('uselesspart-', '', $item['sku']);

        // [
        //     'sku'  => 'sku-1',
        //     'name' => 'my full name 1',
        // ]

        // use the base converter to convert to the standard format
        $convertedItem = $this->baseProductConverter->convert($item, $options);

        // [
        //     'sku' => [
        //         'data'   => 'sku-1',
        //         'locale' => NULL,
        //         'scope'  => NULL
        //     ],
        //     'name' => [
        //         'data'   => 'my full name 1',
        //         'locale' => NULL,
        //         'scope'  => NULL
        //     ],
        //     'enabled' => true
        // ]

        return $convertedItem;
    }
}

Then we declare this new array converter service in array_converters.yml.

1
2
3
4
5
6
7
8
parameters:
    acme_csvcleanerconnector.array_converter.flat.product.class: Acme\Bundle\CsvCleanerConnectorBundle\ArrayConverter\ProductStandardConverter

services:
    acme_csvcleanerconnector.array_converter.flat.product:
        class: %acme_csvcleanerconnector.array_converter.flat.product.class%
        arguments:
            - '@pim_connector.array_converter.flat.product'

Note

You can notice here that we use the Decorator Pattern by injecting the default array converter in our own class.

The big advantage of this practise is to decouple your custom code from the PIM code, for instance, if in the future, an extra dependency is injected in the constructor of the default array converter, your code will not be impacted.

Finally, we introduce the following extension to load the services files in configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
<?php

namespace Acme\Bundle\CsvCleanerConnectorBundle\DependencyInjection;

use Symfony\Component\HttpKernel\DependencyInjection\Extension;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\DependencyInjection\Loader;
use Symfony\Component\Config\FileLocator;

class AcmeCsvCleanerConnectorExtension extends Extension
{
    public function load(array $configs, ContainerBuilder $container)
    {
        $loader = new Loader\YamlFileLoader($container, new FileLocator(__DIR__.'/../Resources/config'));
        $loader->load('array_converters.yml');
        $loader->load('processors.yml');
    }
}

Use the new Connector

Now if you refresh the cache, the new export can be found under Extract > Import profiles > Create import profile.

You can run the job from the UI or you can use following command:

php app/console akeneo:batch:job my_job_code