How to create a The DataTank extension

Extending The DataTank's functionality without touching the core


Project maintained by tdtext Hosted on GitHub Pages — Theme by mattgraham

tdtext

Have you as well been dreaming about a world where it is possible for anyone to write a The DataTank extension and amaze the world of Open Data?

With tdtext, this becomes possible. Just like people can easily create ckan extensions (ckanext, see what we did there?) with a minimum of documentation, you are able to create The DataTank extensions which plug into the core workflow. The DataTank extension systems is built for the ease of developers. Once a The DataTank is installed, it should be a breeze to install: scrapers, extra formatters, extra visualizations, strategies to read data, and so on.

Architecture

We have one singleton class called TdtextNotifier. This class will be updated by tdt/core whenever something happens: the routes are loaded, the formatters are loaded, the documentation is updated, and so on. TdtextNotifier will collect all updates and will direct then to registered tdtext classes. These classes can be added to the configuration. What event will go to which class is decided based on the interfaces they inherit from.

A new github organisation

http://github.com/tdtext

This is the organisation where we will add all official extensions for The DataTank and a barebone repository which everyone can fork to start its own tdtextension.

It also contains these docs at http://tdtext.github.io/ which can be edited at http://github.com/tdtext/tdtext.github.io.

Installing extensions using composer

If you're not familiar with composer, check out http://getcomposer.org/doc/00-intro.md.

cd /path/to/tdt/root/
composer require tdtext/${extensionname}
composer update # this will also trigger the code to enable the extension. You can disable it in your config.

Writing a tdt extension

Once you have tested it by installing it locally, you can add it to packagist: https://packagist.org/packages/submit

The Interfaces (the hard way)

tdt/core/tdtext/IRoutesEditor

With the route mapper you can add a controller for a specific route. This comes in handy if you want to go past all tdt controller handling. This is not recommended if you're going to be handling data through here. It is recommended if you want to implement your own brand new functionality for certain URI patterns.

For example:

class MyOwnRouteController implements IRoutesEditor {

    function editRoutes(&$routes){
        //key of routes is divided by a |
        //the first part is the HTTP Method use, the second part is a regular expression
        $routes["GET | about/(?P<person>[^?]+)"] = get_class($this);
    }

    function GET($matches){
        echo "A page about " . urldecode($matches["person"]);
    }
}

tdt/core/tdtext/IDefinitionsEditor

Called when the documentation in tdt/core is ready

You can add your own documentation for a new resource if you want to have your own configuration in there. You can use it to define new configurations without them being stored in the database.

For example:

class NewDefinitionAdder IDefinitionsEditor {
    function editDefinitions(&$definitions){
        $definitions[...] = ...;
    }
}

tdt/core/tdtext/IFormattersEditor

Add your own formatter to the list

Interface IFormattersEditor {
    /**
    * Add or edit formatters in this array
    */
    abstract function editFormatters(&$formatters);
}

tdt/core/tdtext/ITransformer

A transformer transforms an object after it is read into memory.

Interface ITransformer {
   /**
    * Add or edit an object from the moment is read into memory
    * @param $resourceconfiguration contains the identifier of a resource and the configuration
    * @param $object is the data object
    */
    abstract function transform($resourceconfiguration, &$object);
}

...WIP

Using an abstract class (recommended - it's there if you don't know about The DataTank internals

tdt/core/tdtext/AFormatter

Write a new behaviour for a certain format.

For example:

class YAMLFormatter extends \tdt\core\tdtext\AFormatter {
    public function __construct(){
        $this->name = "YAML"; //you can also override the behaviour for for example XML or JSON
    }

    function getGETParameters(){
        return array();
    }

    function print($resourceconfiguration, $parameters, $object){
        //YAML print code
    }
}

tdt/core/tdtext/AScraper

Writing a scraper can be done easily by extending the AScraper class. Use this class if you want to have a certain URI in The DataTank to scrape data from a certain source.

Example can be found in the barebone repository which can be forked to start your own: http://github.com/tdtext/barebone

tdt/core/tdtext/AStrategy

If you want to implement a new strategy next to the standard ones for reading a certain source (e.g. a NoSQL cluster or a certain data format), extend this abstract class.

Draft:

abstract class AStrategy implements ... {

    /**
     * Returns an array according to the discovery API of parameter objects.
     * They include the parameters needed to read a resource which's source uses this strategy.
     */
    abstract function getGETParameters();

    /**
     * Returns an array according to the discovery API of parameter objects.
     * They include documentation about whether the parameter is required or not when configuring a source of this strategy type through a PUT request.
     */
    abstract function getConfigParameters();

    /**
     * when reading the a resource configured with this strategy, this is what's going to happen.
     * The resourceconfiguration contains the resourceidentifier and the config parameters (as defined by the getConfigParameters() function)
     */
    abstract function read($resourceconfiguration, $parameters);
}