Wendelin Home Wendelin

    HowTo ingest data with fluentd inside Wendelin

    A tutorial showing how to ingest data into Wendelin using fluentd.
    • Last Update:2020-07-30
    • Version:002
    • Language:en

    Abstract

    This short HowTo will teach you how to ingest data inside Wendelin platform using fluentd. In order to do so you must have already a Wendelin instance ready and know its URL and username / password to access. There's no need of additional configuration at Wendelin side as it come already pre configured.

    You can read wendelin-HowTo.Install.Wendelin.Standalone to know how to install Wendelin.

    For the purpose of the HowTo we will show how to ingest a simple JSON data but it can be anything.

    Step 1: Install fluentd and Wendelin fluentd plugin

    root@debian: ~$ apt install ruby ruby-dev
    root@debian: ~$ gem install --user-install fluentd
    root@debian: ~$ gem install --user-install fluent-plugin-wendelin

    Step 2: Clone default Wendelin's plugin directory

    Before this step you need to be aware of your Wendelin's instance URL, username and password.

    ivan@debian: ~$ git clone https://lab.nexedi.com/nexedi/fluent-plugin-wendelin.git
    ivan@debian: ~$ cd fluent-plugin-wendelin/example
    # set proper username / password and URL in configuration file!
    ivan@debian: ~/fluent-plugin-wendelin/example$ vi to_wendelin.conf
    ivan@debian: ~/fluent-plugin-wendelin/example$ ~/.gem/ruby/2.7.0/bin/fluentd -v -c to_wendelin.conf

    Step 3: Ingest

    ivan@debian: ~$ curl -X POST -d 'json={"foo1":"bar1"}' http://localhost:8888/test_sensor.test_product

    Step 4: Check everything is successfully ingested at Wendelin side

    Wendelin's Data model is quite complex. For the purpose of the HowTo it's enough to see where data was successfully ingested. In the concrete example it's ingested inside a "Data Stream" object which has a reference "test_sensor-test_product". By going to this object's view we shall see it's size which should increase after multiple "curl" calls.

    Also one can also use following command line to read what was ingested into the Data Stream (please use as a template and insert proper values for your setup!)

    ivan@debian: ~$ curl -su <your_wendelin_user>:<your_wendelin_password>  <Wendelin_URL>/erp5/data_stream_module/<Ingested_Data_Stream_Id> -r 0-19 > instance1.msgpack
    ivan@debian: ~$ python
    >>> import msgpack
    >>> msgpack.unpackb(open("instance1.msgpack").read())
    [1596106294, {'foo1': 'bar1'}]
    # 1596106294 is the timestamp value inserted by wendelin fluentd plugin.