Uploaded image for project: 'Okapi'
  1. Okapi
  2. OKAPI-860

Add HTTP (API calls, system calls) metrics to Okapi



    • CP: sprint 92, CP: sprint 94, CP: sprint 95
    • 8
    • Core: Platform


      In order to get an idea of how Okapi performs on a detailed level, the following custom metrics will be useful to have added:

      1. Incoming request rate (in reqs/sec)
      2. Internal end-to-end processing time (after receiving the request to right before sending the response)
      3. Any outgoing API call need to have their own set of metrics, which include:
      • rate of calling x API or (for certain operations that are suspicious of highly expensive)
      • count of calls made to x API.
      • response time - the time measured right before sending off the request to finish receiving the last bit of the response.

      For example, if Okapi calls to mod-authtoken, it'd be great to know the rate of calling it, the number of calls, and the time that it waits to get the response back.

      1. Count of errors


      The metrics reported for incoming (proxied) calls and outgoing calls should include enough metadata to allow categorize and group them, including:

      • moduleId e.g mod-authtoken-1.2.3, unless it makes sense to tag with moduleName and moduleVersion seperately
      • uri the "path" of the HTTP call
      • queryString to capture the parameters
      • whether it's an incoming (proxy) calls or outgoing (system) call that Okapi makes
      • phase for the "filter" e.g 'auth'

      Metrics reported should be easy to integrate with InfluxDB, The proposal is to use https://vertx.io/docs/vertx-micrometer-metrics/java/
      Micrometer concepts: https://micrometer.io/docs/concepts
      RED monitoring method https://www.weave.works/blog/the-red-method-key-metrics-for-microservices-architecture/

      Design and implementation
      Following meters (and tags) are defined and implemented for Okapi HTTP proxy calls. They have the same Vert.x core tools metrics naming convention. Note the dots in the name are converted to _ in InfluxDB automatically. There is no error meter defined for server side because that is covered by server processingTime meter.

      • org.folio.okapi.http.server.processingTime - metrics type is Micrometer Timer. Tags are below
        • host - Okapi instance. An example value: ip-172-31-19-200.ec2.internal/
        • tenant - FOLIO tenant id. An example value: diku
        • code - HTTP response code. An example value: 200
        • method - HTTP request method. An example value: GET
        • module - FOLIO module id. An example value: mod-authtoken-2.6.0-SNAPSHOT.73
        • url - HTTP request url as defined in module descriptor. An example value: /users/ {id}
      • org.folio.okapi.http.client.responseTime - metrics type is Micrometer Timer. Tags are below
        • all tags defined above plus
        • phase - FOLIO proxy phase. An example value: auth. The default value is handler
      • org.folio.okapi.http.client.errors - metrics type is Micrometer Counter. Tags are below
        • tags: host, tenant, method, and url

      TestRail: Results


          Issue Links



                hji Hongwei Ji
                mtraneis Martin Tran
                0 Vote for this issue
                6 Start watching this issue



                  TestRail: Runs

                    TestRail: Cases