Troubleshooting the Production DB

Smarter insights enriched with Production DB metadata

What is the Metis Metadata Collector

The Metadata Collector collects metadata from your production database and sends it to Metis servers, this is it. The metadata is used by the Insights engine to provide more information to the user and calibrate the severity of the problem found.
For example, the Insights engine can know that a specific index is never used by the production DB. Or, while a table scan in the Staging DB read 5M rows, in the Production DB the query will read 100M rows.
Since the collector was designed for the Production env, it was built using the following principles:
  • Data Security: The collector only pushes data out and never listens to external requests
  • Data Privacy: The collector collects metadata only and not any data from the tables.
  • No impact on performance: The SQL queries used by the MMC run on the system catalog and usually finish in 1 -3 seconds. Also, some of the queries run once a day to minimize the impact on the PG server to almost nothing.
The code is Open Sourced (MIT license).

Data Flow

  1. 1.
    Deploy a Metadata Collector. Configure it to connect to the production DB.
  2. 2.
    The MC runs SQL commands every 1 to 24 hours. It uses a predefined configuration.
  3. 3.
    The metadata is sent to Metis SaaS
  4. 4.
    The Metis backend processes the data and generates insights.
  5. 5.
    View the metatada and insights in the web app Observability Reports.

Example of the collected metadata

To provide a holistic, end-to-end analysis, Metis collects metadata from the production database using the Metadata collector. The metadata contains information that exists only in the production database:
  • Storage: the size of the tables and indexes, number of dead rows, TOAST size
  • Usage: read and write operations on tables and indexes
  • Statistics: actual data distribution, last update date
  • Queries: duration, using disk for sorting
  • Configuration
  • Performance metrics: CPU, memory, IO, number of connections
  • WAL: size