MySQL Optimizer Tracer usage case with count(*)

What is Optimizer Trace? After reading topic about Optimizer Tracer by [Morgan Tocker][1] decided to test it. From [Optimizer Trace and EXPLAIN FORMAT=JSON in 5.7][2]: Optimizer trace is a new diagnostic tool introduced in MySQL 5.6 to show how the optimizer is working internally. It is similar to EXPLAIN, with a few notable differences: It doesn’t just show the intended execution plan, it shows the alternative choices. You enable the optimizer trace, then you run the actual query. It is far more verbose in its output. For understanding goal of article please read previous one about related verified optimizer BUG: [Playing with count() optimizer work][3] ** We have 2 queries: **select count() from sales; select count(*) from sales where sales_id > 0; Firstly let’s get explain plan for query with JSON format and as regular:

       -- JSON 
          mysql> explain format=json select count(*) from sales; |
          { "query_block": 
          { "select_id": 1, 
          "table": { "table_name": "sales", 
                     "access_type":  "index", 
                     "key": "sales_cust_idx", 
                     "used_key_parts": [ "CUSTOMER_ID" ] 
                    /*  used_key_parts */, 
                    "key_length": "4", 
                    "rows": 2489938, 
                    "filtered": 100,
                     "using_index": true }  
            /* table */ } 
           /* query_block */ } 

        mysql> explain select count(*) from salesG 
        *************************** 1. row *************************** 
        id: 1 
        select_type: SIMPLE 
        table: sales 
        type: index 
        possible_keys: NULL 
        key: sales_cust_idx 
        key_len: 4 
        ref: NULL 
        rows: 2489938 
        Extra: Using index 1 row in set (0.00 sec) 

Second query:

       -- JSON  
       mysql> explain format=json select count(*) from sales where sales_id > 0G         *************************** 1. row *************************** 
    EXPLAIN: 
    { "query_block": 
    { "select_id": 1, 
    "table": 
    { "table_name": "sales", 
      "access_type": "range",
      "possible_keys": [ "PRIMARY" ], "key": "PRIMARY", 
      "used_key_parts": [ "SALES_ID" ], "key_length": "4", 
      "rows": 1244969, 
      "filtered": 100, 
      "using_index": true, 
     "attached_condition": "(`sales`.`sales`.`SALES_ID` > 0)" } } } 


     mysql> explain select count(*) from sales where sales_id > 0G 
     *************************** 1. row *************************** 
     id: 1 
     select_type: SIMPLE 
     table: sales 
     type: range 
     possible_keys: PRIMARY 
     key: PRIMARY 
     key_len: 4 
     ref: NULL 
     rows: 1244969 
     Extra: Using where; Using index 1 row in set (0.00 sec)  

From Explain plan it is obvious that, first query will use “Index Scan”, second will use “Range Scan + Index”. First query will use, sales_cust_idx in customer_id column, second query will use primary key in sales_id column. From first view, there now difference between queries, but optimizer estimates half of rows when attaching sales_id > 0 condition. See related BUG: [#68814][4] Now let’s examine problem with Optimizer Tracer. So before running query you should enable optimizer trace:

      SET OPTIMIZER_TRACE="enabled=on",END_MARKERS_IN_JSON=on;
      SET OPTIMIZER_TRACE_MAX_MEM_SIZE=1000000; 

After run first query:

      mysql> select count(*) from sales; 
      +----------+ 
      | count(*) |
      +----------+
      | 2500003 |
      +----------+ 
     1 row in set (0.58 sec) 

Query to OPTIMIZER_TRACE table from information_schema:

     mysql> select query, trace from INFORMATION_SCHEMA.OPTIMIZER_TRACE;
     select count(*) from sales | 
    { "steps": 
   [ { "join_preparation": 
    { "select#": 1, 
    "steps": [ { "expanded_query": "/* select#1 */ select count(0) AS `count(*)` from `sales`" } ] /* steps */ }
    /* join_preparation */ },
    { "join_optimization": 
       { "select#": 1, 
         "steps": [ { "table_dependencies": [ { "table": "`sales`", "row_may_be_null": false, "map_bit": 0, "depends_on_map_bits": [ ] 
    /* depends_on_map_bits */ } ] 
    /* table_dependencies */ }, 
      { "rows_estimation": 
           [ { "table": "`sales`", "table_scan": { "rows": 2489938, "cost": 10347 }
          /* table_scan */ } ] 
          /* rows_estimation */ }, 
           { "considered_execution_plans": [ { "plan_prefix": [ ] 
           /* plan_prefix */,
           "table": "`sales`", 
           "best_access_path": { "considered_access_paths": [ { "access_type": "scan", "rows": 2.49e6, "cost": 508335, "chosen": true } ] 
          /* considered_access_paths */ } 
          /* best_access_path */, 
         "cost_for_plan": 508335,
         "rows_for_plan": 2.49e6,
         "chosen": true } ]
         /* considered_execution_plans */ },
        { "attaching_conditions_to_tables": { "original_condition": null,  "attached_conditions_computation": [ ] 
        /* attached_conditions_computation */,
       "attached_conditions_summary": [ { "table": "`sales`", "attached": null } ] 
        /* attached_conditions_summary */ } 
        /* attaching_conditions_to_tables */ },
       { "refine_plan": [ { "table": "`sales`", "access_type": "index_scan" } ] 
        /* refine_plan */ } ] 
        /* steps */ } 
        /* join_optimization */ },
        { "join_execution": { "select#": 1, "steps": [ ] 
        /* steps */ } 
        /* join_execution */ } ] 
        /* steps */ 

Interesting part for query 1 is -> “cost_for_plan”: 508335, “rows_for_plan”: 2.49e6, “chosen”: true Cost is 508335, rows for plan is 2.49e6 = 2490000 rows, is roughly equal to explain plan estimation.
Now second query:

     mysql> select count(*) from sales where sales_id > 0;
     +----------+ 
     | count(*) |
     +----------+ 
     | 2500003 |
     +----------+ 1 row in set (1.18 sec) 

Query to OPTIMIZER_TRACE:

     mysql> select query, trace from INFORMATION_SCHEMA.OPTIMIZER_TRACE; 
     select count(*) from sales where sales_id > 0 | 
     { "steps": [ { "join_preparation": 
        { "select#": 1, 
          "steps": [ { "expanded_query": "/* select#1 */ select count(0) AS `count(*)` from `sales` where (`sales`.`SALES_ID` > 0)" } ] 
        /* steps */ } 
        /* join_preparation */ },
       { "join_optimization": 
          { "select#": 1, 
            "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(`sales`.`SALES_ID` > 0)", "steps": [ { "transformation": "equality_propagation", 
           "resulting_condition": "(`sales`.`SALES_ID` > 0)" }, 
          { "transformation": "constant_propagation", "resulting_condition": "(`sales`.`SALES_ID` > 0)" }, 
          { "transformation": "trivial_condition_removal", "resulting_condition": "(`sales`.`SALES_ID` > 0)" } ] 
         /* steps */ } 
         /* condition_processing */ }, 
        { "table_dependencies": [ { "table": "`sales`", "row_may_be_null": false, "map_bit": 0, "depends_on_map_bits": [ ] 
        /* depends_on_map_bits */ } ] 
        /* table_dependencies */ }, 
       { "ref_optimizer_key_uses": [ ] 
        /* ref_optimizer_key_uses */ },
       { "rows_estimation": [ { "table": "`sales`", "range_analysis": { "table_scan": {  "rows": 2489938, "cost": 508337 } 
       /* table_scan */, 
       "potential_range_indices": [ { "index": "PRIMARY", "usable": true, "key_parts": [ "SALES_ID" ] 
      /* key_parts */ }, 
      { "index": "sales_cust_idx", "usable": false, "cause": "not_applicable" } ] 
      /* potential_range_indices */,
     "best_covering_index_scan": { "index": "sales_cust_idx", "cost": 500418, "chosen": true } 
      /* best_covering_index_scan */,
      "setup_range_conditions": [ ] 
      /* setup_range_conditions */, 
      "group_index_range": { "chosen": false, "cause": "not_group_by_or_distinct" } 
      /* group_index_range */, 
      "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "PRIMARY", "ranges": [ "0 < SALES_ID" ] 
     /* ranges */, 
     "index_dives_for_eq_ranges": true, 
     "rowid_ordered": true, 
     "using_mrr": false, "index_only": true, "rows": 1244969, "cost": 251364, "chosen": true } ] 
    /* range_scan_alternatives */,
   "analyzing_roworder_intersect": { "usable": false, "cause": "too_few_roworder_scans" }  /* analyzing_roworder_intersect */ } 
    /* analyzing_range_alternatives */,
   "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", "index": "PRIMARY", "rows": 1244969, "ranges": [ "0 < SALES_ID" ] 
   /* ranges */ } 
   /* range_access_plan */, 
   "rows_for_plan": 1244969, "cost_for_plan": 251364, "chosen": true } 
   /* chosen_range_access_summary */ } 
   /* range_analysis */ } ] 
   /* rows_estimation */ },
    { "considered_execution_plans": [ { "plan_prefix": [ ] /* plan_prefix */, "table": "`sales`", "best_access_path": { "considered_access_paths": [ { "access_type": "range", "rows": 1.24e6, "cost": 500357, "chosen": true } ] 
    /* considered_access_paths */ } 
    /* best_access_path */, 
    "cost_for_plan": 500357, "rows_for_plan": 1.24e6, "chosen": true } ] 
    /* considered_execution_plans */ },
    { "attaching_conditions_to_tables": { "original_condition": "(`sales`.`SALES_ID` > 0)", "attached_conditions_computation": [ ] 
   /* attached_conditions_computation */,
   "attached_conditions_summary": [ { "table": "`sales`", "attached": "(`sales`.`SALES_ID` > 0)" } ] 
   /* attached_conditions_summary */ } 
   /* attaching_conditions_to_tables */ }, 
   { "refine_plan": [ { "table": "`sales`", "access_type": "range" } ] /* refine_plan */ } ] 
   /* steps */ } 
   /* join_optimization */ }, 
  { "join_execution": { "select#": 1, "steps": [ ] /* steps */ } /* join_execution */ } ] /* steps */ 

It is much more complicated with second query and due to lacking documentation for all output, i am looking for explanations from experts.
First thing is, it says in “potential_range_indices” that “index”: “sales_cust_index” is not usable:

      "potential_range_indices": 
      [ { "index": "PRIMARY", "usable": true, "key_parts": [ "SALES_ID" ] 
      /* key_parts */ },
      { "index": "sales_cust_idx", "usable": false, "cause": "not_applicable" } ] 

But in “best_covering_index_scan”, “index”: “sales_cust_idx” is marked as “chosen”:true

       "best_covering_index_scan": { "index": "sales_cust_idx", "cost": 500418, "chosen": true } 

Second thing is, in “range_scan_alternatives” and,

        "analyzing_range_alternatives": 
        { "range_scan_alternatives": [ { "index": "PRIMARY", "ranges": [ "0 < SALES_ID" ]  
        /* ranges */, 
       "index_dives_for_eq_ranges": true, "rowid_ordered": true, "using_mrr": false, "index_only": true, "rows": 1244969, "cost": 251364, "chosen": true } 

in “chosen_range_access_summary”, “rows_for_plan” is 1244969 and “cost_for_plan” is 251364

       "chosen_range_access_summary": 
       { "range_access_plan": 
        { "type": "range_scan", "index": "PRIMARY", "rows": 1244969, "ranges": [ "0 < SALES_ID" ] 
         /* ranges */ } 
         /* range_access_plan */, 
        "rows_for_plan": 1244969, "cost_for_plan": 251364, "chosen": true } 

But for final “best_access_path” “cost_for_plan” is increased to 500357 and “rows_for_plan” is 1.24e6 = 1240000:

       "best_access_path": { "considered_access_paths": [ { "access_type": "range", "rows": 1.24e6, "cost": 500357, "chosen": true } ] 
       /* considered_access_paths */ } 
       /* best_access_path */, 
       "cost_for_plan": 500357, "rows_for_plan": 1.24e6, "chosen": true }

Third thing is that, sales_id > 0 is rewritten to 0 < sales_id

      "ranges": [ "0 < SALES_ID" ] 

*** After explanations from community this article will be updated ***

http://www.tocker.ca/
http://www.tocker.ca/2015/05/25/optimizer-trace-and-explain-formatjson-in-5-7.html
https://mysql.az/playing-with-count-optimizer-work/
http://bugs.mysql.com/bug.php?id=68814

Author: Shahriyar Rzayev

Azerbaijan MySQL User Group leader.

9 thoughts on “MySQL Optimizer Tracer usage case with count(*)”

  1. The part about sales_cust_index not being usable, but then being used is similar to EXPLAIN. For index scan operations it will say possible_keys NULL, and then key used: sales_cust_index.

    With the remaining output, you can see that it finds a plan and then sees if other optimizations apply. They may be eliminated due to cost, or not being applicable.

    Like

    1. But for my instance for eg. exactly with my query, there is no NULL in possible_keys column in explain plan.

      It will use Primary key:
      mysql> explain select count(*) from sales where sales_id > 0G
      *************************** 1. row ***************************
      id: 1
      select_type: SIMPLE
      table: sales
      type: range
      possible_keys: PRIMARY
      key: PRIMARY
      key_len: 4
      ref: NULL
      rows: 1244969
      Extra: Using where; Using index
      1 row in set (0.00 sec)

      You mean that, even an index does not appear in EXPLAIN plan, it is possible that optimizer will use it internally?
      So, there is no note about “index”: “sales_cust_idx” in EXPLAIN plan. But from Optimizer Tracer we should say that in fact it will use this index internally?

      Like

      1. Here is an actual example:

        CREATE TABLE numbers (id INT NOT NULL PRIMARY KEY, b varchar(255));
        INSERT INTO numbers VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc'), (4, 'ddd');
        mysql> EXPLAIN SELECT sum(id) FROM numbersG
        ************************** 1. row ***************************
        id: 1
        select_type: SIMPLE
        table: numbers
        partitions: NULL
        type: index
        possible_keys: NULL <-- No possible key
        key: PRIMARY <-- Key used
        key_len: 4
        ref: NULL
        rows: 4
        filtered: 100.00
        Extra: Using index

        In your query there was no NULL in possible keys, because there was another plan chosen.  Had this plan been chosen, you would have seen output in EXPLAIN similar to the output I’ve pasted here 🙂

        Like

      2. 🙂 Thank you for answers,
        So in fact we should consider Optimizer Tracer as overall optimizer full pathway tracer, it is possible to see several plans in one output, but only one of them will be chosen among them.
        Is it true? )))

        Like

      3. 🙂 the true power of community.
        In the future who will search for Optimizer Tracer will read our comments and surely will understand why to use this tool.
        So in conclusion we can say that, in my query:
        “index”: “sales_cust_idx” was chosen as “best_covering_index_scan” (“chosen”: true), but it is not considered to be in “potential_range_indices” due to:

        {
        "index": "sales_cust_idx",
        "usable": false,
        "cause": "not_applicable"
        }

        In contrast PRIMARY key was applicable.
        Thank you again for clarifying ))

        Like

  2. That’s right.  It looked at the sales_cust_idx , but it was not applicable for range because your WHERE clause doesn’t have anything that filters by CUSTOMER_ID (the first column of sales_cust_idx).

    If you had an index on sales_id, but most sales were > 0, it would say it was applicable for range but not chosen due to cost (unable to filter out enough rows).  Try it!

    Like

    1. I have PRIMARY key on sales_id column,here is create statement for table 🙂
      CREATE TABLE `sales` (
      `SALES_ID` int(8) NOT NULL AUTO_INCREMENT,
      `CUSTOMER_ID` decimal(8,0) NOT NULL,
      `PRODUCT_ID` decimal(8,0) NOT NULL,
      `SALE_DATE` datetime NOT NULL,
      `QUANTITY` decimal(8,0) NOT NULL,
      `SALE_VALUE` decimal(8,0) NOT NULL,
      `DEPARTMENT_ID` decimal(8,0) DEFAULT '0',
      `SALES_REP_ID` decimal(8,0) DEFAULT '0',
      `GST_FLAG` decimal(8,0) DEFAULT NULL,
      `sale_status` char(1) DEFAULT NULL,
      `FREE_SHIPPING` char(1) DEFAULT '',
      `DISCOUNT` decimal(8,0) unsigned DEFAULT '0',
      PRIMARY KEY (`SALES_ID`),
      KEY `sales_cust_idx` (`CUSTOMER_ID`)
      ) ENGINE=InnoDB AUTO_INCREMENT=2500004 DEFAULT CHARSET=latin1

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s