This article explains how to configure the following settings in Hive:

  1. hive.server2.session.check.interval
  2. hive.server2.idle.operation.timeout
  3. hive.server2.idle.session.timeout

1). hive.server2.idle.session.timeout
Session will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero or a negative value.
For example, the value of “86400000” indicate that the session will be timed out after 1 day of inactivity.

2). hive.server2.session.check.interval
The check interval for session/operation timeout, in milliseconds, which can be disabled by setting to zero or a negative value.
For example, the value of “3600000” indicate that the session will be checked every 1 hour.

3) hive.server2.idle.operation.timeout
Operation will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero. For a positive value, checked for operations in terminal state only (FINISHED, CANCELED, CLOSED, ERROR). For a negative value, checked for all of the operations regardless of state.
For example, the value of “7200000” indicate that the query/operation will be timed out after 2 hours if it is still running.

So if you combine the three settings from above examples, we can summarize the following use cases:

1) If you started a HS2 session, beeline for example, and without doing anything afterwards, HS2 will trigger 24 session checking before it determines that 24 hours has passed since last activity, then session will be closed

2) If you worked on beeline for 2 hours and then leave the beeline open without doing anything afterwards, HS2 will trigger total of 26 session checking (2 while you worked and another 24 while in idle), and the session will be closed 26 hours after initially opened.

3) If you worked on beeline for 2 hours, and you started running a query that will run for 1 hour and then returns result, the idle timer actually starts from the time when data returns, so if you don’t do anything afterwards, HS2 will kill the session after another 24 hours, so in total, the session lasted 27 hours (2+1+24)

4) If, for instance, your query takes longer than the 2 hours defined by hive.server2.idle.operation.timeout, then the query will be cancelled, hence you will lose the result all together

5) If hive.server2.session.check.interval = 0 and hive.server2.idle.session.timeout > 0, then it will have the same effect that hive.server2.idle.session.timeout = 0, because no idle timer will be triggered as check interval is disabled

6) If hive.server2.session.check.interval > hive.server2.idle.session.timeout > 0, then the actual time out will be the same as hive.server2.session.check.interval.

For example if hive.server2.session.check.interval = 20 minutes, but hive.server2.idle.session.timeout = 10 minutes, then timeout will happen when checking happens, which is 20 minutes.

The basic rule of thumb is as follows:
hive.server2.session.check.interval < hive.server2.idle.operation.timeout < hive.server2.idle.session.timeout

And the recommended values are:

hive.server2.session.check.interval = 1 hour
hive.server2.idle.operation.timeout = 1 day
hive.server2.idle.session.timeout = 3 days

This will work for most of clusters, but you can change the value depending on how your cluster behaves and how long each query run for.

4 Comments

  1. Sai

    Hi Eric,

    Is this going to terminate the actual MapReduce job which is running in the background? I have set below properties, but MR job is still running even after 2 hours.

    hive.server2.session.check.interval = 60000 (in ms) # 10 minutes
    hive.server2.idle.operation.timeout = 1800000 (in ms) # 30 minutes
    hive.server2.idle.session.timeout = 2400000 (in ms) # 40 minutes

    1. Eric Lin

      Hi Sai,

      Thanks for your comment and apologies for the late response. I do not believe that Hive will kill the MR job that was triggered by the query that was run in the session. It would be a improvement for Hive.

  2. Adama DIABATE

    Hello,

    Thanks you for this great post!

    I am currently working on HDP 2.3.0.0 platform allowing Hive 1.2 queries.
    What are the default values of these parameters ?

    Regards

Leave a Reply

Your email address will not be published. Required fields are marked *