最佳忠诚度选择政策的结构特性

论文标题

最佳忠诚度选择政策的结构特性

Structural Properties of Optimal Fidelity Selection Policies for Human-in-the-loop Queues

论文作者

Gupta, Piyush, Srivastava, Vaibhav

论文摘要

我们研究为服务队列的人类运营商的最佳忠诚度选择。代理可以在正常或高保真度级别上为任务提供服务，而忠诚度是指在为任务服务时的精确性和精确度。因此，高保真服务可提供更高质量的服务，但导致服务时间更长并增加了操作员的疲劳。我们将人类的认知状态视为一个捕获心理因素（例如工作量和疲劳）的集体参数。操作员的服务时间分布取决于她的认知动态和为任务服务的忠诚度。她的认知动力是马尔可夫链的发展，在这种链中，每当她忙碌并在休息时减少时，认知状态就会随着高概率而增加。这些任务是根据泊松过程到达的，并且操作员以固定费率对等待队列中的每个任务进行罚款。我们解决了任务高质量服务的权衡与随后使用离散时间半马尔科夫决策过程框架的队列长度增加而导致的罚款之间的权衡。我们在数值上确定最佳策略和相应的最佳值函数。最后，我们建立了最佳保真政策的结构性属性，并提供了最佳策略是基于门槛的政策的条件。

We study optimal fidelity selection for a human operator servicing a queue of homogeneous tasks. The agent can service a task with a normal or high fidelity level, where fidelity refers to the degree of exactness and precision while servicing the task. Therefore, high-fidelity servicing results in higher-quality service but leads to larger service times and increased operator tiredness. We treat the human cognitive state as a lumped parameter that captures psychological factors such as workload and fatigue. The operator's service time distribution depends on her cognitive dynamics and the fidelity level selected for servicing the task. Her cognitive dynamics evolve as a Markov chain in which the cognitive state increases with high probability whenever she is busy and decreases while resting. The tasks arrive according to a Poisson process and the operator is penalized at a fixed rate for each task waiting in the queue. We address the trade-off between high-quality service of the task and consequent penalty due to a subsequent increase in queue length using a discrete-time Semi-Markov Decision Process framework. We numerically determine an optimal policy and the corresponding optimal value function. Finally, we establish the structural properties of an optimal fidelity policy and provide conditions under which the optimal policy is a threshold-based policy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题