Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Originality: As far as I know, this is the first paper to study the theory of online Markov decoding rigorously. Quality: The results our sound. Clarity: The paper is well-written overall. However the lower bounds are in a form that makes it a bit hard to compare them with the upper bounds. Significance: Markov decoding is an important scientific tool. Designing fast algorithms for this problem is imperative in my humble opinion. --- Edit --- Lowered score following comments from other reviewers.
The algorithm uses a discounted reward for lookahead observed tokens (of size L) and runs dynamic programming (similar to Viterbi) to decode the current state. The discounted reward is the key to obtain the provable bounds, however, I believe the authors have not emphasized its importance enough and it is more implicit inside the proofs. At line 228, authors mentioned beam search, similar to Viterbi, requires to compute the optimal path of the full sequence. Beam search is a greedy approach and for beam size of k, only looks at the k top recent paths for decoding. In general, I liked the idea, but I guess the paper requires polishing and better presentation to deliver its message more clearly. The paper could benefit from having a clear algorithm (pseudo-code). ======== I increased my score based on the provided feedback and other reviews.
The authors propose three online inference methods, i.e., peek search, randomized peek search, and peek reset, for Markov chain models. They also use a proof framework to prove the competitive ratio for each method. The proof method first computes the lower bound of ON, so that gets the relationship between the ON and OPT. And then it uses the properties of geometric series to find an upper bound of competitive ratio. The authors also design a dynamic programming method to efficiently compute reward accumulation, and prove its complexity. Finally, they prove the lower bounds of competitive ratio for general deterministic online algorithms. In Section 7, they compare their method, i.e., peek search, with another online inference method OSA on a genome dataset. I have to admit that I am not familiar with this area, so can only go through a part of the proof, and I am not able to evaluate the originality and quality of this work. Pros: 1. The paper is well written and easy to follow. The proof is clear and thorough. 2. I think the proof framework should be useful for proving similar problems. 4. The proposed methods have solid theoretical guarantees. 3. I believe that the lower bounds are important, which can give guidance to design new online inference method. Cons: 1. The paper mentions adversarial rewards a few times, but I did not see further discussion or proof of this topic. 2. It would be better if the authors can provide more details about how to compute the sum of geometric series, and how to set the \gamma in the appendix. 3. In the experiments, the authors only use one dataset, and show the results of peek search. It would be better if the authors can show results on more dataset and results of randomized peek search and peek reset. =============================== I have read the authors' responses, and other reviewers' comments. I did not change my score. I agree with the other reviewers that the theorems in this paper are interesting. The main reason that I did not change the score is that I am not quite familiar with this area, so I cannot evaluate the significance of this paper. I do not want to overestimate its value.