Summary and Contributions: This paper proposes a NN approach to predict implied volatility surfaces (IVSs). The proposed method can be combined with standard IVS models in quantitative finance, thus providing a NN-based correction when econometric models fail at replicating observed market prices.
Strengths: + The goal of this paper is clear, and the paper is generally well-written. + This paper aims at an important topic in quantitative finance: Accurately predict IVS. + Adding some non-arbitrage conditions into the loss function is novel. + The model presentation, mathematical formulas and symbol definitions are all very clear. + The is the first work that suggests the use of soft-constraints on the IVS values to guide the training of a deep model.
Weaknesses: - Lack of theoretical justification for the proposed method - A short period of data is used in the experiments, making the empirical results not so convincing. I'm not sure if the model still works for periods of important financial events, e.g., the financial crisis of 2007–2008. - The reference format is not consistent. - A typo in LINE 190 for the MAPE equation.
Correctness: Yes
Clarity: Yes
Relation to Prior Work: Yes
Reproducibility: Yes
Additional Feedback:
Summary and Contributions: This work proposed a neural network approach to fit and predict implied volatility surfaces. The proposed approach met the non-arbitrage principle, which was a key in real finance applications. Their method was evaluated on synthetic data and SP500 data. Overall, it is a high-quality submission, but I do have the concern: the paper studied a very niche problem in finance industry, and many designs/concepts/novelties are heavily tied with the problem itself, so may have quite limited audience for a machine learning venue.
Strengths: * The writing was very clear and easy to follow, though it was not quite friendly for the non-finance background audience, because many materials were placed in appendix, probably due to the page limits. * I quite like the key design of the predictive model: instead of producing the total variance/implied volatility, the authors chose to use the NN to ‘correct’ an existing prediction from a mathematical finance model (e.g., SVI). What the NN model produced was a ‘rescale’ factor. This design was novel as far as I can tell, and it pointed out a neat way to build a hybrid model of NN and mathematical finance model.
Weaknesses: * "This work is the first to suggest the use of soft-constraints on the IVS values to guide the training of a deep model" was an over-claim. The use of soft constraints on NN models for fitting implied volatility surfaces have been investigated in Yu 2018 [60]. * The submission contains several typos, e.g., Line 33, ‘fir’ -> ‘for’, and Line 190: MAPE error formulation. * I can understand that the final layer activation is \alpha(1+tanh(.)) with \alpha>0 is chosen to guarantee the positivity, but (1+tanh(.)) is an odd choice, why not simply use sigmoid? * It was not very clear how condition (6), i.e., large moneyness behaviour is linked to the design of the loss term \mathcal{L}_{C6}. * For the choice of prior model, SVI doesn’t stand for the STOA mathematical finance models, to name a few, the authors may want to try SSVI, extended SSVI, rough volatility model.
Correctness: Yes
Clarity: Yes
Relation to Prior Work: Yes
Reproducibility: Yes
Additional Feedback: NB: broader impact was not presented
Summary and Contributions: The authors propose an approach for modeling the implied volatility surface that works by using a neural network to correct the output of a prior (parametric) model, such as Black-Scholes. The main idea is a simple one: parametric models like BS can fail to reliably describe the volatility surface "away from the money" and may introduce spurious and undesirable arbitrage opportunities, but one can "correct" these artifacts using a secondary model (in this case a neural network). The paper is very well written and while many of the details are pushed into the appendix, they did a very good job of describing a complex problem - modeling the implied volatility surface - while maintaining focus on their contribution. My main concern is that this paper has a very narrow scope and may be better suited to a financial machine learning venue. In addition, the authors failed to include a section on broader impact.
Strengths: The primary strength of this paper is that the authors present a practical solution to an important financial modeling problem and clearly demonstrate an improvement over standard parametric approaches to modeling the volatility surface, in terms of modeling market data, providing a stable price surface away from observed data (tail behavior) and in enforcing the no arbitrage conditions. This benefit is clearly visible in the results shown in the main paper.
Weaknesses: I only see a few weaknesses in this paper, other than a lack of relevance to the broader NeurIPS community. First, why did the authors use a feed-forward NN model? With only 2 features I wonder if a linear model with an appropriate (say quadratic or cubic) basis expansion could provide enough modeling flexibility. Second, I would liked to have seen some of the training details make it into the main paper. I assume standard gradient based optimization methods were used, but some detail would be nice - especially in regards to the arbitrage constraints. In addition, some more detail on how the prior model was used to sample points to enforce the zero arbitrage condition and tail behavior is needed. Also, how large is the market data for a typical option used in the analysis? Finally, the authors mention a few existing applications of NNs to volatility surface smoothing in the related work section, but they don't compare to the models in the results section. The proposed system seems solid, but a more realistic baseline than BS or SVI would help drive this home.
Correctness: As mentioned above, my main criticism with the empirical results is that the authors didn't directly compare to existing NN-based smoothing methods. Its not important that the proposed method is significantly better in all cases, but providing a more realistic baseline would be helpful.
Clarity: The paper is very well written. Only have a few minor comments: Line 26: "...interpolation and extrapolation..." Line 33: "...flexible predictors for applications..." Line 108: "and that the interest rate" Figure 2 appears before Figure 1 in the pdf
Relation to Prior Work: yes.
Reproducibility: Yes
Additional Feedback:
Summary and Contributions: This manuscript presents a neural network approach to fit, correct and extrapolate implied volatility surfaces from option prices. The two main novelties introduced are: 1/ The use of soft constraints (penalty terms) to characterize no arbitrage conditions (the "correct" and "extrapolate" part), as well as the neural network based fit with the inclusion of a model based prior. The product of this prior and the neural network fit produces the surface.
Strengths: This work is significant for a few main reasons, of which: - Classical fitting techniques come in two flavors: one relying on rigid models (typically just the prior component), which struggles to fit the data. Another relying on ad-hoc parametrizations that are more flexible, but are hard to calibrate not to include arbitrages (and are usually slower). This works tries to mix the best of both worlds - Dual to vol fitting is option pricing. The pricing function is highly non-linear, unknown in closed form (for american options) and requires involved numerical methods (e.g. PDE solver, or Longstaff-Schwarz Monte Carlo simulations). Having a robust and fast vol fitting procedure will minimize calls to the pricers, resulting in drastic speed ups
Weaknesses: - The authors do not compare their work in terms of accuracy and speed to other techniques - The objective function could use more work, especially given the data available to the authors as inputs - Some details on how the authors pre-process their inputs are unclear
Correctness: My only concern about the methodology is that it does not reflect the aim of the paper. The authors are trying to achieve two things: 1- fit the data (to the extent where the data is arbitrage free) and 2- correct any arbitrage in the data. The authors do report RMSE and MAPE for the fitting part, as well as different metrics for the arbitrage penalties C_{4,5,6}. However, it is unclear to me when: - RMSE/MAPE is large, is it because the fit doesn't go through the arb free data (so a real fitting issue), or is it because the data has arbs and the correction part is doing its job? One could try to guess what the answer is using C_{4,5,6} but this is not enough. For example, including metrics for how much the data itself is arb free would enlighten the reader as to whether a large RMSE is expected or not.
Clarity: The paper is well written, short and concise. The appendix is a great addition and goes straight to the point. No concerns on that end.
Relation to Prior Work: The introduction clearly details prior work in this field.
Reproducibility: Yes
Additional Feedback: \section*{Comments}
\subsection*{General comments}
From a theoretical standpoint, the two novelties that the authors claim to introduce -combining a prior model with a neural network and using soft constraints for the no arbitrage conditions, are of critical importance and make this manuscript worthy of a publication to me.
From a practical standpoint, the implementation details -whether regarding the loss function, the neural network inputs or the training set, could use more work before producing a useful implied volatility surface superior to the ones produced by more classical techniques. A comparison with these techniques would have been most welcome.
\subsection*{Objective function}