{"title": "Application of Neural Network Methodology to the Modelling of the Yield Strength in a Steel Rolling Plate Mill", "book": "Advances in Neural Information Processing Systems", "page_first": 698, "page_last": 705, "abstract": null, "full_text": "Application of Neural Network Methodology to \nthe Modelling of the Yield Strength in a Steel \n\nRolling Plate Mill \n\nAh Chung Tsoi \nDepartment of Electrical Engineering \nUniversity of Queensland, \nSt Lucia, Queensland 4072, \nAustralia. \n\nAbstract \n\nIn this paper, a tree based neural network viz. MARS (Friedman, 1991) for \nthe modelling of the yield strength of a steel rolling plate mill is described. \nThe inputs to the time series model are temperature, strain, strain rate, \nand interpass time and the output is the corresponding yield stress. It \nis found that the MARS-based model reveals which variable's functional \ndependence is nonlinear, and significant. The results are compared with \nthose obta.ined by using a Kalman filter based online tuning method and \nother classification methods, e.g. CART, C4 .5, Bayesian classification. It \nis found that the MARS-based method consistently outperforms the other \nmethods. \n\n1 \n\nIntroduction \n\nHot rolling of steel slabs into fiat plates is a common process in a steel mill. This \ntechnology has been in use for many years. The process of rolling hot slabs into \nplates is relatively well understood [see, e.g., Underwood, 1950]. But with the \nintense intrnational market competition, there is more and more demand on the \nquality of the finished plates. This demand for quality fuels the search for a better \nunderstanding of the underlying mechanisms of the transformation of hot slabs \ninto plates, and a better control of the parameters involved. Hopefully, a better \nunderstanding of the controlling parameters will lead to a more optimal setting \nof the control on the process, which will lead ultimately to a better quality final \nproduct. \n698 \n\n\fANN Modelling of a Steel Rolling Plate Mill \n\n699 \n\nIn this paper, we consider the problem of modelling the plate yield stress in a \nhot steel rolling plate mill. Rolling is a process of plastic deformation and its \nobjective is achieved by subjecting the material to forces of such a magnitude that \nthe reSUlting stresses produce permanent change of shape. Apart from the obvious \ndependence on the materials used, the characteristics of the material undergoing \nplastic deformation are described by stress, strain and temperature, if the rolling \nis performed on hot slabs. In addition, the interpass time, i.e., the time between \npasses of the slab through the rollers (an indirect measure of the rolling velocity), \ndirectly influences the metallurgical structure of the metal during rolling. \n\nThere is considerable evidence that the yield stress is also dependent 011 the strain \nrate. In fact, it is observed that as the strain rate increases, the initial yield point \nincreases appreciably, but after an extension is achieved, the effect of strain rate on \nthe yield stress is very much reduced [see, e.g., Underwood, 1950]. \n\nThe effect of temperature on the yield stress is important. It is shown that the \nresistance to deformation increases with a decrease in temperature. The resistance \nto deformation versus temperature diagram shows a \"hump\" in the curve, which \ncorresponds to the temperature at which the structure of material changes funda(cid:173)\nmentally [see, e.g., Underwood, 1950, Hodgson & Collinson, 1990]. \nUsing, e.g., an energy method, it is possible to formulate a theoretical model of the \ndependence of deformation resistance on temperature, strain, strain rate, velocity \n(indirectly, the interpass time). One may then validate the theoretical model by \nperforming a rolling experiment on a piece of material, perhaps under laboratory \nconditions [see .e.g., Horihata, Motomura, 1988, for consideration of a three roller \nsystem] . \nIt is difficult to apply the derived theoretical model to a practical situation, due to \nthe fact that in a practical process, the measurement of strain and strain rate are \nnot accurate. Secondly, one cannot possibly perform a rolling experiment on each \nnew piece of material to be rolled. Thus though the theoretical model may serve as \na guide to our understanding of the process, it is not suitable for controller design \npurposes. \n\nThere are empirical models relating the resistance of deformation to temperature, \nstrain and strain rate [see, e.g., Underwood, 1950, for an account of older models]. \nThese models are often obtained by fitting the observed data to a general data \nmodel. \n\nThe following model has been found useful in fitting the observed practical data \n\nkm = a{b sinh -1 (ci: exp( T)!) \n\nd \n\n(1) \n\nwhere km is the yield stress, { is the strain, i is the corresponding strain rate, \nand T is the temperature. a, b, c, d and f are unknown constants. It is claimed \nthat this model will give a good prediction of the yield stress, especially at lower \ntemperatures, and for thin plate passes [Hodgson & Collinson, 1990] . \nThis model does not always give good predictions over all temperatures as mill \nconditions vary with time, and the model is only \"tuned\" on a limited set of data. \n\n\f700 \n\nTsoi \n\nIn order to overcome this problem, McFarlane, Telford, and Petersen [1991] have \nexperimented with a recursive model based on the Kalman filter in control theory \nto update the parameters (see, e.g. Anderson, Moore, [1980]), a, b, c, d, / in the \nabove model. To better describe the material behaviour at different temperatures, \nthe model explicitly incorporates two separate sub-models with a temperature de(cid:173)\npendence: \n1. Full crystallisation (T < Tupper) \n\nkm = alb sinh-1(ci exp( :)J) \n\nThe constants a, b, c, d, f are model coefficients. \n2. Partial recrystallisation (Tiower ~ T ~ Tupper). \n\nkm = a({ + f*)bsinh-1(ciexp(:)J) \nto .5 = j(Ai-lfi-l + {i)9 h\u00abq(Ti - 1, n)h\u00bb \nAi = h(t, to.5) \n\n(2) \n\n(3) \n\n(4) \n(5) \n\nwhere A is the fractional retained strain; {*, expressed as a Taylor series expansion \nof Ai-l (i-l, is the retained strain; t is the interpass time; to.5 is the 50 % recrystalli(cid:173)\nsation time; q(n-l, Ti) is a prescribed nonlinear function of n-l and n; h(.) and \n12(.) are pre-specified nonlinear functions; i, the roll pass number; j, h, g are the \nmodel coefficients; Tupper is an experimentally determined temperature at which the \nmaterial undergoes a permanent change in structure; and 1iower is a temperature \nbelow which the material does not exhibit any plastic behaviour. \n\nModel coefficients a,b,c,d,/,g,h,j are either estimated in a batch mode (i.e., all \nthe past data are assumed to be available simultaneously) or adapted recursively \non-line (i.e., only a limited number of the past data is available) using a Kalman \nfilter algorithm in order to provide the best model predictions [McFarlane, Telford, \nPetersen, 1991]. \nIt is noted that these models are motivated by the desire to fit a nonlinear model \nof a special type, i.e., one which has an inverse hyperbolic sine function. But, since \nthe basic operation is data fitting, i.e., to fit a model to the set of given data, it \nis possible to consider more general nonlinear models. These models may not have \nany ready interpretation in metallurgical terms, but these models may be better in \nfitting a nonlinear model to the given data set in the sense that it may give a better \nprediction of the output. \n\nIt has been shown (see, e.g., Hornik et aI, 1989) that a class of artificial neural \nnetworks, viz., a multilayer perceptron with a single hidden layer can approximate \nany arbitrary input output function to an arbitrary degree of accuracy. Thus it \nis reasonable to experiment with different classes of artificial neural network or \ninduction tree structures for fitting the set of given data and to examine which \nstructure gives the best performance. \n\n\fANN Modelling of a Steel Rolling Plate Mill \n\n701 \n\nThe structure of the paper is as follows: in section 2, a brief review of a special \nclass of neural networks is given. In section 3, results in applying the neural network \nmodel to the plate mill data are given. \n\n2 A Tree Based Neural Network model \n\nFriedman [1991] introduced a new class of neural network architecture which is \ncalled MARS (Multivariate Adaptive Regression Spline). This class of methods \ncan be interpreted as a tree of neurons, in which each leaf of the tree consists of a \nneuron. The model of the neuron may be a piecewise linear polynomial, or a cubic \npolynomial, with the knot as a variable. In view of the lack of space, we will refer \nthe interested readers to Friedman's paper [1991] for details on this method. \n\n3 Results \n\nMARS has been applied to the platemill data. We have used the data in the \nfollowing manner. \n\nWe concatenate different runs of the plate mill into a single time series. This consists \nof 2877 points corresponding to 180 individual plates with approximately 16 passes \non each plate. There are 4 independent variables, viz., interpass time, temperature, \nstrain, and strain rate. The desired output variable is the yield stress. \n\nA plot of the individual variables, viz temperature, strain, strain rate, interpass \ntime and stress versus time reveal that the variables can vary rather considerably \nover the entire time series. In addition, a plot of stress versus temperature, stress \nversus strain, stress versus strain rate and stress versus interpass time reveals that \nthe functional dependence could be highly nonlinear. \nWe have chosen to use an additive model (Friedman [1991]), instead of the more \ngeneral multivariate model, as this will allow us to observe any possible nonlinear \nfunctional dependencies of the output as a function of the inputs. \n\n(6) \nwhere k., i = I, 2, 3, 4 are gains, and fi' i = 1, 2, 3,4 are piecewise nonlinear polyno(cid:173)\nmial models found by MARS. \n\nThe results are as follows: \n\nBoth the piecewise linear polynomial and the piecewise cubic polynomial are used \nto study this set of data. It is found that the cubic polynomial gives a better fit than \nthe linear polynomial fit. Figure I(a) shows the error plot between the estimated \noutput from a cubic spline fit, and the training data. It is observed that the error \nis very small. The maximum error is about -0.07. Figure I(b) shows the plot of the \npredicted yield stress and the original yield stress over the set of training data. \n\nThese figures indicate that the cubic polynomial fit has captured most of the vari(cid:173)\nation of the data. It is interesting to note that in this model, the interpass time \n\n\f702 \n\nTsoi \n\n.1' \n\n13 \n\n12 \n\n- 12 \n\nJI \n\n28 \n\n2' \n\nFigure 1: (a) The prediction error on the training data set (b) The prediction and \nthe training data set superimposed \n\nplays no significant part. This feature may be a peculiar aspect of this set of data \npoints. It is not true in general. \nIt is found that the strain rate has the most influence on the data, followed by \ntemperature, and followed by strain. The model, once obtained, can be used to \npredict the yield stress from a given set of temperature, strain, and strain rate. \n\nFigure 2(a) shows the prediction error between the yield stress and the predicted \nyield stress on a set of testing data, i.e. the data which is not used to train the model \nand Figure 2(b) shows a plot of the predicted value of yield stress superimposed on \nthe original yield stress. \nIt is observed that the prediction on the set of testing data is reasonable. This \nindicates that the MARS model has captured most of the dynamics underlying the \noriginal training data, and is capable of extending this captured knowledge onto a \nset of hitherto unseen data. \n\n4 Comparison with the results obtained by conventional \n\napproaches \n\nIn order to compare the artificial neural network approach to more conventional \nmethods for model tuning, the same data set was processed using: \n\n1. A MARS model with cubic polynomials \n2. An inverse hyperbolic sine law model using least square batch parameter tuning \n\n\fANN Modelling of a Steel Rolling Plate Mill \n\n703 \n\n.. ~ \n... \n.3. \n.. ~ \n\n. \u2022 2. \n\n-.\u2022. ~ \n\n32 \n\n28 \n\n26 \n\n2' \n\n22 \n\n. 3 \n\n.6 \n\n\u2022\u2022 \n\n'2 \n\nII ... \n16V \n\n- .\u2022 ~. ~ \u2022\u2022 \n\n61 \n\n, \u2022 \u2022 \" \n\n'3 ... '61 , . 211 23 24. 26' \n\n. . . . \n\n3 \n\n\u2022\u2022 \n\n61 \n\n\u2022 \n\n' \" .3 ... '61 , . 211 23 2\" \n\n26' \n\nFigure 2: (a) The prediction error on the testing data set (b) The prediction and \nthe testing data set superimposed \n\n3. An inverse hyperbolic sine law model using a recursive least squares tuning \n4. CART based classification [Brie men et. al., 1984] \n5. C4.5 based method [Quinlan, 1986,1987] \n6. Bayesian classification [Buntine, [1990] \n\nIn each case, we used a training data set of78 plates (1242 passes) and a testing data \nset of 16 plates (252 passes). In the cases of CART, C4.5, and Bayesian classification \nmethods, the yield stress variable is divided equally into 10 classes, and this is used \nas the desired output instead of the original real values. \nThe comparison of the results between MARS and the Kalman filter based approach \nare shown in the following table \n\nBll \nmean% \n-.64 \nmean abs% 4.61 \nstd % \n\n6.26 5.11 \n\nB12 All Al2 ell C12 \n-0.2 4.5 \n1.69 \n-.64 \n4.22 4.61 \n5.3 \n3.5 \n4.9 \n4.7 \n\n2.38 \n5.3 \n6.26 6.25 \n\nwhere \nBll = Batch Tuning: tuning model ( forgetting factors =1 in adaption) on the \ntraining data \nB12 = Batch Tuning: running tuned model on the testing data \nAll = Adaptation: on the training data \nAl2 = Adaptation: on the testing data \n\n\f704 \n\nTsoi \n\nCll = MARS on the training data \nCu = MARS on the testing data, \nand mean% = mean\u00abkmea, - kpred)/kmea,), \nmeanabs% = mean(abs(kmea, - kpred)/kmea,)), \nstd% = stdev(kmea, - kpred)/kmea,); where mean and stdev stands for the mean \nand the standard deviations respectively, and kmea\" kpred represents the measured \nand predicted values of the yield stress respectively. \n\nIt is found that the MARS based model performs extremely well compared with the \nother methods. The standard deviation of the prediction errors in a MARS model \nis considerably less than the corresponding standard deviation of prediction errors \nin a Kalman filter type batch or online tuning model on the testing data set. \n\nWe have also compared MARS with both the CART based method and the C4.5 \nbased method. As both CART and C4.5 operate only on an output category, \nrather than a continuous output value, it is necessary to convert the yield stress \ninto a category type of variable. We have chosen to divide equally the yield stress \ninto 10 classes. With this modification, the CART and C4.5 methods are readily \napplicable. \nThe following table summarises the results of this comparison. The values given are \nthe percentage of the prediction error on the testing data set for various methods. \nIn the case of MARS, we have converted the prediction error from a continuous \nvariable into the corresponding classes as used in the CART and C4.5 methods. \n\nI Bayes I CART I C4.5 I MARS I \n\n65.4 \n\n12.99 \n\n16.14 6.2 \n\nIt is found that the MARS model is more consistent in predicting the output classes \nthan either the CART method, the C4.5 based method, or the Bayesian classifier. \nThe fact that the MARS model performs better than the CART model can be seen \nas a confirmation that the MARS model is a generalisation of the CART model \n(see Friedman [1991]). But it is rather surprising to see that the MARS model \noutperforms a Bayesian classifier. \n\nThe results are similar over a number of other typical data sets, e.g., when the \ninterpass time variable becomes significant. \n\n5 Conclusion \n\nIt is found that MARS can be applied to model the platemill data with very good \naccuracy. In terms of predictive power on unseen data, it performs better than \nthe more traditional methods, e.g., Kalman filter batch or online tuning methods, \nCART, C4.5 or Bayesian classifier. \nIt is almost impossible to convert the MARS model into one given in section 1. The \nHodgson-Collinson model places a breakpoint at a temperature of 925 deg G, while \nin the MARS model, the temperature breakpoints are found to be at 1017 degG \nand 1129 deg C respectively. Hence it is difficult to convert the MARS model into \nthose given by the Hodgson-Collinson model, the Kalman filter type models or vice \n\n\fANN Modelling of a Steel Rolling Plate Mill \n\n705 \n\nversa. \n\nA possible improvement to the current MARS technique would be to restrict the \nbreakpoints, so that they must exist within a temperature region where microstruc(cid:173)\ntural changes are known to occur. \n\n6 Acknowledgement \n\nThe author acknowledges the assistance given by the staff at the BHP Melbourne \nResearch Laboratory in providing the data, as well as in providing the background \nmaterial in this paper. He specially thanks Dr D McFarlane in giving his generous \ntime in assisting in the understanding of the more traditional approaches, and also \nfor providing the results on the Kalman filtering approach. Also, he is indebted \nto Dr W Buntine, RIACS, NASA, Ames Research Center for providing an early \nversion of the induction tree based programs. \n\n7 References \n\nAnderson, B.D.O., Moore, J .B., (1980). Optimal Fitering. Prentice Hall, Eagle(cid:173)\nwood, NJ. \nBrieman, L., Friedman, J., Olshen, R.A., Stone, C.J., (1984). Classification and \nRegrression Trees. Wadworth, Belmont, CA. \nBuntine, W, (1990). A Theory of Learning Classification Rules. PhD Thesis sub(cid:173)\nmitted to the University of Technology, Sydney. \nFriedman, J, (1991). \"Multivariate Adaptive Regression Splines\". Ann Stat. to \nappear. (Also, the implication of the paper on neural network models was presented \norally in the 1990 NIPS Conference) \nHodgson, Collinson, (1990). Manuscript under preparation (authors are with BHP \nResearch Lab., Melbourne, Australia) . \n\nHorihata, M, Motomura, M, (1988). \"Theoretical analysis of 3-roll Rolling Process \nby the energy method\". Trans of the Iron and Steel Institute of Japan, 28:6, 434-\n439. \nHornik, K., Stinchcombe, M., White, H., (1989). \"Multilayer Feedforward Networks \nare Universal Approximators\". Neural Networks, 2, 359-366. \n\nMcFarlane, D, Telford, A, Petersen, I, (1991). Manuscript under preparation \nQuinlan, R. (1986). \"Induction of Decision Trees\". Machine Learning. 1,81-106. \n\nQuinlan, R. (1987). \"Simplifying Decision Trees\". International J Man-Machine \nStudies. 27, 221-234. \nUnderwood, L R, (1950). The Rolling of Metals. Chapman & Hall, London. \n\n\f", "award": [], "sourceid": 482, "authors": [{"given_name": "Ah", "family_name": "Tsoi", "institution": null}]}