Hamburg University - Chemistry - TMC - Stribeck - nonlin | Search - |
POLYMER PHYSICS | IDF Analysis by Nonlinear Regression | ||||||||||||||||||||||||
Basics
Overview Models The Programs Requirements Procedure Parameter file Keys Download |
OverviewIncreasingly interface distribution functions (IDF) are computed from small-angle X-ray scattering (SAXS, USAXS) and interpreted. But the quantitative analysis of the IDFs by fitting of models is advancing rather slow. Here fitting programs for different models are presented and described. The basic modelsUsing the presented programs domain thickness distributions can be extracted from the IDFs. The distributions are characterized by their position (center of gravity), their width and their skewness. The model, in general, is based on stacking statistics. Two different kinds of stacking statistics are common:
As was shown by me, each of these models can be unified with a third one ("homogeneous long period distribution"). The ProgramsRequirements concerning computer and dataI offer executable programs for Linux and MS-Dos. The DOS-version runs in the DOS-boxes of Linux, Windows95, Windows98 and Windows Me. Under Windows95 and Windows98 the programs are quite slow or can only be started after COMMAND.COM is exchanged (The program work under the shell 4DOS). The data supplied must represent the IDF and must be written in ASCII format. Here an extract from a data file (h-6.dat):
Ap8.8dz1.05Fl595smx.43 60512-6 idf 1.40000E+0000 2.28419E+0000 1.50000E+0000 2.38694E+0000 1.60000E+0000 2.50201E+0000 1.70000E+0000 2.62754E+0000 1.80000E+0000 2.76114E+0000 1.90000E+0000 2.89997E+0000 2.00000E+0000 3.04085E+0000 ...The first line is a deliberate comment, which will be repeated in the output of the regression program. The following lines contain the curve data. The first column contains x-values (in units of nanometer), the second column contains the values of the IDF g_{1}(x). The x-values must be equidistantly gridded. Negative x-values are not allowed. In order to accelerate the fitting process, every second point in the curve is ignored. The extrapolation of the resulting grid must contain the value x=0. Not more than 512 points are allowed. Commonly the file contains approx. 200 points. Be careful not to supply IDF values which are too small or too big. If the x-values range from 0 to 100, the values of g_{1}(x) should cover an interval of approx. the same order of magnitude - running poorly conditioned input data will not result in reasonable output. My program topas generates suitable data files when the command #write is issued. Regression procedureThe programs read data and parameter files. Then the initial simplex vertex is generated in parameter space and moved thereafter. Every improvement of the residual sum of squares (RSS) is reported on screen in the terminal window. If no improvement is achieved, only the counter is advanced. If the program encounters its regular exit, the old parameter file is renamed. Under MS-DOS it gets the suffix .bak. Under Linux a second suffix .bak is appended. After this a new parameter file is written. It receives the values found, but an "annealed" Simplex vertex. Thus it is quite frequently possible to simply start the program once again without having edited the parameter file. The program is started repeatedly, until there is no more improvement. After each regular program run the two files fit.dat and err.dat are written. Utilizing suitable prgrams, additional information on the quality of the fit can be extracted from these ASCII files. Examples for program calls:
In the complete protocol the asymptotic interval of confidence is reported with each parameter value. Moreover, the parameter correlation matrix is reported. If the matrix contains values > 0.96, a reduction of parameters in the model should be considered. But perhaps the starting values have been poor. If no intervals of confidence are reported, the input data are poor, inacceptable or poorly conditioned. Multiplication of the y-values by a factor may help to improve the matrix condition. The parameter fileEach of the regression programs requires a parameter file. The program is started with the parameter values from this file. Name: It is suitable to name the parameter file <exper>.par, if data are supplied in <exper>.dat. The data file h-6.dat, e.g., should be accompanied by a parametr file h-6.par, which might look as follows: #10000 10 7 0.4 6 0.3 0.3 0.3 8 4 7 0.3 0.3 0.3 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-004 1.000000E-006 This is a parameter file for the program mr_2stac, which fits two stacks from domains to the IDF. Syntax: If the first character on the first line is a "#", the parameter file is a simplified one. After this two numbers have to be supplied on the first line. The first number is the maximum iteration count. The second number determines, how big the initial Simplex vertex shall be blown up. The number is in percent with respect to the starting parameter values given thereafter. The following numbers may be deliberately distributed on different lines. Now the starting values for the model parameters are given (in the order in which they are defined in the model function). Thus "7" designates the starting value for the weight of the first stack. The following two numbers are two average layer thicknesses (0.4 nm and 6 nm). Thereafter the three relative widths of the generating Gaussians are given. In almost any case it is a good choice to start with values of 0.3. The first relative width indicates the heterogeneiy of the stack when moved across the irradiated volume (or the skewness of the distribution functions, resp.). The two remaining widths are related to the generating Gaussians of the two layer thickness distributions. Then a similar parameter set is given for the second stack. The parameter file is completed by a list of constraints. Here, e.g., it is specified that the minimum is found, when parameter variation is constrained to the 5th decimal and when the residual sum of squares (last number: 1.000000E-6) is constrained to variations in the7th decimal Program control keysTwo keys can be used to stop the running program:
DownloadUncorrelated domains (Lamellae)
One stack
Two stacks
One stack plus uncorrelated lamellaae
Here it was assumed that the thicknesses of the uncorrelated lamellae vary according to a (symmetric) Gaussian function. Thus a parameter sgH is missing in the corresponding parameter set. The most important helper program is extrafit (Linux, DOS). It extracts the model fit found from the file "fit.dat" generated in the latest run of a fitting program and writes it into an ASCII file. |
Hamburg University - Chemistry - TMC - Stribeck - nonlin | Search- |