The proposed solution is simply to use another distribution to model your data. You could have used any other multivariate continuous distribution of proper dimension. IMO it is not a valid answer as the distribution has no link to your data.

After inspection, it appears that the problem has nothing to do with Lilliefors’s test. In OT 1.15 we were using this test (under the wrong name of Kolmogorov) to select automatically a distribution suited to the input sample, but we switched to a more sophisticated selection algorithm (see MetaModelAlgorithm::BuildDistribution). It is based on a first pass using the raw Kolomgorov test (thus ignoring the fact that parameters have been estimated) then an information-based criterion is used to select the most relevant model (AIC, AICC, BIC depending on the value of the “MetaModelAlgorithm-ModelSelectionCriterion” key in ResourceMap. The problem is caused by the TrapezoidalFactory class during the Kolmogorov phase. I will provide a fix ASAP in OpenTURNS master. In the mean time, I have adapted the proposed solution to something more adapted to your data:

degree = 6
dimension_xi_X = 3
dimension_xi_Y = 450
enumerateFunction = ot.HyperbolicAnisotropicEnumerateFunction(dimension_xi_X, 0.8)
basis = ot.OrthogonalProductPolynomialFactory(
    [ot.StandardDistributionPolynomialFactory(ot.HistogramFactory().build(sample_X[:,i])) for i in range(dimension_xi_X)], enumerateFunction)
basisSize = enumerateFunction.getStrataCumulatedCardinal(degree)
#basis = ot.OrthogonalProductPolynomialFactory(
#    [ot.HermiteFactory()] * dimension_xi_X, enumerateFunction)
#basisSize = 450#enumerateFunction.getStrataCumulatedCardinal(degree)
adaptive = ot.FixedStrategy(basis, basisSize)
projection = ot.LeastSquaresStrategy(
    ot.LeastSquaresMetaModelSelectionFactory(ot.LARS(), ot.CorrectedLeaveOneOut()))
ot.ResourceMap.SetAsScalar("LeastSquaresMetaModelSelection-ErrorThreshold", 1.0e-7)
algo_chaos = ot.FunctionalChaosAlgorithm(sample_X, 
outputSampleChaos,basis.getMeasure(), adaptive, projection)
result_chaos = algo_chaos.getResult()
meta_model = result_chaos.getMetaModel()
metaModel = ot.PointToFieldConnection(postProcessing, 

I also implemented a quick and dirty estimator of the L2-error:

# Meta_model validation
iMax = 5
# Input values
sample_X_validation = ot.Sample(np.array(month_1_parameters_MSE.iloc[:iMax,0:3]))
print("sample size=", sample_X_validation.getSize())
# sample_X = ot.Sample(month_1_parameters_MSE[['Rseries','Rsh','Isc']])

# output values
Field = ot.Field(mesh,np.array(month_1_simulated.iloc[0:1]).transpose())
sample_Y_validation = ot.ProcessSample(1,Field)
for k in range(1,iMax):
    sample_Y_validation.add( np.array(month_1_simulated.iloc[k:k+1]).transpose() )

# In[18]:

graph = sample_Y_validation.drawMarginal(0)
drawables = graph.getDrawables()
graph2 = metaModel(sample_X_validation).drawMarginal(0)
drawables = graph2.getDrawables()
graph.setTitle('Model/Metamodel Validation')
drawables = graph.getDrawables()
L2_error = 0.0
for i in range(iMax):
    L2_error = (drawables[i].getData()[:,1]-drawables[iMax+i].getData()[:,1]).computeRawMoment(2)[0]
print("L2_error=", L2_error)

You get an error of 79.488 with the previous answer and 1.3994 with the new proposal. Here is a graphical comparison.

Comparison between test data & previous answer Comparison between test data & new proposal

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top