# Machine learning method for identification of significant parameters

1. ## Machine learning method for identification of significant parameters

Hello everyone,

Preface: I searched the web with google, but obviously failed to use the correct keywords. Apparently I should outline for what I am looking for to real people rather than a search engine:

I have a blackbox system with several input and several output values. For my first step I am looking for a method/tool/.... that tells me which of the input parameters are significant and which I can ignore. Why? Because in the end I'd like to guess input parameters in order to receive some defined output, and therefore I'd like the system to be as small as possible and not over-determined. For step 2 I'll need some learning algorithm in order to predict input parameters for a given output.

Can anyone please name some methods which I can explore? For step 2 I could use neural networks, but maybe there's something better for that. For step 1 I fail to find anything, some sort of correlation maybe?

thanks and cheerio,
heikster

•

2. ## Re: Machine learning method for identification of significant parameters

Have you read about ANOVA (Analysis of Variance) and Linear Regression Analysis?

Do you have control of the input variables such that you can vary one to the exclusion of the others and measure its effect on the outputs? Or, are you faced with a situation of multiple inputs that are changing and affecting multiple outputs?

John

•

3. ## Re: Machine learning method for identification of significant parameters

Hi John,

I'm not familiar with regression of more than one independent variable. Would ANOVA cover such a topic?

I can only postprocess data, but the input-datasets contain variants of every variable. I thought there must be methods that can tell which parameters don't matter too much and which are essential.

cheers,
heikster

•

4. ## Re: Machine learning method for identification of significant parameters

From Wikipedia ( http://en.wikipedia.org/wiki/ANOVA ):

In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form.
Now, can you answer my question in post #2? Is there anything else that you can tell us about the problem? Is this a school assignment? If not, then what are the inputs and outputs? Are the inputs independent variables?

John

5. ## Re: Machine learning method for identification of significant parameters

My knowlegde of statistics is on a basic level, and I'm no native speaker, and my kids are running around so my concentration isn't up to much. I don't understand the "to different sources of varaition" bit.

My situation is basically
Or, are you faced with a situation of multiple inputs that are changing and affecting multiple outputs?
School assignment? You must have very advanced schools, I didn't do something like this in my studies (to be fair, I didn't specialize in statistics).
It's something I like to try for work. The black box is a real-time simulation tool (so every step takes real time + x, i.E. is "expensive"). The input parameters are 1.) constant parameters which I cannot change (vehicle data) and 2.) parameters I change during a heuristic optimization. Output: data about how the vehicle was driving and different vehicle dynamics signals.
Now, this optimization is done for different kind of vehicles, so the input 1.) is different for every vehicle, but constant during the optimization. The parameters 2.) are changed in order to reach some dynamic values.
As every step is expensive, my idea was that if a new vehicle starts the opti, I could fetch the parameters 1.) and some target output values and compute some starting values for the parameters 2.). Therefore I need some model to train with present data (could be a neural network, but I guess there is much more in the machine learning toolbox). But beforehand I want to make sure that the model is not over-determinded. This applies many to parameters 1.), so I want to know what vehicle parameters are important for the opti and which are not. I already collect data during the opti runs and I was hoping to find some method that shows some sort of correlation value between each input value (or pair, tuples...) and some output value.
Is it any clearer now? Sorry that I'm so bad at explaining...
thanks a lot,
Heiko

--[[ ]]--