Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Some basic questions about Machine Learning

Status
Not open for further replies.

cherryman

Member level 1
Joined
Feb 28, 2013
Messages
32
Helped
9
Reputation
18
Reaction score
9
Trophy points
1,288
Location
Poland
Activity points
1,477
Hello all, I have several basic questions about Machine Learning to better solidify my knowledge.
I hope to find the answers here.

1. SVM
I have a one dimensional feature vector (x) of length N and the target vector that haves labels of two classes (e.g. 0 and 1). The case is not linearly separable, so the histograms of those features given classes are overlapped.
Can You explain me, why the SVM generates different hypothesis according to the different kernel functions?
Is it related with that, for the SVM it is actually not a linear case? And when I use a kernel function I actually make a transformation z = phi(x)?

2. Kernels
Can You explain me the basic concept of a kernel? What it actually do to the data set? I know that it is an inner product of the given data (x'x) but how it affect on the data set?

3 SVM and classes with uneven number of representants
I heard that SVM is sensitive to uneven classes (classes with uneven number of representants). Can You tell me is it true? And if it is, can You tell me why?
On the other hand to evaluate hyperplane the SVM needs only support vectors, and other data are not necessary except those, which are misclassified
in soft margin SVM.

4. Data normalization
Having a set of M real signals, I extracted a set of 2D - features from them. From one signal there are N 2D features, so as a result I have the M matrices with Nx2 dimension. The task is to recognise 2 classes (0 and 1) from them. But the one class occurs much frequent than the other.
The main problem that I have is as follows.
Within each signal (matrix) the features have quite good separation, but when I want to create a training vector from all of the signals, the features from different classes are highly overlap. So I suspect, the problem is in normalization. But how to make good normalization? I have found that there is a good practice to subtract mean value from the features and then divide them by max(abs(x)) or standard deviation.
But if I do that on the vector that have features form two classes, the result will be different comparing to the vector with only one class representation.
The second hypothesis is that I have extracted weak features.

Probably I'll have more questions, but right now that is all. If somebody can help me I'll be very thankful.
 

Well I see nobody has answered yet.. so I think I can try to answer at least the 2 first questions:

The basic SVM works only if your data is linearly separable... if it isn't, then you'll have to try using different Kernel Functions. These Kernel functions transform/map your data in a so called "feature space". This feature space is usually higher dimensional than your data space (in your case 2D).
The idea of the Kernel trick is that the data in the feature space become linearly separable (by a hyperplane). So, if the feature space is 3D, we try to find the 2D plane that separates your data. Note that a hyperplane always has the dimension n-1, where n is the feature space dimension.

When this plane is found, we map the plane back into your original data space (so your data set is actually unaffected in the end). The plane will become a curve in your data space that separates your data properly in the original space.

An important aspect of the Kernel Trick, is that you don't need to compute the coordinates of the feature space explicitely, because you only need the inner product for the whole optimization behind the SVM feature vector calculation. That's why it is possible to have feature spaces with an infinite dimension (RBF-Functions).

Hope I helped at least a bit.

Greetings
 
Thanks a lot for the answer.
An important aspect of the Kernel Trick, is that you don't need to compute the coordinates of the feature space explicitely, because you only need the inner product for the whole optimization behind the SVM feature vector calculation. That's why it is possible to have feature spaces with an infinite dimension (RBF-Functions).
I think that is the key of the kernel method, because the transformation/mapping that You mention above can be done by a simple basis functions (e.g polynomial, sigmoidal etc.)
I just didn't get the difference between those two methods.
Thanks a lot once again. Indeed it was very helpful.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top