Skip to main content

Effective Means of Handling Curse of Dimensionality

Abstract:

Increase in dimensions of the data decrease the performance of the machine learning systems as the increase in the dimensions increase the problem space under analysis make data sparse. As the efficiency of the machine learning algorithms directly relates to the volume of the test data, increased space demands more data for better learning opportunities. To address this challenge, most of the time we tend to reduce some of the data dimensions searching for dimensions which are not directly related to the problem under analysis. For efficient reduction of dimensions we need to address the question "what is the idea dimensionality we can address without compromising on the sensitive of the dimensions?" This paper outline the problem of dimensionality not just from the angle of issues with high dimensional data leading to the reduction of dimensions but analyses how to efficiency balance the dimensions through better data projection techniques for more accurate results.

Awaiting session recording. Will post it soon.

Comments

Popular posts from this blog

Machine learning challenges with Imbalanced Data

For many real world machine learning problem we see an imbalance in the data where one class under represented in relative to others. This leads to mis-classification of elements between classes. The cost of mis-classification is often unknown at learning time and can be far too high. We often see this type of imbalanced classification scenarios in fraud/intrusion detection, medical diagnosis/monitoring, bio-informatics, text categorization and et al. To better understand the problem, consider the “Mammography Data Set,” a collection of images acquired from a series of mammography examinations performed on a set of distinct patients. For such a data set, the natural classes that arise are “Positive” or “Negative” for an image representative of a “cancerous” or “healthy” patient, respectively. From experience, one would expect the number of noncancerous patients to exceed greatly the number of cancerous patients; indeed, this data set contains 10,923 “Negative” (majority class) and

Do we know the enterprise IT challenges...???

Last night during the dinner chat with one of my old school pal, we stumbled on the topic of current issues that enterprises are stuck with. It went on almost for 30 mins. But what made it less interesting to me is that whole discussion was around cost cutting, our sourcing, rationalization etc., It is really boring, we are still taking about the tip of iceberg. But the question is due we really know what the real challenges are. I am not talking about a laundry list with 30/40/50 items. I am looking why we really have those items? (whatever the count is). I could not get this out of mind and started listing, order, consolidating, prioritizing those items to make sure I am completely confident that as a consultant I am doubly sure about them. Of course, it is debatable. But this is what I think are core problem and rest of list is the symptoms. 1. Dynamic market conditions are forcing business to adopt rapidly while IT is able to respond to this 2. Day by day IT is becoming exp

Infra store – the next IT marketplace

We are all familiar with the Apple App Store or Google Play Store we visit every day to download apps, games and necessary updates for our phones and tablets. The app store model revolutionized the marketplace idea, making it easy for both software vendors and consumers to publish and install software without the hassles of software building, distribution and deployment. Read further on CSC HyperThink