Skip to main content

Posts

Showing posts from 2011

SharePoint Guru Blog Post Back Online...

Last 8 months were been really hectic with the delivery commitments which kept me away from my blogs. Taking sometime, thinking of reviving my blog postings which were offline for quite some time due to expired membership with my hosting provider. Till I could make some proper arrangements on that I am thinking of posting all my old blogs here .

IT Next Generation Strategies Blog Post Launched

Last week, during the lunch meeting, I was talking to Greg (IT director of a manufacturing major) on different IT management strategies in the manufacturing sector. Slowly Greg started explaining his challenges in  meeting the business expectation of maintaining optimal IT service while balancing the current outdated technical landscape. One of his biggest concern was on the increasing cost of total ownership which is getting compounded due to rapidly changing IT technologies. Frankly, this is not just Greg's challenge but this is the challenge of every IT manager.     Discussing this topic in the evening with my colleague we decided to launch a blog on this areas so that we can share our experiences and ideas with like other in this industry. This gave birth to IT Next Gen a blog on the future of IT and different strategies for the technical managers which will help them keeping themselves flexible to the rapid changes while meeting the business expectations

Sampling strategies for Imbalanced Learning

As discussed in my previous blog, Imbalanced data poses serious challenges in Machine Learning .  One of approach to combat this imbalance is data is to alter the training set in such a way as to create a more balanced class distribution so that the resulting sampled data set can be used with traditional data-mining algorithms. This can be achieved through...  Under-sample where the size of the majority class is reduced using different techniques like reducing redundancy, removing boundary candidates etc., Over-sample where the size of the minority class is increased by adding more candidates which can augment the data set. Hybrid approach where a combination of both oversampling of minority class and under sampling of majority class is attempted. Each of these techniques discussed below Random Over Sampling In random over-sampling, the minority class instances are duplicated in the data set until a more balanced distribution is reached. As a illustration, consider a data s