2 + Pages – 12 Hours Question in text doc Running Head: BASIC CONCEPTS AND TECHNIQUES 1 BASIC CONCEPTS AND TECHNIQUES 4 Basic Concepts and Tech

Question in text doc



Basic Concepts and Techniques







Basic Concepts in Data Classification

Data classification refers to the process involved in organizing data in different categories for it to be used effectively. Classification of data make it easier for retrieval and location. Additionally, it also reduces several duplications of data thereby reducing storage as well as backup costs. The main types of data classification involves: content, context and user (Ghaddar and Naoum, 2018). In the content, the classification is based on looking for sensitive information. On context, the classification is based on searching for indirect indicators of sensitive information.

General framework for classification

Data classification consist of grouping depending on the relevance. Data is classified on the bases of the content carried, the knowledge involved and the content contained. One of the necessity in data classification is the data framework. Framework provide the structure. The framework is significant to the enterprise organisation who benefit from big data.

What is a decision tree and decision tree modifier?

Decision tree refers to a supervised machine where the data is split depending on specific parameters. Decision tree consists of nodes, edges and the leaf nodes. The nodes test the value of specific attribute. Branch correlate with the outcome of the test. Leaf nodes predicts the outcome. On the other hand, decision tree modifier refers to the discriminator class that separate the training set such that each portion contains entirely of one class.

What is a hyper parameter?

Hyper parameter refers to an external configuration to the model and its value cannot be calculated from the data. Hyper parameter are mostly used in estimation of model parameters and are specified by the practitioner. Hyper parameters are adjustable parameters used in obtaining a model with optical performances (Chen et al, 2019).

Hyper parameter optimization is a challenge especially when selecting a set of optical hyper parameters. The parameter is used to regulate the learning process.

Model evaluation is a significant part in the development process. It is significant in finding the best model representing the data as well as how well the selected model perform various activities. Some of the cross validation pitfalls when choosing and assessing data include selection of model performance, selection of variables and performance of single cross validation.


Ghaddar, B., & Naoum-Sawaya, J. (2018). High dimensional data classification and feature selection using support vector machines. European Journal of Operational Research265(3), 993-1004.

Wu, J., Chen, X. Y., Zhang, H., Xiong, L. D., Lei, H., & Deng, S. H. (2019). Hyperparameter optimization for machine learning models based on Bayesian optimization. Journal of Electronic Science and Technology17(1), 26-40.

Looking for this or a Similar Assignment? Click below to Place your Order