IJFANS International Journal of Food and Nutritional Sciences

Volume 13 Issue 4

Efficacy of Withania somnifera against oxidative stress in male Drosophila melanogaster.
Volume 13 | Issue 4

Intervention Study On Cognitive Development Among School Children Of Siddi Tribe
Volume 13 | Issue 4

ASSESSMENT OF INTRA AND INTER- EXAMINER RELIABILITY FOR PROPOSED SCALE OF MASTICATION IN CHILDREN– A PRELIMINARY STUDY
Volume 13 | Issue 4

AI-DRIVEN APPROACH FOR MULTI CROP DISEASE CLASSIFICATION WITH PESTICIDE RECOMMENDATION SYSTEM
Volume 13 | Issue 4

AI-based Detecting Deception In Online Interactions: An Analysis of Dishonest Internet Users for Advances In Online Security
Volume 13 | Issue 4

Dimensionality Reduction for Brazilian Business Descriptions

PDF

Keywords:

Venkateswarlu B1*, Dr Somasekhar Donthu2
» doi: 10.48047/IJFANS/V11/ISS10/421

Abstract

It appears that you have presented a dataset that includes business descriptions of Brazilian enterprises that are classified into several economic activities. You wish to reduce the size of the data matrix without sacrificing any significant information by doing dimensionality reduction. This is an overview of the points you raised. Dataset Overview: 1080 documents total from your dataset contain free-text business descriptions of Brazilian enterprises. The National Classification of Economic Activities is the basis for the nine distinct categories into which these descriptions are divided (CNAE). Prepositions have been eliminated, words have been transformed into their canonical forms, and each document has been represented as a vector based on word frequency. Data Reliability: With zeros occupying 99.22% of the matrix, the dataset is extremely sparse. This indicates that a high dimensionality issue results from the majority of terms not appearing in the majority of documents. Reducing the number of variables or features in order to address the high dimensionality issue is known as dimensionality reduction. It is separated into two categories: feature extraction and feature selection. Engineering and Feature Extraction: The process of feature extraction converts unprocessed data into features that can be used in modelling. The process of increasing data correctness for algorithms is called feature transformation. In feature selection, superfluous characteristics must be eliminated. primary goal Reducing the dimensionality of the data matrix while preserving crucial information is your main objective. This entails removing features or terminology while keeping as much important data as you can. In a vector space, vector S. Tempo Model: The texts in the database are represented by you using a vector space model, in which every term becomes a dimension. Weighting Terms: By identifying terms with the highest power of discrimination and removing fewer terms, you are using term weighting approaches to increase dimensionality reduction and choose the most relevant terms.

Issue

Volume 11, Issue 10 (2022 )

Submit article