DOI: https://doie.org/10.65985/APER.2026549597
Authors:Vidhya Rao, Surekha Kohle
BoW, TF-IDF, Word2Vec, LASSO, Random Forest, XGBoost, Folk art paintings
Online marketplaces have become major platforms for the commercialization of Indian folk and traditional paintings, yet empirical evidence on how textual descriptions influence artwork pricing remains limited. This study examines how descriptive language, along with marketplace indicators such as art type, painting area, ratings, and reviews, shapes prices of online folk art listings. We introduce the Indian Painting Ecommerce Metadata (IPEM) dataset comprising 385 manually authenticated online listings of Indian paintings, including textual descriptions, prices, physical dimensions, art form categories, and market signals. Manual verification ex-cluded counterfeit and replica artworks, ensuring dataset reliability. Machine learning–based text analytics are applied using three representation techniques: Bag of Words (BoW), Term Frequency–Inverse Document Frequency (TF-IDF), and Word2Vec. Each representation is paired with an appropriate learning algorithm. High-dimensional and sparse BoW features are analyzed using LASSO (Least Absolute Shrinkage and Selection Operator) regression to enable feature selection and interpretability. TF-IDF representations are modeled using Random Forest, while Word2Vec embeddings are combined with XGBoost (Extreme Gradient Boosting) to exploit semantic interactions. Experimental results show that TF-IDF with Random Forest achieves the strongest predictive performance, explaining approximately 73% of the variance in log-transformed prices. The BoW + LASSO model reveals that keywords related to cultural identity, regional heritage, craftsmanship, traditional materials, and emotional aesthetics positively influence pricing, whereas decor-oriented, generic, or reproduction-related descriptors are negatively associated with value. The study provides managerial insights for sellers, marketplaces, and policymakers, emphasizing strategic text optimization to enhance visibility and pricing outcomes.
Type: Journal
Language: English
Publisher: ya tai jing ji bian ji bu
ISSN: 1000-6052
Email: [email protected]