Abstract:
E-Commerce platforms are proving to be of great help to people across the world. The shopping over the internet has brought a great convenience to the lives of the modern world. The e-commerce has implemented the true open market with full of competition. This competition is among all the giant companies around the world. The target market is all the consumers across the globe. To win this global competition, e-commerce companies need to follow strong strategy through utilizing advanced technologies. One of such winning strategy is to understand the consumer behavior to improve the user experience of the e-commerce site. The consumer behavior is observed with the help of the log files associated with the website. The most valuable log file for this purpose is Access log. Access Log of an e-commerce website contains the movement of the consumers. This movement carries valuable information about the needs, choices and purchase history of the consumers. This information can be extracted to influence the purchase of the consumers through appropriate ads, offers and discounts. Moreover, this information has interesting correlations with various business parameters, which helps to make a strong prediction for business intelligence. To make these insights more sensible, log data should be as big as possible. It makes big data analytics suitable for such application. A cluster of distributed storage, Hadoop, is used in this project for big data analysis. But without proper tools like Pig, Hive etc, Hadoop applications are difficult to access. A unique and convenient system using command line interface is presented to solve this particular problem. One can easily find the consumer behavior using the web access logs with this system. An explicit evaluation of the system is also done to prove it as an efficient, scalable and big data compatible system along with to give a complete idea of designing such system.