Zemoso provided end-to-end digital transformation services for an industry leader in data management and governance platforms for product information management (PIM). The company’s platform was recognized several times by reputed industry analysts (such as Gartner and Forrester) for its leadership in this space. Our client wanted to evolve their existing platform into a more intuitive, smart version of itself to retain their industry stronghold.
Grow, protect, and forecast e-commerce sales through improved data-driven decision-making around pricing, selection, advertising, and supply chain.
We co-built and future-proofed their industry-leading platform, in a way that allowed them to be more agile and solve data management problems of the future. This platform was ranked as Number 1 by both Gartner and Forrester. Specifically, this translates to:
Our clients love what we do:
The team aligned quickly, pivoted as needed, tested rapidly, and delivered consistent, incremental wins throughout the partnership.
We helped our client speed up development and evolution cycles, consistently delivering well-designed applications. After evaluating multiple providers, we chose Amazon Web Services (AWS) for its greater flexibility, scalability, and lowest developer friction with SDKs. In partnership with their internal engineering team, we designed a multi-layered platform using a microservices architecture for a modular and agile system. The team used DevOps best practices for continuous integration and deployment to expand capabilities more quickly.
For each functionality, we evaluated many competing tech solutions and selected the best-suited solutions with the client’s engineering leadership.
Polymer JS was chosen to create the next-gen user interface with added controls for tiered access protocols. It eased setting up different application elements and their relations.
The events stream processing (key to developing deduplication functionality) was built using Apache Storm, a highly vetted solution for real-time stream processing. The team conducted a thorough evaluation between Apache Spark and Apache Storm. Apache Storm was a better fit than Spark, which is a far more complex, general purpose computation engine.
To ensure smart deduplication whenever a record is added or updated, Kafka messaging layer transfers the data between applications. It is a fast and scalable event distributor. The ‘deduplication’ and ‘match and merge’ queues were robust and processed high volumes of data with minimal downtime or data loss. It integrated seamlessly with Apache Storm for real-time streaming data analysis, which helped their clients gain faster insights.
For search, the team chose Elasticsearch. It is a distributed, free and open-sourced search and analytics engine that powers search solutions for global giants like Microsoft, Netflix, Slack, and Uber. This text-based, NoSQL search tool proved highly useful in indexing data points needed to fulfill search parameters. The team used advanced indexing techniques like Ngram to generate superior match results. For instance, with only the first three letters as input, the search engine could match and reflect the product name.
As different operators, suppliers, and vendors enter product data into the platform, it is important to maintain the integrity of the product information. One key functionality that enables that is deduplication. Core functions provided by a retailer depend on these systems accessing and displaying accurate information. We helped our MDM client ensure that the same product is not listed twice under different unique ids.
Our team developed and deployed the deduplication algorithm on top of their newly upgraded tech stack. This algorithm showcased that the tech stack we evaluated and set up worked as intended.
The algorithm used similarity triggers around name, brand, color, etc. to flag potential duplicates with a probability index in a database of over a million records. Every change in the system is an ‘event’. Each event then goes through a ‘match and merge’ protocol, which is added to a queue. In the case of a suspect match, the event is assigned to a different queue to be manually resolved, thus maintaining the right records for critical business functions.
We co-created dashboards to improve access to learnings from the master data that the business can leverage. These were designed to help client’ customers visualize data and create intelligent reports for better analysis. Some of these dashboards helped analyze change trends, update summaries, workflow SLAs, governance summaries, and so on.
The team used Kibana, which is a browser-based analytics and search dashboard for Elasticsearch. This enabled quicker analysis and faster compilation of the data, and users could easily share these reports with stakeholders. It also helped detect learnings in Elasticsearch data with machine learning features.
P.S. Since we work on early-stage products, many of them in stealth mode, we have strict Non-disclosure agreements (NDAs). The data, insights, and capabilities discussed in this blog have been anonymized to protect our client’s identity and don’t include any proprietary information.
©2024 Zemoso Technologies. All rights reserved.