Discover how our expertise in data strategy, architecture, and implementation delivers measurable results for our clients.
Bol.com improves its product every day by measuring user behavior and identifying opportunities for improvement. They were using an expensive external tool to track and analyze user behavior. The tool worked, but not perfectly. They wanted to switch to an in-house tool based on Apache Kafka. It offered more possibilities, but it was much more complex to use. Our role? Make it easier.
The internal tool had much more information than the external one, but more is not always better. Too much detail made it hard to see the big picture. The goal was to simplify the tool so that analyses could be faster and easier.
We focused on two main areas: user behavior and financial results. Together with internal stakeholders, we designed a modular structure and built a proof of concept to analyze A/B tests thoroughly.
We broke the huge amount of data into different modules using dbt and BigQuery. We created a clear data flow, making A/B tests faster, more consistent, and more accurate. We introduced data tests and streamlined the entire process. Dashboards were built to work on this clean, organized data. The biggest challenge was the sheer volume: 100TB per analysis was not unusual.
The system was built for a production environment to run permanently and automatically, no manual data input (very handy with 100TB!). We organized internal knowledge-sharing sessions and discussed successes and improvements with business owners, data owners, analysts, and developers.
The complexity of analyses was reduced by a factor of ten. Analyses are now faster, more consistent, and the external tool is no longer needed (saving costs). Business users can explore data and generate insights themselves.
But Bol.com is always moving forward. Engineers quickly replaced the in-house Kafka solution with RudderStack, proving that data work is never really finished.
Every organization has data. Somewhere in your data could be a breakthrough innovation. This case showed how we analyzed user behavior on an e-commerce platform. Do you have questions or want insights into your sales process? With data-driven insights, you can do more. Contact us today.
ING is committed to equal opportunities and equal rights. They wanted to better measure and address the gender pay gap (GPG). While they had a proof of concept and an annual script-based tool, it was not flexible or always reliable. The goal was to develop a robust, user-friendly tool that provides faster and more accurate insights.
Together with the business owner, we identified the value of investing in understanding and reducing the gender pay gap. We reviewed the existing tool and found areas for improvement, such as creating a direct database connection, validating and cleaning data, and improving statistical analysis to generate accurate and actionable insights.
Working with the HR Analytics team, we identified use cases and designed a modular system with clear components and strong analytical capabilities. We explored new statistical methods to improve analysis and planned for future needs.
We rewrote the script into an object-oriented version, applied best practices like logging and hypothesis testing, and documented the entire process. This made the tool reliable, easy to use, and added functionality for hypothesis testing. HR Analytics could now run simulations to determine the best ways to reduce the gender pay gap.
The new system supports command-line options for hypothesis testing and includes Power BI dashboards. Its modular, object-oriented design makes it easy to adjust, maintain, and run analyses, with clear knowledge transfer and governance for the future.
The new tool reduced analysis time from weeks to just a few minutes, allowing gender pay differences to be identified faster and more accurately. Analyses are more flexible and consistent, making the tool easier to use. This supports ING in actively promoting gender equality and transparency. Learn more in the gender pay gap analysis, and in the ING annual reports.
This project aligns with our mission “Data for a Better Future” by promoting gender equality at work. It contributes to a fairer and more transparent workplace, which is an important part of broader societal progress and sustainable business practices.
Every organization has data. Your data may also hold breakthrough insights. In this case, we used data science and statistical analysis on HR data in a multinational. Do you have a question or want insights into your HR policies? With data-driven insights, you can do more. Contact us today.
ASML builds extremely advanced machines called scanners. These machines produce wafers, the core product in semiconductor manufacturing. ASML wanted better insight into the performance of these machines to support a new type of service contract.
The idea is simple: higher machine performance means more wafers produced, which creates more value for the customer. With this model, ASML aligns the interests of both the company and the customer.
But what exactly is performance? It is the connection between machine uptime and machine productivity. In the past, this connection was made manually by a domain expert with more than 20 years of experience. To understand performance across the entire fleet of machines, this process needed to be automated.
Together with the domain expert, we defined what really determines machine performance. We looked at how uptime and productivity are related and which data sources describe them. We also studied how the expert performed the analysis: what information was needed, how much time it took, and how many analyses could be done each year. This gave us a clear understanding of the complexity and the business value before starting the implementation.
We designed how different data sources could be combined into one coherent system. A single machine can generate hundreds of gigabytes of data each year, so it is essential to find the important signals in a very large amount of data. We developed modular and scalable data products that transform raw event logs into useful datasets. These datasets make it possible to analyze and improve scanner performance.
In the first version, a proof of concept, we already showed clear impact: analysis time was reduced from several weeks to a few days. We then expanded this into a minimum viable product (MVP). For the first time, ASML could see performance across the entire fleet of machines. This was presented several times to senior leadership (VP and EVP level). We also added the ability to explain why performance decreases, enabling root cause analysis and performance optimization. This helped engineering teams and leaders quickly identify and solve productivity issues. We called the tool “Where Are My Wafers”, because it shows exactly that: how well the machines perform and where improvements are possible. It also made fleet-wide comparisons possible, for example, understanding why some customers achieve better performance than others.
The system was built using Azure Databricks and Apache Spark. We created trusted datasets, an ASML concept for shared and reliable data across the organization. This supports strong data governance and security. The system runs automatically and is designed to be scalable and flexible, with clear knowledge transfer and governance to support long-term use.
The project reduced the time needed to identify performance improvements at customer sites from six weeks to less than one day. This frees highly skilled experts to focus on other important work and supports the new business model based on productivity-driven service contracts.
The estimated additional annual revenue from this contract model is €150 million, demonstrating a major financial impact. At the same time, optimizing scanner performance supports technological progress and innovation.
This project aligns with our mission “Data for a Better Future” by improving efficiency and productivity in semiconductor manufacturing, an industry that is essential for technological and societal progress.
Every organization has data. Somewhere in your data there may also be a breakthrough insight.
In this case, we created fleet-wide insights from data coming from many machines and data sources. Do you run a production factory? Or do your products send data back to a central system?
With the right bird’s-eye view of your data, you can discover new opportunities and improvements.
Contact us to learn more.