I. Formula decomposition
The so-called formula decomposition method is for a certain indicator, using the formula layer by layer decomposition of the impact factors of the indicator.
For example: analyze the reasons for the low sales of a product, using the formula method to decompose
A variety of common data analysis methods"
Second, comparative analysis
Contrast method is to use two or more groups of data for comparison, is the most common method.
We know that isolated data is meaningless, there is a difference only when there is a comparison. For example, year-on-year and year-over-year, growth rate, fixed base ratio in time dimension, comparison with competitors, comparison between categories, comparison of characteristics and attributes, etc. The comparison method identifies patterns of data change, is used frequently, and is often used in conjunction with other methods.
The following chart of AB sales comparison, although the overall sales of company A rose and higher than company B, but the rapid growth rate of company B, higher than company A, even if the growth rate dropped in the late, the final sales still catch up.
Kind of common data analysis methods"
Third, A/Btest
The process of A/B test is as follows.
(1) Analyze the current situation and establish hypothesis: analyze the business data, determine the most critical improvement points at present, make hypothesis for optimization and improvement, and propose optimization suggestions; for example, if we find that the conversion rate of users is not high, we assume that it is because the conversion rate brought by the landing page of promotion is too low, and we have to think of ways to make improvements below
(2) Set goals and develop programs: set primary goals to measure the advantages and disadvantages of each optimized version; set secondary goals to evaluate the impact of the optimized version on other aspects.
(3) Design and development: Produce design prototypes of 2 or more optimized versions and complete technical implementation.
(4) Allocate traffic: Determine the split ratio for each online test version. In the initial stage, the traffic setting for the optimized solution can be small, and gradually increase the traffic according to the situation.
(5) Collect and analyze data: collect experimental data and make validity and effectiveness judgments: if the statistical significance reaches 95% or more and is maintained for a period of time, the experiment can be ended; if it is below 95%, the test time may need to be extended; if the statistical significance does not reach 95% or even 90% for a long time, a decision needs to be made whether to discontinue the experiment.
(6) Finally: according to the test results to determine the release of a new version, adjust the proportion of triage to continue testing or continue to optimize the iterative program to redevelop the online test if the test results are not achieved.
The flow chart is as follows.
kinds of commonly used data analysis methods"
Fourth, quadrant analysis
Through the division of two or more dimensions, the use of coordinates to express the desired value. From the value directly into a strategy, so as to carry out some landing drive. Quadrant method is a strategy-driven thinking, often with product analysis, market analysis, customer management, commodity management, etc. For example, the figure below shows a four-quadrant distribution of ad clicks, with the X-axis from left to right indicating low to high and the Y-axis from bottom to top indicating low to high.
Kind of common data analysis methods"
An ad with high click-through rate and high conversion rate indicates that the crowd is relatively accurate and it is a highly efficient ad. Ads with high click-through rate and low conversion rate indicate that most of the people who clicked in were attracted by the ads, and the low conversion rate indicates that the contents of the ads target the people and the actual audience of the products are not consistent. Ads with high conversion rate and low click-through rate indicate that the content of the ad is targeted at a high degree of compliance with the actual audience of the product, but the content of the ad needs to be optimized to attract more people to click. Ads with low click-through rates and low conversions can be abandoned. There is also the classic RFM model, which divides customers into eight quadrants according to the last consumption (Recency), consumption frequency (Frequency), consumption amount (Monetary) three dimensions.
This is a common data analysis method.
Advantages of quadrant method.
(1) Find the common cause of the problem
Through quadrant analysis, events with the same characteristics can be attributed and analyzed, and the common causes can be summarized. For example, in the case of the above advertisement, the events in the first quadrant can refine effective promotion channels and promotion strategies, and the third and fourth quadrants can exclude some ineffective promotion channels.
(2) Establish grouping optimization strategy
For example, in the RFM customer management model, customers are divided into different types according to quadrants, such as key development customers, key retention customers, general development customers and general retention customers. Give key development customers more resources, such as VIP service, personalized service, additional sales, etc. Sell higher value products to potential customers, or some preferential measures to attract them to return.
V. Pareto analysis
Pareto's law, derived from the classic law of two or eight. For example, in personal wealth it can be said that 20% of the world's people hold 80% of the wealth. And in data analysis, it can be understood that 20% of the data produces 80% of the effect needs to be mined around this 20% of the data. Often in the use of the two-eight rule has a relationship with the ranking, ranked in the top 20% is considered effective data. The two-eight method is to catch the focus of the analysis, applicable to any industry. Find the focus and discover its characteristics, then you can think about how to make the rest of the 80% to this 20% conversion, to improve the results.
Generally, it will be used in product classification to measure and build the ABC model. For example, if a retailer has 500 SKUs and the sales corresponding to those SKUs, which SKUs are important is a matter of prioritizing in business operations.
A common approach is to use product SKUs as dimensions and the corresponding sales as the base metric, rank these sales metrics from largest to smallest, and calculate the cumulative total sales of SKUs as of the current product as a percentage of total sales.
If the percentage is within 70%, it is classified as Class A. If the percentage is 70-90% or less, it is classified as category B. If the percentage is within 90~100% (inclusive), it is classified as Category C. The above percentages can also be adjusted according to your actual situation.
The ABC analysis model can be used to classify not only products and sales, but also customers and customer transactions. For example, which are the customers that contribute 80% of the profit to the company and what percentage of them. Assuming that there are 20%, then with limited resources, it is known to focus on maintaining this 20% category of customers.
Kind of common data analysis methods"
Six, funnel analysis
Funnel method is a funnel diagram, a bit like the inverted pyramid, is a process of thinking, often used in the development of new users, shopping conversion rate of these changes and a certain process of analysis.
A common method of data analysis"
The diagram above is a classic marketing funnel, showing the sub-sections of the process from user acquisition to final conversion to purchase. The conversion rate of the adjacent links is the data metrics used to quantify the performance of each step. So the whole funnel model is to first split the whole purchase process into individual steps, then use the conversion rate to measure the performance of each step, and finally identify the problematic links through abnormal data indicators, so as to solve the problem and optimize the step, and finally achieve the purpose of improving the overall purchase conversion rate.
The core idea of the overall funnel model can actually be categorized as decomposition and quantification. For example, to analyze the conversion of e-commerce, all we have to do is to monitor the conversion of users on each tier and find the optimization points that can be optimized at each tier. For users who do not follow the process, we specifically draw their conversion models to shorten the path to improve user experience.
There is also the classic hacking growth model, AARRR model, which refers to Acquisition, Activation, Retention, Revenue, and Referral, that is, user acquisition, user activation, user retention, user revenue, and user propagation. This is one of the more common models in product operation, combining the characteristics of the product itself and the position of the product life cycle to focus on different data indicators and eventually develop different operation strategies.
From the following AARRR model diagram, it is obvious that the entire user lifecycle shows a gradually decreasing trend. By dismantling and quantifying the entire user lifecycle, we can compare the data horizontally and vertically, so as to discover the corresponding problems and finally make continuous optimization iterations.
The common data analysis methods"
Seven, path analysis
User path analysis tracks the behavioral path of users from a certain beginning event to the end event, i.e. monitoring the flow of users, which can be used to measure the effectiveness of website optimization or marketing promotion, as well as to understand user behavior preferences, with the ultimate goal of reaching business goals, guiding users to complete the optimal path of the product more efficiently, and eventually prompting users to pay. How to conduct user behavior path analysis?
(1) Calculate each first step when users use the website or APP, then calculate the flow and conversion of each step in turn, and through the data, realistically reproduce the whole process from opening the APP to leaving the user.
(2) View the distribution of users' paths when using the product. For example, after visiting the home page of a certain e-commerce product, what percentage of users conducted a search, what percentage of users visited the category page, and what percentage of users directly visited the product detail page.
(3) Perform path optimization analysis. For example, which path is the most visited by users; which step is the most likely to lose users when they go to.
(4) Identify user behavior characteristics through paths. For example: analyze whether the user is a goal-oriented type of use and leave, or aimless browsing type.
(5) Segmentation of users. Usually classify users according to the purpose of APP use. For example, the users of car APP can be subdivided into concern type, intention type and purchase type users, and analyze the paths of different access tasks for each type of users, such as the intention type users, what paths he has for comparing different models and what problems exist. Another method is to use algorithms to perform cluster analysis based on all access paths of users, classify users based on the similarity of access paths, and then analyze each category of users.
Take e-commerce as an example, buyers have to go through the process of browsing the home page, searching for products, adding to the shopping cart, submitting orders and paying for orders from logging in to the website/app to paying successfully. For example, after submitting an order, the user may return to the home page to continue searching for products or cancel the order, and there are different motives behind each path. After in-depth analysis with other analysis models, we can find fast user motivation and lead users to the optimal path or desired path.
Example of a user behavior path diagram.
Kind of common data analysis methods"
Eight, retention analysis
User retention refers to the specific attributes and behaviors of new members/users who still visit, log in, use or convert after a certain period of time, and the proportion of retained users to new users at that time is the retention rate. Retention rate is divided into three categories according to different cycles, taking the retention identified by login behavior as an example.
The first type Daily retention, which can be further subdivided into the following categories.
(1) Next day retention rate: (the number of users who logged in on the second day among the users added on that day)/the total number of users added on the first day
(2) Day 3 retention rate: (the number of users who logged in on the third day among the users added on the first day)/total number of users added on the first day
(3) 7th day retention rate: (among the new users on the first day, the number of users who are still logged in on the 7th day)/total new users on the first day
(4) 14th day retention rate: (among the new users on the first day, the number of users who still logged in on the 14th day)/total number of new users on the first day
(5) 30th day retention rate: (among the new users on the first day, the number of users who are still logged in on the 30th day)/total new users on the first day
The second type of retention is the weekly retention rate, which refers to the number of users who are still logged in each week compared to the first week's new users.
The third type of retention, monthly retention rate, is the number of users who are still logged in each month relative to the first week's additions. The retention rate is for new users and the result is a matrix half-sided report (only half of the data is available), where each data record row is the date and column corresponding to the retention rate at different time periods. Normally, the retention rate decreases over the time period. The following is the monthly user retention curve generated with the example of monthly retention.
The following is an example of the monthly user retention curve generated by the following example.
Cluster analysis
Cluster analysis is an exploratory data analysis method. Usually, we use cluster analysis to group and categorize seemingly disordered objects in order to better understand the purpose of the research object. The clustering results require high similarity of objects within groups and low similarity of objects between groups. In user research, many problems can be solved with the help of clustering analysis, for example, the problem of classifying information on websites, the problem of correlating click behaviors on web pages, and the problem of classifying users, and so on. Among them, user classification is the most common case.
There are many common clustering methods, such as K-means, Spectral Clustering, and Hierarchical Clustering. Take the most common K-means as an example, as follows.
The "Common Data Analysis Methods"
As you can see, the data can be divided into three different clusters (clusters), each cluster should have its own specific properties. Obviously, cluster analysis is a kind of unsupervised learning, a classification model in the absence of labels. Once we have clustered the data and obtained the clusters, we generally analyze each cluster individually and in depth to obtain more detailed results.