Traditional Culture Encyclopedia - Traditional customs - What are the methods of data analysis

What are the methods of data analysis

Commonly used nine listed for reference:

One, the formula dismantling

The so-called formula dismantling method is for a certain indicator, using the formula to break down the factors affecting the indicator layer by layer.

Example: to analyze the reasons for the low sales of a product, using the formula method of decomposition

Two, comparative analysis

Contrast method is to use two or more groups of data for comparison, is the most common method.

We know that isolated data has no meaning, there is a comparison to make a difference. For example, in the time dimension of the year-on-year and chain ratio, growth rate, fixed base ratio, comparison with competitors, comparison between categories, characteristics and attributes comparison. Comparison method can be found in the data change rules, the use of frequent, often and other methods with the use of.

The following chart of the comparison of sales of AB, although the overall rise in sales of A company and higher than B company, but the rapid growth rate of B company, higher than the A company, even if the growth rate fell in the later period, the final sales still catch up.

Three, A/Btest

A/Btest, is the Web or App interface or process of two or more versions, in the same time dimension, respectively, so that similar visitor groups to access, collect the user experience data and business data of each group, and finally analyze the best version of the assessment of the formal adoption.

The process of the A/Btest is as follows:

( 1) Analysis of the current situation and establish assumptions: analyze business data, determine the most critical improvement points, make assumptions about optimization and improvement, and put forward optimization proposals; for example, we found that the user conversion rate is not high, we assume that because of the promotion of the landing page brought about by the conversion rate is too low, and we need to think of ways to improve the following

(2) Setting goals and formulating programs: set the main objectives, which are used to Measure the advantages and disadvantages of each optimized version; set auxiliary goals, used to assess the impact of the optimized version on other aspects.

(3) Design and Development: Prototype the design of two or more optimized versions and complete the technical implementation.

(4) Allocation of traffic: Determine the proportion of traffic to be diverted for each online test version. In the initial stage, the traffic setting of the optimized solution can be small, and gradually increase the traffic according to the situation.

(5) Collect and analyze data: collect experimental data, and judge the validity and effect: if the statistical significance reaches 95% or above and maintains for a period of time, the experiment can be ended; if it is below 95%, it may be necessary to extend the test time; if the statistical significance cannot reach 95% or even 90% for a long time, it is necessary to decide whether to suspend the experiment.

(6) Finally: according to the results of the experiment to determine the release of a new version, adjust the proportion of diversion to continue testing or in the case of experimental results have not been achieved to continue to optimize the iterative program to re-develop the on-line test.

The flow chart is as follows:

Four, quadrant analysis

Through the division of two and more dimensions, the use of coordinates to express the desired value. By the value is directly transformed into a strategy, so as to carry out some landing to promote. The quadrant method is a strategy-driven thinking, often associated with product analysis, market analysis, customer management, commodity management and so on. For example, the chart below shows a four-quadrant distribution of ad clicks, with the X-axis from left to right indicating low to high, and the Y-axis from bottom to top indicating low to high.

An ad with a high click-through rate and a high conversion rate indicates a relatively accurate demographic and an efficient ad. High click-through rate and low conversion ads indicate that most of the people who clicked on the ads were attracted to the ads, and the low conversion indicates that the ads target a crowd that is not in line with the actual audience of the product. High conversion and low click-through rate advertisements indicate that the advertisements are targeted at a higher degree of conformity with the actual audience of the product, but the advertisements need to be optimized to attract more people to click on the advertisements. Ads with low click-through rates and low conversion rates can be abandoned. There is also the classic RFM model, which divides customers into eight quadrants according to the three dimensions of the most recent consumption (Recency), consumption frequency (Frequency), and consumption amount (Monetary).

Advantages of the quadrant method:

(1) Finding the **** cause of the problem

Through the quadrant analysis method, events with the same characteristics are attributed and analyzed, and the **** cause of the problem is summarized. For example, in the case of the above advertisement, the events in the first quadrant can be extracted into effective promotional channels and promotional strategies, and the third and fourth quadrants can exclude some ineffective promotional channels;

(2) Establishing grouping optimization strategies

Quadrant analysis for the placement of the quadrant analysis method can be used to establish optimization strategies for different quadrants, for example, the RFM customer management model according to the quadrant of the customer will be divided into focus on For example, in the RFM customer management model, customers are categorized into key development customers, key retention customers, general development customers, and general retention customers according to the quadrant. Give key development customers more resources, such as VIP services, personalized services, and additional sales. Give potential customers to sell higher-value products, or some incentives to attract them to return.

Fifth, Pareto analysis

Pareto's law, derived from the classic law of two or eight. For example, in personal wealth can be said that 20% of the world's people hold 80% of the wealth. And in data analysis, it can be understood as 20% of the data produced 80% of the effect needs to be mined around this 20% of the data. Often in the use of the two-eighths rule and ranking has a relationship, ranked in the first 20% is considered effective data. The two-eighths rule is focus-grabbing analysis and applies to any industry. Find the focus, discover its characteristics, and then you can think about how to make the rest of the 80% to this 20% of the conversion, to improve the results.

Generally, it will be used in product categorization to measure and build ABC model. For example, if a retailer has 500 SKUs and the sales corresponding to those SKUs, which SKUs are important, it's all about prioritizing in business operations.

The common practice is to use product SKUs as dimensions and corresponding sales as the base metrics, arrange these sales metrics from largest to smallest, and calculate the cumulative total of product SKU sales as a percentage of total sales as of the current product SKU.

Percentages up to and including 70% are classified as Category A. If the percentage is 70~90% or less, it is classified as Category B. If the percentage is 90~100% or less, it is classified as Category C. The above percentages can also be adjusted according to your actual situation.

The ABC analysis model can be used to divide not only products and sales, but also customers and customer transactions. For example, to the enterprise to contribute 80% of the profits of the customer is which, what percentage. Assuming that there are 20%, then in the case of limited resources, it is known to focus on maintaining the 20% category of customers.

Six, funnel analysis

Funnel method that is the funnel diagram, a bit like the inverted pyramid, is a process-oriented way of thinking, commonly used in the development of new users, shopping conversion rate of these changes and certain processes in the analysis.

The diagram above is a classic marketing funnel, showing the sub-links in the process from user acquisition to final conversion to purchase. The conversion rate of the neighboring links then means quantifying the performance of each step with data metrics. So the whole funnel model is to split the whole purchase process into steps, and then use the conversion rate to measure the performance of each step, and finally through the abnormal data indicators to find out the problematic links, so as to solve the problem, optimize the step, and ultimately to achieve the purpose of improving the overall purchase conversion rate.

The core idea of the overall funnel model can actually be categorized as decomposition and quantification. For example, to analyze the conversion of e-commerce, what we have to do is to monitor the conversion of users on each level and look for optimizable points on each level. For users who do not follow the process, specifically draw their conversion model, shorten the path to improve the user experience.

There is also the classic hacker growth model, the AARRR model, which refers to Acquisition, Activation, Retention, Revenue, and Referral, i.e., User Acquisition, User Activation, User Retention, User Revenue, and User Diffusion. This is a relatively common model in product operation, combining the characteristics of the product itself and the product's life cycle position to focus on different data indicators, and ultimately develop different operational strategies.

From the AARRR model diagram below, it is clear that the entire user lifecycle is showing a decreasing trend. By disassembling and quantifying the various aspects of the entire user lifecycle, data can be compared horizontally and vertically to identify the corresponding problems and ultimately carry out continuous optimization and iteration.

Seven, path analysis

User path analysis tracks the user's behavioral path from the beginning of the event until the end of the event, that is, the user flow monitoring, can be used to measure the effect of website optimization or the effect of marketing and promotion, as well as to understand the user's behavioral preferences, and the ultimate goal is to achieve the business objectives, to guide the user to complete the optimal path of the product in a more efficient manner, and ultimately promote the user to pay. Payment. How to conduct user behavior path analysis?

(1) Calculate the first step of each step when users use the website or APP, and then calculate the flow and conversion of each step in turn, through the data, realistically reproduce the whole process from the time users open the APP to the time they leave.

(2) View the distribution of users' paths when using a product. For example, after visiting the home page of an e-commerce product, what percentage of users searched, what percentage of users visited the category page, and what percentage of users directly visited the product detail page.

(3) Conduct path optimization analysis. For example: which path is the most visited by users; to which step, the user is most likely to lose.

(4) identify user behavioral characteristics through the path. For example, analyze whether the user is a goal-oriented user who uses the path and then leaves, or a user who browses without a purpose.

(5) Segment users. Users are usually categorized according to the purpose of APP use. For example, the users of automobile APP can be subdivided into concerned, intentional, and purchasing users, and the path analysis of different access tasks for each type of user, such as intentional users, what paths he has to compare different models, and what problems exist. Another method is to use algorithms to perform cluster analysis based on all the access paths of users, classify users based on the similarity of access paths, and then analyze each type of user.

Taking e-commerce as an example, buyers have to go through the process of homepage browsing, searching for products, adding shopping cart, submitting orders, and paying for orders from the time they log in to the website/APP to the time they pay successfully. The user's real shopping process is an intertwined and repeated process, for example, after submitting an order, the user may return to the home page to continue searching for products, or may go to cancel the order, there are different motives behind each path. After in-depth analysis with other analytic models, we can find fast user motivations and lead users to the optimal path or the desired path.

User behavior path diagram example:

Eight, retention analysis

User retention refers to the new members/users after a certain period of time, but still have a visit, log in, use or conversion of specific attributes and behaviors, and the proportion of retained users to the then new users is the retention rate. The retention rate is divided into three categories according to different cycles, taking the retention identified by the login behavior as an example:

The first type of daily retention, daily retention can be subdivided into the following types:

(1) next day retention rate: (the number of users added on the same day who still logged in on the 2nd day)/the total number of users added on the first day

(2) the 3rd day retention rate: (the number of new users added on the 3rd day who still logged in) / the total number of new users on the 3rd day

(2) the 3rd day retention rate: (the number of new users on the 3rd day who still logged in) (the number of users who logged in on the 3rd day)/the total number of new users on the 1st day

(3) 7th day retention rate: (the number of users who logged in on the 7th day among the new users on the 1st day)/the total number of new users on the 1st day

(4) 14th day retention rate: (the number of new users who logged in on the 14th day among the new users on the 1st day)/the total number of new users on the 1st day

() (5) Day 30 retention rate: (number of users added on day 1 who are still logged in on day 30)/total number of users added on day 1

The second type of weekly retention is a weekly retention rate that refers to the number of users who are still logged in each week relative to the first week of additions.

The third type of retention, monthly retention, refers to the number of new users who are still logged in each month relative to the first week of new users. The retention rate is for new users, and the result is a matrixed half-side report (only half has data), where each data record row is the date, listed as the retention rate for the corresponding different time period. Normally, the retention rate decreases over the time period. The following monthly user retention curve is generated as an example:

Nine, cluster analysis

Cluster analysis is an exploratory data analysis method. Usually, we use cluster analysis to group and categorize seemingly disordered objects to achieve a better understanding of the purpose of the research object. Clustering results require high similarity of objects within groups and low similarity of objects between groups. In user research, many problems can be solved with the help of cluster analysis, for example, the problem of information classification of websites, the problem of relevance of clicking behavior of web pages, and the problem of user classification and so on. Among them, user categorization is the most common case.

There are many common clustering methods, such as K-means, Spectral Clustering and Hierarchical Clustering. Take the most common K-means as an example, as follows:

You can see that the data can be divided into red, blue and green three different clusters (cluster), each cluster should have its own unique properties. Obviously, cluster analysis is a kind of unsupervised learning, a classification model in the absence of labels. Once we have clustered the data and obtained the clusters, we generally analyze each cluster individually and in depth to get more detailed results.

Get more data analytics learning information and materials, welcome to pay attention to the polycountry cloud sea public number of the same name oh~