My Analytic Research

воскресенье, 17 апреля 2016 г.

Stores Sales Forecasting using R+Tableau Approach

In our previous post we described the case of using R interface of Tableau for sales forecasting on the fly. The other case is when sales forecasting and analysis have been performed using appropriate R script and the results of such an analysis are placed in the tableau. For sales forecasting, we used linear model with LASSO regularization. In the model, we take into account seasonality, promo actions, holidays. The forecasting for each store, performed by R script was loaded into Tableau. In the Tableau dashboard, users can analyze the historical data features such as seasonality, sales aggregations, sales distributions together with displaying of sales forecasting. The forecasting is performed for each individual store in the daily time scale, so users can use tableau to do sales aggregations for stores groups or view sales days, weeks or months time scale. This case of stores sales analysis can be tested here .

воскресенье, 30 августа 2015 г.

Sales forecasting with limits

Sales forecasting is an important part of modern business intelligence. In this area, it is important to analyze the reasons that cause time series behavior, e.g. trend. If we apply well-known technics without such analyzing, we can receive statistically correct results but from business point of view they will be incorrect. Examples of such incorrect results are predicted negative sales. Let us consider some sales time series, which map the consumptions of some goods. Suppose that the time series has a descending trend. This time series is shown below in the daily and weekly scale.

Seasonal fluctuations for daily sales with boxplot presentation type are the following:

When we apply an approach based on the ARIMAX algorithm to this time series forecasting, we can receive negative values for predicted prices. This algorithm takes into account the multilevel seasonality (intra-week days, intra-month days, annual monthly seasonality). You can see the result on the following figures

Time series considered in the daily scale. Green line denotes historical time series, red color denotes fitted time series, blue color denotes forecasted time series.

Let us analyze the reasons which can cause the descending sales trend. One of the possible reasons is that the vendor concentrates on the other product of the goods group and as a result, due to a higher price and lack of promotion the consumers discontinue purchasing the analyzed product. This reason can cause trend descending. But in the consumers’ structure, there is a group, which will purchase this product despite its higher price, lack of promotion and any other reasons. So, in the forecasting it is important to take into account the structure of consumers, which can be described by estimated sales minimum limit. It can be considered using sales variable transformation. One of such ways is considered by Rob Hyndman here. Using this variable transformation with minimum limit value for sales equal 50 and applying ARIMAX algorithm, we receive the following results (in the daily and weekly scales):

So, using sales variable transformation with set sales limits, we can avoid incorrect results (e.g. negative sales) for sales forecasting with descending trends.

четверг, 9 января 2014 г.

Mining of Social Network Streams in Marketing, Predictive Analytics, and Risk Management.

The analysis of modern social networks is widely used in many business areas such as marketing, forecasting, financial and stock markets, etc. Marketing, Predictive Analytics & Risk Management are the parts of Business intelligence (BI). BI provides reporting and analysis that can help make business decisions and show what happened and why. We would like to consider the ability of using data mining methods which were applied to unstructured data streams in BI solutions. One can gather such data from different sources, e.g. social network streams, specialized forums, RSS channels, etc. We especially study how such type of analysis can be applied to predictive analytics and risk management. Let us consider the grounds of these areas. Predictive analytics is an area of data mining that deals with extracting information from data and using it to predict trends and behavior patterns. Predictive models analyze past performance to assess how likely a customer is to exhibit a specific behavior in order to improve marketing effectiveness. With the number of competing services available, businesses need to focus efforts on maintaining continuous consumer satisfaction, rewarding consumer loyalty. So, it is important to analyze users' opinion which can be retrieved from users' messages in social networks. Predictive analytics can also predict this behavior, so that the company can take proper actions to increase customer activity. Apart from identifying prospects, predictive analytics can also help to identify the most effective combination of product versions, marketing material, communication channels and timing that should be used to target a given consumer. Predictive analytics and mining of social network streams can also be used to identify high-risk fraud candidates in business or the public sector. Another area where we can implement social network stream is risk management, this is about the identification, assessment, and prioritization of risks. Social network streams make it possible to reveal quantitative characteristics of background factor for the processes under analysis. Monitoring of indicators of social network streams allows to control the probability and/or impact of unfortunate events or to maximize the realization of opportunities. Lack of knowledge can be retrieved from semistructured data of message streams in social networks. This additional knowledge can also help to optimize analyzed processes and minimize overall risk. Text stream mining enables us to reveal the dynamics of different risk sources by analyzing quantitative indicators, retrieved from social network data. One of important factors in risk management is users' opinion about some entity, e.g. process, services, etc. Such opinions can be retrieved using sentiment mining approach applied to informational streams of social networks. Modern systems of business intelligence widely use the analytical methods of non-structured and semi structured data, gathered from different sources.

We would like to show the possibility of analyzing economical and financial indicators using the stream of textual data and informational streams of social networks, special-purpose forums, and RSS channels. Consider the data mining of social network streams. To receive information streams, we used Twitter API and special Python software for web scraping of special-purpose forums. The theoretical basis for the analysis was the theory of semantic fields, the analysis of formal concepts, and the theory of frequent sets and association rules, sentiment mining methods. For the predictive analytics we used ARIMA and VAR models. The Granger test was used to find causality between time series. As a result of data mining of text messages we will receive the time series of various quantitative characteristics of blog messages, e.g. support and confidence of association rules. The next step is to find correlations between the time series, which are the results of social network data mining and the time series that represent real stock markets. On this step, we need to find such time series of social media trends that not only correlate with stock market series but also have predictive potential. Very important for decision-making in risk management are the visualization of data and infographics, on the basis of which an expert makes his decision. That is why we attached a great importance to various methods how to represent our results. As our previous studies show it is very important to detect and remove anomalous communities that were dynamically formed in tweet streams. We also showed that it is very important to single out the tweets of competent users and main influencers. We can find them using different methods of graph theory.

As an example, consider the dynamics of popularity of some cosmetic brands, based on the downloaded tweet streams. Fig. 1-4 show the results obtained. We used various types to visualize our results in graph presentation. Such types of graphs may be used in business intelligence dashboards. They may also provide additional business information for the experts in marketing, predictive analytics, and risk management spheres.

Fig.1

Fig.2

Fig.3

Fig.4

Let us consider the dynamics of chosen brands, based on the analysis of messages from economic forums. Those messages were downloaded from forums using corresponding Python software. Fig.5-7 shows the obtained results in different graphical presentations.

Fig.5

Fig.6

Fig.7

Now we consider the dynamics of quantitative characteristics of one company, based on the analysis of downloaded tweet streams. We chose Apple company as an example. Fig.8-9 shows the graphs with the dynamics of keyword frequent itemsets and the dynamics of users' opinion. These results reflect the dynamics of the popularity of Apple products and users' opinion towards them.

Fig.8

Fig.9

Our next step is to consider if it is possible to predict Apple stock prices on the basis of obtained time series of keyword frequent itemsets. In our previous studies, we conducted the Granger test for the time series of frequent itemsetsand Apple stock prices. This test showed that the time series of frequent itemsets of analyzed tweet stream causes the peculiarities of the dynamics of stock prices. We use the VAR model to analyze the possibility to predict stock prices. This model takes into account both the dynamics of stock prices and the dynamics of some chosen frequent itemsets. Fig.10-12 show the calculation results with different sets of frequent itemsets. The bold points are the predicted values that were calculated on the basis of previous historical data. Fig.10-11 shows the calculations for three days ahead , and the fig.12 shows the calculations of the prediction for one day ahead. Confidential interval is marked by grey color.

Fig.10

Fig.11

Fig.12

Into VAR model, we included the time series of keywords and users' opinions of frequent itemsets. The obtained data show that on some analyzed intervals VAR model has appeared to be effective in predictive analytics approach to stock market forecasting In our further studies we are going to concentrate on the algorithms how to select effectively the sets of time series of frequent itemsets for the purpose of reducing the confidence interval and more accurate prediction for longer time periods.

Our previous similar investigations can be found at:

Graph Approach for Data Mining in BI Applications

The analysis of travelling trends, using tweets data mining:

Comparing stock market charts with frequentsets of keywords in Twitter messages

Forecasting of Stock Financial Series Using Multivariate Vector Autoregressive Model

Granger Causality Test for Frequent Itemsets of Keywords in Financial Tweets
Forecasting of the winners and favorites ofEurovision Song Contest 2013

Tweet mining using NLP can help in goods marketing

We also give our selected scientific e-prints and links where we described the theoretical grounds of social network mining, which we used in our studies:

B. Pavlyshenko

Tweets Miner for Stock Market Analysis

In this paper, we present a software package for the data mining of Twitter microblogs with the purpose of their usage in the stock market analysis. The package is written in R language using appropriate R packages. We considered the model of tweets and then compared stock market charts with frequent sets of keywords in Twitter microblog messages.

URL: http://arxiv.org/ftp/arxiv/papers/1305/1305.7014.pdf

B. Pavlyshenko

Can Twitter Predict Royal Baby's Name?

We analyze the existence of possible correlation between public opinion of twitter users and the decision-making of persons who are influential in the society. In our study, we use the methods of quantitative processing of natural language, the theory of frequent sets, the algorithms of visual displaying of users' communities. It was revealed that the structure of dynamically formed users' communities participating in the discussion is determined by only a few leaders who influence significantly the viewpoints of other users.

URL: http://arxiv.org/ftp/arxiv/papers/1310/1310.3500.pdf

B. Pavlyshenko

Forecasting of Events by Tweet Data Mining

This paper describes the analysis of quantitative characteristics of frequent sets and association rules in the posts of Twitter microblogs related to different event discussions. For the analysis, we used a theory of frequent sets, association rules and a theory of formal concept analysis. We revealed the frequent sets and association rules which characterize the semantic relations between the concepts of analyzed subjects. The support of some frequent sets reaches its global maximum before the expected event but with some time delay. Such frequent sets may be considered as predictive markers that characterize the significance of expected events for blogosphere users. We showed that the time dynamics of confidence in some revealed association rules can also have predictive characteristics. Exceeding a certain threshold may be a signal for corresponding reaction in the society within the time interval between the maximum and the probable coming of an event. In this paper, we considered two types of events: the Olympic tennis tournament final in London, 2012 and the prediction of Eurovision 2013 winner.

URL: http://arxiv.org/pdf/1310.3499

четверг, 2 января 2014 г.

Graph Approach for Data Mining in BI Applications

I would like to share some thoughts about using graph theory in modern business intelligence (BI). Business intelligence is about providing reporting and analysis solutions that show business users what happened and why. On the other hand, advanced analytics solutions deliver deeper insight into what might happen in future, based upon high volumes of historical data and sophisticated modeling techniques. BI includes advanced econometric, statistical analyses, machine learning, predictive analytics and other modern approaches including methods of graph theory. Recently IBM revealed technological trend prediction for the next 5 years (CNN). These are Smarter Classrooms, Smarter Stores, Smarter Medicine, Smarter Privacy and Protection, Smarter Cities. The structure of data in these technology areas can be effectively represented by graphs. I would like to explain briefly the prospects of using the methods of graph theory. Graphs make it possible to establish connections between different component elements in the system. One of the main advantages of graph approach is in the fact that one can analyze some entity not only as an element of some set or transaction, but also take into account type relations with nodes of all network system. Graph approaches give new abilities in such areas as fraud detection, finding key influencers, gatekeepers, suspicious users and processes, predictive analytics, anomaly detection, detections of artificial communities, recommendation systems, etc. Community detection, detecting fraudulent personalities in networks, an important part of network analysis, has become a popular area of investigations. Recommendation systems process relations between users and some other entities e.g. products, processes, services which are based on users' activity such as purchasing, etc. Graphs are the most widely used while describing social ties between users. Many methods of social network analysis are based on the graph theory. Graphs allow to establish connections not only between users but also to include various entities into these structures. For these needs N-mode graph presentation can be used. For a specific analysis N-mode graphs can be transformed into one-mode graph using different aggregation functions. So, using graph approach, we can describe different types of users' relations together with processes and other entity relations. To build a practical system it is necessary to consider graph quantitative characteristics, which may be further used in the algorithms of supervised and unsupervised classifications. We can consider different graph characteristics which can be used for graph vertices analysis. The simple ones are betweenness and closeness centralities, vertex PageRank, authority, coreness, etc. These measures allow to estimate the importance of a node in the network. As an example, I took a social graph that was created using tweet stream from my previous investigation Detection of Community Anomalies in Twitter Trends . Fig. 1 shows the graph with detected communities.

Fig. 1

Fig. 2 shows the subgraph of the vertices with the authority > 0.

Fig. 2

Fig. 3 shows the subgraph of the vertices with the coreness > 1.

Fig. 3

Fig. 4 shows the subgraph of vertices with the coreness > 3.

Fig. 4

Using these scores, we can find both artificial anomalous communities and the communities with the users who are really important in the analyzed tweet stream. The removal of anomalous communities is very important in data mining of trends, since it enables us to get a real picture of users' minds. One more useful thing in social networks may be an additional service that would filter out anomalous communities; or it may inform other users about any suspicious users and informational streams.
Here are some other results. For the last several months we have been loading the tweet stream with such given keywords as apple, cosmetics, etc. Consider tweet stream with the keyword 'apple'. Fig.5 shows the revealed users' communities, formed on the basis of the analysis of graph relations.

Fig.5

Fig.6 demonstrates the cloud of keyword frequencies.

Fig.6

As a result of graph analysis, we revealed competent users and key influencers. The connections between most competent users and influencers are shown on the fig. 7.

Fig.7

For further consideration we took the tweets of the most competent and authoritative users that were defined by many graph characteristics. The cloud of keyword frequencies for such tweet array is shown on the fig. 8. The users' authority was defined on the basis of user's connections with other users, taking into account the connections of those users with the others. In this analysis, we calculated the principal eigenvector for the product of transposed adjacency matrix and adjacency matrix of the graph. The most authoritative user appeared to be the one with the name jastinbieber. A very interesting thing is that the Apple products occur in the tweets of this user not directly but implicitly, in the contexts of other topics, e.g. as a link to itunes. In my opinion this is the most effective type of advertising.

Fig. 8

We carried out the similar analysis for the tweet stream with two keywords 'phone' and 'galaxy'. Fig.9 shows the revealed users' communities, fig.10 shows the cloud of keyword frequencies, fig.11 shows the cloud of keyword frequencies in the tweet array of most competent and authoritative users. The most authoritative user in this analysis appeared to be the one with the name hayyouapp.

Fig. 9

Fig. 10

Fig. 11

In our next studies, we intend to research users' communities and entities denoted by keywords in financial and business tweet streams. Our aim is to find out if stream graph scores have predictive features allowing to predict important time series business and financial time series. In our previous studies we found out that frequent sets, based on the tweet streams, can be used for financial predictive analytics. You can find these results here:
Forecasting of Stock Financial Series Using Multivariate Vector Autoregressive Model
Granger Causality Test for Frequent Itemsets of Keywords in Financial Tweets
Tweets Miner for Stock Markets
We intend to research how graph approach in the analysis of social networks can be used in financial analytics and business intelligence applications.

четверг, 12 декабря 2013 г.

Detection of Community Anomalies in Twitter Trends

Users, when forming their own views on different trends, pay great attention on other users' points of view. Very important in user's view formation is the ratio of number of users with different opinions. Obviously, there emerge some forces that are interested in the formation of users' trends and opinions. Such methods of influence are much more complicated than mere spam. In particular, a whole community with given trend may be created artificially.
When a user finds himself in such a community, he/she may get a wrong feeling that the trend of this community is being supported by a great number of users and, thus, this trend should be well-reasoned, analyzed and unbiased. Having only selective acquaintance with trends, it is very difficult for a user to detect that the communities, which give rise to these trends, are artificial. Such artificial trends may be created while discussing various political, social, economic, or financial issues.

One may detect artificial communities through long-lasting observing of informational streams on given topic. Based on the analysis of quantitative characteristics of created communities, one can reveal some anomalies. The communities, created on the grounds of these anomalies, may be regarded as anomalous and, thus, excluded from further consideration and informational stream.

In Ukraine, for the last few weeks there have been nonviolent mass protests against government policy and particularly against the breakdown of association agreement with EU, against coercion to peaceful demonstrators, etc. Evidently, these processes have their reflections in social networks. It is also obvious that some forces are trying to influence network users' viewpoints towards these events.
That is why it is interesting and important to analyze social network informational streams concerning events in

Ukraine

for revealing both the anomalous communities and productive communities with effective discussions written by real users.

Using Twitter API, we have been loading the tweets for several days with such filtering keywords as Ukraine, Euromaidan, etc. The analysis was conducted using R and Python languages. From our point of view, the most effective analysis of tweets can be based on: the theory of formal concept analysis, the theory of frequent itemsets and association rules, network theories, supervised and unsupervised classifications.

Users mention other users in their tweets. They also quote other users by retweeting their messages. It makes possible to create connections among users and to build a graph, which will demonstrate users' connections. On such a graph, one may single out different communities using various existing approaches. One of popular approaches is based on the modularity notion, which describes the relation of connections between the vertices inside and outside of the community.
To identify the communities that were formed dynamically in the discussion, we used a fast greedy modularity optimization algorithm. To build a graph, we used a Fruchterman-Reingold algorithm. This algorithm belongs to force algorithms, or spring algorithms. The character of the graph is due to the model which is used in force algorithms. The distinctive feature of the model is that its vertices are considered as the balls, affected by repulsive forces; and the edges are considered as spring models that attract the vertices which are connected by these edges. We have built a network with user communities marked with different colours:

3000 random tweets samples

10000 random tweets samples

50000 random tweets samples

Then we noticed that the obtained graph contains two big communities, which appeared to be totally isolated. The analysis of those communities revealed that one of them has only one influencer who is being quoted by all the community members. Those users appeared to have no connections with any other user in the network. It is evidently an anomalous community, since one can hardly believe in the existence of a real big community the members of which quote only one source and nobody of them either writes his/her own tweets or retweets other users. Using the adhesion coefficient, we can define the measure of community isolation. For isolated communities the adhesion coefficient is equal to zero. The other feature may be found with the analysis of influencers' activity, the ratio of their tweets and retweets, the study of users' activity inside the community, etc. All these characteristics may be used for the training of classifiers for anomaly revealing.

In our analysis, we have detected several big communities with different adhesion coefficients and different quantitative characteristics of influencers' activities. The analysis of top trends in the communities with zero adhesive coefficient showed that their influencers are anomalous, and it is rather difficult to establish their social identity. On the other hand, for the big communities with the maximum adhesion coefficient, the top influencers are well-known Ukrainian and European politicians and news agencies.

One of conclusions for the study conducted is the fact that zero or minimum adhesion coefficient points to the anomality of given community; and high adhesion coefficient indicates that the community is effective and productive.

Our next step was to remove the tweets belonging to the users, who were defined as members of of anomalous communities. As a result we obtained the following graph of users:

3000 random tweets samples (2 anomalous communities were found and removed)

10000 random tweets samples (3 anomalous communities were found and removed)

10000 random tweets samples (10 anomalous communities were found and removed)

There is no doubt that the removal of anomalous communities is very important in data mining of trends, as it enables to get a real picture of users' minds. One more useful thing in social networks may be an additional service that would filter the activity of users from anomalous communities; or it may inform other users about any suspicious users and informational streams.

In our further studies we are planning to analyze the other types of anomalous informational streams, using the theory of formal concept analysis, the theory of semantic fields, the theory of frequent itemsets and association rules.

вторник, 15 октября 2013 г.

Data Mining of Informational Stream in Social Networks

Data Mining of Informational Stream in Social Networks from Bohdan Pavlyshenko

вторник, 23 июля 2013 г.

Can Twitter predict royal baby's name? (Updated)

One of the main news today is the birth of royal baby, the crown prince. We congratulate Kate and William on this event and wish much health and happiness to them and their son!

Is it possible to predict the crown prince's name on the basis of the analysis of tweets? Using NLP methods, the theory of frequent itemsets and association rules, I have analysed the tweets. For my analysis, I used the R environment and the algorithms I used in my previous studies. I've obtained the following distribution of names:

So, we'll see if there is really the crown prince's name among all these male names.

After the Royal baby's name was announced

At last the Royal baby's name has been announced: Prince George of Cambridge!

As a result, we can see tweets mining could predict the Royal baby's name! What does this mean? Somebody writes me that this study is nuts. It is really not serious problem and nuts, if to take it literally. But the main goal of this study is to test whether there is a correlation between social network users' opinions and the decisions that can be made by individuals who are highly influential in certain spheres of the society. As the obtained results show such correlation does exist. The Crown Prince's full name is George Alexander Louis. Unfortunately I don't know the history of England very well and I didn't take into account that the full name of the Royal baby may consist of three names. I studied the tweets array once again which had been downloaded before the Crown Prince's name was announced. Using the theory of frequent itemsets and association rules, we studied which names occur in tweets together. As the analysis showed the three names George, Alexander and Louis are the part of the top 5 of frequent itemsets with the biggest level of support.

Top of frequent itemsets:

items                                         support
1 {alexander,george,james}     0.135593220
2 {george,henry,james}     0.121725732
3 {george,james,louis}     0.104776579
4 {alexander,james,louis}     0.098613251
5 {alexander,george,louis}     0.098613251
6 {george,henry,louis}     0.095531587
7 {alexander,henry,james}     0.093990755
8 {alexander,george,henry}     0.093990755
9 {henry,james,louis}     0.092449923
10 {alexander,henry,louis}     0.090909091

The formation of frequent itemsets can be represented as the following graph:

On the basis of frequent itemsets with three elements, we analysed the association rules with high level of support and confidence. The names George, Alexander and Louis also form the top 5 of association rules, grouped by the value of confidence:

Top of association rules:

1 {james,louis} => {george} 0.10477658 0.9855072 1.714730

2 {henry,louis}     => {george}    0.09553159 0.9841270 1.712328
3 {alexander,louis}    => {james}     0.09861325 0.9696970 2.192799
4 {alexander,louis}    => {george}    0.09861325 0.9696970 1.687221
5 {james,louis}     => {alexander} 0.09861325 0.9275362 4.459045
6 {george,louis}     => {james}     0.10477658 0.9189189 2.077973
7 {alexander,james}    => {george}    0.13559322 0.8888889 1.546619
8 {george,louis}     => {alexander} 0.09861325 0.8648649 4.157758
9 {alexander,george}   => {james}     0.13559322 0.8543689 1.932005
10 {george,louis}     => {henry}     0.09553159 0.8378378 3.649374

The top 5 of obtained association rules can be represented as the following:

Consider the set structure of the users who participated in the discussion of the prince's name. To identify the communities that were formed dynamically in the discussion under analysis, we used a fast greedy modularity optimization algorithm. To build a graph, we used a Fruchterman-Reingold algorithm. This algorithm belongs to force algorithms, or spring algorithms. The character of the graph is due to the model which is used in force algorithms. The distinctive feature of the model is that its vertices are considered as the balls, affected by repulsive forces; and the edges are considered as spring models that attract the vertices which are connected by these edges . In the tweet arrays, we have found 6919 users that sent 37191 tweets. These tweets mentioned 2645 users. An essential part of these mentions is relates to retweets. For further analysis, we take active users who sent more than on tweet in the process of discussion or who were mentioned in tweets more than once. We have found 2,300 active users who sent more than one tweet, and 923 users who were mentioned in tweets more than once. Figure 6 shows the graph of users' interrelations, the shades of colors on it mark the users' communities. On this graph, we can see that there are several numerous users' communities.

Revealed users' communities.

Our next step is to conduct the analysis after removing the most popular users that were mentioned in tweets 100 times or more. We have found only 6 such users. Having removed these users from the analysis, we received the community graph. Removed users constitute nearly 0.2% of all the users mentioned in tweets. As follows from the obtained data, that if to remove only the most popular users from the analysis, the community structure will be changed significantly, and only numerous small communities will be left.

Users' communities without six most popular users.

The results of the study demonstrate that tweets mining could predict the Royal baby's name. We showed that the major name of newborn Prince George was dominant in the spectrum of names before the official announcement. It follows from the obtained data that the theory of frequent sets allows to get a more precise prediction for the full name if to compare with the analysis of the name frequency range which allows to predict a major name only. The three prince's component names George, Alexander, Louis form a frequent itemset of words and this itemset was the part of the top 5 largest frequent itemsets by the support value. We also showed that the structure of dynamically formed users' communities that participated in the discussion is defined by only several leaders who have a significant influence on the position of other users. What do these results mean? It is really not a serious problem, if to take it literally. But the main goal of this study is to test whether there is any correlation between social network users' opinions and the decisions that can be made by individuals who are highly influential in certain spheres of the society. In our studies, we revealed that such a correlation does exist. This means that there is a certain correlation between the bloggers' viewpoints and the decision-making of the Royal family as to the prince's name.

Populare retweets about Royal Baby name:

"RT @Lord_Voldemort7: They should name the #RoyalBaby 'Weasley' so that in future people can go around singing "Weasley is our King." "
"RT @Lord_Voldemort7: #RoyalBabyName It seems only fitting that the son of Prince William and commoner Kate Middleton be named Severus Snape…"
"RT @PrincessKateNOT: We have decided to name our #RoyalBaby with a popular British boys name. Mohammed."
"RT @eonline: We still don't know the #RoyalBaby's name...but we may have an idea of what his surname could be!"
"RT @AmazingPhil: I think they should name him after his great grandfather! Prince Philip. #RoyalBaby"
"RT @AdamCatterall: I woke to see #thunderstorm was trending. For a moment I thought they'd let Kanye West name the #RoyalBaby"
"RT @gracehelbig: Should've named it "Norther Wester." #RoyalBaby"
"RT @wescraven: Suggested name for the new little prince... Freddy. #RoyalBaby"
"RT @Lord_Voldemort7: The #RoyalBaby has not yet been named. They should just call him 'You Know Who.'"
"RT @MelissaJoanHart: I seriously don't wanna go to bed without knowing the #royalbaby name. Isn't that ridiculous?! "
"RT @Telegraph: #RoyalBaby: George is the bookies' favourite for the new prince's name, followed by James, Alexander, Louis and Henry"

воскресенье, 17 апреля 2016 г.

Stores Sales Forecasting using R+Tableau Approach

воскресенье, 30 августа 2015 г.

Sales forecasting with limits

четверг, 9 января 2014 г.

Mining of Social Network Streams in Marketing, Predictive Analytics, and Risk Management.

четверг, 2 января 2014 г.

Graph Approach for Data Mining in BI Applications

четверг, 12 декабря 2013 г.

Detection of Community Anomalies in Twitter Trends

вторник, 15 октября 2013 г.

Data Mining of Informational Stream in Social Networks

вторник, 23 июля 2013 г.

Can Twitter predict royal baby's name? (Updated)

After the Royal baby's name was announced

воскресенье, 17 апреля 2016 г.

воскресенье, 30 августа 2015 г.

четверг, 9 января 2014 г.

четверг, 2 января 2014 г.

четверг, 12 декабря 2013 г.

вторник, 15 октября 2013 г.

вторник, 23 июля 2013 г.