вторник, 11 декабря 2012 г.

The Model of Semantic Concepts Lattice For Data Mining Of Microblogs

        The methods of modern data mining are used effectively in Web content resources processing. The system of microblogs Twitter is one of the most popular for users’ interaction with the help of short messages. The model of semantic concept lattice for data mining of microblogs has been proposed in this work. It is shown that the use of this model is effective for the semantic relations analysis and for the detection of associative rules of keywords in the microblogs messages array. For the experimental research the package of applied programs in the language Perl has been developed.  With the help of this package and using the API of Twitter the test array of messages that contain the word "software" and the hash tag "# software" has been downloaded. A set of thematic messages associated with the software themes has been selected. The lattice of formal concepts for the semantic fields of different size and content has been considered. The tweets containing words of different semantic fields have been analysed. The semantic concepts lattice reflects the interaction of concepts in microblogs messages.  After filtering the array of input messages by given semantic field, there was received an array of 8920 tweets.   The package of programs Lattice Miner was used for calculating the concepts lattice. On the basis of concepts lattice the associative rules that represent the relations between semantic concepts of analysed subjects have been found. The application of the theory of formal concept analysis is effective in the processing of intellectual microblogs messages. The use of lattice models of semantic concepts allows to analyse semantically related sets of words and to construct associative rules. The formation of semantic fields based on the array of identified frequent sets enables to narrow significantly the search of associative rules and lattice size of semantic concepts in algorithms of text mining.
     Similar investigations were carried out for Tweeter messages array with the hash tags "#london2012" and "olympics", which were loaded during the Olympic Games in London (2012). We studied the events on the Olympic Games, in particular the final of tennis tournament.

Examples of Galois Lattices:

Ideal & Filter for concept {android, developer, london}

Ideal & Filter for concept {london}

Ideal & Filter for concept {android, developer}

Ideal & Filter for concept {browser}

Ideal & Filter for concept {android}

Ideal & Filter for concept {android, phones, popular}

Investigation of tennis final on the Olympic Games (London 2012)

Galois Lattice

 Ideal & Filter for concept {aug_05, men, federer, murrey}

 Ideal & Filter for concept {aug_04, women, williams, sharapova}

The dynamics of support for associative rules Gold->Sharapova, Gold->Williams

The dynamics of confidence for associative rules Gold->Sharapova, Gold->Williams

Комментариев нет:

Отправить комментарий