Observation and binning

The source of data to be observed by the Social Computer is any system on the World Wide Web that provides a time stamped feed of publicly shared information. The system can also subscribe to sources that share private information feeds. However, we concentrate our efforts on information that human users deliberately share in public to respect their (self-defined) private sphere. We argue that this, while significantly limiting the chances to incorporate useful contextual features for more accurate modelling of causality, is an important area of research these days that will gain even more importance when personal data management environments put the users back in control of their data. In order to decrease the necessary effort to implement individual data harvesters on a per system basis, it is recommended to instantiate or link with a real-time Web Observatory \citep{Tinati_2015}, a decentralised approach to enable access to historic and real-time data from and about systems on the Web. The observer component subscribes to one or more accessible feeds from such data sources, integrates them and analyses the resulting unified activity feed. The analytical approach implements a) an information extraction algorithm to look for particular patterns in the discrete content elements of the feed and b) a threshold heuristic to indicate a relevant burst of activity around a matched pattern. The observer holds all incoming content elements within bins that group together those elements that matched exactly the same unique set of informational patterns. We use those unique identified sets instead of unique matches of single patterns, because this is a very simple but generic and extensible method to distinguish between related and unrelated content with similar matched patterns. When the employed heuristic indicates a burst, a new Social Computer process is kicked off by the process creator involving all content elements that are already in and continue to go into the respective bin.

Instructions

The task creator manages that from now on content elements from that bin above the threshold - now called tasks - are stored separately with reference to their parent process. The procedural logics of the Social Computer are built upon the principle that now instructions are captured for tasks from active processes (cf. Figure 2). The state of a task is formed by the sequence of all instructions captured in response to it, so that our Social Computer can be regarded an applicative computing system following a functional programming style  \citep*{Backus_1978}.