|1||beatDB: A Large ScaleWaveform Feature Repository [ENG]||2013|
|2||MoocViz: A Large Scale, Open Access, Collaborative, Data Analytics Platform for MOOCs [ENG]||2013|
|3||MOOC En Images (MIT Technical Report) [ENG]||2013|
|4||MOOCdb: Developing Standards and Systems for MOOC Data Science (MIT Technical Report) [ENG]||2013|
|5||MOOCdb: Developing Data Standards for MOOC Data Science (MOOCshop paper) [ENG]||2013|
|6||Machine Learning Algorithms for In-Database Analytics [ENG]||2013|
|7||Efficient training set use for blood pressure prediction in a large scale learning classifier system [ENG]||2013|
|8||Replacing the computer mouse [ENG]||2012|
|9||Artificial Intelligence: why should firms care? [ENG]||2012|
|10||Of the use of natural dialogue to hide MCQs in serious games [ENG/FR]||2012|
|11||Designing an intelligent dialogue system for serious games [ENG/FR]||2012|
|12||The medial Reticular Formation (mRF): a neural substrate for action selection? An evaluation via evolutionary computation. [ENG/FR]||2011|
|13||Fuzzy logic: introducing human reasoning within decision support systems? [ENG/FR]||2011|
|14||Fuzzy logic: between human reasoning and artificial intelligence [ENG/FR]||2011|
|15||Presentation on the Motion-Induced Blindness (MIB) phenomenom [ENG/FR]||2011|
|16||Presentation on the paper Automated Variable Weighting in k-Means Type Clustering [ENG/FR]||2010|
|17||Prediction of the water inflow to a lake [FR]||2010|
|Abstract: A great majority of the effort is spent assembling the
data and formulating the features, while, rather ironically, the model building exercise takes relatively
beatDB aims at radically shrinking the time of large
scale investigations by judiciously pre-computing beat
features which are likely to be frequently used.
In this poster we present beatDB structure and use
beatDB for a concrete research study: predicting
acute hypotensive event with blood pressure.
|Abstract: In this paper we present an open access large scale analytics
platform that helps researchers analyze MOOC data from
multiple platforms with out the need to share the data. It
allows researchers to share scripts/effort, compare results
and attempts to engage the community to achieve shared
educational science goals. The platform utilizes some well
known tools and packages and provides multiple levels of
access to address a wide variety of needs around the data.
We demonstrate the platforms capability by analyzing data
from two MOOCs, one from Coursera (offered by Stanford
University) and one from edX (offered by MITx). This is the
first time two courses from two platforms have been jointly
analyzed. The analysis and the platform is made possible
due to joint adoption of a data model called MOOCdb.
|Abstract: This report provides a view into different descriptive statistics extracted from the data recorded during 6.002x the first course
offering by MITx. We have developed a generalizable analytics framework and this report demonstrates use of this framework.
This is a working document and we are expanding the scope of this document as we add additional analytical tools and interfaces
to our framework
|This MIT Technical Report is an extended version of the MOOCshop paper.
Abstract: The intent of this document is to enable development of data standards for MOOCs and build enabling technology. This document will be updated from time to time with feedback from the community as well from our internal development process
|Abstract: The intent of this article is to propose data standards for MOOCs. Our team has been conducting research related to mining information, building
models, and interpreting data from the inaugural course offered by edX, 6.002x:
Circuits and Electronics, since the Fall of 2012. This involves a set of steps,
undertaken in most data science studies, which entails positing a hypothesis,
assembling data and features (aka properties, covariates, explanatory variables,
decision variables), identifying response variables, building a statistical model
then validating, inspecting and interpreting the model. In our domain, and others
like it that require behavioral analyses of an online setting, a great majority of
the effort (in our case approximately 70%) is spent assembling the data and
formulating the features, while, rather ironically, the model building exercise
takes relatively less time. As we advance to analyzing cross-course data, it has
become apparent that our algorithms which deal with data assembly and feature
engineering lack cross-course generality. This is not a fault of our software design.
The lack of generality reflects the diverse ad hoc data schemas we have adopted
for each course. These schemas partially result because some of the courses are
being offered for the first time and it is the first time behavioral data has been
collected. As well, they arise from initial investigations taking a local perspective
on each course rather than a global one extending across multiple courses.
|Abstract: Our project focused on extending the functionality of MADlib. MADlib is an open source machine
learning and statistics library which works with Postgres or Greenplum to provide in-database analytics.
Although some machine learning algorithms have been implemented in MADlib, there is room for additional contributions. We have implemented two different machine learning algorithms, symbolic regression
with genetic programming and adaptive boosting for MADlib, and are in the process of contributing our
code to the MADlib community codebase. We have also assessed the performance of our implementations
and compared their performance with the same algorithms outside MADlib.
|Abstract: We define a machine learning problem to forecast arterial blood pressure. Our goal is to solve this problem with a large scale learning classifier system. Because learning classifiers systems are extremely computationally intensive and this problem's eventually large training set will be very costly to execute, we address how to use less of the training set while not negatively impacting learning accuracy. Our approach is to allow competition among solutions which have not been evaluated on the entire training set. The best of these solutions are then evaluated on more of the training set while their offspring start off being evaluated on less of the training set. To keep selection fair, we divide competing solutions according to how many training examples they have been tested on.
|Abstract: In a few months the computer mouse will be half-a-century-old. It is known to have many drawbacks, the
main ones being: loss of productivity due to constant switching
between keyboard and mouse, health issues such as RSI, medical
impossibility to use the mouse e.g. broken or amputated arm
and unnatural human-computer interface like the keyboard.
However almost everybody still uses a computer mouse nowadays.
In this short article, we explore computer mouse alternatives. Our research shows that moving the mouse cursor can be done efficiently with the SmartNav device and mouse clicks can be emulated in many complementary ways. We believe that computer users can increase their productivity and their health by using those alternatives. There are a few exceptions such as advanced users of graphics editing programs or FPS gamers, who will still be more efficient using a computer mouse.
This article is voluntary short and not overly technical, our main motivation being to make the readers aware of these solutions and their efficiencies. Details can be found in the appendices and by following the URLs and references. The primarily intended readers are computer scientists, people with RSI, physicians and interface pioneers. Feedback is highly welcome: this is work in progress, so feel free to e-mail the main author at email@example.com
|Talk given on May 30th, 2012 at the Swedish Chamber of Commerce in Paris. As I was reading an article about IBM Watson, a small sentence drew my attention: "Eighty or 90 per cent of these requests don't need Watson anyway, technology already exists for what they need.". This epitomizes the growing need for the business world to catch up with artificial intelligence's latest developments. What is AI? What is the state of the art? Why should I care? i.e. what can AI bring to the business world? From law to finance, any field will be reshaped in the long term by AI.
||Replace CS by AI|
Abstract of the original paper: A major weakness of serious games at the moment is that they often incorporate multiple choice questionnaires (MCQs). However, no study has demonstrated that MCQs can accurately assess the level of understanding of a learner. On the contrary, some studies have experimentally shown that allowing the learner to input a free-text answer in the program instead of just selecting one answer in an MCQ allows a much finer evaluation of the learner's skills. We therefore propose to design a conversational agent that can understand statements in natural language within a narrow semantic context corresponding to the area of competence on which we assess the learner. This feature is intended to allow a natural dialogue with the learner, especially in the context of serious games. Such interaction in natural language aims to hide the underlying MCQs. This paper presents our approach.
Abstract of the original paper: the objective of our work is to design a conversational agent (chatterbot) capable of understanding natural language statements in a restricted semantic domain. This feature is intended to allow a natural dialogue with a learner, especially in the context of serious games. This conversational agent will be experimented in a serious game for training staff, by simulating a client. It does not address the natural language understanding in its generality since firstly the semantic domain of a game is generally well defined and, secondly, we will restrict the types of sentences found in the dialogue.
The medial Reticular Formation (mRF) is located in the brainstem: it receives many sensory inputs and it can control motor actions through its projections on the spinal cord and cranial nerves. The mRF is phylogenetically one of the oldest neural structures of the brainstem, the latter being regarded as one of the oldest centers of the central nervous system. Subsequently it seems to be a low-level system for action selection.
The first model of the mRF was proposed by Kilmer and McCulloch in 1969, who already proposed that the mRF could be a "mode selector". In 2005, Humphries et al. (2005) tested the efficiency of this model in the minimal survival task defined in Girard et al. (2003). It performed poorly, but another version of it that included artificially evolved weights performed quite honorably. As a result, Humphries proposed a second model of the mRF, based on neural network formalism and taking into account new anatomical data. Nevertheless, it showed poor performances in the minimal survival task and turns out not to be anatomically very plausible.
In this Master's Thesis, we propose a new model of the mRF:
The model we obtain successfully manages the tasks of selection, indicating that the mRF can be used as an action selection system. We also demonstrate an anatomical property of the mRF, which coupled with the results of the paper Humphries et al. (2006) shows that it is very likely that the mRF network has a small-world structure.
This project was funded by the ANR (ANR-09-EMER-005-01. ANR = French National Agency for Research) in the project EvoNeuro.
Fuzzy logic is based on solid mathematical foundations, including the mathematical theory of fuzzy sets, generalizing classical set theory. Firstly, we define fuzzy operators, which generalize operators of classical logic.
As a second step, we see how fuzzy logic can imitate human reasoning. We analyze the contribution of fuzzy logic for the modeling of human reasoning, and also experimentally investigate whether the decisions taken by humans correspond to decisions taken by fuzzy systems. To this end, given that the literature is deficient on this point, we design an experiment for that purpose and analyze the results.
We study the potential applications for databases and decision support systems in Chapter 5. How to integrate the advantages of fuzzy logic in the database? To which extent decision-making systems can use the flexibility of fuzzy logic?
We then analyze the potential applications for decision support systems and databases.
We show that at the heart of the company, bringing together all the interesting information from the operational databases, decision systems could benefit greatly from fuzzy logic by giving the keys to human reasoning, allowing to refine the decision-making.
Database theorists know what fuzzy logic could bring them in terms of information modeling: queries more intuitive and more powerful on the one hand, the data more consistent with the reality on the other. Many papers have been written, but few significant achievements have followed. The lack of consensus on a standard is probably the main reason behind.
Fuzzy logic is an extension of Boolean logic by Lotfi Zadeh in 1965 based on the mathematical theory of fuzzy sets, which is a generalization of classical set theory. By introducing the concept of degree in the verification of a condition, allowing a condition of being in a state other than true or false, fuzzy logic provides a very valuable flexibility to use reasoning, which makes it possible taking into account the inaccuracies and uncertainties. One of the advantages of fuzzy logic to formalize human reasoning is that the rules are set in natural language.
In this report, we:
We show that fuzzy logic can explain many experiments that had undermined traditional models of human reasoning in the 20th century. We show how the non-additivity of probability judgments can be expressed in a fuzzy system. We then confront fuzzy logic with some paradoxes of classical logic when it tries to model human reasoning: the sorites paradox is typically the kind of threshold problem that fuzzy logic reduces and the paradox of entailment does not pose a problem in fuzzy logic. It would be interesting to further explore Hempel's paradox and especially how we could express it in a neuro-fuzzy system. Similarly, Wason selection task would require further analysis, this time by focusing on fuzzy modus ponens and modus tollens.
Thus fuzzy logic appears as a powerful theoretical framework for studying human reasoning. Surprisingly, we find only one study comparing the decisions made by humans with that of a fuzzy system, whose purpose was essentially to design a system of decision support for medical personnel, not analyze human reasoning as such. We conduct our own experiment and investigate whether a fuzzy system could mimic the results observed in humans. For this purpose, we use a technique for optimizing fuzzy system using neural networks (neuro-fuzzy), through which we obtain good results, although the correlation between the two criteria for entry is high: a fuzzy system gives results closer to experimental values than those obtained by a polynomial system. This result reinforces the hypothesis that fuzzy logic can be used to explain decisions from human reasoning.
The visual system has a number a 'bugs', some of which we call illusions. Motion-induced blindness (MIB) belongs to a very interesting class of illusions in which objects in plain sight just disappear from phenomenal perception. Other classical examples of disappearance illusions are:
In addition, a number of neurological conditions usually involving lesions in parietal cortex, such as hemineglect and extinction, lead to cases in which objects in plain view are not seen, or not noticed. For a good review of these phenomena, see article "Psychophysical magic" by Kim and Blake (2005).
Motion-induced blindness MIB is a recently discovered and quite spectacular example of a disappearance illusion. The stimulus consists of a field of small objects, moving in a coherent way (either a 2D or 3D rotation, for example). Superimposed on this moving field is a number of high-contrast stationary objects. When most observers fixate a stationary point in this stimulus (such as one of the high-contrast objects, or a fixation point), after several seconds one or more of the stationary objects just disappear.
|Activate full-screen, fix the white point in the center. After a few seconds, you will notice that the yellow point seems to disappear.|
Abstract of the original paper: This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applications where large and complex real data are often involved. Experimental results on both synthetic and real data have shown that the new algorithm outperformed the standard k-means type algorithms in recovering clusters in data.
The purpose of this project is to predict the water inflow to a lake, the Lac St-Jean, based on the evolution of the inflow to the lake from the history of this flow, snowmelt and precipitation in the watershed. All the data for this work have already been collected: our work aims to process, analyze and use these data to build a model which should be able to accurately predict the lake's water inflow.
In the first part, we conduct a preliminary study of the data so as to extract general information. In the second part, we establish a classification of the data to see the main trends. In the third and last part, we build several models to predict and we evaluate them through quality measurements.