Popis: |
This thesis analyzes usage data from nanoHUB.org, which is a web-based infrastructure for e-collaboration among nanotechnology simulation community. Previous analysis of nanoHUB database showed he nanoHUB usage data follows an unknown, heavy-tailed distributions. This thesis extends the analysis and develops an automatic anomaly detection method based on piece-wise linear approximation. The anomaly here refers to collective user behaviors different from others. The result shows that the method can accurately detect the anomalies in the unknown, heavily detailed distribution. This thesis also applies anomaly detection method and principal component analysis to other databases in nanoHUB and successfully reveals differences between different categories. |