Archive | Information Theory RSS feed for this section

A not so short description of a simple digital library

In my previous post I outlined some problems that are encountered when performing experiments. I then introduced an experimental design methodology to follow that attempts to minimise these problems. The focus was on the following reasons why one would perform thorough documentation in the requirements analysis phase: it uncovers requirements that have been misinterpreted or [...]

Comments { 0 }

Some thoughts on why experimental design methodology matters.

Young, or inexperienced, scientists and engineers often struggle with designing experiments and the accompanying data management. This leads to time unnecessarily spent on redoing experiments due to some oversight. The problem is compounded when thorough design is thrown out the window altogether. Tears soon follow. Yet, designing experiments and managing the resulting data can be [...]

Comments { 0 }

An example of topic models on the web

Have you ever followed someone on Twitter expecting great tweets, but instead you only see tweets about coffee and muffins? Topic models will come to your rescue. It all comes down to the fact that users need more control on Twitter.¬† Let’s be honest: most tweets in your stream only receive a cursory glance. Have [...]

Comments { 0 }

Quick-and-dirty data clustering with compression distance

Have you ever wanted to search a database of documents for something “similar” to a reference document? Or needed to identify near-duplicate records in a database? When dealing with large collections of data, manual sifting may not be feasible. Highly sophisticated pattern recognition techniques exist to extract features from data, and cluster items accordingly. However, [...]

Comments { 1 }