Lost in Just the Translation

About

This tool is a web-based implementation of a natural language-based steganographic system developed by Christian Grothoff at UCLA Computer Science Department, Krista Grothoff, Ryan Stutsman, and Mikhail Atallah at CERIAS. The key idea behind the system is to hide a message in the noise inherent in natural language translation. This prototype implements some of the ideas presented in our research paper (also, more information is available in our paper about a previous version of our work).

Features

This tool can be used to steganographically encode or decode a message in user-defined or system-provided cover texts. The tool provides a variety of options for the encoding as described in the paper, including a selection of translation engines, multiple languages and various transformations. The engine uses keyed cryptographic hashes of translations to encode information. For details about how the system works please read our paper.

Limitations

The prototype has limitations. It uses various on-line translation engines to obtain translations. These engines have a high latency, are sometimes unavailable and limit the rate at which sentences can be translated. To alleviate these problems, the prototype caches translations in a local database. However, for cover texts supplied by users the cache is likely to not have any effect. Thus, user-supplied covers are likely not to work if they are longer than a few sentences.
For longer hidden messages, the provided covertexts should be used. For those texts we have ensured that all translations are cached already, thus sidestepping problems with doing translation requests on-the-fly.

Note: Changes in the web interfaces for our online translation sources and small, idiosyncratic changes in the way the translations are encoded by the engines make using the encoder extremely difficult. You will likely need your own translation software or sources to achieve any meaningful results.

Download

Please note since our original sources for translations have changed this software won't provide meaningful results without substanial effort as noted above. The encoder/decoder portion of the code is isolated fairly well and still serves as a practical example of the protocol described in detail in the paper.

You can download the current release here. LiJtT requires PHP 5 with libtidy support, a PEAR supported SQL database (i.e. mysql) and a PHP-enabled webserver (i.e. apache). The machine running LiJtT must be able to access the WWW in order to perform on-line queries to various translation engines. The size of the initial database included in the distribution is currently around 8 MB.