It's essentially a specialized Linux distribution, with a lot of useful data software pre-installed and exposing a simple interface. For full documentation, see http://www.datasciencetoolkit.org/developerdocs.
Like data? Check out my Data Source Handbook from O'Reilly:
Version 0.50 - May 19th 2013
Country boundaries by Thematic Mapping.
Contains Ordnance Survey data © Crown copyright and database right 2010.
Irish boundaries by Ben Raue.
New Zealand boundaries from Statistics NZ.
Worldwide states and provinces from Natural Earth.
US neighborhood boundaries provided by Zillow under a CC-SLA license.
This product includes GeoLite data created by MaxMind, available from http://www.maxmind.com/.
Uses the Hpricot library for parsing HTML.
The Boilerpipe library is used to recognize and extract the main story text from documents.
Uses my Ruby port of Eamon Daly and Jon Orwant's original GenderFromName Perl module to classify first names.
Uses street and place data from OpenStreetMap
Uses region and postal code data from GeoNames.
If you have any questions, comments, or suggestions, email us